Redaction
Schema for redaction.yaml — channels, named functions, per-topic transforms, metadata mappings
The canonical reference for redaction.yaml. For the why-and-when, see Redact.
# Master switch.
enabled: true
# Optional — merge named functions from additional files.
includes:
- /etc/alloy/transforms-common.yaml
# Topic-level allow/deny. Cheapest filter — runs before any decode.
channels:
allow: ["*"]
deny: ["/user/*", "/audio/**"]
# Salt for hash(...) helpers. ${VAR} is read from the environment at load time.
hash_salt: "${ALLOY_HASH_SALT}"
# Per-channel mappings. First match wins.
transforms: [...]
# Per-metadata-record mappings. Same shape as transforms.
metadata: [...]
# The named-function library. Referenced by transforms / metadata via `function:`.
functions: { ... }
# Optional — rule-file-level audit defaults. Operational overrides (CLI flags,
# edge-sync.yaml's `audit:` block) take precedence. Useful for "this rule set
# always wants its audit embedded" without forcing every call site to pass a flag.
audit:
embed_in_mcap: true
# Write filtered output to a sandbox dir and skip uploads.
dry_run: falseTop-level
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Master switch. false → the redactor is bypassed even though edge-sync's redaction.enabled may be true. |
includes | list of paths | [] | Additional files merged into the same functions: namespace, in declaration order. Last file wins on name collision; the entry-point file's own functions: always wins overall. Resolved relative to the directory of the file that contains them. |
channels | object | {} | Topic-level allow/deny. See below. |
hash_salt | string | — | Salt for hash(...) and friends. ${VAR} is interpolated from the environment at load time. Missing env var → load-time error. |
functions | map | {} | Named transform bodies — the "library" referenced by transforms: and metadata:. |
transforms | list | [] | Per-channel mappings. First match wins. |
metadata | list | [] | Per-metadata-record mappings. Same shape as transforms but match: globs against record names. |
audit | object | — | Rule-file-level audit defaults. Currently one field: embed_in_mcap (bool). Absent → defer to edge-sync.yaml's audit: block / CLI flags. |
dry_run | bool | false | When true, write filtered output to <input_dir>/.dry-run/<timestamp>/ and skip uploads. CLI --dry-run overrides the file. |
channels
Topic-level allow/deny — the cheapest filter. Runs before any decode, so denied channels never pay the CDR-decode cost.
| Field | Type | Default | Description |
|---|---|---|---|
allow | list of globs | — | Fail-closed. null or [] → no topic passes. ["*"] → all topics pass (only deny filters). ["/odom", "/tf*"] → only matching topics pass. |
deny | list of globs | [] | Always wins on conflict — a deny match drops the topic regardless of allow. |
Globs are POSIX-style: * matches any segment, ** matches across segments.
transforms[] / metadata[]
Each entry has exactly one of function: or transform:. Both or neither → load-time error.
| Field | Type | Required | Description |
|---|---|---|---|
match | string or object | yes | Channel/record selector. See Match selector below. |
function | string | one of | Reference to a named function in functions: (or merged from includes:). |
transform | object | one of | Inline transform body — same shape as a functions: entry. |
Match selector
Two equivalent forms — pick the shorthand for the common case.
# Shorthand — channel only (or record-name only for `metadata:`)
match: "/diagnostics"
# Object form — full selector
match:
channel: "/diagnostics"
schema: "diagnostic_msgs/DiagnosticArray" # disambiguates same-topic-different-schema (rare but legal)
encoding: "cdr" # or "ros2msg", "protobuf", etc.For the metadata: block, the shorthand maps to record_name instead of channel:
match: "operator_*" # shorthand → record_name
match:
record_name: "operator_*" # object form
key: "operator_email" # optional inner-key glob| Field | Used by | Description |
|---|---|---|
channel | transforms | Topic glob (e.g. /diagnostics, /sensors/*). |
schema | transforms | Optional schema-name glob — disambiguates two channels with the same topic but different schemas. |
encoding | transforms | Optional encoding glob — cdr, ros2msg, protobuf. |
record_name | metadata | MCAP metadata-record name glob. |
key | metadata | Optional inner-key glob within a metadata record. |
Transform bodies — put and patch
Each transform body has a type: discriminator. v1 ships two: put and patch.
type: put — whitelist
Spell out the entire output. Fields not mentioned in the template disappear. New upstream fields don't leak.
| Field | Type | Default | Description |
|---|---|---|---|
schema | string | — | ROS2 schema name the template targets (e.g. diagnostic_msgs/DiagnosticArray). Optional for metadata: rules. |
available_fields | list of strings | [] | Documentation-only — human-readable list of fields the template references. Not enforced. |
template | string (Jinja2) | — | Jinja2 template rendering the full output JSON. Omit template: for the identity put — the runtime short-circuits to a zero-copy passthrough (same cost as having no rule), useful when an operator wants the config to document a deliberate "this channel is fine, leave it alone." |
type: patch — denylist
Pass everything through, override only the listed fields.
| Field | Type | Default | Description |
|---|---|---|---|
schema | string | — | ROS2 schema name. Optional for metadata: rules. |
overrides | map | {} | Field-path → Jinja2 expression. Path syntax is field, field.sub, array[].field. |
functions
Named transform bodies. Same shape as a transform: entry, minus the match: key (which lives on the call site).
functions:
# Identity put — runtime short-circuits to zero-copy passthrough.
identity_pass:
type: put
# Whitelist redactor for an operator-command message.
redact_operator_cmd:
type: put
schema: "your_msgs/OperatorCommand"
template: |-
{
"operator_id": {{ operator_id | hash(algo="sha256") | tojson }},
"command": {{ command | tojson }},
"timestamp": {{ timestamp | tojson }}
}
# Denylist counterpart — same schema, only override two fields.
redact_operator_cmd_lite:
type: patch
schema: "your_msgs/OperatorCommand"
overrides:
operator_id: '{{ original | hash(algo="sha256") }}'
notes: '""'Functions from includes: land in the same namespace; local definitions in the entry-point file always win on collision.
Template helpers
Inside template: and overrides: expressions, the rendering engine is MiniJinja (Jinja2-compatible). On top of the standard filters, the redactor adds:
| Helper | Use | Example |
|---|---|---|
hash(algo=...) | Full hex digest of sha256(salt + value). algo: md5, sha256. Use this for de-identification. Requires hash_salt. | '{{ original | hash(algo="sha256") }}' |
sha256_short | First 8 hex chars of sha256(value) — unsalted. Content fingerprint, not de-identification (reversible with a wordlist). | '{{ s.name | sha256_short }}' |
regex_strip(pattern, replacement) | Replace regex matches in the value with a literal string. | '{{ original | regex_strip("[A-Z]+", "[X]") }}' |
regex_redact([patterns]) | Apply a list of regex patterns; replace every match with [REDACTED]. Convenience wrapper for chaining several regex_strip calls — patterns are user-supplied. | '{{ original | regex_redact([email_re, phone_re]) }}' |
redact_if(predicate, replacement) | Replace the value only when the predicate matches; otherwise pass through. | '{{ original | redact_if("hostname", "[HOST]") }}' |
zero | Type-aware zero — "" for strings, 0 for numbers, etc. | '{{ original | zero }}' |
tojson | JSON-encode a value (custom registration; minijinja's built-in tojson requires a feature this crate doesn't pull in). | '{{ header | tojson }}' |
original is a context variable, not a filter. It's bound only inside patch.overrides (the pre-override field value) and metadata: rule expressions (the pre-redaction string). put.template does not see original — it sees the full decoded message and addresses fields by name ({{ header | tojson }}, {{ status[0].level }}).
hash_salt must be set if any rule calls hash(...). sha256_short does not consume the salt.
Resolution rules
- Channel filter runs first. Denied topics drop before any decode.
- First match wins per channel / record name. Anything not listed in
transforms:is an implicit passthrough — no decode, no allocation, byte-for-byte copy. includes:merge in declaration order (last include wins on function-name collision); localfunctions:in the entry-point file override every include.- Bad templates fail at load time. A Jinja syntax error in any template / override expression rejects the rules file before the agent starts.
Reload semantics
Edits to redaction.yaml (or any included file) take effect only after edge-sync restarts. The rules file is read once at startup; there is no file watcher today. A future release plans hot-reload using the source_files set the loader populates.
The merged config is hashed (sha256:<hex>, recorded as rules_hash on every audit entry) so you can prove later which version of the rules produced a given file.