Processors
Processors sit between sources and sinks. They transform, enrich, filter, batch, and sample telemetry. Multiple processors can be chained — each one declares its upstream sources.
batch — Batching
Buffers records and flushes downstream on a time or size threshold. Always place a batch processor before sinks to reduce the number of export requests.
processors:
batch_main:
type: batch
timeout: 5s
max_size: 10000
sources: [otlp_in, host_metrics]
| Option | Default | Description |
|---|---|---|
timeout | 5s | Flush if this much time passes since last flush |
max_size | 10000 | Flush when the batch reaches this many records |
sources | required | Upstream source or processor names |
attributes — Attribute Mutations
Add, remove, rename, update, or hash attributes on log records, metric data points, and trace spans.
processors:
add_env:
type: attributes
sources: [otlp_in]
actions:
- action: insert
key: deployment.environment
value: "production"
- action: upsert
key: host.name
value: "${HOSTNAME}"
- action: delete
key: http.request.header.authorization
- action: hash
key: user.email
- action: rename
key: old_key
new_key: new_key
| Action | Description |
|---|---|
insert | Add the key only if it does not exist |
upsert | Add or overwrite the key |
delete | Remove the key |
hash | Replace the value with its SHA-256 hex digest |
rename | Rename a key while preserving its value |
filter — Record Filtering
Drop or keep records based on attribute conditions. Unmatched records are dropped.
processors:
drop_debug:
type: filter
sources: [add_env]
logs:
severity_number:
min: 9 # keep INFO (9) and above, drop DEBUG (1-8)
production_only:
type: filter
sources: [otlp_in]
logs:
attributes:
deployment.environment: "production"
Log filter options:
| Option | Description |
|---|---|
severity_number.min | Minimum OTLP severity number (1=TRACE, 9=INFO, 13=WARN, 17=ERROR) |
severity_number.max | Maximum OTLP severity number |
attributes | Key-value map; all conditions must match |
body_regex | Keep records whose body matches this regex |
Metric filter options:
| Option | Description |
|---|---|
metric_names | List of metric name regexes to keep |
attributes | Key-value attribute conditions |
transform — CEL Expressions
Mutate fields using Common Expression Language (CEL) expressions. Supports an optional where clause to apply mutations selectively.
processors:
normalize:
type: transform
sources: [otlp_in]
log_statements:
- context: log
where: 'severity_number < 9'
statements:
- 'set(severity_text, "DEBUG")'
- context: log
statements:
- 'set(attributes["http.url"], redact_url(attributes["http.url"]))'
metric_statements:
- context: datapoint
where: 'metric.name == "http.server.duration"'
statements:
- 'set(value_double, value_double / 1000)' # ms → seconds
Statements are evaluated in order. The where clause skips the statement block when false.
sampling — Trace Sampling
Reduces trace volume before export. Supports head-based (per-span decision) and tail-based (decision after the full trace is complete) policies.
processors:
trace_sample:
type: sampling
sources: [otlp_in]
# Tail-based: buffer complete traces before deciding
decision_wait: 10s # how long to wait for all spans
max_traces: 100000 # max traces in buffer before eviction
policies:
- name: always_errors
type: error # keep any trace with an error span
- name: slow_requests
type: latency
threshold_ms: 500 # keep traces slower than 500ms
- name: baseline
type: probabilistic
sampling_percentage: 5 # keep 5% of everything else
| Policy type | Description |
|---|---|
always_sample | Keep all traces (useful for testing) |
error | Keep traces where any span has status = Error |
latency | Keep traces with root-span duration > threshold_ms |
probabilistic | Keep sampling_percentage% based on trace ID hash |
Policies are evaluated in order. The first matching policy wins.
k8s_attributes — Kubernetes Metadata
Injects Kubernetes pod, node, and namespace metadata into logs, metrics, and traces. Reads from the Kubernetes API using the agent's service account.
processors:
k8s_enrich:
type: k8s_attributes
sources: [otlp_in, app_logs]
extract:
- k8s.pod.name
- k8s.namespace.name
- k8s.node.name
- k8s.deployment.name
- k8s.container.name
| Option | Default | Description |
|---|---|---|
extract | all listed | Which attributes to inject |
kubeconfig | in-cluster | Path to kubeconfig (leave empty when running in-cluster) |
The processor uses the pod IP of the record's originating process to look up pod metadata from the Kubernetes API. Required RBAC: get/list/watch on pods and namespaces.
aggregate — Metric Pre-Aggregation
Reduces metric cardinality by aggregating high-cardinality label sets before export. Useful for keeping cost down when labels like user_id or request_id are attached.
processors:
agg:
type: aggregate
sources: [otlp_in]
metrics:
- name: "http.server.request.count"
drop_attributes: [user_id, request_id]
aggregation: sum
interval: 60s
| Option | Description |
|---|---|
name | Metric name (regex supported) |
drop_attributes | Attributes to remove before aggregating |
aggregation | sum, min, max, last |
interval | Aggregation window |
deduplicate — Deduplication
Drops duplicate records within a rolling time window using a content fingerprint. The fingerprint covers the body (for logs), metric name + labels (for metrics), or span name + trace ID (for traces).
processors:
dedup:
type: deduplicate
sources: [syslog_in]
window: 30s
max_entries: 100000
| Option | Default | Description |
|---|---|---|
window | 30s | How far back to remember fingerprints |
max_entries | 100000 | Max fingerprints to retain (LRU eviction) |
rate_limit — Rate Limiting
Applies a token-bucket rate limit per label set. Records that exceed the limit are dropped.
processors:
limit_logs:
type: rate_limit
sources: [syslog_in]
rate: 1000 # records per second
burst: 5000 # burst capacity
group_by: [service.name] # separate bucket per service
| Option | Default | Description |
|---|---|---|
rate | required | Sustained records per second |
burst | rate * 5 | Maximum burst capacity |
group_by | [] | Attributes to partition buckets by |
geoip — Geo-IP Enrichment
Looks up IP addresses in a MaxMind GeoLite2 or GeoIP2 database and injects location attributes.
processors:
geo:
type: geoip
sources: [otlp_in]
database: "/etc/infrasagent/GeoLite2-City.mmdb"
ip_attribute: "client.address" # attribute containing the IP
target_prefix: "client.geo" # prefix for injected attributes
Injected attributes: client.geo.country_iso_code, client.geo.city_name, client.geo.latitude, client.geo.longitude.
| Option | Default | Description |
|---|---|---|
database | required | Path to MaxMind .mmdb file |
ip_attribute | client.address | Attribute name holding the IP |
target_prefix | geo | Prefix for injected attributes |