Skip to main content

Processors

Processors sit between sources and sinks. They transform, enrich, filter, batch, and sample telemetry. Multiple processors can be chained — each one declares its upstream sources.


batch — Batching

Buffers records and flushes downstream on a time or size threshold. Always place a batch processor before sinks to reduce the number of export requests.

processors:
batch_main:
type: batch
timeout: 5s
max_size: 10000
sources: [otlp_in, host_metrics]
OptionDefaultDescription
timeout5sFlush if this much time passes since last flush
max_size10000Flush when the batch reaches this many records
sourcesrequiredUpstream source or processor names

attributes — Attribute Mutations

Add, remove, rename, update, or hash attributes on log records, metric data points, and trace spans.

processors:
add_env:
type: attributes
sources: [otlp_in]
actions:
- action: insert
key: deployment.environment
value: "production"

- action: upsert
key: host.name
value: "${HOSTNAME}"

- action: delete
key: http.request.header.authorization

- action: hash
key: user.email

- action: rename
key: old_key
new_key: new_key
ActionDescription
insertAdd the key only if it does not exist
upsertAdd or overwrite the key
deleteRemove the key
hashReplace the value with its SHA-256 hex digest
renameRename a key while preserving its value

filter — Record Filtering

Drop or keep records based on attribute conditions. Unmatched records are dropped.

processors:
drop_debug:
type: filter
sources: [add_env]
logs:
severity_number:
min: 9 # keep INFO (9) and above, drop DEBUG (1-8)

production_only:
type: filter
sources: [otlp_in]
logs:
attributes:
deployment.environment: "production"

Log filter options:

OptionDescription
severity_number.minMinimum OTLP severity number (1=TRACE, 9=INFO, 13=WARN, 17=ERROR)
severity_number.maxMaximum OTLP severity number
attributesKey-value map; all conditions must match
body_regexKeep records whose body matches this regex

Metric filter options:

OptionDescription
metric_namesList of metric name regexes to keep
attributesKey-value attribute conditions

transform — CEL Expressions

Mutate fields using Common Expression Language (CEL) expressions. Supports an optional where clause to apply mutations selectively.

processors:
normalize:
type: transform
sources: [otlp_in]
log_statements:
- context: log
where: 'severity_number < 9'
statements:
- 'set(severity_text, "DEBUG")'

- context: log
statements:
- 'set(attributes["http.url"], redact_url(attributes["http.url"]))'

metric_statements:
- context: datapoint
where: 'metric.name == "http.server.duration"'
statements:
- 'set(value_double, value_double / 1000)' # ms → seconds

Statements are evaluated in order. The where clause skips the statement block when false.


sampling — Trace Sampling

Reduces trace volume before export. Supports head-based (per-span decision) and tail-based (decision after the full trace is complete) policies.

processors:
trace_sample:
type: sampling
sources: [otlp_in]

# Tail-based: buffer complete traces before deciding
decision_wait: 10s # how long to wait for all spans
max_traces: 100000 # max traces in buffer before eviction

policies:
- name: always_errors
type: error # keep any trace with an error span

- name: slow_requests
type: latency
threshold_ms: 500 # keep traces slower than 500ms

- name: baseline
type: probabilistic
sampling_percentage: 5 # keep 5% of everything else
Policy typeDescription
always_sampleKeep all traces (useful for testing)
errorKeep traces where any span has status = Error
latencyKeep traces with root-span duration > threshold_ms
probabilisticKeep sampling_percentage% based on trace ID hash

Policies are evaluated in order. The first matching policy wins.


k8s_attributes — Kubernetes Metadata

Injects Kubernetes pod, node, and namespace metadata into logs, metrics, and traces. Reads from the Kubernetes API using the agent's service account.

processors:
k8s_enrich:
type: k8s_attributes
sources: [otlp_in, app_logs]
extract:
- k8s.pod.name
- k8s.namespace.name
- k8s.node.name
- k8s.deployment.name
- k8s.container.name
OptionDefaultDescription
extractall listedWhich attributes to inject
kubeconfigin-clusterPath to kubeconfig (leave empty when running in-cluster)

The processor uses the pod IP of the record's originating process to look up pod metadata from the Kubernetes API. Required RBAC: get/list/watch on pods and namespaces.


aggregate — Metric Pre-Aggregation

Reduces metric cardinality by aggregating high-cardinality label sets before export. Useful for keeping cost down when labels like user_id or request_id are attached.

processors:
agg:
type: aggregate
sources: [otlp_in]
metrics:
- name: "http.server.request.count"
drop_attributes: [user_id, request_id]
aggregation: sum
interval: 60s
OptionDescription
nameMetric name (regex supported)
drop_attributesAttributes to remove before aggregating
aggregationsum, min, max, last
intervalAggregation window

deduplicate — Deduplication

Drops duplicate records within a rolling time window using a content fingerprint. The fingerprint covers the body (for logs), metric name + labels (for metrics), or span name + trace ID (for traces).

processors:
dedup:
type: deduplicate
sources: [syslog_in]
window: 30s
max_entries: 100000
OptionDefaultDescription
window30sHow far back to remember fingerprints
max_entries100000Max fingerprints to retain (LRU eviction)

rate_limit — Rate Limiting

Applies a token-bucket rate limit per label set. Records that exceed the limit are dropped.

processors:
limit_logs:
type: rate_limit
sources: [syslog_in]
rate: 1000 # records per second
burst: 5000 # burst capacity
group_by: [service.name] # separate bucket per service
OptionDefaultDescription
raterequiredSustained records per second
burstrate * 5Maximum burst capacity
group_by[]Attributes to partition buckets by

geoip — Geo-IP Enrichment

Looks up IP addresses in a MaxMind GeoLite2 or GeoIP2 database and injects location attributes.

processors:
geo:
type: geoip
sources: [otlp_in]
database: "/etc/infrasagent/GeoLite2-City.mmdb"
ip_attribute: "client.address" # attribute containing the IP
target_prefix: "client.geo" # prefix for injected attributes

Injected attributes: client.geo.country_iso_code, client.geo.city_name, client.geo.latitude, client.geo.longitude.

OptionDefaultDescription
databaserequiredPath to MaxMind .mmdb file
ip_attributeclient.addressAttribute name holding the IP
target_prefixgeoPrefix for injected attributes