Skip to main content

Webhooks

InfraSage's webhook integration lets you connect to any HTTP endpoint. It supports pattern-based routing, data transformation with jq/JavaScript/Python, and exponential backoff retries.


Configuring a Webhook

Webhooks are configured through the Admin UI or API:

curl -X POST http://localhost:8080/api/v1/webhooks \
-H "Authorization: Bearer $ADMIN_JWT" \
-H "Content-Type: application/json" \
-d '{
"name": "custom-alerting",
"url": "https://api.mycompany.com/alerts",
"method": "POST",
"headers": {
"Authorization": "Bearer $MY_API_TOKEN",
"Content-Type": "application/json"
},
"pattern": {
"service_id_regex": "payment-.*",
"min_anomaly_score": 0.7
},
"transform": {
"type": "jq",
"expression": "{alert: .service_id, score: .anomaly_score, cause: .root_cause.summary}"
},
"retry": {
"max_attempts": 5,
"initial_delay_ms": 1000,
"backoff_multiplier": 2.0
}
}'

Pattern Matching

Control which anomalies trigger a webhook:

Pattern FieldTypeDescription
service_id_regexregexMatch service IDs (e.g., payment-.*, .*-api)
metric_name_regexregexMatch metric names
min_anomaly_scorefloatOnly fire for scores above this threshold
root_cause_categorystringinfrastructure, application, external
environmentsstring[]production, staging

Data Transformation

Transform the InfraSage payload before sending to the endpoint.

jq Transform

"transform": {
"type": "jq",
"expression": "{
title: (\"Anomaly: \" + .service_id),
severity: (if .anomaly_score > 0.8 then \"critical\" elif .anomaly_score > 0.6 then \"high\" else \"medium\" end),
description: .root_cause.summary,
runbook: .root_cause.suggested_actions[0]
}"
}

JavaScript Transform

"transform": {
"type": "javascript",
"code": "function transform(event) { return { alert_title: event.service_id + ' anomaly', body: event.root_cause.summary, tags: ['infrasage', event.service_id] }; }"
}

Python Transform

"transform": {
"type": "python",
"code": "def transform(event):\n return {'title': f'Anomaly on {event[\"service_id\"]}', 'score': event['anomaly_score']}"
}

Retry Policy

Webhooks automatically retry on failure with exponential backoff:

"retry": {
"max_attempts": 5,
"initial_delay_ms": 1000,
"backoff_multiplier": 2.0,
"max_delay_ms": 30000
}

Retry schedule for default settings:

  • Attempt 1: immediate
  • Attempt 2: after 1 second
  • Attempt 3: after 2 seconds
  • Attempt 4: after 4 seconds
  • Attempt 5: after 8 seconds

After all attempts are exhausted, the event is logged in the DLQ.


Incoming Webhooks (From Alertmanager)

InfraSage also accepts incoming webhooks from Prometheus Alertmanager:

# alertmanager.yml
receivers:
- name: infrasage
webhook_configs:
- url: http://infrasage-aiops:9093/api/v1/alerts/webhook

route:
receiver: infrasage

Incoming alerts trigger RCA and are processed exactly like internally-detected anomalies.


Testing a Webhook

# Send a test event to your webhook
curl -X POST http://localhost:8080/api/v1/webhooks/test \
-H "Authorization: Bearer $ADMIN_JWT" \
-d '{"webhook_name": "custom-alerting"}'

Webhook Delivery Logs

curl http://localhost:8080/api/v1/webhooks/custom-alerting/deliveries \
-H "Authorization: Bearer $ADMIN_JWT"

Response includes each delivery attempt, HTTP status code, response time, and error message (if any).