PagerDuty
InfraSage integrates bidirectionally with PagerDuty — creating incidents when anomalies are detected, routing to the correct on-call team, and syncing resolution status back.
Configuration
PAGERDUTY_API_TOKEN=your-pagerduty-api-token
PAGERDUTY_SERVICE_KEY=your-integration-key # Events API v2 key
Get your API token from PagerDuty → Integrations → API Access Keys.
Incident Lifecycle
1. InfraSage creates a PagerDuty incident
When an anomaly triggers RCA and the severity is high or critical:
{
"routing_key": "$PAGERDUTY_SERVICE_KEY",
"event_action": "trigger",
"payload": {
"summary": "CPU anomaly on payment-api: 87% CPU (score: 0.93)",
"severity": "critical",
"source": "infrasage",
"component": "payment-api",
"group": "production",
"custom_details": {
"anomaly_id": "anom-7f3d",
"root_cause": "CPU saturation from DB connection pool exhaustion",
"blast_radius": ["user-service", "checkout-service"],
"suggested_actions": ["Scale to 5 pods", "Restart pods"],
"rca_confidence": 0.92,
"infrasage_url": "https://infrasage.mycompany.com/incidents/anom-7f3d"
}
}
}
2. PagerDuty routes to on-call
PagerDuty uses your configured escalation policies to notify the on-call engineer. No additional configuration in InfraSage is required.
3. Resolution sync
When a runbook resolves an incident, InfraSage sends a PagerDuty resolve event:
{
"routing_key": "$PAGERDUTY_SERVICE_KEY",
"event_action": "resolve",
"dedup_key": "infrasage-anom-7f3d"
}
When an incident is resolved in PagerDuty, InfraSage receives the webhook and marks the anomaly as resolved in ClickHouse.
On-Call Routing
InfraSage uses PagerDuty's service mapping to route different alert types to different teams:
# Map InfraSage service IDs to PagerDuty service keys
PAGERDUTY_ROUTING_RULES='[
{"service_pattern": "payment-*", "service_key": "payments-team-key"},
{"service_pattern": "infra-*", "service_key": "platform-team-key"},
{"service_pattern": "*", "service_key": "default-key"}
]'
Severity Mapping
| InfraSage Anomaly Score | PagerDuty Severity |
|---|---|
| 0.4–0.6 | warning |
| 0.6–0.8 | error |
| 0.8–1.0 | critical |
Bidirectional Webhook
Configure PagerDuty to send webhooks back to InfraSage:
- In PagerDuty: Services → Service → Extensions → Generic Webhook (v3)
- Webhook URL:
https://your-infrasage-host:9093/api/v1/pagerduty/webhook - Enable events:
incident.acknowledged,incident.resolved,incident.assigned
InfraSage uses these webhooks to:
- Update the incident status in ClickHouse
- Record the acknowledging engineer in the audit log
- Trigger incident memory storage when resolved with a note
Verification
# Check PagerDuty incidents created by InfraSage
curl -H "Authorization: Token token=$PAGERDUTY_API_TOKEN" \
"https://api.pagerduty.com/incidents?service_ids[]=YOUR_SERVICE_ID&statuses[]=triggered"