Migration Guide
This page covers two migration paths:
- Datadog (or New Relic / Dynatrace) → InfraSage — moving from a SaaS observability tool
- InfraSage self-hosted → InfraSage Cloud — moving between deployment models
Datadog → InfraSage
Overview
The migration has two phases. Phase 1 runs InfraSage in parallel (no disruption). Phase 2 cuts over and decommissions Datadog agents.
Phase 1 (parallel) Phase 2 (cutover)
────────────────── ─────────────────
Services → Datadog agent Services → InfraSage only
Services → InfraSage too Datadog agents removed
(double-reporting) Datadog subscription cancelled
Expect Phase 1 to last 1–2 weeks while you validate alert parity and tune thresholds.
Step 1: Map Your Datadog Monitors to InfraSage Services
List your existing Datadog monitors and group them by service:
# Export Datadog monitors via API
curl -X GET "https://api.datadoghq.com/api/v1/monitor" \
-H "DD-API-KEY: $DD_API_KEY" \
-H "DD-APPLICATION-KEY: $DD_APP_KEY" | jq '.[].name'
For each monitor, identify:
- Which
service_idit corresponds to in InfraSage - Which metric name to use (see metric naming)
- The threshold / sensitivity equivalent
Step 2: Deploy InfraSage
Follow Installation to get InfraSage running. Create one tenant per team or environment.
Step 3: Send Metrics to Both (Parallel Phase)
Option A: Use the Datadog Agent as a forwarder
Configure the Datadog Agent to also forward metrics to InfraSage via the Prometheus remote-write endpoint:
# datadog.yaml
additional_endpoints:
"http://infrasage:8080/api/v1/prometheus/remote_write":
- api_key: "isage_your_key"
Option B: Instrument directly alongside Datadog
Add InfraSage batch calls alongside existing Datadog metric calls. Since InfraSage uses HTTP, it adds minimal overhead.
Option C: OpenTelemetry Collector (recommended for new services)
Route OTEL Collector output to both backends during the migration window:
# otel-collector-config.yaml
exporters:
datadog:
api:
key: ${DD_API_KEY}
otlp/infrasage:
endpoint: http://infrasage:8080/api/v1/otlp
headers:
X-API-Key: ${INFRASAGE_API_KEY}
service:
pipelines:
metrics:
exporters: [datadog, otlp/infrasage]
traces:
exporters: [datadog, otlp/infrasage]
Step 4: Recreate Alerts in InfraSage
InfraSage anomaly detection is automatic (no manual threshold configuration required), but you'll want to validate it catches what your Datadog monitors were catching.
For each critical Datadog monitor, trigger a synthetic event and confirm InfraSage detects it:
# Send a spike that would have triggered your Datadog monitor
curl -X POST http://infrasage:8080/api/v1/telemetry \
-H "X-API-Key: $INFRASAGE_API_KEY" \
-d '{"type":"metric","service_id":"my-service","metric_name":"error_rate","value":0.15,"timestamp":'$(date +%s000)'}'
# Confirm anomaly appeared
curl "http://infrasage:8080/api/v1/anomalies?service_id=my-service" \
-H "X-API-Key: $INFRASAGE_API_KEY"
Step 5: Migrate Dashboards
Datadog dashboards map to Grafana dashboards backed by InfraSage's Prometheus metrics endpoint. InfraSage ships pre-built Grafana dashboards — import them first, then add service-specific panels.
InfraSage exposes Prometheus metrics at:
http://infrasage:8080/metrics # gateway
http://infrasage-aiops:8081/metrics # aiops engine
Add these as Prometheus scrape targets in your Grafana data source, then use standard PromQL for custom panels.
Step 6: Cut Over and Decommission
Once you've validated alert parity for 1 week:
- Remove Datadog agent DaemonSets from your clusters
- Update OTEL Collector config to remove the Datadog exporter
- Cancel Datadog subscription (export your historical data first if needed)
Datadog Feature Mapping
| Datadog Feature | InfraSage Equivalent |
|---|---|
| Monitors | Anomaly detection (automatic) + configurable thresholds |
| Watchdog | AIops Engine Watchdog |
| APM | Traces via OTLP |
| Log Management | Log ingestion via batch API or OTLP |
| Synthetics | Not included — use an external synthetic monitor |
| NPM | Not included — use Cilium Hubble or similar |
| RUM | Not included |
| Incident Management | Jira integration + runbook automation |
| Notebooks | Grafana notebooks |
| SLOs | SLO telemetry type |
InfraSage Self-Hosted → InfraSage Cloud
When to Consider This
- You want to reduce operational overhead (no ClickHouse/Kafka to manage)
- Your data residency requirements are satisfied by InfraSage Cloud's EU/US region options
- You're scaling faster than you want to manage infrastructure for
Step 1: Export Your ClickHouse Data
Export historical telemetry from your self-hosted ClickHouse to parquet files:
-- Export telemetry table
SELECT * FROM telemetry
WHERE timestamp >= now() - INTERVAL 90 DAY
INTO OUTFILE '/tmp/telemetry-export.parquet'
FORMAT Parquet;
Contact [email protected] to arrange a bulk import to your Cloud workspace.
Step 2: Provision Cloud Workspace
- Sign up at console.infrasage.dev
- Create a tenant with the same
tenant_idas your self-hosted instance - Generate new API keys (same scopes as your existing keys)
Step 3: Update Your Ingestion Endpoints
Change INFRASAGE_GATEWAY_URL in all instrumented services from your self-hosted endpoint to:
https://ingest.infrasage.dev
Step 4: Validate and Decommission
Run both endpoints in parallel for 48 hours, then shut down your self-hosted stack.
# Helm uninstall (after data validation)
helm uninstall infrasage -n infrasage
kubectl delete pvc -n infrasage --all
InfraSage Cloud → Self-Hosted
The reverse migration follows the same pattern. Export your telemetry data from Cloud (contact support), deploy self-hosted via Helm, and import the data.
This path is most common for:
- Teams that outgrow Cloud pricing tiers
- Regulatory changes requiring on-prem data residency
- Acquisition by a regulated entity