Skip to main content

Migration Guide

This page covers two migration paths:

  1. Datadog (or New Relic / Dynatrace) → InfraSage — moving from a SaaS observability tool
  2. InfraSage self-hosted → InfraSage Cloud — moving between deployment models

Datadog → InfraSage

Overview

The migration has two phases. Phase 1 runs InfraSage in parallel (no disruption). Phase 2 cuts over and decommissions Datadog agents.

Phase 1 (parallel) Phase 2 (cutover)
────────────────── ─────────────────
Services → Datadog agent Services → InfraSage only
Services → InfraSage too Datadog agents removed
(double-reporting) Datadog subscription cancelled

Expect Phase 1 to last 1–2 weeks while you validate alert parity and tune thresholds.


Step 1: Map Your Datadog Monitors to InfraSage Services

List your existing Datadog monitors and group them by service:

# Export Datadog monitors via API
curl -X GET "https://api.datadoghq.com/api/v1/monitor" \
-H "DD-API-KEY: $DD_API_KEY" \
-H "DD-APPLICATION-KEY: $DD_APP_KEY" | jq '.[].name'

For each monitor, identify:

  • Which service_id it corresponds to in InfraSage
  • Which metric name to use (see metric naming)
  • The threshold / sensitivity equivalent

Step 2: Deploy InfraSage

Follow Installation to get InfraSage running. Create one tenant per team or environment.


Step 3: Send Metrics to Both (Parallel Phase)

Option A: Use the Datadog Agent as a forwarder

Configure the Datadog Agent to also forward metrics to InfraSage via the Prometheus remote-write endpoint:

# datadog.yaml
additional_endpoints:
"http://infrasage:8080/api/v1/prometheus/remote_write":
- api_key: "isage_your_key"

Option B: Instrument directly alongside Datadog

Add InfraSage batch calls alongside existing Datadog metric calls. Since InfraSage uses HTTP, it adds minimal overhead.

Option C: OpenTelemetry Collector (recommended for new services)

Route OTEL Collector output to both backends during the migration window:

# otel-collector-config.yaml
exporters:
datadog:
api:
key: ${DD_API_KEY}
otlp/infrasage:
endpoint: http://infrasage:8080/api/v1/otlp
headers:
X-API-Key: ${INFRASAGE_API_KEY}

service:
pipelines:
metrics:
exporters: [datadog, otlp/infrasage]
traces:
exporters: [datadog, otlp/infrasage]

Step 4: Recreate Alerts in InfraSage

InfraSage anomaly detection is automatic (no manual threshold configuration required), but you'll want to validate it catches what your Datadog monitors were catching.

For each critical Datadog monitor, trigger a synthetic event and confirm InfraSage detects it:

# Send a spike that would have triggered your Datadog monitor
curl -X POST http://infrasage:8080/api/v1/telemetry \
-H "X-API-Key: $INFRASAGE_API_KEY" \
-d '{"type":"metric","service_id":"my-service","metric_name":"error_rate","value":0.15,"timestamp":'$(date +%s000)'}'

# Confirm anomaly appeared
curl "http://infrasage:8080/api/v1/anomalies?service_id=my-service" \
-H "X-API-Key: $INFRASAGE_API_KEY"

Step 5: Migrate Dashboards

Datadog dashboards map to Grafana dashboards backed by InfraSage's Prometheus metrics endpoint. InfraSage ships pre-built Grafana dashboards — import them first, then add service-specific panels.

InfraSage exposes Prometheus metrics at:

http://infrasage:8080/metrics # gateway
http://infrasage-aiops:8081/metrics # aiops engine

Add these as Prometheus scrape targets in your Grafana data source, then use standard PromQL for custom panels.


Step 6: Cut Over and Decommission

Once you've validated alert parity for 1 week:

  1. Remove Datadog agent DaemonSets from your clusters
  2. Update OTEL Collector config to remove the Datadog exporter
  3. Cancel Datadog subscription (export your historical data first if needed)

Datadog Feature Mapping

Datadog FeatureInfraSage Equivalent
MonitorsAnomaly detection (automatic) + configurable thresholds
WatchdogAIops Engine Watchdog
APMTraces via OTLP
Log ManagementLog ingestion via batch API or OTLP
SyntheticsNot included — use an external synthetic monitor
NPMNot included — use Cilium Hubble or similar
RUMNot included
Incident ManagementJira integration + runbook automation
NotebooksGrafana notebooks
SLOsSLO telemetry type

InfraSage Self-Hosted → InfraSage Cloud

When to Consider This

  • You want to reduce operational overhead (no ClickHouse/Kafka to manage)
  • Your data residency requirements are satisfied by InfraSage Cloud's EU/US region options
  • You're scaling faster than you want to manage infrastructure for

Step 1: Export Your ClickHouse Data

Export historical telemetry from your self-hosted ClickHouse to parquet files:

-- Export telemetry table
SELECT * FROM telemetry
WHERE timestamp >= now() - INTERVAL 90 DAY
INTO OUTFILE '/tmp/telemetry-export.parquet'
FORMAT Parquet;

Contact [email protected] to arrange a bulk import to your Cloud workspace.

Step 2: Provision Cloud Workspace

  1. Sign up at console.infrasage.dev
  2. Create a tenant with the same tenant_id as your self-hosted instance
  3. Generate new API keys (same scopes as your existing keys)

Step 3: Update Your Ingestion Endpoints

Change INFRASAGE_GATEWAY_URL in all instrumented services from your self-hosted endpoint to:

https://ingest.infrasage.dev

Step 4: Validate and Decommission

Run both endpoints in parallel for 48 hours, then shut down your self-hosted stack.

# Helm uninstall (after data validation)
helm uninstall infrasage -n infrasage
kubectl delete pvc -n infrasage --all

InfraSage Cloud → Self-Hosted

The reverse migration follows the same pattern. Export your telemetry data from Cloud (contact support), deploy self-hosted via Helm, and import the data.

This path is most common for:

  • Teams that outgrow Cloud pricing tiers
  • Regulatory changes requiring on-prem data residency
  • Acquisition by a regulated entity