Skip to main content

Installation

InfraSage is available as a fully managed cloud service or as a self-hosted deployment within your own cloud infrastructure.


Cloud (InfraSage Hosted)

The hosted service at console.infrasage.dev requires zero infrastructure setup. InfraSage manages all components — ingestion gateways, Kafka brokers, ClickHouse clusters, the AIops Engine, and the console UI.

To get started:

  1. Sign up at console.infrasage.dev/register
  2. Create an API key under Settings → API Keys
  3. Start sending telemetry to your assigned ingestion endpoint

See the Quick Start guide for a step-by-step walkthrough.


Self-Hosted

Self-hosted deployments give your team complete control over data residency, network placement, and infrastructure sizing. InfraSage provides signed container images and production-ready deployment manifests as part of your license.

License & Access

Self-hosted InfraSage requires a commercial license. Contact [email protected] to:

  • Request a license
  • Get access to the InfraSage private artifact registry
  • Discuss architecture requirements and support tiers

Once licensed, you receive:

  • Access credentials for the private container registry
  • Kubernetes manifests and Helm chart (optional)
  • A license key for activation

Requirements

ComponentMinimumRecommended (Production)
Kubernetes1.25+1.28+
ClickHouse26+Managed or dedicated cluster
Kafka / RedpandaRedpanda 23.3+3-node cluster
Node RAM (gateway)8 GB16–32 GB
Persistent storage100 GBSized to retention policy

Step 1 — Authenticate with the Registry

docker login registry.infrasage.io \
--username <YOUR_LICENSE_EMAIL> \
--password <YOUR_REGISTRY_TOKEN>

Step 2 — Create the Namespace and Secrets

kubectl create namespace infrasage

# Your Anthropic API key (powers LLM-based RCA)
kubectl create secret generic llm-secrets \
--from-literal=ANTHROPIC_API_KEY=sk-ant-YOUR_KEY_HERE \
--from-literal=SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK \
-n infrasage

# ClickHouse credentials
kubectl create secret generic infrasage-clickhouse-secret \
--from-literal=password=YOUR_SECURE_PASSWORD \
-n infrasage

# InfraSage license key
kubectl create secret generic infrasage-license \
--from-literal=key=YOUR_LICENSE_KEY \
-n infrasage

Step 3 — Deploy InfraSage

kubectl apply -f deployments/kubernetes/

This deploys:

  • ClickHouse StatefulSet with persistent volumes
  • Redpanda StatefulSet (Kafka-compatible broker)
  • Ingestion Gateway Deployment + Service
  • Telemetry Operator Deployment + Service
  • AIops Engine Deployment + Service
  • Prometheus ConfigMap + Deployment
  • Grafana Deployment + Service

Step 4 — Verify All Pods Are Running

kubectl get pods -n infrasage
# NAME READY STATUS RESTARTS AGE
# clickhouse-0 1/1 Running 0 2m
# redpanda-0 1/1 Running 0 2m
# ingestion-gateway-xxx 1/1 Running 0 90s
# telemetry-operator-xxx 1/1 Running 0 90s
# aiops-engine-xxx 1/1 Running 0 90s
# prometheus-xxx 1/1 Running 0 90s
# grafana-xxx 1/1 Running 0 90s

Step 5 — Scale for Your Environment

# Scale ingestion horizontally
kubectl scale deployment ingestion-gateway -n infrasage --replicas=3

# Autoscale based on CPU
kubectl autoscale deployment ingestion-gateway \
-n infrasage --min=2 --max=10 --cpu-percent=70

See Scale Profiles for sizing guidance across small, medium, and large deployments.


Service Ports

ServiceHTTP PortMetrics Port
Ingestion Gateway80809090
Telemetry Operator80819091
AIops Engine9092
Alertmanager Webhook9093
Prometheus9999
Grafana3000
ClickHouse HTTP8123
ClickHouse Native9000
Redpanda / Kafka9092

Upgrading

# Pull and apply updated manifests (provided with each release)
kubectl set image deployment/ingestion-gateway \
ingestion-gateway=registry.infrasage.io/ingestion-gateway:NEW_VERSION \
-n infrasage

kubectl set image deployment/telemetry-operator \
telemetry-operator=registry.infrasage.io/telemetry-operator:NEW_VERSION \
-n infrasage

kubectl set image deployment/aiops-engine \
aiops-engine=registry.infrasage.io/aiops-engine:NEW_VERSION \
-n infrasage

Security Hardening

Before going to production:

  • Change CLICKHOUSE_PASSWORD from any default values
  • Change Grafana admin password
  • Rotate ANTHROPIC_API_KEY quarterly
  • Enable TLS on ClickHouse connections
  • Set up Kubernetes NetworkPolicies to restrict inter-service communication
  • Use a secrets manager (AWS Secrets Manager, HashiCorp Vault) instead of raw Kubernetes secrets
  • Enable ClickHouse access logging
  • Set up regular ClickHouse backups

See the Security section for full guidance.