Data Residency & VPC Isolation
InfraSage is designed from the ground up for environments where telemetry data cannot leave your infrastructure perimeter. This page explains the architecture, data flows, and compliance posture.
The Core Guarantee
No telemetry data sent to InfraSage, Inc.
When you deploy InfraSage self-hosted:
- All metrics, logs, traces, events, and profiles stay inside your VPC
- ClickHouse (storage), Kafka (streaming), and all processing run on your infrastructure
- The only outbound call is to the Anthropic API for RCA reasoning — and this can be replaced with a self-hosted LLM (see below)
- InfraSage, Inc. has no access to your telemetry data
Data Flow Map
Your VPC / Data Center
┌──────────────────────────────────────────────────────────────────┐
│ │
│ Your services ──► Ingestion Gateway ──► Kafka │
│ │ │
│ ▼ │
│ Telemetry Operator │
│ │ │
│ ▼ │
│ ClickHouse │
│ │ │
│ ▼ │
│ AIops Engine │
│ │ │
│ ┌────────────────────┘ │
│ │ │
│ RCA triggered? │
│ │ │
│ Yes ──►│──► Anthropic API (optional) │
│ │ (structured prompt only, │
│ │ no raw telemetry) │
│ No ────┘ │
│ │
│ Admin UI ◄── AIops Engine (internal) │
│ │
└──────────────────────────────────────────────────────────────────┘
Everything inside the box is your infrastructure.
What Leaves Your Environment
Anthropic API (RCA only, optional)
When an anomaly triggers RCA, InfraSage sends a structured analytical prompt to the Anthropic API. This prompt contains:
- Service IDs and metric names (e.g.,
payment-service,error_rate) - Statistical summaries (mean, stddev, Z-score)
- Causal graph edges (service A → service B dependency)
- Recent event descriptions (e.g., "deployment at 14:32 UTC")
What it does NOT contain:
- Raw log lines
- Trace payloads
- User PII or financial data
- Customer identifiers
- Any metric values beyond statistical summaries
Disabling the Anthropic API (Fully Air-Gapped)
To eliminate all outbound calls entirely, set:
LLM_PROVIDER=none
RCA will continue to run — causal graph construction, blast radius scoring, and evidence gathering all execute locally. Only the final natural-language explanation is skipped.
Using a Self-Hosted LLM
To keep RCA reasoning fully inside your environment:
LLM_PROVIDER=openai_compatible
LLM_BASE_URL=http://ollama.internal:11434/v1
LLM_MODEL=llama3.1:70b
LLM_API_KEY=none
Compatible with any OpenAI-compatible API (Ollama, vLLM, LMStudio, Azure OpenAI private endpoint).
Compliance Posture
GDPR (EU)
InfraSage self-hosted satisfies GDPR's data transfer restrictions (Articles 44–49) by keeping all personal data processing within the EU. There is no cross-border transfer to a third-country processor for telemetry data.
Customer obligations:
- Deploy InfraSage in an EU-region VPC (e.g., eu-west-1, eu-central-1)
- Ensure ClickHouse persistent volumes are bound to EU-region storage
- Review log content for incidental PII; use the log sampling and field exclusion config to strip PII before ingestion
Relevant config:
# Strip fields from log payloads before storage
LOG_FIELD_EXCLUSIONS=user_id,email,ip_address,card_number
BaFin / DORA (Germany / EU Financial Services)
DORA (Digital Operational Resilience Act) and BaFin BAIT require that ICT third-party service providers either store data in the EU or provide contractual guarantees of equivalent protection.
InfraSage self-hosted removes InfraSage, Inc. from the ICT third-party chain entirely — you operate the software, you own the data. BaFin's concentration risk provisions around cloud providers do not apply to self-hosted software you control.
RBI Data Localization (India)
RBI's guidelines on storage of payment system data require that data related to payment transactions be stored only in India.
With InfraSage deployed in an India-region VPC (e.g., ap-south-1), all telemetry including payment service metrics stays within India's borders. InfraSage has no servers or data pipeline outside your deployment region.
HIPAA (US Healthcare)
HIPAA's Security Rule requires that PHI be protected at rest and in transit. InfraSage self-hosted ensures:
- No PHI transits to a third-party SaaS vendor
- ClickHouse storage encryption is configurable at rest
- mTLS is available for all internal service communication
- Audit logs of all API access are written to ClickHouse
No BAA is required with InfraSage, Inc. — because InfraSage, Inc. never processes your PHI. The software runs in your environment.
Network Architecture Recommendations
No Inbound Exposure Required
InfraSage does not require any inbound internet access. All components communicate over internal cluster networking. The Admin UI can be served internally and accessed via VPN or bastion.
Outbound Egress (Optional)
| Destination | Purpose | Required |
|---|---|---|
api.anthropic.com:443 | RCA natural-language explanation | Optional |
| PagerDuty API | Incident creation | Optional |
| Slack API | Alert notifications | Optional |
| Jira API | Ticket creation | Optional |
All integration endpoints are configurable. Set LLM_PROVIDER=none and disable unused integrations to achieve zero required egress.
Recommended Network Policy (Kubernetes)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: infrasage-egress
namespace: infrasage
spec:
podSelector: {}
policyTypes:
- Egress
egress:
# Allow internal cluster DNS
- ports:
- port: 53
protocol: UDP
# Allow internal cluster communication
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: infrasage
# Allow Anthropic API (remove if using LLM_PROVIDER=none)
- to:
- ipBlock:
cidr: 0.0.0.0/0
ports:
- port: 443
protocol: TCP
Encryption
In Transit
All inter-service communication within InfraSage supports mTLS. Configure via:
MTLS_ENABLED=true
MTLS_CA_CERT=/etc/certs/ca.crt
MTLS_SERVER_CERT=/etc/certs/server.crt
MTLS_SERVER_KEY=/etc/certs/server.key
At Rest
ClickHouse supports encryption at rest via:
- Filesystem-level encryption (recommended: use cloud provider disk encryption)
- ClickHouse native encryption codecs per column
-- Example: encrypt a sensitive column
ALTER TABLE telemetry MODIFY COLUMN log_body Encrypted('AES-256-GCM-SIV')
Audit Logging
Every API call to InfraSage is logged to the audit_log ClickHouse table:
SELECT
timestamp,
actor_email,
tenant_id,
action,
resource_type,
resource_id,
ip_address,
status_code
FROM audit_log
WHERE tenant_id = 'your-tenant'
ORDER BY timestamp DESC
LIMIT 100;
Audit logs are immutable by default — the audit_log table uses ReplacingMergeTree with no delete operations permitted via the API.
Penetration Testing & Security Reviews
InfraSage is open to customer-initiated penetration testing against self-hosted deployments. Contact [email protected] if you need architecture documentation for a third-party security review or vendor assessment.