Skip to main content

Data Residency & VPC Isolation

InfraSage is designed from the ground up for environments where telemetry data cannot leave your infrastructure perimeter. This page explains the architecture, data flows, and compliance posture.


The Core Guarantee

No telemetry data sent to InfraSage, Inc.

When you deploy InfraSage self-hosted:

  • All metrics, logs, traces, events, and profiles stay inside your VPC
  • ClickHouse (storage), Kafka (streaming), and all processing run on your infrastructure
  • The only outbound call is to the Anthropic API for RCA reasoning — and this can be replaced with a self-hosted LLM (see below)
  • InfraSage, Inc. has no access to your telemetry data

Data Flow Map

Your VPC / Data Center
┌──────────────────────────────────────────────────────────────────┐
│ │
│ Your services ──► Ingestion Gateway ──► Kafka │
│ │ │
│ ▼ │
│ Telemetry Operator │
│ │ │
│ ▼ │
│ ClickHouse │
│ │ │
│ ▼ │
│ AIops Engine │
│ │ │
│ ┌────────────────────┘ │
│ │ │
│ RCA triggered? │
│ │ │
│ Yes ──►│──► Anthropic API (optional) │
│ │ (structured prompt only, │
│ │ no raw telemetry) │
│ No ────┘ │
│ │
│ Admin UI ◄── AIops Engine (internal) │
│ │
└──────────────────────────────────────────────────────────────────┘

Everything inside the box is your infrastructure.


What Leaves Your Environment

Anthropic API (RCA only, optional)

When an anomaly triggers RCA, InfraSage sends a structured analytical prompt to the Anthropic API. This prompt contains:

  • Service IDs and metric names (e.g., payment-service, error_rate)
  • Statistical summaries (mean, stddev, Z-score)
  • Causal graph edges (service A → service B dependency)
  • Recent event descriptions (e.g., "deployment at 14:32 UTC")

What it does NOT contain:

  • Raw log lines
  • Trace payloads
  • User PII or financial data
  • Customer identifiers
  • Any metric values beyond statistical summaries

Disabling the Anthropic API (Fully Air-Gapped)

To eliminate all outbound calls entirely, set:

LLM_PROVIDER=none

RCA will continue to run — causal graph construction, blast radius scoring, and evidence gathering all execute locally. Only the final natural-language explanation is skipped.

Using a Self-Hosted LLM

To keep RCA reasoning fully inside your environment:

LLM_PROVIDER=openai_compatible
LLM_BASE_URL=http://ollama.internal:11434/v1
LLM_MODEL=llama3.1:70b
LLM_API_KEY=none

Compatible with any OpenAI-compatible API (Ollama, vLLM, LMStudio, Azure OpenAI private endpoint).


Compliance Posture

GDPR (EU)

InfraSage self-hosted satisfies GDPR's data transfer restrictions (Articles 44–49) by keeping all personal data processing within the EU. There is no cross-border transfer to a third-country processor for telemetry data.

Customer obligations:

  • Deploy InfraSage in an EU-region VPC (e.g., eu-west-1, eu-central-1)
  • Ensure ClickHouse persistent volumes are bound to EU-region storage
  • Review log content for incidental PII; use the log sampling and field exclusion config to strip PII before ingestion

Relevant config:

# Strip fields from log payloads before storage
LOG_FIELD_EXCLUSIONS=user_id,email,ip_address,card_number

BaFin / DORA (Germany / EU Financial Services)

DORA (Digital Operational Resilience Act) and BaFin BAIT require that ICT third-party service providers either store data in the EU or provide contractual guarantees of equivalent protection.

InfraSage self-hosted removes InfraSage, Inc. from the ICT third-party chain entirely — you operate the software, you own the data. BaFin's concentration risk provisions around cloud providers do not apply to self-hosted software you control.

RBI Data Localization (India)

RBI's guidelines on storage of payment system data require that data related to payment transactions be stored only in India.

With InfraSage deployed in an India-region VPC (e.g., ap-south-1), all telemetry including payment service metrics stays within India's borders. InfraSage has no servers or data pipeline outside your deployment region.

HIPAA (US Healthcare)

HIPAA's Security Rule requires that PHI be protected at rest and in transit. InfraSage self-hosted ensures:

  • No PHI transits to a third-party SaaS vendor
  • ClickHouse storage encryption is configurable at rest
  • mTLS is available for all internal service communication
  • Audit logs of all API access are written to ClickHouse

No BAA is required with InfraSage, Inc. — because InfraSage, Inc. never processes your PHI. The software runs in your environment.


Network Architecture Recommendations

No Inbound Exposure Required

InfraSage does not require any inbound internet access. All components communicate over internal cluster networking. The Admin UI can be served internally and accessed via VPN or bastion.

Outbound Egress (Optional)

DestinationPurposeRequired
api.anthropic.com:443RCA natural-language explanationOptional
PagerDuty APIIncident creationOptional
Slack APIAlert notificationsOptional
Jira APITicket creationOptional

All integration endpoints are configurable. Set LLM_PROVIDER=none and disable unused integrations to achieve zero required egress.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: infrasage-egress
namespace: infrasage
spec:
podSelector: {}
policyTypes:
- Egress
egress:
# Allow internal cluster DNS
- ports:
- port: 53
protocol: UDP
# Allow internal cluster communication
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: infrasage
# Allow Anthropic API (remove if using LLM_PROVIDER=none)
- to:
- ipBlock:
cidr: 0.0.0.0/0
ports:
- port: 443
protocol: TCP

Encryption

In Transit

All inter-service communication within InfraSage supports mTLS. Configure via:

MTLS_ENABLED=true
MTLS_CA_CERT=/etc/certs/ca.crt
MTLS_SERVER_CERT=/etc/certs/server.crt
MTLS_SERVER_KEY=/etc/certs/server.key

At Rest

ClickHouse supports encryption at rest via:

  • Filesystem-level encryption (recommended: use cloud provider disk encryption)
  • ClickHouse native encryption codecs per column
-- Example: encrypt a sensitive column
ALTER TABLE telemetry MODIFY COLUMN log_body Encrypted('AES-256-GCM-SIV')

Audit Logging

Every API call to InfraSage is logged to the audit_log ClickHouse table:

SELECT
timestamp,
actor_email,
tenant_id,
action,
resource_type,
resource_id,
ip_address,
status_code
FROM audit_log
WHERE tenant_id = 'your-tenant'
ORDER BY timestamp DESC
LIMIT 100;

Audit logs are immutable by default — the audit_log table uses ReplacingMergeTree with no delete operations permitted via the API.


Penetration Testing & Security Reviews

InfraSage is open to customer-initiated penetration testing against self-hosted deployments. Contact [email protected] if you need architecture documentation for a third-party security review or vendor assessment.