Most of your observability bill is data you never needed to store.
Causely processes telemetry locally before it reaches your observability vendor. What gets ingested is a fraction of what your system generates.
Observability platforms charge for everything your system generates. Most of it is noise.
Every metric, log, and trace your infrastructure produces gets shipped to your observability vendor, ingested, stored, and indexed. At scale, that raw volume is the largest driver of your bill. Most of it is queried rarely, if ever.
The standard response is sampling or retention policies, keeping less data, for less time. The problem is that sampling is blunt. You don't know which signals matter until you need them. So teams keep everything and pay for it.
The structural fix is processing telemetry before it reaches your vendor, transforming raw signals into structured state at the source, and transmitting only what's needed for reasoning. That's what the Causely mediator does.
A 50-service cluster generates more telemetry than any team needs to store.
A mid-size engineering organization running 50 microservices at moderate traffic generates roughly 120 million spans and tens of thousands of metric time series every day. Most observability platforms charge separately for span ingestion, span indexing, and each unique metric time series. The bill compounds fast, and most of what you're paying to store is never queried.
~120M spans/day
generated across 50 services at moderate traffic
0 spans/day
raw spans never reach your vendor
tens of thousands
unique metric time series, each a billable unit
structured state only
health signals, not raw time series
ingested + indexed
charged twice by most platforms: once to collect, once to query
neither
processed locally, transmitted as semantic state
grows with cardinality
every new tag combination is a new billable series
fixed structure
semantic model doesn't expand with tag cardinality
Volume estimates based on 50 services at 500 requests/minute with average trace depth of 4 spans. Billing model reflects standard usage-based pricing common across major observability platforms.
The mediator runs in your environment. Raw data never leaves.
Causely deploys a mediator component co-located with your telemetry sources. It processes raw metrics, logs, traces, and alerts locally, transforming them into structured semantic state: entities, relationships, and health signals. Only that structured state is transmitted to the Causely backend.
Your observability vendor receives a fraction of the raw volume your system generates. What gets ingested and stored is less, and your bill reflects both.
Causely flow
Causely
Mediator
Causely
Causal model
Observability flow — optional
Reduced volume
Send only what you need
Observability vendor
Your choice
Causely flow
Causely
Mediator
Causely
Causal model
Observability flow — optional
Reduced volume
Send only what you need
Observability vendor
Your choice
Agents that do query your stack make the problem worse.
Even with reduced ingestion volume, every agent investigation that fans out through your observability stack costs you in two places simultaneously: LLM inference tokens consumed during the scan, and the data volume those queries pull from your stack. On metered platforms, that query volume has a direct line to your bill.
Without causal context, an ops agent investigating an incident queries metrics, pulls logs, enumerates traces, and repeats until it has enough signal to form a hypothesis. It may take 15–20 tool call cycles to reach an answer. Often that answer is wrong.
At 150 agent investigations per day, the difference is measurable.
Agent Investigation Savings
−48%
avg token reduction per investigation
−79%
avg tool calls per investigation
~2,200
fewer observability API calls per day at 150 investigations
Based on 72 experiments across Claude Code, Codex, and HolmesGPT.
Without Causely
- 18.8tool calls avg per investigation
- ~2,820observability API calls/day
- 433Kavg tokens per investigation
- 2.4Mworst-case token exposure (Codex)
- 75%of configs produced at least one missed diagnosis
With Causely
- 3.9tool calls avg per investigation
- ~1,585observability API calls/day
- 219Kavg tokens per investigation
- 456Kworst-case token exposure (Codex)
- 100%fault accuracy across all configs
~2,200 fewer observability queries per day at 150 agent investigations. Combined with mediator-level ingestion reduction, both layers of your observability bill move in the same direction.
Based on 72 experiments across Claude Code (Sonnet), Codex (GPT-5.4-mini), HolmesGPT (Gemini Flash Lite), and HolmesGPT (Sonnet).
Causely reduces your observability bill at two layers.
At ingestion: the mediator processes spans and metrics locally before they reach your vendor. Raw span volume and metric time series, the two largest drivers of observability spend at scale, are transformed into structured semantic state at the source. What gets transmitted is a fraction of what your system generates.
At query time: agents that connect via MCP query the causal model directly instead of scanning your stack. Fewer tool calls. Less data pulled. Lower token consumption per investigation. On metered platforms, both layers reduce spend.
This isn't prompt engineering. It's a structural change to how your agents interact with your stack.
See the full benchmark data.
72 experiments. Four agent frameworks. Every metric broken down by configuration.
More use cases
Build an AI SRE
Your AI SRE works in demos but not in your production environment.
Causely provides the causal model of your system that makes it accurate.
Read →OBSERVE AI WORKLOADS
When your AI application breaks, nobody knows where to look.
Causely answers cause vs. symptom before your engineers open a single dashboard.
Read →CONNECT BUSINESS OUTCOMES
The business sees the drop before engineering knows what caused it.
Causely connects business metrics to the infrastructure causing them.
Read →