If you've ever wondered why your observability bill keeps climbing while query performance stays sluggish, you're not alone. Traditional observability solutions built on Elasticsearch, Loki, or time-series databases like InfluxDB are reaching their limits. Enter ClickHouse—the columnar database that's quietly revolutionizing how we store and query observability data.
At Qorrelate, we chose ClickHouse as our foundation for one simple reason: it's built for exactly the type of analytical workloads that observability demands. Let's dive into why.
The Observability Data Challenge
Modern observability generates staggering amounts of data:
- Logs: Thousands to millions of events per second, each with timestamps, severity, messages, and arbitrary key-value attributes
- Metrics: High-cardinality time-series data with labels, requiring precise aggregations over time windows
- Traces: Distributed spans with parent-child relationships, service names, durations, and custom attributes
This data shares common characteristics that make traditional databases struggle:
- Write-heavy: Constant streaming ingestion with minimal updates
- Time-based: Nearly all queries filter by time range
- Analytical: Aggregations (count, sum, avg, percentiles) far outnumber point lookups
- Wide and sparse: Many columns, but queries typically touch only a few
ClickHouse was designed from the ground up to handle exactly these patterns.
1. Columnar Storage: Read Only What You Need
In a traditional row-based database like PostgreSQL or MySQL, data is stored row by row:
| timestamp | service | level | message |
|-----------|---------|-------|-------------------|
| 10:00:01 | api | INFO | Request started |
| 10:00:02 | api | ERROR | Connection failed |
| 10:00:03 | web | INFO | Page rendered |
When you query SELECT count(*) WHERE level = 'ERROR', the database reads every column of every row, even though you only care about the level column.
ClickHouse stores data by column:
timestamp: [10:00:01, 10:00:02, 10:00:03]
service: [api, api, web]
level: [INFO, ERROR, INFO]
message: [Request started, Connection failed, Page rendered]
Now the same query reads only the level column—typically 10-100x less I/O for analytical queries. This is the fundamental reason ClickHouse is faster.
A query scanning 1 billion log entries for error counts might read 50GB in Elasticsearch but only 500MB in ClickHouse—a 100x reduction in I/O.
2. Exceptional Compression: Store More, Pay Less
Columnar storage enables dramatically better compression. When similar values are stored together (all timestamps, all service names, all log levels), compression algorithms can find patterns more effectively.
ClickHouse achieves 10-20x compression ratios on typical observability data:
| Data Type | Raw Size | ClickHouse Compressed | Ratio |
|---|---|---|---|
| Log messages | 100 GB | 8 GB | 12.5x |
| Metrics (numeric) | 100 GB | 5 GB | 20x |
| Trace spans | 100 GB | 7 GB | 14x |
Compare this to Elasticsearch's typical 1.5-3x compression, and you can see why ClickHouse-based solutions cost a fraction of traditional observability platforms.
Specialized Compression Codecs
ClickHouse goes further with data-type-specific codecs:
- Delta encoding for timestamps: Instead of storing absolute values, store the difference between consecutive values
- DoubleDelta for monotonic sequences: Even better compression for regularly-spaced time series
- LZ4/ZSTD for string data: Fast decompression for log messages
- Gorilla for floating-point metrics: Specialized algorithm from Facebook's time-series database
CREATE TABLE metrics (
timestamp DateTime64(9) CODEC(DoubleDelta, LZ4),
service LowCardinality(String),
metric_name LowCardinality(String),
value Float64 CODEC(Gorilla, LZ4)
) ENGINE = MergeTree()
ORDER BY (service, metric_name, timestamp);
3. Blazing-Fast Aggregations
Observability queries are almost always aggregations: "How many errors in the last hour?", "What's the 99th percentile latency?", "Which services have the most requests?"
ClickHouse is purpose-built for aggregations:
- Vectorized execution: Processes data in batches using SIMD instructions, achieving billions of rows per second per core
- Parallel processing: Automatically parallelizes queries across all CPU cores
- Approximate aggregations: Built-in functions like
uniq()andquantileTDigest()trade tiny accuracy for massive speed gains
Benchmark: Error Rate Query
Query: Count errors per service for the last 24 hours across 10 billion log entries.
| Database | Query Time | Data Scanned |
|---|---|---|
| Elasticsearch | 45 seconds | 2.1 TB |
| PostgreSQL | 12 minutes | 850 GB |
| ClickHouse | 0.8 seconds | 45 GB |
4. Real-Time Ingestion at Scale
Observability data doesn't wait. ClickHouse handles massive real-time ingestion without breaking a sweat:
- 1 million+ rows per second ingestion on modest hardware
- Async inserts: Batches incoming data automatically for optimal write performance
- No write amplification: Unlike LSM-tree databases, ClickHouse's MergeTree engine has predictable I/O patterns
At Qorrelate, we ingest OpenTelemetry data (OTLP) directly into ClickHouse with sub-second latency from ingestion to queryability.
-- Logs are immediately queryable after ingestion
SELECT
toStartOfMinute(timestamp) as minute,
count() as log_count,
countIf(level = 'ERROR') as error_count
FROM logs
WHERE timestamp >= now() - INTERVAL 5 MINUTE
GROUP BY minute
ORDER BY minute;
5. Materialized Views for Pre-Aggregation
For dashboards that need instant response times, ClickHouse's materialized views pre-compute aggregations as data arrives:
-- Pre-aggregate error rates per minute
CREATE MATERIALIZED VIEW error_rates_mv
ENGINE = SummingMergeTree()
ORDER BY (service, minute)
AS SELECT
service,
toStartOfMinute(timestamp) as minute,
count() as total_count,
countIf(level = 'ERROR') as error_count
FROM logs
GROUP BY service, minute;
Now your dashboard queries hit the pre-aggregated table, returning in milliseconds regardless of raw data volume.
6. Native Time-Series Functions
ClickHouse includes powerful functions specifically for time-series analysis:
-- Rate of change over time
SELECT
toStartOfHour(timestamp) as hour,
service,
count() as requests,
requests / 3600 as rps
FROM traces
WHERE timestamp >= now() - INTERVAL 24 HOUR
GROUP BY hour, service;
-- Moving averages
SELECT
timestamp,
value,
avg(value) OVER (ORDER BY timestamp ROWS BETWEEN 10 PRECEDING AND CURRENT ROW) as moving_avg
FROM metrics
WHERE metric_name = 'cpu_usage';
-- Percentiles for latency analysis
SELECT
service,
quantile(0.50)(duration_ms) as p50,
quantile(0.95)(duration_ms) as p95,
quantile(0.99)(duration_ms) as p99
FROM traces
GROUP BY service;
7. The Cost Advantage
Let's talk numbers. Here's a realistic comparison for a mid-sized observability deployment:
| Metric | Datadog/New Relic | Self-Hosted Elastic | Qorrelate (ClickHouse) |
|---|---|---|---|
| Monthly data volume | 10 TB | 10 TB | 10 TB |
| Storage required | N/A (SaaS) | 5 TB | 500 GB |
| Monthly cost | $25,000+ | $3,000 | $300 |
The 10-100x cost difference isn't marketing—it's the direct result of ClickHouse's architectural advantages in compression and query efficiency.
ClickHouse vs. Alternatives
vs. Elasticsearch
Elasticsearch is an inverted index optimized for full-text search. While great for searching log messages, it's inefficient for:
- Aggregations (counts, sums, percentiles)
- Time-range filtering at scale
- Storage efficiency (typically 3-5x larger than ClickHouse)
vs. Prometheus/InfluxDB
Time-series databases excel at metrics but struggle with:
- Log storage and search
- Trace data with complex nested structures
- High cardinality (millions of unique label combinations)
vs. Loki
Grafana Loki is cost-effective but limited:
- Minimal indexing—slow for filtered queries
- No native metrics or traces support
- Query language (LogQL) is less powerful than SQL
Why ClickHouse Wins
ClickHouse provides a unified foundation for all three pillars of observability:
- Logs: Full-text search with
LIKEand regex, plus fast aggregations - Metrics: Native time-series functions with excellent high-cardinality support
- Traces: Flexible schema for span attributes with fast correlation queries
How Qorrelate Leverages ClickHouse
At Qorrelate, we've built our entire observability platform on ClickHouse:
- OpenTelemetry-native ingestion: OTLP data flows directly into optimized ClickHouse tables
- Unified data model: Logs, metrics, and traces in one database with consistent query patterns
- Real-time dashboards: Sub-second queries even across billions of events
- Cost transparency: ClickHouse efficiency means we can offer 10-100x lower prices than legacy vendors
Get started with Qorrelate in under 60 seconds. Install our CLI, point your OpenTelemetry data at us, and experience the ClickHouse difference.
# Install Qorrelate CLI
curl -sL https://install.qorrelate.io | sh
# Initialize and run with auto-instrumentation
qorrelate init --token YOUR_API_KEY
qorrelate run python app.py
Conclusion
ClickHouse isn't just another database—it's a paradigm shift for observability. Its columnar architecture, exceptional compression, and blazing-fast aggregations make it the ideal foundation for modern monitoring platforms.
If you're tired of:
- Expensive observability bills that grow faster than your infrastructure
- Slow queries that make debugging painful
- Siloed tools for logs, metrics, and traces
It's time to experience what a ClickHouse-powered observability platform can do. Get started with Qorrelate today.
Have questions about ClickHouse or observability? Reach out to us or check our documentation.