Features
- Columnar storage with extreme compression
- Vectorized query execution for analytical workloads
- Real-time data ingestion at millions of rows per second
- Distributed queries across clusters
Pros
- Fastest analytical database for large-scale queries
- Handles petabytes of data efficiently
- Real-time ingestion and querying simultaneously
Cons
- Not suited for transactional (OLTP) workloads
- UPDATE and DELETE operations are expensive
- Operational complexity for cluster management
Overview
ClickHouse is an open-source column-oriented database management system designed for online analytical processing (OLAP). Originally developed at Yandex for web analytics, it can process billions of rows and gigabytes of data per second on a single server.
ClickHouse achieves its performance through columnar storage (only reading columns needed for a query), vectorized execution (processing data in batches), and aggressive compression. It supports distributed deployments for horizontal scaling and integrates with tools like Kafka for real-time data pipelines.
When to Use
ClickHouse is the right choice for real-time analytics dashboards, log and event analysis, time-series data at scale, and any workload involving aggregations over billions of rows. For transactional workloads, use PostgreSQL or MySQL.
Getting Started
docker run -d --name clickhouse \
-p 8123:8123 -p 9000:9000 \
clickhouse/clickhouse-server
# HTTP interface
curl 'http://localhost:8123/' --data "SELECT 1"
CREATE TABLE events (
timestamp DateTime,
user_id UInt32,
event String
) ENGINE = MergeTree()
ORDER BY timestamp;