DuckDB

Databases

Fast in-process analytical database

Website GitHub

Features

In-process OLAP database — no server needed
Directly queries CSV, Parquet, JSON, and S3 files
Columnar storage with vectorized execution
Full SQL support with window functions and CTEs

Pros

Blazing fast analytical queries on local data
Zero setup — embeds into any application
Query files directly without importing

Cons

Not designed for concurrent write-heavy OLTP workloads
Single-node only — no distributed mode
Newer with a growing but smaller ecosystem

Overview

DuckDB is an in-process analytical database management system, often described as “SQLite for analytics.” It’s designed for fast analytical queries on local data, using columnar storage and vectorized execution to process large datasets efficiently without a separate server process.

DuckDB can directly query CSV, Parquet, JSON, and even remote files on S3 without importing them first. This makes it invaluable for data analysis, ETL pipelines, and any scenario where you need to quickly analyze large datasets locally.

When to Use

DuckDB is ideal for data analysis, ETL pipelines, local analytics dashboards, and any workload involving analytical queries on structured data. It’s perfect when you need SQLite-like simplicity but for analytical (OLAP) rather than transactional (OLTP) workloads.

Getting Started

npm install duckdb

import duckdb from "duckdb";

const db = new duckdb.Database(":memory:");
db.all(
  "SELECT * FROM read_csv_auto('data.csv') WHERE amount > 100",
  (err, rows) => { /* process rows */ }
);

Related Technologies

ClickHouse SQLite

SQLite

PostgreSQL