DuckDB

DuckDB

Fast in-process analytical database

Features

  • In-process OLAP database — no server needed
  • Directly queries CSV, Parquet, JSON, and S3 files
  • Columnar storage with vectorized execution
  • Full SQL support with window functions and CTEs

Pros

  • Blazing fast analytical queries on local data
  • Zero setup — embeds into any application
  • Query files directly without importing

Cons

  • Not designed for concurrent write-heavy OLTP workloads
  • Single-node only — no distributed mode
  • Newer with a growing but smaller ecosystem

Overview

DuckDB is an in-process analytical database management system, often described as “SQLite for analytics.” It’s designed for fast analytical queries on local data, using columnar storage and vectorized execution to process large datasets efficiently without a separate server process.

DuckDB can directly query CSV, Parquet, JSON, and even remote files on S3 without importing them first. This makes it invaluable for data analysis, ETL pipelines, and any scenario where you need to quickly analyze large datasets locally.

When to Use

DuckDB is ideal for data analysis, ETL pipelines, local analytics dashboards, and any workload involving analytical queries on structured data. It’s perfect when you need SQLite-like simplicity but for analytical (OLAP) rather than transactional (OLTP) workloads.

Getting Started

npm install duckdb
import duckdb from "duckdb";

const db = new duckdb.Database(":memory:");
db.all(
  "SELECT * FROM read_csv_auto('data.csv') WHERE amount > 100",
  (err, rows) => { /* process rows */ }
);