DuckDB is the closest thing the analytics world has to SQLite. It runs in-process, needs no server, reads and writes a single file, and chews through columnar aggregate queries that would make a row-store sweat. PHP has shipped PDO_SQLite in core for twenty years. Until now it had no equivalent for DuckDB.
So I wrote one. pdo_duckdb is a native PDO driver. You open a DuckDB database with a DSN, prepare statements, bind parameters, and iterate results through the same PDO API you already use for SQLite, MySQL, and PostgreSQL.
$db = new PDO('duckdb:/path/to/analytics.duckdb');
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$stmt = $db->prepare('SELECT region, SUM(amount) AS total FROM sales WHERE year = ? GROUP BY region');
$stmt->execute([2026]);
foreach ($stmt as $row) {
printf("%s: %s\n", $row['region'], $row['total']);
}
This is also the first PDO driver I've written in over fifteen years. I was one of the original authors of PDO back in the PHP 5.1 days, so coming back to that extension surface was equal parts muscle memory and culture shock.
You can already reach DuckDB from PHP today. It has a C API you can drive through FFI, and there's always shelling out to the CLI. Both work. Both also hand you a brand-new API surface, manual lifetime management, and a data layer that looks nothing like the rest of your application.
PDO is the abstraction PHP developers already have in their fingers. A PDO driver means prepared statements, parameter binding, the exception error mode, fetch modes, and transactions all behave the way they do for every other PDO backend. Your repository classes and query builders transfer unchanged. The driver is written in C against DuckDB's C API, so there is no FFI boundary at runtime and no per-call marshalling happening up in PHP.
To be fair to FFI, it has its place. If you need a DuckDB capability the PDO surface doesn't expose, FFI hands you the entire C API and gets out of the way. pdo_duckdb makes the opposite trade on purpose: less breadth, in exchange for fitting into code you've already written.
DuckDB is an in-process analytical (OLAP) database. The relationship to ClickHouse or Snowflake is roughly what SQLite is to Postgres: same SQL you know, no server to run, embedded directly in your process. It is columnar and vectorized, built for scans and aggregations over a lot of rows rather than single-row OLTP traffic. The DSN tells the driver which database to open:
duckdb:/path/to/database.duckdb # file-backed database
duckdb::memory: # in-memory database
duckdb: # in-memory database (empty path)
An in-memory database is scratch space that vanishes when the connection closes; a file-backed one persists to disk. DuckDB's own extensions load through ordinary SQL, with no special driver API to learn:
$db->exec('LOAD json'); // bundled extensions load offline
$db->exec('INSTALL httpfs; LOAD httpfs;'); // downloadable extensions
Reading Parquet, querying a CSV in place, or pulling a file over HTTP is a LOAD away, all through the same $db->exec() you would use for any statement. The driver doesn't wrap or re-expose any of it. DuckDB's SQL surface is the API.
Row-by-row INSERT is the wrong way to push a lot of data into a columnar store, and DuckDB knows it. The driver exposes DuckDB's native appender through PDO::duckdbAppender(), which streams rows into a table far faster than a loop of prepared INSERT statements.
$db->exec('CREATE TABLE events (id INTEGER, name VARCHAR, ts TIMESTAMP)');
$app = $db->duckdbAppender('events'); // optional 2nd arg: schema name
foreach ($rows as $r) {
$app->appendRow($r['id'], $r['name'], $r['ts']);
}
$app->flush(); // or $app->close() to finalize
appendRow(...$values) takes one argument per column, left to right, and returns the appender so calls can chain. PHP null, bool, int, float, and string map to DuckDB values and get cast to the target column types. This is the one place the driver steps outside the standard PDO surface, and it earns the exception. Bulk ingestion is exactly where the row-at-a-time PDO model leaves the most performance on the table, so it's the one spot worth a driver-specific method.
This is the honest section. DuckDB is not MySQL, and a faithful driver has to refuse a few things rather than fake them.
lastInsertId() is not supported. DuckDB has no implicit rowid, so there is nothing honest to return. If you need generated keys, use a sequence and currval().
setAttribute(PDO::ATTR_AUTOCOMMIT, false) is rejected. DuckDB is autocommit-by-default with no session-level toggle, so instead of pretending otherwise, the driver points you at beginTransaction() for explicit transactions. From there commit() and rollBack() map onto DuckDB's COMMIT and ROLLBACK.
Placeholders get rewritten. Positional ? and named :name parameters are both supported, and PDO rewrites them to DuckDB's native $N form. That creates one sharp edge: because : introduces a named placeholder, an inline STRUCT or MAP literal inside a prepared query needs a space after the colon. Write {'k': 1}, not {'k':1}, or the parser reads :1} as a parameter and gets confused.
Types come back predictably, with one rule worth internalizing. Integers up to 64-bit signed arrive as PHP int, FLOAT and DOUBLE as float, and BLOB as a binary string. Everything else, VARCHAR, the DATE and TIME and TIMESTAMP family, DECIMAL, the wide HUGEINT and UBIGINT integers, and DuckDB's nested types, comes back in its canonical string form. That keeps values lossless when they wouldn't fit a native PHP scalar, at the cost of a cast on your side when you want a number out of a HUGEINT column.
Results are materialized, for now. execute() buffers the full result set in memory before PDO begins fetching. For the analytical queries DuckDB is built for that is usually fine, but a SELECT that returns millions of rows is bounded by available memory rather than streamed row by row. True streaming through DuckDB's pending-result API is the next thing on the list.
One behavior goes the other way, in your favor. When open_basedir is set, the driver disables DuckDB's SQL-level external file access: read_csv, COPY, ATTACH, httpfs, and the rest. Without that, a single COPY ... TO '/somewhere/outside' would punch straight through the sandbox open_basedir is supposed to enforce. The restriction holds at the SQL layer, not just for the path of the database file itself.
The PDO driver model itself has barely moved since I last touched it. You fill in a handful of method tables, hang your connection state off the driver object, and PDO drives the rest. What changed is everything around it.
On PHP 8.4 and up, PDO::connect('duckdb:...') returns a Pdo\Duckdb subclass instance, and duckdbAppender() lives on that subclass. With new PDO('duckdb:...'), and on PHP 8.1 through 8.3, the method is reachable on the base object directly. PHP 8.5 starts emitting a deprecation when you call a driver-specific method on the base PDO class, so on 8.4 and newer you'll want PDO::connect(). The driver supports both paths; the subclass is just the cleaner one going forward.
The part that actually cost me a release was the link line. DuckDB ships as a C++ library with a C API bolted on top. The driver is C. The prebuilt Linux binary in 0.2.0 loaded fine on my machine and then died on a clean host with undefined symbol: _ZTVN10__cxxabiv120__function_type_infoE, the vtable for a C++ ABI type that was supposed to be baked into the binary. The bundled DuckDB C++ runtime was never getting statically linked, because the gcc C driver silently ignores -static-libstdc++, a flag it treats as g++-only. So the module quietly depended on a libstdc++ the target box didn't ship. 0.2.1 links the static libstdc++ and libgcc_eh in explicitly, and the Linux binary is genuinely self-contained now. macOS and Windows were never affected, which is its own special way to lose an afternoon chasing the wrong platform.
Distribution is the other thing that improved while I was away. On Linux x86_64 and arm64, macOS on Apple Silicon, and Windows x64, PIE downloads a self-contained prebuilt binary with no DuckDB install and no build toolchain required:
pie install iliaal/pdo_duckdb
On anything else it falls back to a source build that needs libduckdb and duckdb.h. The last time I shipped a PDO driver, this would have been a PECL package and a paragraph of phpize instructions for the reader to follow. Prebuilt binaries as the default install path is the single biggest quality-of-life change in PHP extension distribution since I started writing them.
pdo_duckdb is an early release, and I'd rather say so than oversell it. The query path, the appender, parameter binding, transactions, and type decoding all work and are covered by tests. Streaming is the main thing still on the bench, and the type-mapping surface will keep growing as people push real workloads through it. I haven't published benchmarks, because the speed here is DuckDB's and the driver's job is to stay out of its way.
If you've wanted DuckDB's analytical engine reachable from PHP without learning a new client API or hand-rolling FFI, this is that, sitting behind the PDO surface you already know.
pie install iliaal/pdo_duckdb