phpser: a fast, secure binary serializer for PHP cache workloads

I Generate Too Many UUIDs... Fastchart 1.X, why I rewr...

I've reached for igbinary on nearly every PHP project I've shipped in the last decade. It's smaller and faster than PHP's native serialize(), it's stable, and it has been the obvious default for so long that reaching for it stopped being a decision.

So phpser started as curiosity, not a complaint. igbinary is good. Could a serializer built specifically for cache workloads do better?

I wanted two things from it. It should be fast on the shapes a cache actually holds, where a value is decoded far more often than it's encoded. And it should be safe to decode bytes from a store an attacker might reach, because unserialize() on untrusted input is one of PHP's oldest exploit primitives. igbinary gives you the speed; the safety you bolt on yourself. phpser builds in both.

On the shapes that matter for caches it encodes 10 to 70% faster than igbinary and decodes 12 to 75% faster, with packed numeric data also 65% smaller on the wire. Its signed mode refuses to decode any payload that wasn't produced with your key, so a poisoned cache entry never reaches the code that builds objects. The rest of this post is how it gets both.

Why a serializer built for caches?

Because igbinary optimizes for the general case, and a cache is not the general case, on two axes.

The first is the read/write asymmetry. A PHP cache pays decode cost on every single read. Encode happens once, when you write the value; decode happens every time anything reads it back. For a read-heavy cache that ratio is easily 100 to 1. igbinary, like most general serializers, balances the two sides. A cache serializer shouldn't.

Encode runs once per write; decode runs on every read

The second is trust. The thing reading those bytes back is often reading from redis, memcached, a file, or a cookie, any of which an attacker may be able to write to. A general serializer treats decode as a pure data operation. A cache serializer has to treat it as a trust boundary.

igbinary is still the right default for general use. I went looking for the specific shapes where a cache-focused design could pull ahead, and there are three that show up everywhere in real PHP backends:

Packed numeric arrays. range(0, 999), ID lists, analytics buckets, sensor readings.
Deep-nested structures. Trees, recursive config, nested document structures.
Same-class object batches. Laravel queue payloads, cached Eloquent models, any array of a few hundred identical-shape DTOs.

Designing the format for the reader

The performance half of phpser borrows an instinct from Rust's rkyv. rkyv's pitch is that deserialization should be nearly free, because the writer already laid the bytes out the way the reader needs them. You don't parse an rkyv archive so much as point at it.

phpser isn't zero-copy, and I want to be precise about that before the comparison runs away. PHP values are refcounted zvals with owned hashtables; you can't hand PHP a pointer into a cache buffer and call it an array. phpser does a real decode pass and builds real zvals. It's not rkyv.

What transferred is the instinct, not the mechanism. rkyv made me stop thinking about the wire format as a neutral container and start thinking about it as a set of instructions to the reader. If the writer knows something that saves the reader work, the writer should record it, even when that makes encoding a little more complex. Once you adopt that lens, a set of concrete decisions falls out of it.

A string dictionary, and an intern that survives decode

The honest starting point: a front-loaded string dictionary isn't novel. igbinary already does this, it calls them compact_strings. Both serializers emit each distinct string once and reference it afterward, so the property name "created_at" repeated across a thousand cached rows costs one copy, not a thousand.

The dictionary isn't where the win is. The win is on the decode side, and it's the most direct application of the design-for-the-reader rule.

When phpser decodes a dictionary string the first time, it allocates a zend_string. Every later reference to that same dictionary index doesn't allocate; it bumps the refcount on the one already built. A thousand rows that all carry the key "user_id" produce exactly one string allocation and 999 refcount increments. PHP's own machinery is built for exactly this, interned strings are shared by refcount throughout the engine, so phpser isn't fighting the runtime, it's leaning on it.

The dictionary is emitted once at the head; values reference it by varint index, and repeated strings reuse one interned zend_string by refcount

Fast to encode, too

Designing for the reader could have meant a slow writer. It doesn't, because of two encoder choices, and the result is that phpser encodes faster than igbinary on every shape I test.

The first is the intern cache. phpser keeps an open-addressed zend_string*-to-slot hash, grown without eviction. Before hashing a string's bytes, it checks pointer identity: PHP interns string literals, so the "id" in row 1 and the "id" in row 900 are usually the same pointer and resolve with no byte work at all. Just as important, a unique value string, a name, an email, a SKU, takes a single-probe miss instead of a linear scan. The per-value dedup lookup stays off the critical path even on payloads full of strings that never repeat.

The second is objects. Encoding a PHP object the obvious way calls get_properties, which materializes a properties hashtable even for a plain object whose layout is fixed and known. For a batch of a few hundred DTOs that's hundreds of throwaway hashtables. phpser serializes a plain object straight from its declared property slots and skips the hashtable, the way native serialize() does. PHP 8.4 lazy objects fall back to get_properties, because their initializer has to run first.

Tagged scalar runs, and building the array in place

Two more decisions, on the decode side, are what make the packed-numeric numbers as large as they are.

The first is tagged scalar runs. igbinary encodes [1, 2, 3, ...] as a sequence of tagged values: a type tag and a varint, a thousand times over. phpser detects a uniform run and emits one PACKED_LONGS header plus the thousand integers as raw zigzag varints, no per-element tag. Decode becomes one tight loop with zero tag dispatch.

The second is building the hashtable in place. When the wire format says PACKED_LONGS of length N, the decoder knows the final size before it reads a single element. So it allocates the array once with zend_new_array(N) and writes the values directly into PHP 8's packed arPacked storage with ZVAL_* macros. That skips N calls to zend_hash_next_index_insert, and with them N hash computations, N capacity checks, and the incremental table growth that a naive decoder pays as it discovers the array's size one element at a time. The writer recorded the size so the reader could allocate once and fill, which is the rkyv instinct applied as far as a non-zero-copy format can take it.

A naive decoder hashes and grows the table per element; phpser allocates once from the header count and writes slots directly

The benchmarks

Here is the full shape-by-shape comparison, run on my machine, against igbinary. The bench harness (bench.php in the repo) round-trips every shape for correctness first, then times encode and decode separately, because decode is the number that matters for a cache.

Methodology: phpser 0.1.2, PHP 8.4.22-dev NTS, release build (not a debug or ASan build, which would inflate everything 2 to 5x), igbinary 3.2.17RC1, Intel Core i9-13950HX. 1,000 iterations per shape, median of 9 runs after a discarded warm-up.

Shape	Size: igbinary → phpser	Encode: igbinary → phpser	Decode: igbinary → phpser
packed_1k	5,495 → 1,941 B (-65%)	4.6 → 1.4 µs (-70%)	7.3 → 1.8 µs (-75%)
packed_10k	59,495 → 21,749 B (-63%)	46.4 → 13.7 µs (-70%)	74.0 → 18.9 µs (-74%)
deep_50	419 → 424 B (+1%)	1.3 → 0.62 µs (-54%)	1.8 → 1.6 µs (-15%)
dto_100	7,083 → 6,362 B (-10%)	15.5 → 13.9 µs (-10%)	26.9 → 23.5 µs (-13%)
dto_1000	73,372 → 64,863 B (-12%)	194 → 165 µs (-15%)	275 → 227 µs (-18%)
rowset_100	4,570 → 4,771 B (+4%)	10.0 → 7.3 µs (-27%)	10.7 → 10.8 µs (+1%)
rowset_1000	47,459 → 47,972 B (+1%)	157 → 71 µs (-55%)	104 → 107 µs (+4%)
dto_mixed	21,644 → 17,927 B (-17%)	58.8 → 39.8 µs (-32%)	112 → 81 µs (-28%)

The packed rows are the ones that jump out: roughly two-thirds smaller and three-quarters faster to decode, on a real shape, not a synthetic micro-case. packed_1k is range(0, 999), which is what an ID list or an analytics bucket looks like.

The DTO rows are the relatable ones. dto_1000 is a thousand small typed objects of one class, the shape a Laravel queue batch or a page of cached models actually has. 12% smaller, 18% faster to decode, 15% faster to encode, from the dictionary dedup on property names and a class-entry lookup cache that amortizes zend_lookup_class_ex across the batch. Encode is faster than igbinary on every row; the largest margins are on the object-heavy dto_mixed (32% faster, 17% smaller) and the mixed rowset_1000 (55% faster).

Where it gives a little back

The one row where phpser loses is the mixed associative rowset. rowset_1000 decodes about 4% slower than igbinary, and the rowset payloads run 1 to 4% larger. That's the front-loaded dictionary showing its one downside: the decoder walks the dictionary header before it touches values, and on a heterogeneous rowset with few repeated strings that header walk doesn't buy back its cost. It's a small tax, and it's on the exact axis I chose to de-prioritize, but it's real and measured, so there it is.

The structural limit is the same decision seen from another angle: phpser isn't streamable. The dictionary lives at the head of the payload and values reference it by index, so you can't decode the stream incrementally as it arrives. The front-loaded dictionary is what makes the other decodes fast and what makes streaming impossible. You don't get to keep both. If you need a streaming parser, this is the wrong format.

I also cross-checked the whole suite on arm64 to make sure none of this was an x86 quirk. Same direction on every shape, with narrower encode margins on the object cases. The decode wins and the single rowset_1000 decode tax both reproduce.

Signed payloads: safe to decode from an untrusted cache

The performance half is only one reason a cache serializer is its own problem. The other is that decoding attacker-controlled bytes is dangerous. Native unserialize() on untrusted input lets a crafted payload instantiate any allowed class and drive its __wakeup, __destruct, or other magic methods into a state the code never anticipated. That's the mechanism behind object-injection and gadget-chain attacks, and a cache is the soft spot: a redis instance, a memcached pool, a file cache, or a cookie is exactly the kind of store an attacker reaches in a real incident, and whatever sits there gets decoded on the next read.

phpser's answer is a signed mode built on HMAC-SHA256. You serialize with a secret key, and you refuse to decode anything that wasn't signed with that same key. Verification is constant-time and runs before any decoding work, so a tampered or foreign-keyed payload never reaches the part of the decoder that builds values or constructs objects.

Native unserialize decodes attacker bytes before checking anything; the signed path verifies the HMAC first and returns null on mismatch, so nothing is decoded

$key = random_bytes(32);            // generate once, keep it in app config or a secrets manager

// on write
$blob = phpser_serialize_signed($cacheValue, $key);
$redis->set('user:42', $blob);

// on read
$blob  = $redis->get('user:42');
$value = phpser_unserialize_signed($blob, $key);

if ($value === null) {
    // tampered, truncated, or signed with a different key.
    // nothing was decoded; treat it as a cache miss and rebuild.
    $value = rebuild_user(42);
}

The contract is deliberately blunt. phpser_unserialize_signed() returns null on any signature failure rather than throwing, so a poisoned cache entry degrades to a miss instead of an exception in a hot read path. The decode only proceeds once the MAC matches. This is authentication, not encryption: the bytes are still readable, but they can't be forged without the key, and that's the property that keeps a crafted object graph out of your decoder.

Signed mode also refuses an empty key. An empty key would reduce HMAC-SHA256 to a fixed, keyless tag anyone can recompute, so a caller writing phpser_serialize_signed($v, getenv('SECRET') ?: '') with the variable unset would be shipping forgeable payloads without knowing it. Both signed entry points throw on an empty key before doing any work, so that misconfiguration fails loudly instead of silently defeating the signature.

If you genuinely can't sign, because you're decoding bytes from a source you don't control and can't key, the second line of defense is allowed_classes, with the same shape as PHP's native unserialize():

// reject every class: unknown objects decode as __PHP_Incomplete_Class, never instantiated
$value = phpser_unserialize($blob, ['allowed_classes' => false]);

// or allowlist only the classes you actually expect to read back
$value = phpser_unserialize($blob, ['allowed_classes' => [UserDto::class, OrderDto::class]]);

The same option works on phpser_unserialize_signed() too, so you can combine a valid signature with a class allowlist for defense in depth. Underneath both paths, the decoder is hardened on its own: a recursion depth cap of 512 bounds stack use against deliberately deep payloads (decode returns null, encode throws), and crafted payloads naming a missing enum case or a non-serializable class like Closure are rejected rather than crashing, matching what PHP's own unserialize() refuses.

What I took away

igbinary is still the serializer I'd reach for on a general-purpose workload, and I'll keep using it. It's mature, it's everywhere, and on a mixed rowset it's still a hair ahead on decode.

For a read-heavy cache, phpser gives me two things at once. It's faster on the shapes caches actually hold, encode and decode both, because the wire format is designed around the reader rather than balanced between reader and writer. And signed mode means I can decode from redis without treating every read as a potential injection. Speed and trust are the two things a cache serializer has to get right, and they're the two things a general serializer leaves half-finished. Building both in was the point.

pie install iliaal/phpser

Source, wire-format spec, and the bench harness: github.com/iliaal/phpser

phpser is the serialization slot in a set of native PHP extensions I maintain alongside it: php_excel, mdparser, php_clickhouse, fastchart, and fastjson.

Why a serializer built for caches?

Designing the format for the reader

A string dictionary, and an intern that survives decode

Fast to encode, too

Tagged scalar runs, and building the array in place

The benchmarks

Where it gives a little back

Signed payloads: safe to decode from an untrusted cache

What I took away

Guide to PHP Security

Search

Categories

Syndicate

Archives