Quadratic Probing vs Separate Chaining
Two hash-collision strategies, one decision. Separate chaining wins on robustness; quadratic probing wins on cache locality at low load. Here's when each actually earns its place.
The short answer
Separate Chaining over Quadratic Probing for most cases. Separate chaining degrades gracefully and never needs a perfect-square table or a load factor below 0.5 to stay sane.
- Pick Quadratic Probing if control the load factor tightly (under ~0.5), keys are small/inline, and you want cache-friendly contiguous memory with no per-node pointer overhead
- Pick Separate Chaining if want a hash table that survives high load factors, deletions, and adversarial key distributions without rehashing gymnastics — i.e. almost always
- Also consider: Open addressing with robin-hood or hopscotch hashing beats both for raw speed if you'll invest the engineering; std::unordered_map (chaining) vs Abseil flat_hash_map (open addressing) is the real-world version of this fight.
— Nice Pick, opinionated tool recommendations
How they actually differ
Both resolve the moment two keys hash to the same bucket — they just disagree on where the loser goes. Separate chaining keeps a secondary structure (linked list, or a tree in Java 8+ HashMap once a bucket exceeds 8 entries) hanging off each slot; collisions append to the chain. Quadratic probing keeps everything in one flat array and, on collision, walks an offset sequence — i+1², i+2², i+3² — until it finds an empty slot. That single design choice cascades into everything else: memory layout, deletion semantics, load-factor tolerance, and worst-case behavior. Chaining tolerates load factors above 1.0 because chains just grow. Quadratic probing physically cannot exceed load factor 1.0, and degrades sharply well before that. If you remember one thing: chaining stores collisions elsewhere, probing stores them in the same array and prays for empty space.
Where quadratic probing earns its keep
Quadratic probing's whole pitch is cache locality. Everything lives in one contiguous array, so probing neighbors is mostly L1/L2 hits instead of chasing pointers across heap addresses like chaining does. At low load factors with small inline keys — integers, fixed structs — it's measurably faster, and it sidesteps the primary clustering that wrecks linear probing. No per-entry allocation, no pointer overhead, better prefetching. But the fine print is brutal: your table size should be prime (or a power of two with the right probe formula) or the probe sequence won't visit every slot, and insertion can fail even with free space. Deletion needs tombstones, which rot performance until you rehash. Push load factor past ~0.7 and probe chains explode. It's a scalpel that cuts you if you hold it wrong, which most people do.
Where separate chaining wins
Separate chaining is the strategy that refuses to fall over. Load factor above 1.0? Chains just get longer, performance degrades linearly, nothing breaks. Deletions are trivial — unlink a node, no tombstones, no rehash debt. Adversarial or clustered keys that would send a probe sequence into a death spiral merely produce a longer list, and modern implementations (Java's HashMap) convert pathological buckets to balanced trees for O(log n) worst case instead of O(n). The cost is real: every entry is a heap allocation with pointer overhead, and you pay cache misses traversing chains. That's why it loses microbenchmarks at low load. But it never surprises you in production, never demands a prime table size, never silently fails to insert. Boring, predictable, allocation-heavy, and correct. For 90% of code that just needs a working map, boring wins.
The decision rule
Pick separate chaining by default. It's the strategy that doesn't require you to be smart about load factors, table sizes, or deletion, and it's what the standard libraries you trust (Java HashMap, C++ std::unordered_map, Python dict's earlier designs) reached for precisely because it doesn't blow up. Reach for quadratic probing only when you've profiled, you control the load factor under 0.5, your keys are small and inline, and cache locality is the bottleneck — a hot inner-loop lookup table, not a general-purpose map. And honestly, if you've gone that far down the open-addressing road, skip quadratic probing's footguns and use robin-hood or a flat_hash_map. Quadratic probing lives in a narrow valley: fast enough to tempt you, fragile enough to punish you. Chaining is the answer until you can prove it isn't.
Quick Comparison
| Factor | Quadratic Probing | Separate Chaining |
|---|---|---|
| Cache locality | Excellent — single contiguous array, prefetch-friendly | Poor — pointer chasing across the heap |
| High load factor tolerance | Degrades sharply past ~0.7, fails at 1.0 | Survives load factor >1.0, degrades gracefully |
| Deletion handling | Needs tombstones; rots until rehash | Trivial unlink, no debt |
| Worst-case lookup | O(n) clustered probe chains | O(log n) with tree-bucket fallback (Java 8+) |
| Implementation footguns | Prime/size-sensitive probe sequence, insert can fail | Forgiving, hard to misimplement |
The Verdict
Use Quadratic Probing if: You control the load factor tightly (under ~0.5), keys are small/inline, and you want cache-friendly contiguous memory with no per-node pointer overhead.
Use Separate Chaining if: You want a hash table that survives high load factors, deletions, and adversarial key distributions without rehashing gymnastics — i.e. almost always.
Consider: Open addressing with robin-hood or hopscotch hashing beats both for raw speed if you'll invest the engineering; std::unordered_map (chaining) vs Abseil flat_hash_map (open addressing) is the real-world version of this fight.
Separate chaining degrades gracefully and never needs a perfect-square table or a load factor below 0.5 to stay sane. Quadratic probing buys cache locality but punishes you the moment the table fills or your modulus isn't prime — and most people misimplement the probe sequence. Chaining is the default that doesn't blow up in production.
Related Comparisons
Disagree? nice@nicepick.dev