Chapter 7 — Maps: Swiss Tables, Growth, and Permanent Memory
Go Black Belt, Part III: Data Structure Internals.
The map is Go's other everyday structure, and the one whose internals changed most recently: Go 1.24 replaced the built-in map with a Swiss Tables implementation — groups of eight slots, a control word probed in parallel, and no overflow buckets. This chapter is current to that runtime. We'll see how a lookup resolves, why a map never gives memory back no matter how much you delete, and what clear, pre-sizing, and delete actually do underneath. If you learned Go maps as "buckets with overflow chains," half of what you know is now wrong — and the retention lesson from Chapter 6 reappears, sharper.
You deleted a million entries and freed nothing
m := make(map[int]int)
for i := 0; i < 1_048_576; i++ { m[i] = i } // ~32 MiB of table
for k := range m { delete(m, k) } // delete every single key
// ... the ~32 MiB is still allocated
Run the retention test and watch it:
filled + 32 MiB
after delete + 32 MiB (retained)
after drop + 0 MiB
Every key is gone. The map's len is zero. And not one byte came back. delete removes entries; it does not shrink the table. Neither does clear. The only way to return a map's memory to the runtime is to stop referencing the whole map and let the garbage collector take it. This is the single most consequential fact about Go maps in long-lived services, and it's the same retention principle as Chapter 6's slices — a structure that grows with your traffic and never shrinks. The rest of the chapter is why, and what the new Swiss Tables runtime changed.
Before you start
What you need: Go 1.24 or newer — this chapter is about the Swiss Tables map introduced in 1.24 (captured against Go 1.25, linux/amd64). On Go ≤ 1.23 the internals described here do not apply. Benchmarks use testing.B.Loop. Confirm with go version.
The lab lives in lab/:
lab/
go.mod
map_lab.go # fill/delete/clear/lookup/iteration helpers
map_lab_test.go # retention + iteration assertions, benchmarks
Makefile # make iter | make retain | make bench | make versions | make golden
Reproduce everything:
cd lab
make iter # iteration start is randomized (go test -run TestIterationOrder -v)
make retain # delete/clear retain memory (go test -run TestMapRetention -v)
make bench # clear vs delete, pre-size, lookups (go test -bench=. -benchmem)
On the numbers. The iteration-randomization result is asserted (deterministic enough to gate). The retention figures are a runtime measurement — read the ratio (delete keeps ~all of it; dropping returns ~all of it), not the exact MiB. Timings and allocs/op are representative; the map's internal allocation counts in particular shift across Go versions.
The belief this chapter dismantles
Common belief: "A Go map is a hash table of buckets with overflow chains. delete removes an entry and frees its space; an empty map is a small map."
What actually happens (Go 1.24+): the map is a Swiss Table — an array of 8-slot groups fronted by a control word, with no overflow buckets at all. It grows by doubling and never shrinks. delete marks a slot empty; clear resets the control words; neither returns memory. An "empty" map that once held a million entries still owns the table for a million. The implementation changed in 1.24, but the retention behavior — grow, never shrink — did not.
The mental model: groups of eight, probed in parallel
A Swiss Table stores entries in groups of 8 slots. Each group is fronted by a 64-bit control word: one byte per slot, holding a 7-bit fingerprint of that slot's key hash plus a bit marking the slot empty, full, or deleted.
one group (8 slots)
control word (8 bytes, one per slot) slots (key/value pairs)
┌────┬────┬────┬────┬────┬────┬────┬────┐ ┌──────┬──────┬─────┬──────┐
│ h2 │ h2 │ EMP│ h2 │ h2 │ DEL│ h2 │ h2 │ │ k,v │ k,v │ ... │ k,v │
└────┴────┴────┴────┴────┴────┴────┴────┘ └──────┴──────┴─────┴──────┘
│ each byte = 7-bit fingerprint
│ + empty/full/deleted bit
▼
a single 64-bit compare matches the wanted fingerprint
against ALL 8 slots at once (SWAR / SIMD-style)
A lookup hashes the key, uses the high bits to pick a group, and compares the key's fingerprint against all eight control bytes in one machine operation. That yields a small mask of candidate slots; only those few are checked for a full key match. So a probe examines eight slots at the cost of roughly one comparison — that's where the speed comes from.
Four consequences define the structure:
No overflow buckets. The pre-1.24 map chained overflow buckets off full buckets; the Swiss Table has none. If a group is full, probing continues to the next group within the table. This removes a whole class of pointer-chasing.
Higher load factor. Swiss Tables run safely at ~87.5% (7/8) full, versus the old map's ~81.25% — the same entries fit in fewer slots, saving memory.
It grows, and never shrinks. When the table passes its load factor, the runtime allocates a larger table and rehashes into it (large maps use a directory of several tables so this can happen incrementally rather than in one stop-the-world copy). Capacity only ever increases. There is no "shrink" operation, which is exactly the cold open.
delete and clear only touch control bytes. delete flips a slot's control byte to empty/deleted; clear resets all the control words at once. Both leave the table — every group, every allocation — exactly as large as it was.
(For the authoritative description, see Go's own write-up: Faster Go maps with Swiss Tables.)
The tools
No compiler flag this chapter. The instruments are the runtime's memory accounting (as in Chapter 6) plus an optional cross-version comparison.
| Tool | Tells you |
|---|---|
runtime.ReadMemStats(&m) → m.HeapAlloc, after runtime.GC() |
how much memory a map retains before vs after delete/clear/drop |
runtime.KeepAlive(m) |
keeps the map reachable across a measurement so it isn't freed early |
go test -bench -benchmem |
ns/op and allocs/op for clear, delete, insert, lookup |
go install golang.org/dl/go1.23.12@latest then go1.23.12 test |
runs a benchmark on the pre-Swiss-Tables runtime, to compare (Experiment 3) |
The retention pattern is the one from Chapter 6: runtime.GC(), read HeapAlloc, do the operation, runtime.GC(), read again — the difference is what's retained.
Experiment 1 — delete on every key frees nothing
The cold open, asserted. TestMapRetention fills a map with N = 1<<20 entries, deletes them all, and measures HeapAlloc at each step:
go test -run TestMapRetention -v .
filled + 32 MiB
after delete + 32 MiB (retained)
after drop + 0 MiB
--- PASS: TestMapRetention
After deleting all N keys, the retained memory is unchanged — the test asserts it stayed above half the filled size, so this is a fact, not a fluke. delete set a million control bytes to "empty"; it did not deallocate a single group. A map's footprint tracks its historical high-water mark, not its current len. In a service whose maps grow during a traffic spike, that high-water mark is permanent until you rebuild.
Micro-claim 1 (proof type: trace). delete(m, k) on every key leaves the map's table memory fully allocated.
Pass condition: HeapAlloc after deleting all keys stays near the filled size; it does not drop.
Experiment 2 — clear is much faster than a delete loop, and just as permanent
If you must empty a map you intend to reuse, clear(m) beats a delete loop — it resets the control words in bulk instead of visiting every entry. But it frees nothing either.
go test -bench='Clear|DeleteLoop' -count=10 .
BenchmarkClear-8 24693 48.6 µs/op
BenchmarkDeleteLoop-8 788 1520 µs/op
clear is ~30× faster here — it touches O(groups) control words, while the delete loop does O(n) individual removals (each a hash, a probe, and a control-byte write). Both leave the table fully allocated; the benchmark refills the same map each round precisely because neither operation shrank it. So: use clear to empty-and-reuse a map cheaply, but understand it's a reuse optimization, not a memory-reclamation one.
Micro-claim 2 (proof type: bench). clear(m) is faster than a delete loop but also does not release table memory.
Pass condition: clear shows much lower ns/op than the delete loop; both leave HeapAlloc unchanged (Experiment 1's measurement applies to clear too).
Experiment 3 — Lookups resolve in a few parallel-probed groups
The Swiss Table's headline win is lookup speed: one control-word compare screens eight slots at once. BenchmarkLookup does 1000 lookups into the 1M-entry map.
go test -bench='Lookup' -count=10 .
BenchmarkLookup-8 81234 14.6 µs/op 0 B/op 0 allocs/op
That's ~14.6 ns per lookup into a million-entry table — a handful of nanoseconds of hashing plus a parallel group probe, with zero allocation. To see the improvement over the old runtime, run the same benchmark on Go 1.23 (the last pre-Swiss-Tables release) — this is a reader-run cross-version comparison, like Chapter 1's escape-drift experiment:
go install golang.org/dl/go1.23.12@latest && go1.23.12 download
go1.23.12 test -bench=Lookup -count=10 . # old bucket map
go test -bench=Lookup -count=10 . # Swiss Tables
Expect the Go 1.24+ build to be materially faster on lookup-heavy work (Go's own measurements report map operations up to ~60% faster than 1.23 on some workloads; your delta is workload-dependent). The win is purely in the runtime — your code is identical.
Micro-claim 3 (proof type: bench). Swiss Table lookups are fast because one control-word comparison screens eight slots at once; the Go 1.24+ map is measurably faster than 1.23 on lookups.
Pass condition: lookups show low ns/op at 0 allocs/op; the current build beats a Go 1.23 build of the same benchmark.
Experiment 4 — Pre-size to skip the rehashes
Inserting into a map that starts empty forces the table to grow and rehash several times as it crosses load-factor thresholds. Telling make the final size up front allocates a table big enough from the start.
go test -bench='Insert' -benchmem -count=10 .
BenchmarkInsertNoPresize-8 302 4150 µs/op 5.2 MB/op 62 allocs/op
BenchmarkInsertPresize-8 498 2360 µs/op 4.1 MB/op 9 allocs/op
The pre-sized version is faster and allocates far fewer times — each avoided growth is a table allocation plus a full rehash-copy of everything inserted so far. When you know (even roughly) how many entries a map will hold, make(map[K]V, n) is free speed. (The exact allocation counts are version-specific; the direction — pre-sizing removes the intermediate grows — is the durable lesson.)
Micro-claim 4 (proof type: bench). Pre-sizing with make(map[K]V, n) avoids intermediate rehashes and is measurably faster.
Pass condition: the pre-sized insert shows fewer allocs/op and lower ns/op than the nil-start insert.
Experiment 5 — Only dropping the map returns its memory
Back to TestMapRetention's third line. After delete left the table fully allocated (Experiment 1), the test sets m = nil and measures again:
after delete + 32 MiB (retained)
after drop + 0 MiB
Once nothing references the map, the garbage collector reclaims the whole table and the memory returns to ~baseline — the test asserts the drop fell below half the filled size. This is the only mechanism that shrinks a map's footprint. In practice that means: to bound a long-lived, traffic-sized map, periodically rebuild it — copy the live entries into a fresh make(map[K]V, liveCount) and drop the old one — rather than relying on delete or clear.
Micro-claim 5 (proof type: trace). Replacing (or dropping) the map is the only way to return its memory to the runtime.
Pass condition: HeapAlloc returns to ~baseline only after the map is dropped/rebuilt and GC runs — never after delete or clear.
Experiment 6 — Iteration order is randomized on purpose
Each range over a map starts at a randomized slot, so iteration order is not stable — not even across two ranges of the same map in the same run. TestIterationOrder ranges a 128-entry map sixteen times and records the first key each time:
go test -run TestIterationOrder -v .
first key took 13 distinct values across 16 ranges (randomized start)
--- PASS: TestIterationOrder
The first key is almost never the same twice — the test asserts it took at least two distinct values, which for a 128-entry map is a near-certainty. The runtime randomizes the start deliberately, so that code cannot accidentally depend on iteration order and then break when the order shifts (across versions, or as the table grows). If you need order, sort the keys; never rely on range order.
Micro-claim 6 (proof type: experiment). A map's iteration start position is randomized per range.
Pass condition: the first key from repeated ranges over a fixed map varies across calls.
Practical implications
- Treat any long-lived map that grows with traffic as a permanent allocation. It tracks its high-water mark forever (Experiments 1, 5). To bound it, rebuild periodically into a fresh, right-sized map;
delete/clearwill not help. This is the slice-retention lesson (Chapter 6) for maps. - Use
clearto empty-and-reuse, not to free. It's the fast way to reset a map you'll refill (Experiment 2) — but it keeps the table, which is often exactly what you want for a reused scratch map. - Pre-size when you can.
make(map[K]V, n)removes the grow/rehash steps and their allocations (Experiment 4); even a rough estimate helps. - Never depend on iteration order (Experiment 6). Sort keys if you need determinism.
- Don't carry pre-1.24 folklore. "Overflow buckets," fixed bucket counts, and
delete-frees-memory are wrong on the current runtime. When you read about map internals, check the Go version.
Rules (enforceable)
- Never expect
delete()orclear()to return memory; rebuild a map into a freshmaketo release its high-water allocation. - Pre-size with
make(map[K]V, n)whenever the approximate entry count is known. - In long-lived services, treat a map that grows with traffic as a retention risk and give it a rebuild/rotation policy.
- Never depend on map iteration order; sort keys when you need a stable order.
- Verify map-internals claims against your Go version — the implementation changed in 1.24 (Swiss Tables) and may change again.
Drills
Predict first, then run make retain / make bench / make iter.
Drill 1. A cache is map[string][]byte. Under a traffic spike it grows to 5 GB, then traffic drops and 99% of entries are deleted. Predict the process's memory afterward, and give the fix.
Answer
Memory stays at ~5 GB. delete returns no table memory (Experiment 1), and []byte values that were deleted are collectable but the table keeps its million-slot capacity. (Worse, if the []byte values are sub-slices of larger buffers, Chapter 6's retention compounds it.) Fix: periodically rebuild — m2 := make(map[string][]byte, len(m)); copy live entries; m = m2 — and drop the old map so GC reclaims the oversized table. Confirm with runtime.ReadMemStats or a heap profile (Chapter 16).
Drill 2. You benchmark inserting 1,000,000 entries and see ~60 allocations. Where do they come from, and how do you cut them to a handful?
Answer
They're table growth steps: starting empty, the map crosses its load factor repeatedly, each time allocating a larger table (and directory) and rehashing. ~60 reflects the doublings plus directory growth to reach a million entries. make(map[K]V, 1_000_000) sizes the table once up front, dropping it to a handful (Experiment 4). Confirm with -benchmem. This is the map analog of slice pre-allocation (Chapter 6, Experiment 4).
Drill 3 (capstone). Two engineers argue: one says "sync.Map would fix our map memory growth," the other says "switch the keys from string to int." For an unbounded-growth retention problem, evaluate both.
Answer
Neither addresses retention. sync.Map is about concurrent access, not memory reclamation — it also grows and doesn't shrink, and it's slower for write-heavy workloads (Chapter 14). Changing string keys to int lowers per-entry overhead (smaller keys, no separate string backing) — a real but constant-factor win, not a fix for unbounded growth. The actual fix is a rebuild/rotation policy that caps the high-water mark (Experiment 5). The lesson: retention is about lifetime and rebuild, not about the key type or the map flavor — an ownership decision (Chapter 17).
What you can now do
You can describe the Go 1.24+ map as it actually is — groups of eight slots, a control word screened in parallel, no overflow buckets, ~87.5% load — and explain why lookups are fast and why the table never shrinks. You know that delete and clear only reset control bytes, that pre-sizing removes the rehash steps, that only dropping or rebuilding the map returns memory, and that iteration order is randomized by design. And you can spot the long-lived-map retention bug — the same shape as Chapter 6's slice leak — and fix it with a rebuild policy.
Where this goes next
Chapter 8 — Strings, Bytes, and the Cost of Conversion closes Part III. You just used string keys; now we look at what a string ↔ []byte conversion costs, the handful of cases where the compiler elides the copy (including the very m[string(b)] map lookup you might reach for here), and how strings.Builder and unsafe.String change the arithmetic. After that, Part IV turns to the price of abstraction.
Course rule, restated: show it, or cut it. Everything in this chapter is in lab/. Run make retain; delete a million keys and watch nothing happen.