Chapter 1 — Escape Analysis: The Compiler's Fragile Decision
The first thing to learn about Go's performance is that you cannot see it in the source code. Whether a variable lives on the stack (free) or the heap (an allocation now, and garbage-collector work later) is decided by the compiler, by rules that stay invisible until you ask the compiler to show its work. This chapter teaches you to ask — and to predict the answer before you do.
The one-line surprise
Here are two functions. They differ by exactly one line.
//go:noinline
func SquareQuiet(n int) int {
return n * n
}
//go:noinline
func SquareLogged(n int) int {
x := n * n
fmt.Fprint(io.Discard, x) // the only difference
return x
}
The second one even throws its log away — it writes to io.Discard, which does nothing. Yet:
BenchmarkSquareQuiet-8 1000000000 1.10 ns/op 0 B/op 0 allocs/op
BenchmarkSquareLogged-8 82149147 15.0 ns/op 8 B/op 1 allocs/op
One line — a discarded log — turned a zero-allocation function into one that allocates on every single call, and runs an order of magnitude slower. Nothing about the result changed. No & appears anywhere.
This is not a bug. It is a direct, predictable consequence of how Go's escape analysis works. By the end of this chapter you will be able to look at that second function and know, before compiling, exactly which variable escapes and why — and you will be able to do it for code far less obvious than this.
Before you start
What you need: Go 1.24 or newer (outputs here are captured against Go 1.25, linux/amd64). The benchmarks use testing.B.Loop, which is Go 1.24+.
The lab: all code in this chapter is real and runnable. It lives next to this file in lab/:
lab/
go.mod
escape_lab.go # every function a claim refers to
escape_lab_test.go # the benchmarks
Makefile # make escape | make bench | make golden
Reproduce everything:
cd lab
make escape # the compiler's escape decisions (go build -gcflags='-m=2')
make bench # the allocation benchmarks (go test -bench=. -benchmem)
On the numbers in this chapter. Every allocs/op figure and every escape-analysis message is a hard fact you will reproduce exactly. Every ns/op and B/op figure is representative — it depends on your CPU, your Go version, and your machine's load. Throughout Go Black Belt the rule is the same: the allocation count and the direction of the change are the lesson; the absolute nanoseconds are not. When in doubt, run make golden and trust your own machine.
The belief this chapter dismantles
Common belief: "Local variables live on the stack. The & operator is what moves something to the heap. So if I don't take addresses, I don't allocate."
What actually happens: Allocation is decided by lifetime, not by syntax. The compiler heap-allocates a value when — and only when — it cannot prove the value's life ends with the function. That proof is fragile. It breaks across interfaces, closures, goroutines, and any parameters, frequently with no & in sight. The one-line surprise above is exactly this: x escapes because it is handed to an any, not because anyone took its address.
This is the most consequential thing the Go compiler does silently, and it is the foundation for the rest of Part I. Get this chapter into your fingers and the next ten chapters become "of course."
The mental model: frames and lifetimes
When a Go function is called, the runtime gives it a stack frame — a slice of a goroutine's stack holding its locals. When the function returns, the frame is gone instantly and for free. No bookkeeping, no garbage collector, nothing. Stack memory is the cheapest memory in the language.
The catch: a frame's memory is only valid while the frame exists. So the compiler may only keep a value on the stack if it can prove the value does not outlive the frame. If a value's lifetime might extend past the return — because a pointer to it is returned, stored in a longer-lived place, or handed to code the compiler can't see through — the value must go somewhere that survives the frame. That somewhere is the heap, and putting it there is an allocation: a call into the runtime allocator now, and a unit of work for the garbage collector later.
That decision — stack or heap — is escape analysis. "The value escapes" means "its lifetime escapes the frame, so it goes to the heap."
stack (free, dies with the frame) heap (allocated, lives until GC frees it)
┌───────────────────────────┐ ┌──────────────────────────┐
│ frame: NewPointVal │ │ │
│ p = {1, 2} ◄── stays │ │ p = {1, 2} ◄── moved │
│ return copy of p │ │ because &p is returned │
└───────────────────────────┘ └──────────────────────────┘
value semantics pointer that outlives
→ the compiler proves the frame → the compiler
p dies here cannot prove p dies here
Two properties make escape analysis worth a whole chapter:
It is intraprocedural by default. The analysis reasons one function at a time. The moment a value crosses a boundary the compiler cannot see through — an interface method, an indirect function call, a goroutine — the analysis must assume the worst and let the value escape. The boundary is the function, not the whole call graph. (Inlining widens what the compiler can see, which is why inlining changes escape results — Chapter 4. We will see this boundary directly in Experiment 5.)
It is invisible in source. There is no keyword, no annotation, no language guarantee about where a value lives. The only ground truth is the compiler's own diagnostic output. That is the tool we learn next.
The one tool that tells the truth: -gcflags=-m
The Go compiler will narrate its escape and inlining decisions if you ask. The flag is -m, passed through via -gcflags:
go build -gcflags='-m' ./... # the decisions
go build -gcflags='-m=2' ./... # the decisions, with reasons
-m lists what escaped and what didn't. -m=2 adds why — the flow chains, the inlining costs, the leaking-param analysis. You will live in this output for the rest of Part I. The vocabulary is small:
| Message | Meaning |
|---|---|
moved to heap: x |
x was going to be a stack local, but its lifetime escaped — it is now a heap allocation. |
&x escapes to heap |
the address of x flows somewhere that outlives the frame. |
x escapes to heap |
x itself (often boxed into an interface) flows out. |
... does not escape |
the compiler proved this value/pointer stays in the frame — stack, free. |
leaking param: p |
parameter p reaches the function's results or a global, so callers' arguments may escape because of it. |
devirtualizing m to T |
the compiler resolved an interface call to a concrete method T.m, turning an opaque call into a transparent one. |
That is essentially the whole language. Everything below is learning to read it fluently and predict it before you run it.
A note on reading -m output. The compiler reports a result for things it actually had to reason about — values whose address is taken, values boxed into interfaces, parameters. For a plain value returned by copy with no address taken, there is often nothing to report, and silence means "stack." Do not expect a reassuring does not escape line for every variable; expect it only where there was a decision to make.
Experiment 1 — A pointer to a local escapes; the value does not
This is the canonical case and the one to burn into memory. Two functions build the same Point. One returns its address; one returns a copy.
//go:noinline
func NewPointPtr(x, y int) *Point {
p := Point{x, y}
return &p // p's address outlives this frame -> heap
}
//go:noinline
func NewPointVal(x, y int) Point {
p := Point{x, y}
return p // copied into the caller -> stack
}
Why //go:noinline? It keeps the call boundary intact so the escape decision we observe is the function's own, not a side effect of the caller inlining it. In real code you would not write this; here it isolates the variable under study. That inlining changes this result is the subject of Chapter 4 — note the dependency and move on.
Ask the compiler:
go build -gcflags='-m=2' ./... 2>&1 | grep -E 'heap|does not escape'
Representative output (Go 1.25, linux/amd64):
./escape_lab.go:41:2: moved to heap: p
Read it carefully, because the absence of a line is half the lesson. NewPointPtr produces moved to heap: p — line 41 is p := Point{x, y}, and the compiler is telling you this local was promoted to the heap. NewPointVal produces nothing: no value escapes, so there is nothing to report. Silence here means "stack."
Now confirm the cost:
go test -bench='NewPoint' -benchmem -count=6 .
BenchmarkNewPointVal-8 1000000000 1.30 ns/op 0 B/op 0 allocs/op
BenchmarkNewPointPtr-8 86097531 13.9 ns/op 16 B/op 1 allocs/op
The value version: 0 allocs/op. The pointer version: 1 alloc/op of 16 bytes (the size of a Point), and an order of magnitude slower — not because copying 16 bytes is expensive (it is nearly free) but because the heap allocation and the eventual GC work are not.
Micro-claim 1 (proof type: compile). Returning a pointer to a local variable escapes it; returning the value does not.
Pass condition: -m=2 reports moved to heap for the pointer-return variant and reports nothing for the value-return variant; benchmem shows 1 vs 0 allocs/op.
The intuition to install: returning &local is the textbook way to force an allocation — not because of the &, but because the pointer outlives the frame.
Experiment 2 — Taking an address is not what allocates
If you took Experiment 1 to mean "& causes heap allocation," this corrects you. LocalAddr takes an address and never allocates:
//go:noinline
func LocalAddr(x, y int) int {
p := Point{x, y}
q := &p // address taken...
return q.X + q.Y // ...but q never leaves the frame
}
go build -gcflags='-m=2' ./... 2>&1 | grep 'does not escape'
./escape_lab.go:61:7: &p does not escape
The compiler took the address, traced where it goes, proved it never leaves the frame, and kept p on the stack. & is everywhere in idiomatic Go and most of it is free. What costs you is a lifetime that escapes the frame — not the address operator. This is why "avoid pointers for performance" is folklore: the question is never "is there a &?" but always "does this outlive its frame?"
This experiment shares Micro-claim 1's pass condition (the does not escape half) and exists to kill the most common wrong inference from it.
Experiment 3 — any always escapes, so the one-line surprise resolves
Now we dissect the function from the top of the chapter. Two functions do identical arithmetic; one boxes its result into an interface:
//go:noinline
func SquareQuiet(n int) int {
return n * n
}
//go:noinline
func SquareLogged(n int) int {
x := n * n
fmt.Fprint(io.Discard, x) // x is boxed into `any` -> escapes
return x
}
fmt.Fprint's signature ends in ...any. To pass x as an any, the compiler must put it behind an interface — and an interface value holds a pointer to its data. The compiler cannot see what fmt.Fprint does with that pointer (it could store it anywhere), so it must assume the value escapes. We write to io.Discard precisely so the benchmark performs no real I/O: the only thing left to measure is the boxing.
go build -gcflags='-m=2' ./... 2>&1 | grep 'escapes to heap'
./escape_lab.go:84:2: x escapes to heap
go test -bench='Square' -benchmem -count=6 .
BenchmarkSquareQuiet-8 1000000000 1.10 ns/op 0 B/op 0 allocs/op
BenchmarkSquareLogged-8 82149147 15.0 ns/op 8 B/op 1 allocs/op
That is the one-line surprise, explained: 8 bytes (a boxed int) allocated on every call, purely from handing a value to an any. This is the mechanism behind a whole genre of "harmless refactor, mysterious allocation regression" — add structured logging to a hot path, add a fmt.Sprintf for an error message that is usually nil, accept an interface{} parameter "for flexibility," and the allocator starts running where it didn't before. It is also why high-performance logging libraries expose typed methods (slog.Int, slog.String) instead of any: each any conversion on a hot path is a heap allocation.
Micro-claim 2 (proof type: bench). Passing a value to a variadic any forces heap allocation because the compiler must assume the interface's data pointer escapes.
Pass condition: the logged variant shows ≥1 allocs/op; the quiet variant shows 0 allocs/op.
A subtlety worth knowing (and the reason the benchmark feeds 1 << 20, not 5): the runtime keeps a static table of the boxed integers 0–255, so boxing a small int can reuse a static value and read as 0 allocs/op — hiding the effect entirely. Use a value above 255 when you want to see interface boxing in a benchmark. This is exactly why the course measures rather than reasons from rules.
Experiment 4 — An escaping closure drags its captures to the heap
A closure is a function value plus the variables it captured. Captures are by reference: the closure and the enclosing function share the same variable. So if the closure escapes, every variable it captured escapes with it.
//go:noinline
func MakeCounter() func() int {
n := 0
return func() int { // the closure escapes (it is returned)...
n++
return n // ...so n must outlive MakeCounter -> heap
}
}
//go:noinline
func SumLocally(nums []int) int {
total := 0
add := func(v int) { total += v } // never leaves the frame
for _, v := range nums {
add(v)
}
return total
}
go build -gcflags='-m=2' ./... 2>&1 | grep -E 'heap|does not escape'
./escape_lab.go:99:2: moved to heap: n
./escape_lab.go:100:9: func literal escapes to heap
./escape_lab.go:112:9: func literal does not escape
Three lines, two stories. In MakeCounter, the returned func literal escapes to heap, and because it captures n, n is moved to heap alongside it. In SumLocally, the func literal does not escape — it is created, used, and discarded within the frame, so it and total stay on the stack.
go test -bench='Closure' -benchmem -count=6 .
BenchmarkClosureLocal-8 205847301 5.8 ns/op 0 B/op 0 allocs/op
BenchmarkClosureEscape-8 62514788 18.7 ns/op ≈24 B/op 1 allocs/op
Closures are not "slow." A closure that stays local is free. The cost appears only when the closure escapes and takes its captures with it. This returns in force in Chapter 11, and again the instant we touch goroutines: go func(){ ... x ... }() forces x to escape, because the goroutine's lifetime is unknowable to the compiler. Capturing-by-launching is one of the most common accidental allocation sources in concurrent Go.
Micro-claim 3 (proof type: compile). A closure that captures a local escapes that variable if the closure itself escapes.
Pass condition: -m=2 reports the captured variable moved to heap and the closure escapes to heap for the returned closure; does not escape for the local one.
Experiment 5 — The intraprocedural boundary, made visible
Mechanism property #1 said the analysis stops at boundaries it cannot see through. Here is that boundary, directly. Two functions call the same method; one knows the concrete type, one only sees an interface.
type Transformer interface{ Transform(int) int }
type Doubler struct{}
func (Doubler) Transform(x int) int { return x * 2 }
//go:noinline
func UseConcrete(x int) int {
d := Doubler{}
return d.Transform(x) // concrete type known
}
//go:noinline
func UseInterface(t Transformer, x int) int {
return t.Transform(x) // concrete type hidden
}
go build -gcflags='-m=2' ./... 2>&1 | grep -E 'inlin|devirtualiz|leaking'
./escape_lab.go:154:1: can inline Doubler.Transform
./escape_lab.go:162:9: devirtualizing d.Transform to Doubler
./escape_lab.go:162:9: inlining call to Doubler.Transform
./escape_lab.go:169:19: leaking param: t
Two completely different outcomes for the same method call. In UseConcrete, the compiler knows the receiver is a Doubler, so it devirtualizes the call (resolves it to Doubler.Transform), inlines it, and can then reason about everything inside as if it were written inline — escape analysis sees straight through. In UseInterface, the receiver is an interface: the compiler cannot know which Transform runs, so the call is an opaque wall. It reports leaking param: t, meaning it must conservatively assume anything reaching that boundary may escape.
This is why interfaces and indirect calls force escapes — not magic, just a wall the analysis cannot see past. We do not benchmark an allocation here, because the cost of interface values (boxing, dispatch) is the subject of Chapter 9; Chapter 1's point is narrower and more fundamental: the interface is the boundary that ends the proof. It is also a preview of why generics (concrete types visible at each instantiation, Chapter 10) frequently beat interfaces — the compiler keeps its line of sight.
Micro-claim (proof type: compile). A concrete call is transparent to escape analysis (devirtualized and inlined); an interface call is an opaque boundary that leaks its receiver.
Pass condition: -m=2 reports devirtualizing/inlining call for the concrete path and leaking param for the interface path.
Experiment 6 — Assigning to a package-level variable escapes the value
The last way a lifetime can escape a frame is the most obvious once stated: store it somewhere that lives forever.
var Retained *Point
//go:noinline
func StashGlobal(x, y int) {
p := Point{x, y}
Retained = &p // lifetime now exceeds the function -> heap
}
./escape_lab.go:134:2: moved to heap: p
BenchmarkStashGlobal-8 88471209 13.6 ns/op 16 B/op 1 allocs/op
A package-level variable outlives every call, so anything reachable from it must too. The same logic applies to any long-lived destination: a field on a struct that lives in a cache, an element appended to a package-level slice, a value sent on a channel a long-lived goroutine holds. Storing into something long-lived is storing into the heap. Chapters 6 and 7 (slices and maps) are, in large part, this idea applied to data structures that retain far more than you intended.
Note on value vs pointer here. If Retained were a Point (a value) rather than a *Point, assigning to it could just copy the struct into the global, and the local would not need the heap. The escape is forced specifically because the global holds a pointer into the local's memory. Try both with make escape and watch the message appear and disappear.
Micro-claim 5 (proof type: compile). Assigning a local to a package-level variable escapes it (when the destination holds a pointer to it).
Pass condition: -m=2 reports the local moved to heap due to the assignment.
Experiment 7 (reader-run) — The decision is not stable across Go versions
The rules above are durable. The specific decision for a specific line is not: escape analysis improves (and occasionally regresses) release to release, and identical source can allocate on one version and not the next. This is why "I checked, it stays on the stack" is only ever true for a stated compiler version.
You need two toolchains to see it, which is why this is a guided exercise rather than a captured result. Go makes installing alternates easy:
# install an older toolchain alongside your current one
go install golang.org/dl/go1.21.13@latest
go1.21.13 download
# diff the escape decisions for the same source
go build -gcflags='-m=2' ./... 2>&1 | grep -E 'heap|escape' | sort > new.txt
go1.21.13 build -gcflags='-m=2' ./... 2>&1 | grep -E 'heap|escape' | sort > old.txt
diff old.txt new.txt
On most files the diff is empty; the interesting cases are functions sitting near a decision boundary, where an analysis improvement flips the result. The Go issue tracker documents real instances of this moving allocations between stack and heap silently across releases — see golang/go#23109.
Micro-claim 4 (proof type: experiment). Escape decisions change between Go releases for the same source.
Pass condition: for a function chosen near a decision boundary, -m=2 reports a different escape status under two Go versions.
The takeaway is procedural: an escape result is a fact about a compiler version, and it belongs in CI, not in your memory. Pin the version; check allocs/op on every hot path; treat a change as a regression to investigate.
Why this is the first chapter: the cascade
Escape analysis is not one cost among many. It is upstream of most of Part I, because losing it cascades:
- When a value escapes, you pay the allocation now and add GC work later. Allocation rate is the primary driver of garbage-collector cost (Chapter 16) — so escape decisions in your hot path set your GC bill.
- Escape analysis is intraprocedural (Experiment 5), so its quality depends on what the compiler can see. Inlining (Chapter 4) widens that window: an inlined callee's internals become visible to the caller's escape analysis, which can then prove non-escape it otherwise couldn't. Lose inlining and you can gain allocations — a one-line refactor with no
&in sight. - Interfaces (Chapter 9) and generics over pointers (Chapter 10) are, from the allocator's point of view, escape stories: the indirection hides the concrete type, the compiler assumes the worst, and the value escapes.
Learn to read -m=2 now and the rest of the course is you applying it to ever-larger structures.
Practical implications
- Treat "harmless" additions as behavioral changes on hot paths. A log line, an
interface{}parameter, afmt.Sprintf, or a captured-by-goroutine variable can introduce per-call allocations. The diff looks cosmetic; the allocation profile does not. - In code review, the allocation-risk signals are syntactic even though the cause is semantic: a new interface parameter, a returned closure, a
go funccapturing locals, a value stored into something long-lived. Each is a prompt to ask for-benchmembefore and after. - "Use pointers for speed" is exactly backwards as often as not. For small and medium structs, returning by value keeps data on the stack; returning by pointer frequently forces a heap allocation. Measure; do not assume.
Rules (enforceable)
- Never assume stack allocation without verifying via
-gcflags='-m=2'. Escape decisions change between Go releases; a result is a fact about a version. - Every hot-path function gets a benchmark with
-benchmem.allocs/opis your regression signal — gate it in CI. - Interface and
anyparameters cause their arguments to escape. Use concrete types or generics on paths where allocation cost matters. - Do not pass pointers to small structs across boundaries on hot paths without measuring — the copy is usually cheaper than heap + GC.
- Closures launched as goroutines escape every captured variable. Pass values as explicit arguments instead of capturing them.
Drills
Predict first, then run make escape / make bench to verify. A drill is "passed" only when your prediction matched the tool output and you can state the reason.
Drill 1. Here is a zero-allocation function. Add a single interface conversion that makes exactly one variable escape. Predict which variable, and why.
func Sum(nums []int) int {
total := 0
for _, v := range nums {
total += v
}
return total
}
Answer
Any edit that boxes a value into an interface does it — e.g. fmt.Fprint(io.Discard, total) before the return, or var x any = total; _ = x. The boxed value gets escapes to heap because the interface holds a pointer the compiler can't follow. Verify: -m=2 prints ... escapes to heap for that line, and -benchmem goes from 0 to 1 allocs/op. The loop and nums are untouched.
Drill 2. Predict the -m=2 output and the allocs/op for each:
func A() *int { x := 1; return &x }
func B() int { x := 1; p := &x; return *p }
func C(w io.Writer) { fmt.Fprint(w, 1<<20) }
Answer
A: moved to heap: x — the address is returned; 1 alloc/op.B: &x does not escape — the pointer never leaves the frame; 0 allocs/op.C: ... escapes to heap for the boxed argument; 1 alloc/op (and note: 1<<20, not a small int, so the static-int table doesn't mask it).
Drill 3. Run Experiment 5's lab functions. Then add var ConcreteSink Doubler at package scope and assign to it inside UseConcrete. Predict whether the devirtualization message survives, and whether anything now escapes.
Answer
Devirtualization survives — the receiver type is still statically known, so d.Transform is still resolved and inlined. Whether anything escapes depends on what you store: assigning the empty Doubler{} value to a Doubler global just copies it (no heap). The lesson reinforces Experiment 5 and 6 together: visibility (concrete vs interface) governs devirtualization; lifetime/destination governs escape. They are independent levers.
Drill 4 (capstone for this chapter). Take MakeCounter from the lab. Without changing what it computes, make the captured counter not escape. Is it possible? Explain using the frame-lifetime model.
Answer
Not possible while still returning the closure: the returned function must mutate state that lives after MakeCounter returns, so that state must outlive the frame — it must be on the heap. The lesson is the model, not a trick: escape is forced by the requirement that a lifetime exceed the frame. If you don't need the counter to survive the call (fold the loop inline, like SumLocally), nothing escapes. The allocation is a consequence of the API's lifetime contract, not of how cleverly you write the body — the seed of "ownership as a design primitive" (Chapter 17).
What you can now do
You can ask the Go compiler whether any value lives on the stack or the heap, and read its answer — including the silences — fluently. You can name the ways a value's lifetime escapes its frame: a returned pointer, interface/any boxing, an escaping closure capture, an opaque call boundary, and storage into something long-lived; and you can predict each before you compile. You have seen why the analysis stops at interface boundaries, you know allocs/op is the durable truth and ns/op is weather, and you know an escape result is pinned to a compiler version.
Where this goes next
Chapter 2 — Value vs Pointer, Measured takes the value-vs-pointer choice from Experiment 1 and finds the crossover: the struct size above which a pointer finally beats a copy, and how receiver types quietly change the answer.
Chapter 3 — Memory Layout asks what a value costs once it is allocated: field alignment, padding, and the cache-line effects that make two correct programs differ several-fold in throughput.
Chapter 4 — Inlining explains the force that widened the compiler's view in Experiment 5 — and how the "clean" refactor that breaks inlining can put your allocations right back.
Course rule, restated: show it, or cut it. Everything in this chapter is in lab/. Run it.