Day 047 — Benchmarks (testing.B)¶
Month 2 · Week 3 · ⬅ Day 046 · Day 048 ➡ · Journal index
🎯 Learning Objective¶
Write trustworthy micro-benchmarks with testing.B: drive the b.N loop, exclude setup with b.ResetTimer, report allocations, and avoid dead-code elimination.
📚 Topics¶
func BenchmarkXxx(b *testing.B)and theb.Nloopb.ResetTimer/b.StopTimer/b.StartTimerb.ReportAllocs· sub-benchmarks withb.Run·benchstat
📖 Reading / Sources¶
-
testing— Benchmarks - Go Blog — Using Subtests and Sub-benchmarks
-
benchstat - Dave Cheney — How to write benchmarks in Go
📝 Notes¶
- A benchmark loops the operation
b.Ntimes. The framework auto-tunesb.N(1 → 100 → ... ) until the run lasts ~1s, then reportsns/op= total/b.N. You never setb.N. → [[bench-b-n]] - Run with
go test -bench=. -benchmem.-benchmem(orb.ReportAllocs()in code) addsB/opandallocs/op— often more decision-relevant thanns/op. - Exclude setup from timing: do expensive prep before the loop and call
b.ResetTimer(). For per-iteration setup, bracket it withb.StopTimer()/b.StartTimer(). - Dead-code elimination: if the result is unused, the optimizer may delete the work and you benchmark nothing. Assign to a package-level sink or
_ =it. Don't includefmt.Printlninside the loop. - Sub-benchmarks:
b.Run("size=1k", func(b *testing.B){...})to sweep input sizes; great for spotting O(n²) growth. - Results are noisy. Use
-benchmem, multiple runs (-count=10), andbenchstatto compare old vs new with a confidence interval. A single run proves little. -cpu=1,4,8runs the benchmark at differentGOMAXPROCSto see parallel scaling;b.RunParallelbenchmarks contended code with ab.PBcallback.- Benchmark realistic inputs; micro-optimizing an unrealistic case wastes effort.
💻 Code Examples¶
func BenchmarkReverse(b *testing.B) {
const s = "the quick brown fox — héllo 🚀"
b.ReportAllocs() // also print B/op and allocs/op
b.ResetTimer() // don't count the const setup
for i := 0; i < b.N; i++ {
_ = Reverse(s) // sink the result so it isn't optimized away
}
}
// Sweep sizes to expose super-linear growth:
func BenchmarkJoin(b *testing.B) {
for _, n := range []int{10, 1000} {
parts := make([]string, n)
b.Run(fmt.Sprintf("n=%d", n), func(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = strings.Join(parts, "")
}
})
}
}
Runnable hand-rolled version of the
b.Nloop:examples/month-02/bench-demo/· Run:go run ./examples/month-02/bench-demoRealBenchmarkReverse:exercises/month-02/week-3/reverse/· Run:go test -bench=. -benchmem ./exercises/month-02/week-3/reverse
🏋️ Exercises / Practice¶
| Exercise | Status | Link |
|---|---|---|
BenchmarkReverse with ReportAllocs |
✅ | exercises/month-02/week-3/reverse |
Compare += vs strings.Builder vs Join |
✅ | examples/month-02/bench-demo |
🐛 Mistakes Made¶
- Left
fmt.Sprintf(and printing) inside the benchmark loop — measured the formatter, not the target. Moved it out. - Didn't sink the result; suspiciously fast
ns/opwas the optimizer deleting the call.
❓ Open Questions¶
- How many
-countruns doesbenchstatneed for a stable p-value? (Rule of thumb: ≥10; more for noisy machines.)
🧠 Active Recall (answer without looking)¶
- Q: Who sets
b.Nand why?A
The testing framework auto-tunes b.N upward until the loop runs long enough (~1s) to time reliably; it reports total/b.N as ns/op. You never set it.
2. Q: Two ways a benchmark can lie? A
(1) Setup counted in the timed region — fix with b.ResetTimer. (2) Dead-code elimination removing unused results — fix by sinking the result. Also: single noisy run; use -count + benchstat.
🪶 Feynman Reflection¶
A benchmark is a stopwatch wrapped around a loop the framework runs enough times to get a stable average. Your jobs: don't time the setup, don't let the compiler delete the work, and report allocations — because allocation count often predicts real-world cost better than raw nanoseconds.
🕳️ Knowledge Gaps¶
- Reading
pprofCPU/heap profiles to explain why one variant is faster — next layer down.
✅ Summary¶
I can write honest benchmarks: drive b.N, reset the timer, report allocs, sink results, sweep sizes, and compare runs with benchstat.
⏭️ Next Steps / Prep for Tomorrow¶
- Day 048: fuzzing with
testing.Fand runnableExamplefunctions.
| Time spent | Difficulty | Confidence |
|---|---|---|
| 90 min | 🟦🟦⬜⬜⬜ | 🟦🟦🟦⬜⬜ |
Suggested commit: test(week-3): testing.B benchmarks and benchmem (day 047)