Skip to content

Day 065 — sync.WaitGroup in Depth

Month 3 · Week 2 · ⬅ Day 064 · Day 066 ➡ · Journal index

🎯 Learning Objective

Use sync.WaitGroup to fan out work and join it correctly, collect results without a data race, and aggregate errors using only the standard library.

📚 Topics

  • Add / Done / Wait lifecycle and the counter invariant
  • Lock-free result collection via distinct slice indices
  • Error aggregation without golang.org/x/sync/errgroup

📖 Reading / Sources

📝 Notes

  • A [[waitgroup]] is a counter: Add(n) raises it, Done() lowers it by one, Wait() blocks until it hits zero. The rhythm is Add before go, defer Done, then Wait.
  • Add must run on the launching goroutine, before the go statement. Calling Add inside the goroutine races with Wait, which may already have returned on a zero counter.
  • The counter must never go negative: more Dones than Adds panics with "negative WaitGroup counter". One Done per Adddefer makes that one-to-one mapping reliable even on panic.
  • Lock-free collection: if each goroutine writes a distinct index of a pre-sized slice (results := make([]T, n)), there is no shared write target, so no mutex is needed — only a WaitGroup to join. This is the cheapest fan-out/collect.
  • append is not goroutine-safe. If multiple goroutines append to one shared slice you must guard it with a Mutex, or write to per-index slots instead. → [[slice-aliasing]]
  • Error aggregation without errgroup: collect errors under a Mutex (or per-index), then stitch them with [[errors-join]] (errors.Join, Go 1.20+), which returns a single error whose Unwrap() []error works with errors.Is/errors.As.
  • A WaitGroup must not be copied after first use; pass *sync.WaitGroup to helper functions. Reuse is allowed only after a prior Wait returns.

💻 Code Examples

results := make([]int, len(inputs)) // each goroutine owns one index — no lock
var wg sync.WaitGroup
for i, x := range inputs {
    wg.Add(1)
    go func(i, x int) {
        defer wg.Done()
        results[i] = x * x // distinct slot → race-free without a mutex
    }(i, x)
}
wg.Wait()

Full code: examples/month-03/waitgroup/main.go · Run: go run ./examples/month-03/waitgroup

🏋️ Exercises / Practice

Exercise Status Link
Thread-safe counter (Mutex) under fan-out exercises/month-03/week-2/safecounter

🐛 Mistakes Made

  • Called wg.Add(1) inside the goroutine — go test -race reported a race between Add and Wait, and occasionally Wait returned early. Moved Add before go.
  • Had every goroutine append to one shared errs slice with no lock → sporadic lost/duplicated entries. Guarded the append with a sync.Mutex.

❓ Open Questions

  • When is the stdlib Mutex+WaitGroup+errors.Join combo enough vs reaching for x/sync/errgroup (cancellation + first-error)? (errgroup adds context cancellation on first error — worth it for early-abort fan-outs.)

🧠 Active Recall (answer without looking)

  1. Q: Why must wg.Add precede the go statement?
    A

Add running inside the goroutine races with Wait; Wait could observe a zero counter and return before the goroutine even started, so the work isn't waited for. 2. Q: Two goroutines append to the same slice under a WaitGroup. Is that safe?

A

No — append reads/writes the slice header and backing array; concurrent appends are a data race. Guard with a Mutex or write to distinct pre-sized indices.

🪶 Feynman Reflection

A WaitGroup is a tally of outstanding jobs. You bump the tally up by one before launching each job, each job knocks it down by one as it finishes, and Wait parks the boss until the tally is zero — at which point all jobs are provably done.

🕳️ Knowledge Gaps

  • errgroup's context cancellation semantics — to compare once I'm allowed third-party libs.

✅ Summary

I can fan out with a WaitGroup, collect results lock-free via distinct indices, and aggregate errors with a guarded slice + errors.Join, all stdlib-only.

⏭️ Next Steps / Prep for Tomorrow

  • Day 066: sync.Once for run-once init and sync.Pool for object reuse.

Time spent Difficulty Confidence
90 min 🟦🟦⬜⬜⬜ 🟦🟦🟦🟦⬜

Suggested commit: feat(examples): WaitGroup fan-out and error aggregation (day 065)