Skip to content

Day 061 — Deadlocks & how to spot them

Month 3 · Week 1 · ⬅ Day 060 · Day 062 ➡ · Journal index

🎯 Learning Objective

Recognise the common shapes of channel deadlocks and goroutine leaks, and use the runtime's tools (the fatal all goroutines are asleep panic, -race, go vet) to find them.

📚 Topics

  • The runtime deadlock detector: fatal error: all goroutines are asleep - deadlock!
  • Classic deadlock shapes; the difference between a deadlock and a goroutine leak
  • Tools: -race, go vet (copylocks/lostcancel), GODEBUG, goroutine stack dumps

📖 Reading / Sources

📝 Notes

  • A deadlock is when every goroutine is blocked waiting on something none of them will provide. If the runtime sees all goroutines asleep it aborts with fatal error: all goroutines are asleep - deadlock! → [[deadlock]].
  • The detector only fires when every goroutine is blocked. If even one is busy-looping (or a time.Sleep/network read is pending), a partial deadlock just hangs — no panic.
  • Classic shapes:
  • Unbuffered send, no receiverch <- v with nobody receiving blocks forever.
  • Receive on an empty channel that's never sent to (and never closed).
  • Forgot to close a channel a for range is draining → consumer waits forever.
  • Buffered channel overfilled — send past cap with no receiver.
  • WaitGroup miscountWait() hangs if Add exceeded Done calls.
  • Self-deadlock — sending and receiving on the same unbuffered channel in one goroutine.
  • A goroutine leak is subtler than a deadlock: the program makes progress, but some goroutines are blocked forever (e.g. a generator whose consumer quit early). No panic — just growing goroutine count and memory → [[goroutine-leak]]. Fix with a done/context cancel signal (Day 062).
  • Tools → [[race-detector]]:
  • go test -race / go run -race instruments memory access to catch data races (a different bug from deadlock, but they cluster together).
  • go vet catches copylocks (copying a sync.Mutex/WaitGroup) and lostcancel (a context cancel func never called).
  • On a hang, send SIGQUIT (Ctrl-\) or set GOTRACEBACK=all to dump every goroutine's stack and see exactly where each is parked.

💻 Code Examples

// Deadlock: unbuffered send with no receiver. Runtime aborts:
//   fatal error: all goroutines are asleep - deadlock!
func main() {
    ch := make(chan int) // unbuffered
    ch <- 1              // blocks forever; nobody will ever receive
    fmt.Println(<-ch)    // unreachable
}

(No runnable example — a deadlock aborts the process. Reproduce it intentionally in the playground, then fix it by adding a receiving goroutine or buffering the channel.)

🏋️ Exercises / Practice

Exercise Status Link
Run all Week 1 exercises under -race to confirm no races/deadlocks exercises/month-03/week-1/

🐛 Mistakes Made

  • Ranged over a channel I never closed; the consumer blocked and the runtime reported a deadlock. Added defer close(out) on the sender.
  • A generator leaked because the consumer breaked early; no panic, just a climbing goroutine count. Fixed with a done channel (Day 062).

❓ Open Questions

  • Why doesn't the runtime detect a partial deadlock (some goroutines stuck, one alive)? (Because it can't prove the live goroutine won't eventually unblock the others — undecidable in general.)

🧠 Active Recall (answer without looking)

  1. Q: What exact condition triggers Go's built-in deadlock panic?

    A When *all* goroutines are simultaneously blocked (asleep) with no possibility of progress: `fatal error: all goroutines are asleep - deadlock!`. A partial deadlock just hangs.

  2. Q: How does a goroutine leak differ from a deadlock?

    A In a leak the program keeps running, but some goroutines are blocked forever (e.g. a generator with no consumer). There's no panic — you see it as growing goroutine/memory counts. Fix with cancellation (`done`/`context`).

🪶 Feynman Reflection

A deadlock is a four-way stop where every car waits for the others to go first — nobody moves, ever. Go's runtime is a traffic cop that, if it sees all cars frozen, throws up its hands and crashes the program with a clear message. A goroutine leak is sneakier: traffic flows, but one car is stuck in a dead-end nobody told it to leave — it just sits there burning fuel.

🕳️ Knowledge Gaps

  • Reading goroutine stack dumps fluently (which frame is the real culprit) — needs practice with real hangs.

✅ Summary

I can name the common deadlock shapes, distinguish a deadlock (whole program stuck, runtime panics) from a goroutine leak (program runs, goroutines stranded), and reach for -race, go vet, and stack dumps to diagnose them.

⏭️ Next Steps / Prep for Tomorrow

  • Day 062: generators and done channels — the cancellation pattern that prevents leaks.

Time spent Difficulty Confidence
90 min 🟦🟦🟦⬜⬜ 🟦🟦🟦⬜⬜

Suggested commit: docs(journal): deadlocks and how to spot them (day 061)