Day 080 — Project: Context & Graceful Shutdown¶
Month 3 · Week 4 · ⬅ Day 079 · Day 081 ➡ · Journal index
🎯 Learning Objective¶
Make the crawler stoppable: one cancellation signal (Ctrl-C, a deadline, or a first error) propagates to every worker so they stop pulling work, abandon in-flight fetches, and let main wait for a clean exit.
📚 Topics¶
context.WithCancel/WithTimeout,ctx.Done(),ctx.Err()signal.NotifyContextfor SIGINT/SIGTERM- Drain-and-wait shutdown with
sync.WaitGroup
📖 Reading / Sources¶
-
contextdocs —WithCancel,WithTimeout,Done -
os/signal—NotifyContext - Go blog — Contexts
📝 Notes¶
- A context is a cancellation tree. Cancelling a parent cancels every child;
ctx.Done()is a channel closed on cancel,ctx.Err()says why (context.Canceledorcontext.DeadlineExceeded) → [[context-cancellation]]. - Cancelling does NOT stop a goroutine. It only closes
Done(). Each goroutine mustselectonctx.Done()and return itself — context is cooperative, not preemptive → [[cooperative-cancellation]]. signal.NotifyContext(parent, os.Interrupt, syscall.SIGTERM)returns a context cancelled on the first matching signal, plus astop()you mustdeferto restore default signal handling. Far cleaner than a manualsignal.Notifychannel + goroutine → [[signal-handling]].- Compose triggers freely. Wrap the signal context in
context.WithTimeout(ctx, d)and you get "cancel on Ctrl-C or after d". The workers don't care which fired — they just seeDone()close → [[context-composition]]. - Always
defer cancel()(anddefer stop()). Even when a timeout will fire, callingcancelreleases the timer and child resources immediately; the vet tool flags a lost cancel as a context leak → [[resource-cleanup]]. - Graceful shutdown = drain + wait. On cancel, the producer stops sending and closes the jobs channel; workers finish or abandon current work and return;
mainblocks onwg.Wait()so it never exits before its goroutines. Returning while workers run leaks them → [[goroutine-leak]] · [[waitgroup]]. - A worker blocked on a send/receive must select on
Done()too, or it deadlocks after the producer is gone:select { case jobs <- j: case <-ctx.Done(): return }→ [[select]].
💻 Code Examples¶
A worker that honours cancellation while idle and mid-task (from the example):
func worker(ctx context.Context, jobs <-chan int, wg *sync.WaitGroup) {
defer wg.Done()
for {
select {
case <-ctx.Done():
return // cancelled while waiting for the next job
case job, ok := <-jobs:
if !ok {
return // jobs closed: normal end
}
select {
case <-time.After(150 * time.Millisecond): // pretend work
case <-ctx.Done():
return // cancelled mid-task: drop it and leave
}
_ = job
}
}
}
Full code:
examples/month-03/graceful-shutdown/· Run:go run ./examples/month-03/graceful-shutdown(or press Ctrl-C early)
🏋️ Exercises / Practice¶
| Exercise | Status | Link |
|---|---|---|
Reuse the crawler/worker-pool exercises and add a ctx param that aborts the crawl |
✅ | exercises/month-03/week-4/crawl/ |
🐛 Mistakes Made¶
- Wrote the producer as a plain
for { jobs <- j }— after cancel the workers were gone, so the producer blocked forever on the send. Added aselectwith<-ctx.Done(). - Forgot
defer stop()fromNotifyContext;go vetdidn't catch that one, but the signal handler stayed installed. Added it.
❓ Open Questions¶
- Should an in-flight fetch be abandoned or allowed to finish on shutdown? (Depends: abandon for responsiveness, finish for data integrity. The context-aware HTTP request makes "abandon" the default.)
🧠 Active Recall (answer without looking)¶
-
Q: Does cancelling a context stop the goroutines using it?
A
No. It only closes `ctx.Done()`. Each goroutine must select on `Done()` and return on its own — cancellation is cooperative, not preemptive. -
Q: What does
signal.NotifyContextreturn and why must you call its second value?
A
It returns a context cancelled on the first matching signal, plus a `stop()` function. Deferring `stop()` releases the signal handler (and the context's resources) so signals revert to default handling.
🪶 Feynman Reflection¶
A context is a kill switch wired to every worker at once. Flipping it (signal, timeout, or error) doesn't yank anyone offstage — it turns on a light (Done()) that everyone is watching, and each worker walks off when they see it. main is the stage manager who waits (wg.Wait()) until the stage is empty before turning off the lights.
🕳️ Knowledge Gaps¶
context.AfterFunc(Go 1.21) andcontext.WithoutCancel/WithDeadlineCause— newer helpers I haven't used yet.
✅ Summary¶
The crawler now shuts down gracefully: signal.NotifyContext + WithTimeout give one cancellation source, every worker selects on Done(), and main drains then wg.Wait()s for a leak-free exit.
⏭️ Next Steps / Prep for Tomorrow¶
- Day 081: add rate limiting so we don't hammer a host —
time.Tickerand a token bucket.
| Time spent | Difficulty | Confidence |
|---|---|---|
| 90 min | 🟦🟦⬜⬜⬜ | 🟦🟦🟦⬜⬜ |
Suggested commit: feat(examples): crawler context & graceful shutdown (day 080)