Skip to content

Day 090 — Graceful Shutdown & Timeouts

Month 4 · Week 1 · ⬅ Day 089 · Day 091 ➡ · Journal index

🎯 Learning Objective

Run an HTTP server like production code: bound every phase with timeouts and shut down gracefully on a signal, draining in-flight requests before exit.

📚 Topics

  • http.Server timeouts, Server.Shutdown, ErrServerClosed
  • signal.NotifyContext, context deadlines, second-signal force kill

📖 Reading / Sources

📝 Notes

  • A default http.Server has no timeouts — a slow/idle client can hold a connection forever (Slowloris). Set at least ReadHeaderTimeout. The full set: ReadHeaderTimeout, ReadTimeout, WriteTimeout, IdleTimeout → [[http-timeouts]].
  • ListenAndServe blocks until the server stops; run it in a goroutine so main can wait on a signal. When Shutdown/Close is called it returns http.ErrServerClosed — that's the expected success signal, not an error, so check errors.Is(err, http.ErrServerClosed) → [[error-wrapping]].
  • Shutdown(ctx) stops accepting new connections, then waits for active requests to finish, bounded by ctx. If the context deadline hits first, Shutdown returns its error and remaining requests are abandoned.
  • signal.NotifyContext gives a context cancelled on SIGINT/SIGTERM and a stop() to restore default handling. Calling stop() after the first signal means a second Ctrl-C force-kills if shutdown hangs → [[context]] / [[signals]].
  • Handlers should select on r.Context().Done() for long work: the request context is cancelled when the client disconnects or the server shuts down, so slow handlers can bail early → [[context-cancel]].
  • Don't Shutdown and then Close: Close hard-cuts connections. Reserve Close for "shutdown deadline blew past and I must exit now".
  • Order: receive signal → Shutdown(timeoutCtx) → drain the ListenAndServe error (ignoring ErrServerClosed) → exit.

💻 Code Examples

srv := &http.Server{
    Addr:              ":8090",
    Handler:           mux,
    ReadHeaderTimeout: 5 * time.Second,
    ReadTimeout:       15 * time.Second,
    WriteTimeout:      30 * time.Second,
    IdleTimeout:       60 * time.Second,
}

go func() { srv.ListenAndServe() }() // blocks; returns ErrServerClosed on Shutdown

ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
defer stop()
<-ctx.Done() // first signal
stop()       // a second Ctrl-C now force-kills

shutCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
if err := srv.Shutdown(shutCtx); err != nil {
    log.Printf("graceful shutdown failed: %v", err) // deadline blew past
}

Full code: examples/month-04/graceful/main.go · Run: go run ./examples/month-04/graceful

🏋️ Exercises / Practice

Exercise Status Link
Re-used from Month 3 graceful-shutdown drill (signals + context) examples/month-03/graceful-shutdown

🐛 Mistakes Made

  • Treated the http.ErrServerClosed from ListenAndServe as a fatal error and log.Fatal'd on every clean shutdown. Now I special-case it with errors.Is.
  • Called srv.Shutdown from the same goroutine that ran ListenAndServe → deadlock. Shutdown must run on a different goroutine than the blocking serve.

❓ Open Questions

  • Good default for the shutdown grace window? (Often tied to the platform's terminationGracePeriodSeconds; pick a value below it so the orchestrator doesn't SIGKILL mid-drain.)

🧠 Active Recall (answer without looking)

  1. Q: What does ListenAndServe return after a successful Shutdown, and how do you treat it?
A `http.ErrServerClosed` — the expected success signal. Check `errors.Is(err, http.ErrServerClosed)` and ignore it.
  1. Q: Why call stop() (from NotifyContext) right after the first signal?
A It restores default signal handling so a **second** Ctrl-C force-kills the process if graceful shutdown hangs.

🪶 Feynman Reflection

Timeouts are seat belts for each request phase; graceful shutdown is a fire drill that lets everyone already inside finish before the doors lock. The server keeps running on its own goroutine while main waits for the signal, then gives in-flight requests a bounded window to complete.

🕳️ Knowledge Gaps

  • Coordinating shutdown order across multiple servers (HTTP + metrics + workers) with an errgroup-style supervisor.

✅ Summary

I can configure server timeouts, run the listener on its own goroutine, and drain cleanly on SIGINT/SIGTERM with a bounded Shutdown, treating ErrServerClosed as success.

⏭️ Next Steps / Prep for Tomorrow

  • Day 091: week review + closed-book recall.

Time spent Difficulty Confidence
90 min 🟦🟦⬜⬜⬜ 🟦🟦🟦⬜⬜

Suggested commit: feat(examples): graceful shutdown and server timeouts (day 090)