Skip to content

Weekly Review — Month 3 · Week 4 (Days 078–084)

Journal index · Roadmap › this week

📅 The Week in One Line

Built the month's capstone — a bounded, cancellable, rate-limited concurrent crawler — then tested it under -race and profiled it.

✅ What I Completed

  • Day 078 — Crawler design & a single cancellable fetch (context, body close)
  • Day 079 — Worker pool + dedup (single-owner map; mutex Set); clean termination via outstanding-work counter
  • Day 080 — context cancellation & graceful shutdown (signal.NotifyContext, drain + WaitGroup)
  • Day 081 — Rate limiting: time.Ticker pacing vs. token-bucket channel; orthogonal to the pool
  • Day 082 — Tests under go test -race; deterministic via sorting, invariants, injected time
  • Day 083 — Profiling with runtime/pprof (CPU/heap/block/mutex/goroutine); go tool pprof
  • Day 084 — Month 3 review + tag v0.3.0
  • Mini-project: concurrent crawler (stdlib, in-memory graph) — pool, dedup, context, rate limit
  • Exercises solved: 3 (visited, crawl, tokenbucket), all -race-clean

💡 Lessons Learned

  • Design the traversal single-threaded first; parallelism is a layer you add to a correct algorithm, not a substitute for one.
  • The cleanest dedup is no shared state: one goroutine owns seen, workers communicate links back over a channel — share memory by communicating.
  • Termination is the subtle part of a crawler — an outstanding-work counter (n++ on dispatch, n-- on receive) tells you exactly when to close the work channel.
  • Cancellation is cooperative: a context only closes Done(); every goroutine must select on it, and main must wg.Wait() to avoid leaks.
  • Rate limiting (how often) and the worker pool (how many) are independent dials; you usually want both.
  • A green -race is evidence, not proof — assert invariants (one winner, each page once, exactly capacity grants) and inject time so tests are deterministic.
  • Profile before optimising: CPU/heap for cost, block/mutex for contention, goroutine for leaks.

💪 Strengths (what clicked)

  • The TGPL single-owner crawler pattern — dedup + bounded fetches + termination with no mutex.
  • Writing deterministic concurrent tests (sort outputs, atomic counters, injected time).

🧩 Weaknesses (what's still fuzzy)

  • Reading block/mutex profiles fluently and acting on contention findings.
  • Goroutine-leak detection inside the test suite.
  • Per-host rate limiting / politeness for a real crawler.

🔁 Spaced-Repetition Re-quiz (topics from earlier weeks)

  1. Q: Channel axioms: nil, closed, send-on-closed?
    Anil send/recv block forever; recv on closed = zero value + ok=false; send on closed panics; double-close panics.
  2. Q: Value vs. pointer receiver for a type holding a sync.Mutex?
    APointer receiver — a value receiver copies the mutex (and the data), which go vet flags and which breaks mutual exclusion.
  3. Q: What does errgroup.WithContext give you?
    AA context cancelled on the first task error so siblings stop early; Wait() returns that first error. Pass that ctx to every task.
  4. Q: Buffered channel as a semaphore?
    ASend = acquire (blocks when full at cap N), receive = release; ≤ N goroutines are past it at once.
  5. Q: Why defer cancel() after WithTimeout?
    ATo release the timer/child resources on every path; a lost cancel is a context leak (go vet flags it).

🎯 Action Items

  • Add per-host rate limiting (map[host]*limiter under a mutex) to the crawler design notes.
  • Practise reading a real block/mutex profile; write a one-pager cheat-sheet.
  • Add a goroutine-leak check to one exercise's test as a pattern.

🚀 Next Week Goals

  • Start Month 4 (networked services): net/http server, routing, JSON encode/decode, middleware — built on this concurrency foundation.

📊 Metrics

Hours Days hit Exercises Commits Avg confidence
9 7/7 3 7 3.⅗

Suggested commit: docs(journal): month 3 week 4 review