Skip to content

Day 082 — Project: Tests with -race

Month 3 · Week 4 · ⬅ Day 081 · Day 083 ➡ · Journal index

🎯 Learning Objective

Prove the concurrent crawler is correct and race-free: write table-driven and stress tests, run them under the race detector, and make the inherently-timing-dependent parts deterministic so the suite never flakes.

📚 Topics

  • go test -race; what the detector can and cannot catch
  • Deterministic concurrent tests (inject time, sort outputs, count winners)
  • t.Parallel, testing.Short, -count for flake-hunting

📖 Reading / Sources

📝 Notes

  • go test -race instruments memory accesses and reports when two goroutines touch the same location concurrently and at least one writes, without a happens-before edge. It finds real races that executed — it is not a static prover → [[race-detector]] · [[data-race]].
  • The detector is a sampler of behaviour, not a proof. A race only reported if the racy interleaving actually ran. Add load (many goroutines), -count=N, and stress loops to make races likely to surface → [[flaky-tests]].
  • Cost: -race makes code ~2–20× slower and uses more memory. Run it in CI and locally, but ship non-race binaries → [[build-modes]].
  • Make concurrent tests deterministic so they assert exact results:
  • Sort nondeterministic output before comparing (crawler returns a sorted set) → [[deterministic-tests]].
  • Count invariants instead of order: "exactly one MarkSeen winner", "exactly capacity Allow successes", "each page fetched once" (via an atomic counter in the fake fetcher) → [[invariants]].
  • Inject time rather than sleeping: the token bucket exposes Refill(n) so tests advance "time" explicitly — no time.Sleep, no flakes → [[testable-time]].
  • Table-driven tests still apply to concurrency: a slice of {name, input, want} cases run under t.Run, plus a separate stress test for the racy path → [[table-driven-tests]].
  • Run the same logic at several worker counts (1, 2, 8) and assert identical results — proves correctness doesn't depend on the degree of parallelism.
  • Use atomic.AddInt64 (not a plain int) for counters inside the test's goroutines, or the test itself races and the detector flags your test, not the code → [[atomic]].

💻 Code Examples

A deterministic stress test: many goroutines race, exactly one wins (from the exercise):

func TestMarkSeenExactlyOneWinner(t *testing.T) {
    var s Set
    var winners int64
    var wg sync.WaitGroup
    for i := 0; i < 200; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            if s.MarkSeen("https://example.com") {
                atomic.AddInt64(&winners, 1) // atomic: the test must not race either
            }
        }()
    }
    wg.Wait()
    if winners != 1 {
        t.Fatalf("winners = %d; want exactly 1", winners)
    }
}

Run the whole week's suite under the detector: go test -race ./exercises/month-03/week-4/...

🏋️ Exercises / Practice

Exercise Status Link
All Week 4 packages pass go test -race (visited, crawl, tokenbucket) exercises/month-03/week-4/

🐛 Mistakes Made

  • Counted winners with a plain int shared across goroutines — -race flagged my test. Switched to atomic.AddInt64.
  • A first version asserted crawl output in discovery order and flaked across runs. Sorting the result in Crawl made it deterministic and the test stable.

❓ Open Questions

  • Can I assert "no goroutine leaked" in a test? (Yes-ish: compare runtime.NumGoroutine() before/after with a settle delay, or use a leak-checker pattern. Fragile; noted.)

🧠 Active Recall (answer without looking)

  1. Q: Does a green go test -race prove the code is race-free?

    A No. The detector only reports races on interleavings that actually executed during the run. It's strong evidence, not a proof — stress the racy paths (many goroutines, `-count`) to raise confidence.

  2. Q: How do you make an inherently timing-dependent limiter testable without time.Sleep?

    A Inject time: expose an explicit `Refill(n)`/tick method the test calls to advance state deterministically, instead of relying on a real clock. Assert exact token counts and success counts.

🪶 Feynman Reflection

The race detector is a smoke alarm that only goes off if smoke actually drifts past it — so I deliberately light a lot of fires (hundreds of goroutines hitting the same spot) to make sure it would smell trouble. Then I make every assertion about invariants (one winner, each page once, exactly capacity grants) instead of timing or order, so the test is a fact, not a coin flip.

🕳️ Knowledge Gaps

  • Goroutine-leak detection in tests and go test -race -count tuning in CI — practical patterns to standardise.

✅ Summary

The crawler and its building blocks are covered by table-driven + stress tests that all pass go test -race. Determinism comes from sorting outputs, asserting invariants with atomic counters, and injecting time — so the suite is meaningful and never flaky.

⏭️ Next Steps / Prep for Tomorrow

  • Day 083: profile the concurrent code with runtime/pprof to find where time and allocations go.

Time spent Difficulty Confidence
90 min 🟦🟦⬜⬜⬜ 🟦🟦🟦🟦⬜

Suggested commit: test(exercises): race-clean crawler tests (day 082)