Skip to content

Day 103 — Rate Limiting

Month 4 · Week 3 · ⬅ Day 102 · Day 104 ➡ · Journal index

🎯 Learning Objective

Build a per-client token-bucket rate limiter as HTTP middleware, return 429 with Retry-After, and keep the bucket map from leaking memory.

📚 Topics

  • Token bucket: rate (steady) vs burst (capacity); lazy refill
  • Per-client buckets behind a sync.Mutex; 429; Retry-After; eviction

📖 Reading / Sources

📝 Notes

  • Token bucket = burst + steady rate. A bucket holds up to burst tokens and refills at rate/sec; each request spends one. Empty → 429. This permits short bursts while bounding the long-run average. → [[token-bucket]]
  • Refill lazily, don't run a ticker per client. On each call compute tokens += elapsed * rate, clamp to burst. O(1), no goroutine per key. → [[lazy-refill]]
  • One bucket per client, guarded by a mutex. Concurrent writes to a plain map panic; the map itself and each bucket need the lock. → [[map-not-concurrent]]
  • 429 Too Many Requests + Retry-After is the contract: tell well-behaved clients exactly how long to back off. → [[http-status-codes]]
  • Evict idle buckets with a background janitor or memory grows unbounded with unique clients (every IP gets a bucket forever otherwise).
  • The client key matters. r.RemoteAddr is host:port — split off the port with net.SplitHostPort. Behind a proxy, RemoteAddr is the proxy; trust a vetted X-Forwarded-For only if your edge sets it. → [[client-identity]]
  • In production use golang.org/x/time/rate (rate.Limiter, Allow, Wait, Reserve) — building it once teaches the algorithm; don't ship a hand-rolled one. → [[reach-for-stdlib-first]]

💻 Code Examples

// Production approach with golang.org/x/time/rate (a third-party x/ module,
// so it lives here as a snippet, not a runnable stdlib example).
import "golang.org/x/time/rate"

type ipLimiter struct {
    mu       sync.Mutex
    limiters map[string]*rate.Limiter
    r        rate.Limit // tokens/sec
    b        int        // burst
}

func (l *ipLimiter) get(ip string) *rate.Limiter {
    l.mu.Lock()
    defer l.mu.Unlock()
    lim, ok := l.limiters[ip]
    if !ok {
        lim = rate.NewLimiter(l.r, l.b) // e.g. rate.NewLimiter(2, 5)
        l.limiters[ip] = lim
    }
    return lim
}

func (l *ipLimiter) middleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        host, _, _ := net.SplitHostPort(r.RemoteAddr)
        if !l.get(host).Allow() { // Allow() = non-blocking token check
            w.Header().Set("Retry-After", "1")
            http.Error(w, "rate limit exceeded", http.StatusTooManyRequests)
            return
        }
        next.ServeHTTP(w, r)
    })
}

Stdlib-only token bucket (no x/ deps): examples/month-04/ratelimit/main.go · Run: go run ./examples/month-04/ratelimit

🏋️ Exercises / Practice

Exercise Status Link
Token-bucket limiter + 429 middleware (in example) examples/month-04/ratelimit
Reuse the week-3-review token-bucket recall journal/month-04/week-3-review.md

🐛 Mistakes Made

  • Used r.RemoteAddr directly as the key — every request got a fresh bucket because the ephemeral port differs. Split off the port with net.SplitHostPort.
  • Forgot to lock around the map and saw fatal error: concurrent map writes under load. Added the mutex around lookup and refill.

❓ Open Questions

  • Global vs per-route vs per-user limits, and how to compose them — likely a chain of limiter middlewares with different keys/rates.

🧠 Active Recall (answer without looking)

  1. Q: In a token bucket, what do rate and burst each control?

    A `burst` is the bucket *capacity* — the largest instantaneous spike allowed. `rate` is the refill speed — the sustained long-run requests/second once the burst is spent.

  2. Q: Why refill lazily on each request instead of a per-client ticker?

    A A ticker per client costs a goroutine and timer per key (doesn't scale to many clients). Lazy refill computes `tokens += elapsed*rate` on access in O(1) with no background work per bucket.

🪶 Feynman Reflection

Imagine each client holds a small bucket that drips full at a fixed rate. Every request scoops one token; if the bucket is dry you get a 429 and a note saying when to come back. The bucket's size lets a burst through; the drip rate caps the long-run average. I top the bucket up only when someone asks, and throw away buckets nobody has touched in a while so memory stays bounded.

🕳️ Knowledge Gaps

  • Distributed rate limiting (shared state across instances, e.g. Redis) — the in-memory map only limits per process.

✅ Summary

I can implement a per-client token-bucket limiter as middleware, lazily refill under a mutex, return 429+Retry-After, evict idle buckets, and reach for golang.org/x/time/rate in production.

⏭️ Next Steps / Prep for Tomorrow

  • Day 104: JWT authentication and role-based access control.

Time spent Difficulty Confidence
90 min 🟦🟦🟦⬜⬜ 🟦🟦🟦⬜⬜

Suggested commit: feat(examples): per-client token-bucket rate limiter middleware (day 103)