Weekly Review — Month 6 · Week 1 (Days 141–147)¶

Journal index · Roadmap › this week

📅 The Week in One Line¶

Made a service legible: structured logging with slog, Prometheus metrics (the RED method), OpenTelemetry tracing + W3C traceparent propagation, liveness/readiness probes with graceful draining, correlation IDs through context, and symptom/SLO-based alerting.

✅ What I Completed¶

Day 141 — Structured logging with log/slog: JSON/Text handlers, levels, typed attrs, With/WithGroup, SetDefault, ReplaceAttr redaction
Day 142 — Prometheus metrics: Counter/Gauge/Histogram/Summary, bounded labels, the RED method, /metrics exposition
Day 143 — OpenTelemetry tracing: trace/span model, sampling, W3C traceparent propagation, TracerProvider/exporter/propagator
Day 144 — Health probes: liveness (restart) vs readiness (drain) vs startup, per-check timeouts, graceful shutdown order
Day 145 — Correlation IDs: unexported context key type, extract-or-mint middleware, a context-reading slog.Handler
Day 146 — Dashboards & alerting: four golden signals, PromQL (rate / error ratio / histogram quantiles), symptom- and SLO-burn-rate alerts
Day 147 — Week review + recall
Stdlib examples: slog, tracecontext, healthcheck, correlation
Exercises solved: 3 (traceparent, health, redact) — all go test green

💡 Lessons Learned¶

slog is stdlib: prefer typed attrs over loose pairs (avoids !BADKEY, preserves JSON types); ReplaceAttr is the single choke point for redaction and time-pinning.
Counters are queried as rate(), never raw; Histograms (not Summaries) for latency because buckets aggregate across replicas via histogram_quantile().
Keep metric label values bounded — never user IDs/raw paths — or cardinality explodes. Use the route template.
In a trace, the trace-id is constant; each hop mints a new span-id whose parent is the incoming one. Sampling is decided once at the root and carried in trace-flags.
Liveness must not check dependencies (failure ⇒ restart loop); readiness does, and failing it drains traffic with no restart.
Graceful shutdown order: flip readiness false → drain → srv.Shutdown. Reverse drops requests.
Context keys must be an unexported named type to avoid cross-package collisions; a context-reading slog.Handler stamps the correlation ID on every record via Handle(ctx, …) + InfoContext.
Alert on symptoms (errors/latency) and SLO burn rate, not causes (CPU); use for: to kill flapping; every alert needs a runbook.

💪 Strengths (what clicked)¶

The "three pillars + one shared ID" mental model unified logs/metrics/traces fast.
Context propagation transferred straight from Month 5 (ctx-first, cancellation) into both tracing and correlation IDs.
The stdlib examples (counter registry from Month 5, traceparent, probes) demystified the third-party libraries.

🧩 Weaknesses (what's still fuzzy)¶

Writing a fully spec-compliant decorating slog.Handler (correct WithAttrs/WithGroup re-wrapping).
Choosing histogram buckets and SLO burn-rate windows/thresholds from real targets rather than guessing.
Head vs tail sampling cost trade-offs at high request volume.

🔁 Spaced-Repetition Re-quiz (topics from earlier weeks)¶

Q: (Day 138) Why guard a shared metrics map with a mutex, and what scales better?
A
Concurrent goroutines mutate it, so unsynchronized access is a data race. Per-series atomic.Int64 scales better than one global mutex; the mutex is just simpler/obviously correct.
Q: (Day 137) How do you keep exponential backoff from overflowing and from thundering-herd?
A
Double-and-cap the delay in-loop (don't compute 2^n directly), then add full jitter. Classify transient vs permanent before spending the retry budget.
Q: (Month 3) What does a receive from a closed channel return?
A
The element type's zero value immediately, with ok == false in the comma-ok form. Close is a broadcast to all receivers.
Q: (Month 1) How do you match a sentinel error through wrapping vs. extract a typed one?
A
errors.Is(err, ErrTarget) for sentinels, errors.As(err, &target) for typed errors; wrap with %w so the chain is walkable.
Q: (Day 135) What does var _ Port = (*Adapter)(nil) buy you?
A
A free compile-time assertion that *Adapter implements Port; it fails to build if the interface drifts.

🎯 Action Items¶

Wire slog (JSON) + the correlation-ID middleware into the Month 5 capstone, defaulting via SetDefault.
Add /healthz and /readyz to the capstone with a real DB/Redis readiness check and graceful drain on SIGTERM.
Instrument the capstone's RED signals (request counter + latency histogram with bounded route labels).
Put the trace-id into every log line and propagate traceparent on outbound queue/RPC calls.
Draft one symptom-based SLO burn-rate alert with a runbook link.

🚀 Next Week Goals¶

Wire the full observability stack into the capstone end to end (logs + metrics + traces sharing one ID).
Containerize and run with a local collector; load-test and read the dashboards.
Continue Month 6: profiling/pprof, performance, and deployment.

📊 Metrics¶

Hours	Days hit	Exercises	Commits	Avg confidence
10.5	7/7	3	7	3.⅘

Suggested commit: docs(journal): month 6 week 1 review