Day 143 — OpenTelemetry Tracing¶
Month 6 · Week 1 · ⬅ Day 142 · Day 144 ➡ · Journal index
🎯 Learning Objective¶
Understand distributed tracing: spans, the trace/span ID model, context propagation via the W3C traceparent header, and how OpenTelemetry wires it in Go.
📚 Topics¶
- Trace · span · span context; parent/child; sampling
- W3C Trace Context propagation; OTel
TracerProvider, exporters, instrumentation
📖 Reading / Sources¶
📝 Notes¶
- A trace is one request's journey across services; it's a tree of spans. Each span = a named, timed operation with attributes, events, and a status → [[tracing]].
- Span context (the wire-propagated part): a 16-byte trace-id (same for the whole trace), an 8-byte span-id (this operation), and trace-flags (low bit = sampled). All-zero IDs are invalid.
- Propagation: the caller serialises its span context into the
traceparentheader00-<trace-id>-<span-id>-<flags>; the callee parses it, keeps the trace-id, and creates a new span whose parent is the incoming span-id. That parent/child chain becomes the waterfall in Jaeger/Tempo → [[context-propagation]]. - Sampling is decided once at the root and carried in trace-flags, so all services agree (head sampling). Tail sampling decides after the fact in the collector.
- OTel pieces: a
TracerProvider(configured with a sampler + resource), aTracer(tracer.Start(ctx, "name")returns a newctx+span), an exporter (OTLP → collector), and a propagator (otel.SetTextMapPropagator(propagation.TraceContext{})). - Context first: spans live in
context.Context. Always passctxdown and calltracer.Start(ctx, …);defer span.End(). Forgetting to threadctxbreaks the parent link and orphans spans. - Record failures with
span.RecordError(err)+span.SetStatus(codes.Error, msg). Add cheap dimensions withspan.SetAttributes(attribute.String(...))— but high-cardinality detail is fine on spans (unlike metric labels). - Tie it together: put the trace-id into your logs (Day 145) and as a metric exemplar, so one trace-id pivots across logs, metrics, and traces.
💻 Code Examples¶
The OTel SDK is third-party, so the propagation format — the load-bearing part — is rebuilt with the stdlib in the example below; the SDK wiring is shown as a snippet.
// Real OTel: start a child span and propagate it over HTTP.
ctx, span := tracer.Start(ctx, "GetUser")
defer span.End()
span.SetAttributes(attribute.Int("user.id", id))
req, _ := http.NewRequestWithContext(ctx, "GET", url, nil)
// Inject the current span context into the outgoing headers as `traceparent`.
otel.GetTextMapPropagator().Inject(ctx, propagation.HeaderCarrier(req.Header))
Stdlib
traceparentparse/propagate:examples/month-06/tracecontext/main.go· Run:go run ./examples/month-06/tracecontext
🏋️ Exercises / Practice¶
| Exercise | Status | Link |
|---|---|---|
traceparent — parse/format the W3C header |
✅ | exercises/month-06/week-1/traceparent |
🐛 Mistakes Made¶
- Created a span from
context.Background()inside a handler → it became a new root, orphaned from the request trace. Must start fromr.Context(). - Forgot
defer span.End()→ spans never closed, showed as still-running.
❓ Open Questions¶
- Head vs tail sampling trade-offs at high volume — where does the cost actually land?
🧠 Active Recall (answer without looking)¶
- Q: Across a 3-service request, which ID is constant and which changes per hop?
A
The trace-id is constant for the entire request; each service mints a new span-id whose parent is the incoming span-id. That parent chain renders as the trace waterfall.
2. Q: Where does the sampling decision come from on a downstream service? A
From the incoming traceparent trace-flags (low bit), set once at the root. Downstream services honor it rather than re-deciding, so the whole trace is sampled consistently (head sampling).
🪶 Feynman Reflection¶
A trace is a stopwatch that follows one request everywhere it goes. Every service it touches starts its own little timer (a span) and tags it with the shared trace-id so they can all be stitched back into one timeline. The traceparent header is just that shared ID handed from caller to callee in a fixed-format string.
🕳️ Knowledge Gaps¶
- Span links and baggage (cross-cutting key/values) — not yet used.
✅ Summary¶
I understand the trace/span model, can parse and propagate the W3C traceparent header by hand, and know how OTel's TracerProvider/exporter/propagator fit together.
⏭️ Next Steps / Prep for Tomorrow¶
- Day 144: liveness vs readiness probes and graceful traffic draining.
| Time spent | Difficulty | Confidence |
|---|---|---|
| 90 min | 🟦🟦🟦⬜⬜ | 🟦🟦🟦⬜⬜ |
Suggested commit: feat(examples): W3C trace-context propagation (day 143)