Table of Contents
- 06 — Redis Job Queue
- Overview
- Demo
- Architecture
- Tech Stack
- Getting Started
- Project Layout
- API
- Testing Strategy
- Lessons Learned
- Future Improvements
- 🎒 Portfolio
06 — Redis Job Queue¶
A Redis-backed background job/worker system in Go: enqueue typed jobs, process them on a bounded worker pool, retry with exponential backoff, dead-letter the poison ones, and watch it all through Prometheus.
Note This project is its own Go module (
github.com/nabin747/go-from-zero/projects/06-job-queue-redis) and is intentionally excluded from the repo-root CI. Before building or testing, rungo mod tidyto resolve dependencies. Theworkerandproducerbinaries need a running Redis; the unit tests do not (they run against an in-memory fake).
Overview¶
This is a simplified, build-it-yourself take on production task queues like
asynq, Sidekiq, and Celery. Producers
enqueue JSON-serialized jobs; a pool of worker goroutines pulls them off Redis,
dispatches each to a registered handler by type, and applies the policy that
makes a queue actually reliable: retry with exponential backoff + jitter, a
max-attempts cap, and a dead-letter queue (DLQ) for jobs that exhaust
their attempts. Everything is instrumented with Prometheus. See the full
SPEC for the larger design (scheduling, recovery, priorities, ADRs).
The interesting part of a job queue is never "push and pop" — it is what happens at the edges: handler failures, backoff, poison messages, and shutting down without dropping work. This implementation focuses on exactly those.
Demo¶
# 1. start Redis
make redis # docker run ... redis:7-alpine
# 2. start the worker pool (separate terminal)
make worker # serves /metrics on :2112
# 3. enqueue some jobs
go run ./cmd/producer --type email:welcome --payload '{"user_id":42}' --count 5
go run ./cmd/producer --type demo:flaky --count 20 # exercises retries + DLQ
# 4. watch the metrics
curl -s localhost:2112/metrics | grep jobqueue_
curl -s localhost:2112/stats # {"pending":..,"scheduled":..,"dead":..}
Architecture¶
flowchart LR
P[producer CLI / app] -->|Enqueue JSON| PEND[("Redis LIST: jq:pending")]
subgraph Redis
PEND
SCHED[("ZSET: jq:scheduled\nscore = run-at ms")]
DEAD[("LIST: jq:dead (DLQ)")]
end
subgraph Pool["Worker pool (bounded concurrency)"]
direction TB
DQ[BLPOP pending] --> DISP[dispatch by Type\nvia Registry]
DISP --> H[Handler]
H --> R{result?}
end
PEND -->|promote due| DQ
SCHED -->|ZRANGEBYSCORE now| PEND
R -->|nil| OK[ack: jobs_processed_total++]
R -->|err & attempt < max| RETRY[backoff+jitter\nEnqueueIn -> ZSET\njobs_retried_total++]
RETRY --> SCHED
R -->|err & attempt >= max\nor SkipRetry| DLQ[DeadLetter\njobs_failed_total++]
DLQ --> DEAD
H -.observe.-> HIST[processing_duration histogram]
A small Queue port (internal/queue) hides the broker behind an
interface. The default implementation is Redis (RPUSH/BLPOP + a scheduled
ZSET for backoff); an in-memory fake implements the same interface so the
worker/retry logic is unit-tested without Redis. The worker pool depends on a
narrow, consumer-defined subset of that interface (Dequeue / EnqueueIn /
DeadLetter), which both implementations satisfy.
Tech Stack¶
Go 1.22 · go-redis/v9 · prometheus/client_golang · log/slog · Docker · docker compose
Getting Started¶
Prerequisites¶
- Go 1.22+
- Docker (for Redis, and optionally the compose stack)
go mod tidyonce, to resolve/pin dependencies (this module shipsgo.modwith the direct requires;tidyfills ingo.sumand indirect deps)
Run¶
git clone https://github.com/nabin747/go-from-zero
cd go-from-zero/projects/06-job-queue-redis
go mod tidy
# Redis: either a throwaway container...
docker run --rm -d --name jobqueue-redis -p 6379:6379 redis:7-alpine
# ...or the full stack (redis + worker):
make compose-up
make worker # JQ_REDIS_ADDR, JQ_CONCURRENCY configurable
go run ./cmd/producer --type demo:flaky --count 10
Configuration (flags or env):
| Flag | Env | Default | Description |
|---|---|---|---|
--addr |
JQ_REDIS_ADDR |
localhost:6379 |
Redis address |
--concurrency |
JQ_CONCURRENCY |
10 |
worker goroutines (worker only) |
--metrics-addr |
JQ_METRICS_ADDR |
:2112 |
metrics/health listen addr |
Test¶
Unit tests run entirely against the in-memory Queue fake — no Redis
required.
Project Layout¶
cmd/
worker/ worker pool binary: registers handlers, exposes /metrics, Run()
producer/ demo enqueuer CLI
internal/
job/ Job type, (de)serialization, Handler, handler Registry
queue/ Queue port + Redis impl (RPUSH/BLPOP/ZSET) + in-memory fake
worker/ bounded worker pool: dequeue -> dispatch -> retry/backoff/DLQ
metrics/ Prometheus counters + processing-duration histogram
Dockerfile multi-stage build for worker + producer (distroless)
docker-compose.yml redis + worker
Makefile tidy / build / test / race / run / compose targets
API¶
HTTP surface exposed by the worker (default :2112):
| Method | Path | Description |
|---|---|---|
| GET | /metrics | Prometheus exposition |
| GET | /healthz | 200 if Redis is reachable |
| GET | /stats | JSON: pending / scheduled / dead counts |
Prometheus metrics:
| Metric | Type | Labels | Meaning |
|---|---|---|---|
jobqueue_jobs_processed_total |
counter | type | jobs completed successfully |
jobqueue_jobs_failed_total |
counter | type | jobs moved to the DLQ |
jobqueue_jobs_retried_total |
counter | type | retries scheduled |
jobqueue_job_processing_duration_seconds |
histogram | type | handler execution time |
Testing Strategy¶
- Unit (table-driven, stdlib
testing) against the in-memoryQueuefake: success-first-try, succeed-after-N-retries, exhaust-to-DLQ,SkipRetry-to-DLQ, and unknown-type-to-DLQ; plus a graceful-shutdown test that asserts an in-flight job is allowed to finish during the drain. - Backoff bounds/jitter are property-checked across attempts.
- Run with
-raceto validate the concurrent dequeue/process paths. - Real-Redis integration (
BLPOPblocking, scheduler timing) is a natural next step viatestcontainers-go; see the SPEC.
Lessons Learned¶
Prompts to write up (see the SPEC for the full list):
- Why is exactly-once delivery practically impossible here, and what does at-least-once force every handler author to do (idempotency)?
- Walk through what happens to an in-flight job when a worker is
kill -9'd: which Redis key holds it, and how would a recovery janitor re-queue it? - Why exponential backoff with jitter rather than a fixed delay? What problem does the jitter solve under load (thundering herd)?
- How does the bounded worker pool apply backpressure when handlers are slower than the enqueue rate, and how would you alert on a queue falling behind?
- Why must poison messages be quarantined in a DLQ rather than retried forever?
Future Improvements¶
- Reliable-queue dequeue (
BLMOVE pending -> processing) + a recovery janitor for crashed workers (the SPEC's headline reliability test). - Delayed/scheduled jobs and per-queue priorities exposed in the public API.
- Unique/idempotent enqueue via
SET NXdedup keys. -
testcontainers-gointegration tests for the blocking + scheduler paths. - Grafana dashboard + Redis Streams alternative (ADR).
🎒 Portfolio¶
Résumé bullets:
- "Built a Redis-backed background job queue in Go (
go-redis/v9) with a bounded worker pool, exponential-backoff-with-jitter retries, a max-attempts cap, and a dead-letter queue for poison messages." - "Designed the broker behind a small
Queueport with an in-memory fake, enabling table-driven unit tests of the retry/DLQ/shutdown logic with no Redis dependency, run under-race." - "Instrumented processing throughput, failure/retry rates, and a processing-latency histogram with Prometheus; shipped a multi-stage Docker build and a docker-compose stack."
Interview talking points:
- The
Queueport + consumer-defined interface in the worker, and how it makes the Redis impl swappable and the logic testable. - Graceful shutdown using two contexts: a dequeue-loop context cancelled on
SIGTERM, and a separate job context kept alive so in-flight handlers drain within a deadline. - At-least-once vs exactly-once, and why handlers must be idempotent.
- Backoff + jitter to avoid thundering-herd retries; DLQ for poison messages.
⬅ Projects · Repo README