Table of Contents
- Taskly — Multi-Tenant SaaS Backend (Capstone)
- Overview
- Architecture
- Features
- Tech Stack
- Getting Started
- API
- Project Layout
- Testing Strategy
- Observability
- Security Notes
- Lessons Learned
- Future Improvements
- 🎒 Portfolio
Taskly — Multi-Tenant SaaS Backend (Capstone)¶
A production-grade, multi-tenant project & task management backend in Go. Hexagonal architecture, strict tenant isolation, JWT + RBAC, Redis cache & rate limiting, and the full observability triad — all wired for graceful, horizontally-scalable operation.
This is the flagship capstone of the Learn Go track. See the full SPEC for the complete brief.
[!IMPORTANT] This is its own Go module with heavy third-party dependencies. Before building you must run
go mod tidy(to download deps and generatego.sum), and to run it you need Docker for PostgreSQL + Redis. The module is deliberately excluded from the repo-root Go CI matrix.
Overview¶
Taskly lets many independent organizations (tenants) share one deployment
while keeping each tenant's data strictly isolated. A user signs up and creates
an organization (becoming its owner); organizations contain projects,
projects contain tasks, and people join organizations as members with a
role (owner / admin / member). Every tenant-scoped query is filtered by
org_id taken from the verified JWT — never from client input — so no endpoint
can ever return another tenant's rows.
The project exists to exercise, end to end, what a senior Go backend engineer is expected to design and defend: clean ports-and-adapters architecture, multi-tenancy, JWT auth + RBAC, PostgreSQL (pgx) + migrations, Redis cache-aside + token-bucket rate limiting, structured logs + Prometheus metrics + OpenTelemetry traces, graceful shutdown, Docker / compose, and an ADR log.
Architecture¶
The application core (domain + use-cases) depends only on ports
(interfaces). Inbound adapters drive the core; outbound adapters implement the
ports. The composition root (cmd/api) is the only place that knows about
concrete infrastructure.
flowchart TB
subgraph Clients
WEB[Web / SPA]
CLI[curl / API client]
end
subgraph Edge["Middleware chain (internal/adapter/http)"]
direction LR
RID[request-id] --> REC[recover] --> MET[metrics] --> LOG[logging] --> RL[rate limit] --> AUTH[JWT auth] --> TEN[tenant guard] --> RBAC[RBAC]
end
subgraph Core["Application core (the hexagon)"]
UC["Use-cases / services<br/>auth · tenant · project · task"]
PORTS{{"Ports (interfaces)<br/>Repos · Cache · TokenIssuer<br/>RateLimiter · TxManager · EventPublisher"}}
DOM["Domain entities + rules<br/>(pure Go, no I/O)"]
UC --> DOM
UC --> PORTS
end
subgraph Outbound["Outbound adapters"]
PG[Postgres repos<br/>pgxpool]
RC[Redis cache + limiter<br/>go-redis]
JWT[JWT issuer + bcrypt]
EV[Event publisher]
end
subgraph Infra
DB[(PostgreSQL)]
REDIS[(Redis)]
JAEGER[(Jaeger)]
PROM[(Prometheus)]
end
WEB & CLI --> Edge --> UC
PORTS -. implemented by .-> PG & RC & JWT & EV
PG --> DB
RC --> REDIS
Edge -. metrics .-> PROM
Edge -. spans .-> JAEGER
A second view — the lifecycle of an authenticated, tenant-scoped
GET .../tasks request, including the cache-aside path:
sequenceDiagram
autonumber
participant C as Client
participant M as Middleware
participant H as chi Handler
participant U as TaskService
participant K as Redis cache (port)
participant R as Postgres repo (port)
participant D as PostgreSQL
C->>M: GET /v1/orgs/{org}/projects/{p}/tasks (Bearer JWT)
M->>M: request-id · metrics · rate limit
M->>M: verify JWT → Actor{user, org_id, role}
M->>M: tenant guard: path org == token org_id
M->>H: ctx(Actor)
H->>U: List(ctx, actor, filter, page)
U->>U: RBAC: role.Can(task:view)?
U->>K: Get(cacheKey(org,p))
alt cache hit
K-->>U: []Task
else miss
U->>R: List(ctx, org_id, filter, page)
R->>D: SELECT ... WHERE org_id=$1 AND project_id=$2 ...
D-->>R: rows
R-->>U: Page[Task]
U->>K: Set(key, value, ttl)
end
U-->>H: Page[Task]
H-->>C: 200 {data, page} (span ended, metrics recorded)
Key decisions are recorded as ADRs: hexagonal architecture · multi-tenancy · JWT + RBAC · observability & persistence.
Features¶
- Multi-tenancy — shared DB +
org_idrow scoping; tenant id is taken from the verified JWT and asserted against the URL; repositories taketenantIDexplicitly so theWHERE org_id = $1predicate can never be forgotten. RLS is available as defense-in-depth (0002_rls.up.sql). - Auth — signup/login/refresh/logout; short-lived access JWT (HS256) + opaque, rotating refresh tokens with reuse (theft) detection; bcrypt password hashing.
- RBAC —
owner/admin/memberwith a single-source-of-truth permission matrix, enforced at the route (fast 403) and in each use-case (authoritative). - CRUD — organizations & members, projects, and tasks with filtering and keyset (cursor) pagination.
- Caching — Redis cache-aside for hot listings with write-through invalidation.
- Rate limiting — Redis token-bucket (atomic Lua) shared across replicas,
with an in-memory fallback; returns
429+Retry-After. - Observability —
slogJSON logs, Prometheus RED metrics, OpenTelemetry traces (OTLP → Jaeger), all correlated byrequest_id. - Operability —
/healthz+/readyz, graceful shutdown (drain on SIGTERM viasignal.NotifyContext+http.Server.Shutdown), timeouts on all I/O, 12-factor config. - Delivery — multi-stage distroless Dockerfile, full
docker-composestack, Makefile, and a CI pipeline.
Tech Stack¶
Go 1.22 · chi (router) · pgx/pgxpool (Postgres) · golang-migrate
(migrations) · go-redis (cache + limiter) · golang-jwt · bcrypt · slog
(logs) · prometheus/client_golang (metrics) · OpenTelemetry + otelhttp
(traces) · Docker / docker-compose · testcontainers-go (integration).
Getting Started¶
Prerequisites¶
- Go 1.22+
- Docker + Docker Compose (for PostgreSQL, Redis, Jaeger, Prometheus)
golang-migrateCLI (only if running migrations outside compose)
1. Resolve dependencies (required)¶
2. Run the whole stack¶
cp .env.example .env # then edit JWT_SIGNING_KEY
make up # docker compose up --build (api + postgres + redis + prometheus + jaeger)
- API: http://localhost:8080 · Metrics: http://localhost:9090/metrics
- Jaeger UI: http://localhost:16686 · Prometheus UI: http://localhost:9091
Migrations run automatically as a one-shot migrate service before the API
starts.
3. Run the API directly (Postgres + Redis still needed)¶
4. Demo flow¶
# Sign up → returns access + refresh tokens and the new org id
curl -s localhost:8080/v1/auth/signup -H 'content-type: application/json' -d '{
"email":"ada@example.com","password":"supersecret","display_name":"Ada","org_name":"Acme"
}'
TOKEN=... # access_token from the response
ORG=... # tenant_id from the response
# Create a project, then a task under it
curl -s localhost:8080/v1/orgs/$ORG/projects -H "authorization: Bearer $TOKEN" \
-H 'content-type: application/json' -d '{"name":"Launch"}'
PROJECT=...
curl -s localhost:8080/v1/orgs/$ORG/projects/$PROJECT/tasks -H "authorization: Bearer $TOKEN" \
-H 'content-type: application/json' -d '{"title":"Write the README","priority":2}'
Test¶
make test # go test -race -cover ./... (service + platform unit tests)
make cover # HTML coverage report for ./internal/...
API¶
Base path /v1. The active org is the JWT org_id claim; org-scoped paths also
carry {org_id}, which must equal the claim or the request is 403.
Full contract in api/openapi.yaml.
| Method & Path | Description | Min role |
|---|---|---|
POST /v1/auth/signup |
Create user + first org (creator = owner) | public |
POST /v1/auth/login |
Verify password, issue access + refresh | public |
POST /v1/auth/refresh |
Rotate refresh token, new access token | valid refresh |
POST /v1/auth/logout |
Revoke active refresh token | member |
GET /v1/orgs |
List my organizations | member |
POST /v1/orgs |
Create a new organization | member |
GET /v1/orgs/{org_id} |
Get organization | member |
PATCH /v1/orgs/{org_id} |
Update organization | admin |
DELETE /v1/orgs/{org_id} |
Soft-delete organization | owner |
GET /v1/orgs/{org_id}/members |
List members | member |
PATCH /v1/orgs/{org_id}/members/{user_id} |
Change member role | admin |
DELETE /v1/orgs/{org_id}/members/{user_id} |
Remove member | admin |
GET /v1/orgs/{org_id}/projects |
List projects (paged/filtered) | member |
POST /v1/orgs/{org_id}/projects |
Create project | member |
GET/PATCH/DELETE …/projects/{project_id} |
Read / update / archive | member / admin / admin |
GET …/projects/{project_id}/tasks |
List tasks (filter status/assignee) | member |
POST …/projects/{project_id}/tasks |
Create task | member |
GET/PATCH/DELETE …/tasks/{task_id} |
Read / update / delete | member (delete: own only) |
Every non-2xx response shares one envelope:
code ∈ unauthenticated · forbidden · not_found · validation_failed ·
conflict · rate_limited · internal.
Project Layout¶
07-capstone-saas-backend/
├── cmd/api/ # composition root: wiring + graceful shutdown
├── internal/
│ ├── domain/ # entities, RBAC, errors, and PORT interfaces (no I/O)
│ ├── service/ # use-cases (auth, tenant, project, task) + unit tests
│ ├── adapter/
│ │ ├── http/ # chi router, handlers, middleware (auth/RBAC/ratelimit/…)
│ │ ├── postgres/ # pgx repositories + TxManager + keyset cursor
│ │ ├── redis/ # cache-aside + token-bucket rate limiter
│ │ └── events/ # EventPublisher (logging; swap for a queue worker)
│ └── platform/ # connectors & setup (config, logging, metrics,
│ # tracing, pgxpool, redis client, jwt+bcrypt, ratelimit)
├── db/migrations/ # golang-migrate *.up.sql / *.down.sql (+ optional RLS)
├── api/openapi.yaml # OpenAPI 3 contract
├── deployments/prometheus/ # Prometheus scrape config
├── docs/adr/ # Architecture Decision Records (Nygard)
├── Dockerfile · docker-compose.yml · Makefile · .env.example · .golangci.yml
└── go.mod
Testing Strategy¶
- Unit tests (core, fast, no infra): the entire service layer is tested
table-driven against hand-written in-memory fakes of every port
(
internal/service/*_test.go). They cover signup/login, refresh rotation & reuse detection, the RBAC matrix, tenant isolation (tenant A cannot read/write tenant B), cache-aside hit/miss/invalidation, and task delete-ownership rules. The JWT issuer and the in-memory rate limiter also have pure unit tests. - Integration tests (require Docker): repositories, migrations, the
cache-aside path, the Redis token-bucket limiter, and the refresh-token store
are intended to be exercised against real Postgres + Redis via
testcontainers-go(spun up per suite inTestMain, torn down after). These catch what mocks cannot: SQL correctness, the23505→ErrConflictmapping, index-backed pagination, and Lua atomicity. Docker is required to run them. - Everything under
-race, with a coverage gate oninternal/...in CI.
Observability¶
- Logs —
slog(JSON in prod). One structured line per request: method, route, status, duration,request_id, anduser_id/tenant_id/rolewhen authenticated. - Metrics —
/metrics(port9090) exposes RED metrics (taskly_http_requests_total,taskly_http_request_duration_seconds,taskly_http_in_flight_requests) labelled by the chi route template (low cardinality), plus Go/process collectors. - Traces — every request is an
otelhttpserver span; spans export over OTLP/gRPC to Jaeger. Tracing is a no-op whenOTEL_EXPORTER_OTLP_ENDPOINTis unset. - Correlation — the
request_idties a log line to its trace; given only arequest_idyou can pivot logs → trace → the slow span.
Security Notes¶
- Tenant id is sourced from the verified token, never request input; the URL
{org_id}is asserted to equal it. Optional RLS is a database-level backstop. - Rotating refresh tokens (hashed at rest) with reuse detection; pinned
HS256, validatediss/aud/exp; bcrypt (cost ≥ 12); vague auth errors to avoid user enumeration. - Strict JSON decoding with
DisallowUnknownFieldsand a 1 MiB body cap; parameterized SQL everywhere; panics recovered into500s. - No secrets in code — 12-factor env config;
JWT_SIGNING_KEYmust be ≥ 32 bytes or the process refuses to start.
Lessons Learned¶
- Ports earned their keep where a second implementation existed (Redis vs in-memory limiter; cache vs no-cache) — the service layer never changed. Where there was exactly one implementation, the interface was ceremony, so I kept ports minimal.
- Passing the
Actorexplicitly (instead of pulling identity from context inside use-cases) made the service tests trivial and the tenant/RBAC rules obvious to read. - Cache-aside is easy to get subtly wrong. Caching only the hot, unfiltered first page kept invalidation exact; caching every filter/cursor permutation would have made correct invalidation far harder than it was worth.
TxManagervia context kept atomic signup clean without leakingpgx.Txinto the use-case signatures.
Future Improvements¶
- gRPC surface sharing the same use-cases (interceptors mirroring the HTTP chain).
- Real async worker (
cmd/worker) consuming a Redis/asynq queue for invitation emails, notifications, and webhooks (theEventPublisherseam already exists). - Invitations (tokenized accept flow) and richer member management.
- Turn on RLS end-to-end (
SET LOCAL app.current_orgplumbing). -
sqlc-generated queries; transactional outbox for reliable domain events. - k6 load script asserting p99 < 150 ms; Grafana dashboards.
🎒 Portfolio¶
Résumé bullets¶
- Built a production-grade multi-tenant SaaS backend in Go with a
hexagonal architecture and strict
org_idtenant isolation (verified-JWT scoping + optional Postgres RLS), validated by dedicated tenant-isolation tests. - Implemented end-to-end auth & authorization: short-lived access JWTs + rotating refresh tokens with reuse detection, bcrypt hashing, and a single-source-of-truth RBAC matrix enforced at both the edge and the use-case layer.
- Delivered the full observability triad (slog + Prometheus RED metrics +
OpenTelemetry traces, correlated by
request_id), Redis cache-aside + a shared token-bucket rate limiter, graceful shutdown, and a multi-stage distroless image withdocker-composeand CI.
Interview talking points¶
- Multi-tenancy — why shared-DB +
org_idover schema/DB-per-tenant, how the predicate is made un-forgettable (explicittenantIDparams), and where RLS fits (ADR-0002). - Hexagonal + DI — domain at the center, consumer-defined ports, adapters at the edge, one composition root; what made the service layer unit-testable in milliseconds (ADR-0001).
- Auth — token design, rotation, reuse detection ("present a revoked refresh token → revoke the whole family"), and what a leaked signing key does and doesn't expose (ADR-0003).
- Observability — when a log vs a metric vs a trace answers the question, and
how
request_idcorrelation makes an incident debuggable (ADR-0004). - Caching & rate limiting — cache-aside with precise write-through invalidation, and an atomic Lua token bucket correct across replicas.
- Operations — graceful drain on SIGTERM behind a load balancer, readiness flipping during shutdown, and zero-downtime expand/contract migrations.
System-design stories this unlocks¶
- Design a multi-tenant SaaS — isolation strategies and the trade-offs chosen.
- Design an auth system — access/refresh, rotation, RBAC, hashing.
- Make this service observable — the triad and incident debugging.
- Add caching & rate limiting — invalidation and the token bucket.
- Decouple a slow side-effect — the
EventPublisherseam → a worker queue.
⬅ Projects · Repo README · SPEC