Table of Contents

Go DevOps: Docker & Redis Interview Questions¶

A curated set of interview questions covering containerizing Go services with Docker, shrinking image size, Redis use cases, and deployment patterns. Difficulty is marked per question: 🟢 junior, 🟡 mid, 🔴 senior.

Table of Contents¶

Docker Multi-Stage Builds
Q1. What is a multi-stage Docker build and why use one for Go?
Q2. Why set CGO_ENABLED=0 when building a Go image?
Q3. How do you build a minimal Go image with scratch?
Q4. What do the -s -w ldflags do and why use them?
Image Size & Optimization
Q5. How do you leverage Docker layer caching for go.mod/go.sum?
Q6. What is a .dockerignore and what belongs in it for Go?
Q7. scratch vs alpine vs distroless: how do you choose?
Q8. How do you run a Go container as a non-root user?
Q9. Your scratch image fails TLS calls and prints wrong times. Why?
Redis Use Cases
Q10. What are common Redis use cases for a Go backend?
Q11. How do you build a distributed rate limiter with Redis in Go?
Q12. How do you implement a cache-aside pattern in Go with Redis?
Q13. Redis Pub/Sub vs Streams vs Lists for a job queue?
Q14. How do you implement a distributed lock with Redis safely?
Deployment
Q15. How do you wire health checks and graceful shutdown for a Go service?
Q16. How do you inject configuration and secrets into a Go container?
Q17. How do you do zero-downtime deploys and roll back safely?

Docker Multi-Stage Builds¶

🟢 What is a multi-stage Docker build and why use one for Go?

A multi-stage build uses multiple `FROM` statements in one Dockerfile, where an early "build" stage compiles the code and a later "runtime" stage copies only the resulting artifact. For Go this is ideal: the build stage needs the full toolchain (compiler, modules, source), but the runtime only needs a single self-contained binary. You compile in a `golang` image, then `COPY --from=build` the binary into a tiny base like `scratch` or distroless. This drops a ~1GB build environment down to a ~10-20MB (or smaller) final image and removes the compiler and source from production.

FROM golang:1.22 AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /app ./cmd/server

FROM gcr.io/distroless/static
COPY --from=build /app /app
ENTRYPOINT ["/app"]

🟢 Why set CGO_ENABLED=0 when building a Go image?

`CGO_ENABLED=0` disables cgo so the compiler produces a fully static binary with no dynamic links to the system libc. This matters because base images like `scratch` and distroless `static` have no libc (`libc.so`, glibc) for a dynamically linked binary to load at runtime, so a cgo-enabled binary would crash with "no such file or directory". With cgo disabled, Go also uses its pure-Go DNS resolver and standard-library crypto, so you don't need `libc`, `libnss`, or other shared objects. The tradeoff is that any package requiring cgo (some SQLite drivers, certain crypto/FFI bindings) won't build; for those you target a glibc base like distroless `base` or alpine with the right libs.

RUN CGO_ENABLED=0 GOOS=linux go build -o /app ./cmd/server

🟡 How do you build a minimal Go image with scratch?

`scratch` is an empty base image with no files at all, so you build a static binary and copy only it (plus a couple of supporting files). You must set `CGO_ENABLED=0`, and if the app makes TLS calls or formats timezones you also copy `ca-certificates` and `tzdata` from the build stage. The result can be just a few MB — essentially the size of your binary.

FROM golang:1.22 AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o /app ./cmd/server

FROM scratch
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=build /usr/share/zoneinfo /usr/share/zoneinfo
COPY --from=build /app /app
ENTRYPOINT ["/app"]

The downside of `scratch` is zero debugging surface: no shell, no `ls`, no package manager, so `docker exec` is impossible and you debug via logs or ephemeral debug containers.

🟡 What do the `-s -w` ldflags do and why use them?

`-ldflags="-s -w"` strips debugging information from the binary: `-s` removes the symbol table and `-w` removes DWARF debug info. This typically shaves 20-30% off the binary size, which directly reduces image size and pull time. The same `-ldflags` is also where you inject build metadata via `-X`, for example a version string into a package variable. The tradeoff is that stripped binaries produce less useful stack traces in some debuggers/profilers (Go panics still print symbolized stacks because that uses runtime tables, but tools like `delve` lose info).

RUN CGO_ENABLED=0 go build \
  -ldflags="-s -w -X main.version=$(git describe --tags)" \
  -o /app ./cmd/server

Image Size & Optimization¶

🟢 How do you leverage Docker layer caching for go.mod/go.sum?

Copy `go.mod` and `go.sum` and run `go mod download` *before* copying the rest of the source. Docker caches each layer, and a layer is only rebuilt when its inputs change. Since dependencies change far less often than application code, isolating the download step means edits to your `.go` files don't re-trigger a full dependency fetch — Docker reuses the cached download layer. Without this split, any source change invalidates the cache and forces re-downloading every module on each build.

COPY go.mod go.sum ./
RUN go mod download      # cached unless go.mod/go.sum change
COPY . .                 # changes often, but download layer is reused
RUN go build -o /app ./cmd/server

You can go further with BuildKit cache mounts (`RUN --mount=type=cache,target=/go/pkg/mod`) to persist the module cache across builds even when go.sum changes.

🟢 What is a .dockerignore and what belongs in it for Go?

`.dockerignore` lists paths excluded from the build context sent to the Docker daemon, similar to `.gitignore`. A smaller context means faster builds and avoids accidentally copying secrets or bloat into the image via `COPY . .`. For Go you typically exclude the `.git` directory, local build artifacts, test data, and editor/CI files. It also prevents cache busting: if `.git` is in the context, every commit changes the context hash.

.git
*.md
Dockerfile
bin/
dist/
*_test.go        # optional, if you don't test inside the build
.env

🔴 scratch vs alpine vs distroless: how do you choose?

All three are small bases; the choice trades off size against debuggability and compatibility. `scratch` is empty and smallest — best for a pure static binary where you accept zero debug tooling. `alpine` (~5MB) uses musl libc and ships a shell and `apk`, so you can `exec` in and install tools, but musl can cause subtle differences versus glibc (DNS, locale) and you generally still build with `CGO_ENABLED=0`. Distroless `static` sits between them: no shell or package manager (so a smaller attack surface than alpine), but it ships `ca-certificates`, `/etc/passwd` with a `nonroot` user, and tzdata, which removes the manual copying you need with scratch. A common senior default is `gcr.io/distroless/static:nonroot` for static Go binaries: tiny, has certs/tz, runs as non-root, no shell for attackers.

🟡 How do you run a Go container as a non-root user?

Running as root inside a container is a security risk: a container escape or a compromised process starts with more privileges than it needs. You can create an unprivileged user in the build stage and copy `/etc/passwd`, or simply set a numeric UID, or use distroless's built-in `nonroot` user/tag. With `scratch` there's no `useradd`, so use a numeric `USER`.

# Distroless: built-in nonroot user
FROM gcr.io/distroless/static:nonroot
COPY --from=build /app /app
USER nonroot:nonroot
ENTRYPOINT ["/app"]

# scratch: numeric UID (no /etc/passwd needed)
FROM scratch
COPY --from=build /app /app
USER 65532:65532
ENTRYPOINT ["/app"]

Note the app must then bind to a port >1024 (e.g. 8080), since non-root users can't bind privileged ports.

🔴 Your scratch image fails TLS calls and prints wrong times. Why?

`scratch` is empty, so two files the standard library expects at runtime are missing. TLS verification fails because there's no CA bundle (`/etc/ssl/certs/ca-certificates.crt`), so `crypto/tls` can't validate server certificates and returns `x509: certificate signed by unknown authority`. Time formatting in non-UTC zones is wrong because `time.LoadLocation` reads the IANA tzdata from `/usr/share/zoneinfo`, which isn't present, so loading "America/New_York" fails and you fall back to UTC. The fix is to copy both from the build stage, or import the `time/tzdata` package to embed the timezone database into the binary.

COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=build /usr/share/zoneinfo /usr/share/zoneinfo

import _ "time/tzdata" // embeds the tz database; no file copy needed

Redis Use Cases¶

🟢 What are common Redis use cases for a Go backend?

Redis is an in-memory key-value store used where low latency and shared state across instances matter. Common uses include caching (cache-aside in front of a database), session storage, rate limiting (`INCR`/`EXPIRE`), distributed locks, leaderboards via sorted sets, ephemeral queues via lists or streams, and pub/sub for fan-out messaging. In Go you typically use `github.com/redis/go-redis/v9`, which provides a connection pool and context-aware methods. Because Redis is single-threaded for command execution, individual commands are atomic, which is what makes counters and locks reliable.

rdb := redis.NewClient(&redis.Options{Addr: "localhost:6379"})
if err := rdb.Set(ctx, "user:42", payload, 10*time.Minute).Err(); err != nil {
    return err
}

🔴 How do you build a distributed rate limiter with Redis in Go?

For a single process you'd use `golang.org/x/time/rate` (a token bucket), but that state lives in memory and isn't shared across replicas. For a distributed limiter you keep the counter in Redis so all instances see it. A simple fixed-window limiter uses `INCR` plus `EXPIRE`, but doing those as two commands has a race: if the process dies between them the key never expires. The fix is a Lua script so the increment and the conditional expire run atomically on the server.

var script = redis.NewScript(`
  local c = redis.call("INCR", KEYS[1])
  if c == 1 then redis.call("EXPIRE", KEYS[1], ARGV[1]) end
  return c
`)

func allow(ctx context.Context, rdb *redis.Client, key string, limit int64, window time.Duration) (bool, error) {
    n, err := script.Run(ctx, rdb, []string{key}, int(window.Seconds())).Int64()
    if err != nil {
        return false, err
    }
    return n <= limit, nil
}

Fixed windows allow bursts at the boundary; for smoother behavior use a sliding-window log/counter (sorted set of timestamps) or a token-bucket implemented in Lua. The tradeoff is each request now costs a network round trip to Redis.

🟡 How do you implement a cache-aside pattern in Go with Redis?

In cache-aside the application checks Redis first; on a miss it reads the source of truth (the DB), stores the result in Redis with a TTL, and returns it. This keeps Redis populated only with data that's actually requested and bounds staleness via the TTL. The main hazards are cache stampedes (many concurrent misses hammer the DB for the same key) and stale data after writes; mitigate stampedes with a short lock or `singleflight`, and handle writes by invalidating or updating the key.

func getUser(ctx context.Context, id string) (User, error) {
    if b, err := rdb.Get(ctx, "user:"+id).Bytes(); err == nil {
        var u User
        return u, json.Unmarshal(b, &u)
    } else if err != redis.Nil {
        return User{}, err // real Redis error
    }
    u, err := db.LoadUser(ctx, id) // miss -> source of truth
    if err != nil {
        return User{}, err
    }
    b, _ := json.Marshal(u)
    rdb.Set(ctx, "user:"+id, b, 5*time.Minute)
    return u, nil
}

🔴 Redis Pub/Sub vs Streams vs Lists for a job queue?

Pub/Sub is fire-and-forget: messages go only to currently-connected subscribers and are lost if no one is listening, so it's wrong for durable jobs. Lists (`LPUSH`/`BRPOP`) give a simple durable FIFO queue with blocking pop, but they lack consumer groups, acknowledgements, and replay — a crashed worker that already popped a job loses it. Streams (`XADD`/`XREADGROUP`/`XACK`) are the purpose-built choice: they persist messages, support consumer groups for load balancing across workers, track pending (unacked) entries so you can detect and reclaim jobs from dead consumers (`XPENDING`/`XCLAIM`), and allow replay from an ID. For an at-least-once job queue, use Streams with consumer groups and `XACK` after successful processing; make handlers idempotent because redelivery can happen.

// Worker side: read as part of a consumer group, ack on success.
res, _ := rdb.XReadGroup(ctx, &redis.XReadGroupArgs{
    Group: "workers", Consumer: hostname, Streams: []string{"jobs", ">"}, Count: 10, Block: 5 * time.Second,
}).Result()
// ... process ...
rdb.XAck(ctx, "jobs", "workers", id)

🔴 How do you implement a distributed lock with Redis safely?

Acquire the lock with `SET key token NX PX `: `NX` makes it succeed only if absent (mutual exclusion) and `PX` sets an expiry so a crashed holder can't deadlock the lock forever. The critical detail is the random `token` value: when releasing you must verify *you* still own the lock and delete it atomically with a Lua script, otherwise a slow process whose TTL already expired could delete a lock now held by someone else. For correctness-critical locking across nodes, the Redlock algorithm (locking a majority of independent masters) exists, but it's debated; many systems instead use a real consensus store (etcd/ZooKeeper) when correctness matters more than availability.

ok, _ := rdb.SetNX(ctx, "lock:order:42", token, 30*time.Second).Result()
if !ok { return errLocked }

release := redis.NewScript(`
  if redis.call("GET", KEYS[1]) == ARGV[1] then
    return redis.call("DEL", KEYS[1])
  end
  return 0`)
release.Run(ctx, rdb, []string{"lock:order:42"}, token)

Deployment¶

🟡 How do you wire health checks and graceful shutdown for a Go service?

Expose a liveness endpoint (process is up) and a readiness endpoint (dependencies like DB/Redis are reachable and the service should receive traffic) so the orchestrator routes traffic correctly. For graceful shutdown, listen for `SIGTERM`, stop accepting new connections, and let in-flight requests drain within a deadline using `http.Server.Shutdown(ctx)`. This prevents dropped requests during a rolling deploy, where Kubernetes sends SIGTERM and removes the pod from the load balancer.

srv := &http.Server{Addr: ":8080", Handler: mux}
go srv.ListenAndServe()

stop := make(chan os.Signal, 1)
signal.Notify(stop, syscall.SIGINT, syscall.SIGTERM)
<-stop

ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
defer cancel()
srv.Shutdown(ctx) // drains in-flight requests

🟢 How do you inject configuration and secrets into a Go container?

Follow the twelve-factor approach: read configuration from environment variables (and mounted files for larger or sensitive values) rather than baking it into the image. This keeps one image promotable across dev/staging/prod and keeps secrets out of the image layers and git history. In Kubernetes, non-secret config comes from ConfigMaps and secrets from Secrets (mounted as env vars or files), often integrated with a vault. In Go, read with `os.Getenv` or a loader library and validate required values at startup so the process fails fast on misconfiguration.

cfg := Config{
    Port:     getEnv("PORT", "8080"),
    RedisURL: mustEnv("REDIS_URL"), // panic/exit if missing
}

🔴 How do you do zero-downtime deploys and roll back safely?

Use a rolling update (or blue-green/canary) so new pods come up and pass readiness checks before old pods are removed, and combine it with graceful shutdown so in-flight requests drain. Tag every image with an immutable version (a git SHA, not `latest`) so a rollback is just redeploying the previous known-good tag, and so the running version is unambiguous. Keep deployments backward compatible — especially database migrations, which should be expand-then-contract so the old and new code versions can both run against the schema during the overlap window. Watch error rate and latency during the rollout (canary first if possible) and automate rollback on a failed health/SLO check. Key practices: readiness gating, `preStop` hook + SIGTERM draining, immutable version tags, and decoupled, backward-compatible migrations.