A focused look at the techniques that shrink container images: which actually pay off, which are folklore, and the discipline that keeps images small over time.
Image size matters: it affects pull time, deploy speed, storage cost, and security surface. Most "make your image smaller" advice is correct in spirit and overstated in practice — the techniques that move the needle are a small subset. After applying these on production images, this is the version that actually works, with the techniques ranked by impact.
The benefits of smaller images:
Faster pulls during scaling events. When a node fails and 30 pods need to start on a new node, image pull becomes the bottleneck. 1GB image vs 200MB image = 5x difference in pull time.
Faster deploys. New version pushed; image pulls everywhere; pods restart. Smaller image = faster deploy.
Less storage cost. ECR / Docker Hub charge for stored images. Across many services and historical versions, this adds up.
Smaller security surface. Fewer packages = fewer CVEs to scan, fewer chances of a vulnerable dependency in the image. This is the underrated win.
For a single image, these are minor. Across a fleet of services × historical versions × deploys, they compound.
The biggest single lever. Build with full toolchain in one stage; copy only the artifact to a smaller runtime stage:
# Build stage
FROM golang:1.22 AS build
WORKDIR /src
COPY . .
RUN CGO_ENABLED=0 go build -o /app ./cmd/server
# Runtime stage
FROM gcr.io/distroless/static-debian12 AS runtime
COPY --from=build /app /app
ENTRYPOINT ["/app"]
Result for a typical Go service: ~15MB final image. The Go binary is essentially the entire image. Everything else (build tools, source, intermediate caches) is in the build stage and doesn't make it to runtime.
Without multi-stage, the same Go service in golang:1.22 directly is ~900MB. ~60x smaller with multi-stage.
For Python (harder because of runtime dependencies):
FROM python:3.12-slim AS build
WORKDIR /src
RUN apt-get update && apt-get install -y gcc g++ libffi-dev
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt
FROM python:3.12-slim AS runtime
COPY --from=build /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "-m", "myapp"]
Smaller than baking everything in one stage; Python images are still chunky (~300-500MB) because of the interpreter and dependencies.
The base image is most of your size:
| Base | Approximate size |
|---|---|
scratch |
0 (empty) |
gcr.io/distroless/static |
~2 MB |
gcr.io/distroless/base |
~17 MB |
alpine:3.19 |
~7 MB |
debian:12-slim |
~74 MB |
ubuntu:22.04 |
~78 MB |
python:3.12-slim |
~120 MB |
node:20-alpine |
~135 MB |
node:20-slim |
~250 MB |
Switching from ubuntu:22.04 to gcr.io/distroless/base: 78MB → 17MB. Free win for any image that doesn't need a shell.
Distroless caveats:
:debug variants if you need a shell during troubleshooting.For most services, distroless or alpine is the right choice. Full distros only if you specifically need them.
Each Dockerfile instruction creates a layer. Layer caching speeds up builds; wrong ordering kills the cache.
The key rule: most-stable things first, most-changing things last.
FROM python:3.12-slim
WORKDIR /app
# Dependencies change rarely
COPY requirements.txt .
RUN pip install -r requirements.txt
# Source code changes every commit
COPY . .
CMD ["python", "-m", "myapp"]
When you change source code, only the last COPY and beyond rebuild. The pip install layer stays cached. Switch the order, and every commit reinstalls everything.
For most projects: copy lockfile, install deps, copy source. Three steps, in that order.
apt-get install adds files; cleanup needs to happen in the same RUN to not be in the layer:
# Wrong
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean
# Right
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
Each RUN creates a layer. The wrong version has cleanup in a separate layer, but the dirty files are in earlier layers and persist. The right version cleans up in the same layer, so the layer doesn't include the deleted files.
--no-install-recommends is also worth knowing: skip optional dependencies. Often saves 50%+ of install size.
Some languages compile to a single binary you can run from scratch:
CGO_ENABLED=0, compiles to a fully static binary. Run from scratch.--target=x86_64-unknown-linux-musl for fully static.For a Go service, the runtime image is just the binary. ~15MB total.
scratch is the smallest possible base; gives you nothing — no CA certs, no shell, no /etc/passwd. Distroless static adds those essentials in ~2MB.
For Node services, node_modules is huge:
npm install: includes dev dependencies (~200-500MB extra)npm install --production: production only (~100-300MB)Bundling for production cuts node_modules out of the runtime image entirely. The trade is build complexity (esbuild config, ensuring runtime imports work).
We bundle for a few high-volume Node services where image size matters. For most, npm prune --production is enough.
A few "best practices" that aren't worth bothering with:
Squashing layers. docker build --squash collapses all layers into one. Saves a tiny bit (no per-layer overhead) but breaks layer caching for downstream consumers. Not worth it.
Aggressive base-image swapping. Switching from python:3.12 to python:3.12-slim saves ~600MB. Switching from slim to alpine saves another 50MB. Switching from alpine to manually-built minimal Python: maybe 20MB more, with significant ongoing maintenance burden. Diminishing returns.
Removing locale data. Saves ~20MB. Breaks some apps. Not worth it unless you're really pushing.
Custom-stripped base images. Building from scratch with only what you need — possible but the maintenance burden outweighs the savings for most teams.
We track image size as a CI metric. Standard checks:
10% growth = warning in PR
25% growth = require justification in PR description
Without tracking, image sizes grow silently. Someone adds a dependency for one feature; the image is now 100MB bigger; nobody notices.
The metrics dashboard shows image sizes per service over time. Spikes are visible.
Old images accumulate in registries. ECR doesn't auto-clean by default.
We have a lifecycle policy:
Without this, ECR storage costs creep up. Our cleanup saves ~$80/month on storage.
A few of our images, before and after optimization:
| Service | Before | After | Technique |
|---|---|---|---|
| Go API service | 920 MB | 15 MB | Multi-stage + distroless static |
| Python API (FastAPI) | 1.1 GB | 380 MB | Multi-stage + python-slim |
| Node API (Express) | 800 MB | 280 MB | Multi-stage + distroless-node |
| Internal data tool (Python + ML) | 4.2 GB | 1.8 GB | CPU-only torch + multi-stage |
Most went from ~1GB to ~200-400MB, except the ML service which is dominated by the libraries themselves (numpy, pandas, torch).
Multi-stage from day one. Don't ship single-stage Dockerfiles.
Distroless or slim/alpine bases. Full distros only if specifically needed.
Layer ordering: stable first. Cache dependencies; rebuild source last.
Track image size in CI. Catch growth before it ships.
Rebuild nightly. Keeps base patches fresh.
Don't chase the last 10MB. Diminishing returns; maintenance cost grows.
Image size is one of those metrics that pays off in operational quality even though it's not visible in features. Smaller images = faster deploys = better incident response. The patterns above are mature; the discipline is in applying them consistently across every service. Once your team is in the habit, images stay small without active effort.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.
Concrete systemd unit patterns that reduced flakiness: restart policies, resource limits, and structured logs.
Explore more articles in this category
You always have known vulnerabilities. The question is how you triage, patch, and respond. The discipline we run after a few real incidents and a lot of routine work.
Three terms that get mixed up constantly. The actual differences, where each one sits in the request path, when you reach for which, and where the same tool plays all three roles.
Helm gives you a lot of rope. The patterns we used that backfired, the ones we replaced them with, and what to skip if you're starting today.