Writing a Dockerfile
A Dockerfile is a text recipe that tells Docker how to assemble an image. Each instruction creates a layer, and the ordered set of instructions produces a reproducible, version-controlled build. Mastering the Dockerfile is what turns “I can run containers” into “I can ship them.”
Core instructions
A handful of instructions cover the vast majority of real Dockerfiles.
| Instruction | Purpose |
|---|---|
FROM | Sets the base image every build starts from. |
WORKDIR | Sets the working directory for subsequent instructions. |
COPY | Copies files from the build context into the image. |
RUN | Executes a command at build time, producing a new layer. |
ENV | Defines environment variables available at build and run time. |
EXPOSE | Documents the port the container listens on. |
CMD | Default command run when the container starts (overridable). |
ENTRYPOINT | Fixed executable for the container; CMD supplies its arguments. |
CMDvsENTRYPOINT: useENTRYPOINTfor the program that always runs (e.g.["node"]) andCMDfor default arguments (e.g.["server.js"]). Together they make a container that behaves like a single executable.
A real Node.js example
FROM node:20-alpine
WORKDIR /app
# Copy manifests first so dependency install is cached
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
# Then copy the rest of the source
COPY . .
ENV NODE_ENV=production
EXPOSE 3000
CMD ["node", "server.js"]
Build and run it:
docker build -t myapp:1.0 .
docker run -d -p 3000:3000 myapp:1.0
Layer caching
Docker caches each layer and reuses it on the next build if the instruction and its inputs are unchanged. The moment one layer’s input changes, that layer and every layer after it are rebuilt.
This is why the example copies package.json before the application source: dependencies change rarely, so the expensive npm ci layer stays cached across most code edits. Reversing the order would reinstall dependencies on every source change.
Order Dockerfile instructions from least-frequently-changed to most-frequently-changed to maximize cache hits and shrink build times.
Multi-stage builds
Multi-stage builds let you compile in one stage and copy only the artifacts into a lean final image — no build tools, no source, smaller attack surface. Here is a Java/Maven example:
# --- Build stage ---
FROM maven:3.9-eclipse-temurin-21 AS build
WORKDIR /src
COPY pom.xml .
RUN mvn -B dependency:go-offline
COPY src ./src
RUN mvn -B package -DskipTests
# --- Runtime stage ---
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY --from=build /src/target/app.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
The final image contains only a JRE and the compiled JAR, often shrinking the result from over 600 MB to under 200 MB.
Trimming the build context with .dockerignore
When you run docker build, the entire directory (the build context) is sent to the daemon. A .dockerignore file excludes files you do not need, speeding builds and avoiding leaked secrets:
node_modules
.git
*.log
.env
dist
Dockerfile
Best Practices
- Use small, official base images (
-alpineor-slimvariants) to cut size and CVEs. - Order instructions to maximize layer cache reuse; copy dependency manifests first.
- Use multi-stage builds to keep build tooling out of the final image.
- Combine related
RUNcommands with&&to reduce layer count, and clean caches in the same layer. - Always include a
.dockerignoreto shrink context and prevent secret leakage. - Run as a non-root user (
USER) in production images for defense in depth.