What are the main instructions used in Dockerfile?
A Dockerfile consists of instructions, each of which creates a new layer in the image. Here are the main ones:
Junior Level
Main Instructions
A Dockerfile consists of instructions, each of which creates a new layer in the image. Here are the main ones:
Environment Definition Instructions
| Instruction | What it does | Example |
|---|---|---|
FROM |
Sets the base image | FROM openjdk:17-slim |
WORKDIR |
Sets the working directory | WORKDIR /app |
ENV |
Sets environment variables | ENV APP_PORT=8080 |
File Operation Instructions
| Instruction | What it does |
|---|---|
COPY |
Copies files from host into the image |
ADD |
Like COPY, but can unpack archives |
Command Execution Instructions
| Instruction | What it does |
|---|---|
RUN |
Executes a command during image build |
CMD |
Default command when container starts |
ENTRYPOINT |
Main container launch command |
Configuration Instructions
| Instruction | What it does |
|---|---|
EXPOSE |
Documents the application port |
USER |
Specifies the user for running |
Simple Example
FROM openjdk:17-jdk-slim # Base image
WORKDIR /app # Working directory
COPY myapp.jar app.jar # Copy jar file
EXPOSE 8080 # Port (documentation)
ENTRYPOINT ["java", "-jar", "app.jar"] # Launch command
COPY vs ADD
- COPY — preferred option (95% of cases).
- ADD — if you need auto-extraction of tar archives or downloading from URL (but for URLs, RUN curl is better).
Exec form ["cmd", "arg"] — command runs directly. Shell form cmd arg — through shell /bin/sh -c.
ENV — variable is available both during build and in the running container. ARG — only during build.
What to Remember
FROMis always the first instructionRUN— executes during build,CMD/ENTRYPOINT— at launchCOPY— preferred option (95% of cases).ADD— if you need auto-extraction of tar archives or URL download.EXPOSEdoesn’t open the port, only documents it- Use
WORKDIRinstead ofRUN cd ...
Middle Level
Instruction Classification
1. Environment Definition Instructions
FROM — sets the base image. Any Dockerfile starts with this. Best Practice: specify a concrete version (openjdk:17-slim), not latest.
ENV — sets environment variables. Available both during build and in the running container.
ENV JAVA_OPTS="-Xmx512m"
ENV APP_ENV=production
ARG — defines variables available only during the build process.
ARG APP_VERSION=1.0.0
RUN echo "Building version $APP_VERSION"
WORKDIR — sets the working directory. All subsequent commands execute relative to it.
WORKDIR /app # better than RUN cd /app
2. File Operation Instructions
COPY — copies files from host into the image.
COPY pom.xml /app/
COPY src/ /app/src/
ADD — extended version of COPY. Can unpack archives (.tar.gz) and download files from URLs.
ADD app.tar.gz /app/ # automatically unpacks
3. Command Execution Instructions
RUN — executes a command during build and records the result in a new layer.
# Combine commands to reduce layers
RUN apt-get update && \
apt-get install -y git curl && \
rm -rf /var/lib/apt/lists/*
// Deletion in the same RUN is critical: if you delete in a separate RUN,
// files remain in the lower layer and will be in the image.
CMD — sets the default command on container start. Easy to override.
CMD ["--server.port=8080"]
ENTRYPOINT — defines the main launch command. Harder to override.
ENTRYPOINT ["java", "-jar", "/app.jar"]
4. Access Configuration Instructions
EXPOSE — documents the port. Does not actually publish it (you need -p at launch).
USER — specifies the user for running.
RUN useradd -r appuser
USER appuser
VOLUME — creates a mount point for persistent data.
VOLUME ["/data"]
Typical Mistakes
| Mistake | Consequence | How to avoid |
|---|---|---|
RUN apt-get update in a separate layer |
Cache goes stale, packages not found | Combine update && install in one RUN |
Using shell form CMD java -jar |
Signals don’t reach the application | Use exec form ["java", "-jar"] |
| Passing secrets via ARG | Passwords visible in docker history |
Use BuildKit secrets (--mount=type=secret) |
| Deleting files in a separate RUN | Files remain in lower layers | Delete in the same RUN where you created them |
| Multiple CMD/ENTRYPOINT | Only the last one counts | One CMD, one ENTRYPOINT per file |
CMD vs ENTRYPOINT Comparison
| Instruction | Can override? | Main purpose |
|---|---|---|
| ENTRYPOINT | With difficulty (--entrypoint) |
Fixed command |
| CMD | Very easily | Default parameters |
Best Practices
- Combine
RUNcommands via&&to reduce layers - Clean package cache in the same
RUNlayer - Use
WORKDIRinstead ofRUN cdchains - Always use Multi-stage build to separate build and runtime
What to Remember
- Each
RUN,COPY,ADDcreates a new layer - Instruction order affects caching
- Use Exec form
["cmd", "arg"]instead of Shell form - Clean cache in the same layer as installation
ENTRYPOINT+CMDtogether — best practice
Senior Level
Instruction Architecture and Image Impact
Understanding instruction nuances is critical for creating secure, compact, and fast-to-build images.
Deep Analysis: Layered Model
Each RUN, COPY, ADD instruction creates a new layer. Layers are read-only filesystems combined through UnionFS (Overlay2).
Critical consequence: deleting a file in a new layer doesn’t remove it from the image — only a “whiteout” entry is created. File size in image = sum of all layers where it appears.
# BAD: file remains in lower layer
RUN apt-get update && apt-get install -y package
RUN rm -rf /var/lib/apt/lists/*
# GOOD: one layer, cache cleaned immediately
RUN apt-get update && \
apt-get install -y package && \
rm -rf /var/lib/apt/lists/*
Trade-offs
| Decision | Plus | Minus |
|---|---|---|
| Shell form | Convenience (pipes, variables) | PID 1 problem, signals don’t reach |
| Exec form | Proper signal handling | No shell functionality |
| ARG for config | Simple | Visible in docker history, not runtime |
| ENV for config | Available at runtime | Visible in docker inspect |
| ADD for URL | No RUN curl/wget needed | Unpredictable cache, no retry |
| RUN curl/wget | Control, retry, checksum | Additional layer |
ARG vs ENV: Subtleties
| Characteristic | ARG | ENV |
|---|---|---|
| Available during build | Yes | Yes |
| Available in container | No | Yes |
Visible in docker inspect |
No | Yes |
Visible in docker history |
Yes (value!) | Yes |
Security warning: ARG values are visible in docker history. Don’t pass secrets through ARG!
# BAD: password visible in docker history
ARG DB_PASSWORD=secret123
# GOOD: BuildKit secrets
RUN --mount=type=secret,id=db_pass cat /run/secrets/db_pass
Shell form vs Exec form: Critical Nuance
Exec form (recommended):
ENTRYPOINT ["java", "-jar", "/app.jar"]
Runs directly as a process with PID 1. Correctly handles signals (SIGTERM, SIGKILL). Critical for graceful shutdown in Kubernetes.
Shell form:
ENTRYPOINT java -jar /app.jar
Runs as a subprocess of /bin/sh -c. OS signals arrive at the sh shell, not the application. The application may be “killed” hard without completing transactions.
PID 1 problem: in Linux, process with PID 1 has special behavior — it doesn’t receive SIGTERM by default. Solution: exec form, tini, or docker run --init.
Edge Cases
- ONBUILD in multi-stage:
ONBUILDinstructions execute when the image is used as a base. In multi-stage this can lead to unexpected side effects. - Glob patterns in COPY:
COPY target/*.jar— if no files exist, build fails. If multiple files, all are copied to the specified directory. - Symbolic links:
COPYfollows symlinks on the host. This may include unexpected files. - Timestamps:
COPYpreserves file mtime. This affects build determinism. BuildKit--metadata-filehelps track. - ENV and escaping:
ENV FOO=bar\ baz— space in value.ENV FOO="bar baz"— quotes are included in the value.
HEALTHCHECK: Production Obligation
HEALTHCHECK --interval=30s --timeout=3s --retries=3 --start-period=60s \
CMD curl -f http://localhost:8080/actuator/health || exit 1
Without HEALTHCHECK the orchestrator doesn’t know if the application is alive. The container may be Running, but the application inside — dead (deadlock, out of memory).
BuildKit: Advanced Features
# BuildKit syntax
# syntax=docker/dockerfile:1
# Secrets
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc npm install
# SSH forwarding (for private git repos)
RUN --mount=type=ssh git clone git@github.com:org/private-repo.git
# Cache mount (for package managers)
RUN --mount=type=cache,target=/root/.m2 mvn package
# TMPFS mount
RUN --mount=type=tmpfs,target=/tmp make
ONBUILD: Instruction for Parent Images
# In base image
ONBUILD COPY . /app
ONBUILD RUN mvn package
Used for creating parent images that automate steps for descendants. Dangerous in multi-stage: ONBUILD doesn’t execute for subsequent stages.
LABEL: Metadata for CI/CD
LABEL maintainer="team@example.com"
LABEL version="1.0.0"
LABEL org.opencontainers.image.source="https://github.com/org/repo"
LABEL org.opencontainers.image.revision="abc123"
Useful for tracking images in registry, automation, compliance.
Performance
| Optimization | Impact |
|---|---|
| Separate pom.xml and src | Cache hit rate > 90% |
| Combine RUN commands | 20-40% fewer layers |
| Alpine instead of full | 60-80% smaller size |
| Multi-stage build | 50-70% smaller final image |
| BuildKit cache mount | 50% faster repeated builds |
Summary
- Each
RUN,COPY,ADDcommand creates a new layer — delete temporary files in the same layer. - Use Exec form for
CMDandENTRYPOINT— critical for signal handling. - ARG is visible in
docker history— don’t pass secrets. HEALTHCHECKis a required element of production images.- BuildKit (
--mount=type=secret/cache/ssh) — modern build standard. - Layer optimization = registry space savings + faster deployment.
Interview Cheat Sheet
Must know:
- RUN executes during build, CMD/ENTRYPOINT — at container launch
- Each RUN/COPY/ADD creates a new layer; deleting in a separate RUN doesn’t remove the file from the image
- Exec form
["cmd", "arg"]is critical for signal handling (graceful shutdown) - ARG is visible in
docker history— don’t pass secrets; use BuildKit secrets - ENV is available at runtime, ARG — only during build
- HEALTHCHECK is required for production images
- BuildKit:
--mount=type=secret/cache/ssh— modern build standard
Frequent follow-up questions:
- “Why is shell form bad?” — Runs through
/bin/sh -c, OS signals don’t reach the application (PID 1 problem) - “Why combine RUN commands with &&?” — Each instruction = layer; combining reduces layer count
- “What does ONBUILD do?” — Instructions execute when the image is used as a base (for parent images)
- “Does EXPOSE open a port?” — No, only documents; actual mapping via
-pat launch
Red flags (DO NOT say):
- “EXPOSE makes the port accessible from outside” (only documents, need
-p) - “I pass passwords through ARG” (visible in
docker history) - “I delete files in a separate RUN layer” (files remain in lower layer)
- “I use shell form for ENTRYPOINT” (PID 1 problem, signals are lost)
Related topics:
- [[What is Dockerfile]] — Dockerfile basics
- [[What is the difference between CMD and ENTRYPOINT]] — details about launching
- [[What is multi-stage build]] — image optimization