What are the main instructions used in Dockerfile?

Junior Level

Main Instructions

A Dockerfile consists of instructions, each of which creates a new layer in the image. Here are the main ones:

Environment Definition Instructions

Instruction	What it does	Example
`FROM`	Sets the base image	`FROM openjdk:17-slim`
`WORKDIR`	Sets the working directory	`WORKDIR /app`
`ENV`	Sets environment variables	`ENV APP_PORT=8080`

File Operation Instructions

Instruction	What it does
`COPY`	Copies files from host into the image
`ADD`	Like COPY, but can unpack archives

Command Execution Instructions

Instruction	What it does
`RUN`	Executes a command during image build
`CMD`	Default command when container starts
`ENTRYPOINT`	Main container launch command

Configuration Instructions

Instruction	What it does
`EXPOSE`	Documents the application port
`USER`	Specifies the user for running

Simple Example

FROM openjdk:17-jdk-slim       # Base image
WORKDIR /app                    # Working directory
COPY myapp.jar app.jar          # Copy jar file
EXPOSE 8080                     # Port (documentation)
ENTRYPOINT ["java", "-jar", "app.jar"]  # Launch command

COPY vs ADD

COPY — preferred option (95% of cases).
ADD — if you need auto-extraction of tar archives or downloading from URL (but for URLs, RUN curl is better).

Exec form ["cmd", "arg"] — command runs directly. Shell form cmd arg — through shell /bin/sh -c. ENV — variable is available both during build and in the running container. ARG — only during build.

What to Remember

FROM is always the first instruction
RUN — executes during build, CMD/ENTRYPOINT — at launch
COPY — preferred option (95% of cases). ADD — if you need auto-extraction of tar archives or URL download.
EXPOSE doesn’t open the port, only documents it
Use WORKDIR instead of RUN cd ...

Middle Level

Instruction Classification

1. Environment Definition Instructions

FROM — sets the base image. Any Dockerfile starts with this. Best Practice: specify a concrete version (openjdk:17-slim), not latest.

ENV — sets environment variables. Available both during build and in the running container.

ENV JAVA_OPTS="-Xmx512m"
ENV APP_ENV=production

ARG — defines variables available only during the build process.

ARG APP_VERSION=1.0.0
RUN echo "Building version $APP_VERSION"

WORKDIR — sets the working directory. All subsequent commands execute relative to it.

WORKDIR /app    # better than RUN cd /app

2. File Operation Instructions

COPY — copies files from host into the image.

COPY pom.xml /app/
COPY src/ /app/src/

ADD — extended version of COPY. Can unpack archives (.tar.gz) and download files from URLs.

ADD app.tar.gz /app/  # automatically unpacks

3. Command Execution Instructions

RUN — executes a command during build and records the result in a new layer.

# Combine commands to reduce layers
RUN apt-get update && \
    apt-get install -y git curl && \
    rm -rf /var/lib/apt/lists/*
// Deletion in the same RUN is critical: if you delete in a separate RUN,
// files remain in the lower layer and will be in the image.

CMD — sets the default command on container start. Easy to override.

CMD ["--server.port=8080"]

ENTRYPOINT — defines the main launch command. Harder to override.

ENTRYPOINT ["java", "-jar", "/app.jar"]

4. Access Configuration Instructions

EXPOSE — documents the port. Does not actually publish it (you need -p at launch).

USER — specifies the user for running.

RUN useradd -r appuser
USER appuser

VOLUME — creates a mount point for persistent data.

VOLUME ["/data"]

Typical Mistakes

Mistake	Consequence	How to avoid
`RUN apt-get update` in a separate layer	Cache goes stale, packages not found	Combine `update && install` in one RUN
Using shell form `CMD java -jar`	Signals don’t reach the application	Use exec form `["java", "-jar"]`
Passing secrets via ARG	Passwords visible in `docker history`	Use BuildKit secrets (`--mount=type=secret`)
Deleting files in a separate RUN	Files remain in lower layers	Delete in the same RUN where you created them
Multiple CMD/ENTRYPOINT	Only the last one counts	One CMD, one ENTRYPOINT per file

CMD vs ENTRYPOINT Comparison

Instruction	Can override?	Main purpose
ENTRYPOINT	With difficulty (`--entrypoint`)	Fixed command
CMD	Very easily	Default parameters

Best Practices

Combine RUN commands via && to reduce layers
Clean package cache in the same RUN layer
Use WORKDIR instead of RUN cd chains
Always use Multi-stage build to separate build and runtime

What to Remember

Each RUN, COPY, ADD creates a new layer
Instruction order affects caching
Use Exec form ["cmd", "arg"] instead of Shell form
Clean cache in the same layer as installation
ENTRYPOINT + CMD together — best practice

Senior Level

Instruction Architecture and Image Impact

Understanding instruction nuances is critical for creating secure, compact, and fast-to-build images.

Deep Analysis: Layered Model

Each RUN, COPY, ADD instruction creates a new layer. Layers are read-only filesystems combined through UnionFS (Overlay2).

Critical consequence: deleting a file in a new layer doesn’t remove it from the image — only a “whiteout” entry is created. File size in image = sum of all layers where it appears.

# BAD: file remains in lower layer
RUN apt-get update && apt-get install -y package
RUN rm -rf /var/lib/apt/lists/*

# GOOD: one layer, cache cleaned immediately
RUN apt-get update && \
    apt-get install -y package && \
    rm -rf /var/lib/apt/lists/*

Trade-offs

Decision	Plus	Minus
Shell form	Convenience (pipes, variables)	PID 1 problem, signals don’t reach
Exec form	Proper signal handling	No shell functionality
ARG for config	Simple	Visible in `docker history`, not runtime
ENV for config	Available at runtime	Visible in `docker inspect`
ADD for URL	No RUN curl/wget needed	Unpredictable cache, no retry
RUN curl/wget	Control, retry, checksum	Additional layer

ARG vs ENV: Subtleties

Characteristic	ARG	ENV
Available during build	Yes	Yes
Available in container	No	Yes
Visible in `docker inspect`	No	Yes
Visible in `docker history`	Yes (value!)	Yes

Security warning: ARG values are visible in docker history. Don’t pass secrets through ARG!

# BAD: password visible in docker history
ARG DB_PASSWORD=secret123

# GOOD: BuildKit secrets
RUN --mount=type=secret,id=db_pass cat /run/secrets/db_pass

Shell form vs Exec form: Critical Nuance

Exec form (recommended):

ENTRYPOINT ["java", "-jar", "/app.jar"]

Runs directly as a process with PID 1. Correctly handles signals (SIGTERM, SIGKILL). Critical for graceful shutdown in Kubernetes.

Shell form:

ENTRYPOINT java -jar /app.jar

Runs as a subprocess of /bin/sh -c. OS signals arrive at the sh shell, not the application. The application may be “killed” hard without completing transactions.

PID 1 problem: in Linux, process with PID 1 has special behavior — it doesn’t receive SIGTERM by default. Solution: exec form, tini, or docker run --init.

Edge Cases

ONBUILD in multi-stage: ONBUILD instructions execute when the image is used as a base. In multi-stage this can lead to unexpected side effects.
Glob patterns in COPY: COPY target/*.jar — if no files exist, build fails. If multiple files, all are copied to the specified directory.
Symbolic links: COPY follows symlinks on the host. This may include unexpected files.
Timestamps: COPY preserves file mtime. This affects build determinism. BuildKit --metadata-file helps track.
ENV and escaping: ENV FOO=bar\ baz — space in value. ENV FOO="bar baz" — quotes are included in the value.

HEALTHCHECK: Production Obligation

HEALTHCHECK --interval=30s --timeout=3s --retries=3 --start-period=60s \
  CMD curl -f http://localhost:8080/actuator/health || exit 1

Without HEALTHCHECK the orchestrator doesn’t know if the application is alive. The container may be Running, but the application inside — dead (deadlock, out of memory).

BuildKit: Advanced Features

# BuildKit syntax
# syntax=docker/dockerfile:1

# Secrets
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc npm install

# SSH forwarding (for private git repos)
RUN --mount=type=ssh git clone git@github.com:org/private-repo.git

# Cache mount (for package managers)
RUN --mount=type=cache,target=/root/.m2 mvn package

# TMPFS mount
RUN --mount=type=tmpfs,target=/tmp make

ONBUILD: Instruction for Parent Images

# In base image
ONBUILD COPY . /app
ONBUILD RUN mvn package

Used for creating parent images that automate steps for descendants. Dangerous in multi-stage: ONBUILD doesn’t execute for subsequent stages.

LABEL: Metadata for CI/CD

LABEL maintainer="team@example.com"
LABEL version="1.0.0"
LABEL org.opencontainers.image.source="https://github.com/org/repo"
LABEL org.opencontainers.image.revision="abc123"

Useful for tracking images in registry, automation, compliance.

Performance

Optimization	Impact
Separate pom.xml and src	Cache hit rate > 90%
Combine RUN commands	20-40% fewer layers
Alpine instead of full	60-80% smaller size
Multi-stage build	50-70% smaller final image
BuildKit cache mount	50% faster repeated builds

Summary

Each RUN, COPY, ADD command creates a new layer — delete temporary files in the same layer.
Use Exec form for CMD and ENTRYPOINT — critical for signal handling.
ARG is visible in docker history — don’t pass secrets.
HEALTHCHECK is a required element of production images.
BuildKit (--mount=type=secret/cache/ssh) — modern build standard.
Layer optimization = registry space savings + faster deployment.

Interview Cheat Sheet

Must know:

RUN executes during build, CMD/ENTRYPOINT — at container launch
Each RUN/COPY/ADD creates a new layer; deleting in a separate RUN doesn’t remove the file from the image
Exec form ["cmd", "arg"] is critical for signal handling (graceful shutdown)
ARG is visible in docker history — don’t pass secrets; use BuildKit secrets
ENV is available at runtime, ARG — only during build
HEALTHCHECK is required for production images
BuildKit: --mount=type=secret/cache/ssh — modern build standard

Frequent follow-up questions:

“Why is shell form bad?” — Runs through /bin/sh -c, OS signals don’t reach the application (PID 1 problem)
“Why combine RUN commands with &&?” — Each instruction = layer; combining reduces layer count
“What does ONBUILD do?” — Instructions execute when the image is used as a base (for parent images)
“Does EXPOSE open a port?” — No, only documents; actual mapping via -p at launch

Red flags (DO NOT say):

“EXPOSE makes the port accessible from outside” (only documents, need -p)
“I pass passwords through ARG” (visible in docker history)
“I delete files in a separate RUN layer” (files remain in lower layer)
“I use shell form for ENTRYPOINT” (PID 1 problem, signals are lost)

Related topics:

[[What is Dockerfile]] — Dockerfile basics
[[What is the difference between CMD and ENTRYPOINT]] — details about launching
[[What is multi-stage build]] — image optimization