Docker Multi-Stage Builds

Docker Multi-Stage Builds

Docker multi-stage builds are a powerful feature that allows you to create optimized, production-ready container images. They solve the common problem of bloated images by separating the build environment from the runtime environment.

Why Multi-Stage Builds?

When building applications, you often need tools and dependencies during development that aren’t required in production. Without multi-stage builds, all these artifacts end up in your final image, making it larger and potentially less secure.

Problems with Single-Stage Builds

# Traditional single-stage Dockerfile (bad practice)
FROM node:18-alpine

WORKDIR /app

# Copy package files
COPY package*.json ./
RUN npm install

# Copy source code
COPY . .

# Build the application
RUN npm run build

# Start the application
CMD ["npm", "start"]

Problems:

  • Build tools remain in the final image
  • Source code is exposed in production
  • Larger attack surface
  • Increased image size (hundreds of MB)

Basic Multi-Stage Build

Let’s convert the above to a multi-stage build:

# Stage 1: Build stage
FROM node:18-alpine AS builder

WORKDIR /app

# Copy package files and install dependencies
COPY package*.json ./
RUN npm ci --only=production

# Copy source code and build
COPY . .
RUN npm run build

# Stage 2: Production stage
FROM node:18-alpine AS production

WORKDIR /app

# Copy only the built application and dependencies
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json

# Create non-root user
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nodejs -u 1001
USER nodejs

# Expose port and start application
EXPOSE 3000
CMD ["node", "dist/index.js"]

Benefits:

  • Smaller final image
  • No build tools in production
  • Source code not included
  • Better security

Real-World Examples

1. React Application

# Build stage
FROM node:18-alpine AS builder

WORKDIR /app

# Copy package files and install dependencies
COPY package*.json ./
RUN npm ci

# Copy source and build
COPY . .
RUN npm run build

# Production stage with nginx
FROM nginx:alpine AS production

# Copy built React app to nginx
COPY --from=builder /app/build /usr/share/nginx/html

# Copy nginx configuration
COPY nginx.conf /etc/nginx/nginx.conf

EXPOSE 80

CMD ["nginx", "-g", "daemon off;"]

Nginx configuration (nginx.conf):

events {
    worker_connections 1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;
    
    sendfile        on;
    keepalive_timeout  65;
    
    server {
        listen 80;
        server_name localhost;
        root /usr/share/nginx/html;
        index index.html;
        
        location / {
            try_files $uri $uri/ /index.html;
        }
        
        # Cache static assets
        location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg)$ {
            expires 1y;
            add_header Cache-Control "public, immutable";
        }
        
        # Security headers
        add_header X-Frame-Options "SAMEORIGIN" always;
        add_header X-XSS-Protection "1; mode=block" always;
        add_header X-Content-Type-Options "nosniff" always;
    }
}

2. Go Application

# Build stage
FROM golang:1.21-alpine AS builder

WORKDIR /app

# Install git (required for some Go modules)
RUN apk add --no-cache git

# Copy go mod files and download dependencies
COPY go.mod go.sum ./
RUN go mod download

# Copy source code
COPY . .

# Build the application
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .

# Production stage
FROM scratch AS production

# Copy ca-certificates for HTTPS requests
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/

# Copy the binary
COPY --from=builder /app/main /main

# Expose port
EXPOSE 8080

# Run the binary
CMD ["/main"]

3. Python Application with Poetry

# Build stage
FROM python:3.11-slim AS builder

WORKDIR /app

# Install Poetry
RUN pip install poetry

# Copy poetry files
COPY pyproject.toml poetry.lock ./

# Configure Poetry
RUN poetry config virtualenvs.create false

# Install dependencies
RUN poetry install --no-dev --no-interaction --no-ansi

# Production stage
FROM python:3.11-slim AS production

WORKDIR /app

# Copy installed dependencies
COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin

# Copy application code
COPY . .

# Create non-root user
RUN useradd --create-home --shell /bin/bash app
USER app

# Expose port and start application
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Advanced Multi-Stage Patterns

1. With Development Stage

# Base stage
FROM node:18-alpine AS base
WORKDIR /app
COPY package*.json ./

# Development stage
FROM base AS development
RUN npm ci
COPY . .
EXPOSE 3000
CMD ["npm", "dev"]

# Build stage
FROM base AS builder
RUN npm ci --only=production
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine AS production
WORKDIR /app

COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json

EXPOSE 3000
CMD ["node", "dist/index.js"]

Build commands:

# Development build
docker build --target development -t myapp:dev .

# Production build
docker build --target production -t myapp:latest .

# Builder stage (useful for testing)
docker build --target builder -t myapp:builder .

2. With Test Stage

# Dependencies stage
FROM node:18-alpine AS dependencies
WORKDIR /app
COPY package*.json ./
RUN npm ci

# Test stage
FROM dependencies AS test
COPY . .
RUN npm run test
RUN npm run lint

# Build stage
FROM dependencies AS builder
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=dependencies /app/node_modules ./node_modules
COPY package*.json ./

EXPOSE 3000
CMD ["node", "dist/index.js"]

3. Multi-Architecture Builds

# Build stage
FROM --platform=$BUILDPLATFORM golang:1.21-alpine AS builder
ARG TARGETPLATFORM
ARG BUILDPLATFORM

WORKDIR /app

# Install build dependencies
RUN apk add --no-cache git

# Copy go files
COPY go.mod go.sum ./
RUN go mod download

# Copy source and build
COPY . .
RUN CGO_ENABLED=0 GOOS=${TARGETPLATFORM#*/} GOARCH=${TARGETPLATFORM##*/} go build -o main .

# Production stage
FROM --platform=$TARGETPLATFORM scratch AS production

# Copy binary and certificates
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /app/main /main

EXPOSE 8080
CMD ["/main"]

Build for multiple architectures:

# Build for AMD64, ARM64, and ARMv7
docker buildx build --platform linux/amd64,linux/arm64,linux/arm/v7 -t myapp:latest .

4. With Security Scanning

# Build stage
FROM node:18-alpine AS builder
WORKDIR /app

COPY package*.json ./
RUN npm ci

COPY . .
RUN npm run build

# Security scan stage
FROM node:18-alpine AS security
RUN npm install -g audit-ci
COPY --from=builder /app/package*.json ./
RUN audit-ci --moderate

# Production stage
FROM node:18-alpine AS production
WORKDIR /app

COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json

RUN addgroup -g 1001 -S nodejs
RUN adduser -S nodejs -u 1001
USER nodejs

EXPOSE 3000
CMD ["node", "dist/index.js"]

Optimization Techniques

1. Layer Caching Strategy

# Good: Copy dependencies first
FROM node:18-alpine AS builder
WORKDIR /app

# Copy package files first (changes less frequently)
COPY package*.json ./
RUN npm ci --only=production

# Copy source code (changes frequently)
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine AS production
WORKDIR /app

# Copy built application
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json

EXPOSE 3000
CMD ["node", "dist/index.js"]

2. Using .dockerignore

Create a .dockerignore file to exclude unnecessary files:

# Version control
.git
.gitignore

# Dependencies
node_modules

# Build artifacts
dist
build
*.log

# Development files
.env
.env.local
.env.development.local

# Testing
coverage
.nyc_output

# IDE
.vscode
.idea

# OS
.DS_Store
Thumbs.db

# Temporary files
*.tmp
*.temp

3. Minimal Base Images

# For compiled languages, use scratch or distroless
FROM scratch AS production
# or
FROM gcr.io/distroless/static-debian11 AS production

# For interpreted languages, use alpine
FROM python:3.11-alpine AS production
# or
FROM node:18-alpine AS production

CI/CD Integration

GitHub Actions Example

name: Build and Deploy

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v2
    
    - name: Test
      run: |
        docker build --target test -t myapp:test .
        docker run --rm myapp:test        
  
  build:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v2
    
    - name: Login to Docker Hub
      uses: docker/login-action@v2
      with:
        username: ${{ secrets.DOCKER_USERNAME }}
        password: ${{ secrets.DOCKER_PASSWORD }}
    
    - name: Build and push
      uses: docker/build-push-action@v4
      with:
        context: .
        target: production
        push: true
        tags: |
          username/myapp:latest
          username/myapp:${{ github.sha }}          
        cache-from: type=gha
        cache-to: type=gha,mode=max

GitLab CI Example

stages:
  - test
  - build
  - deploy

test:
  stage: test
  image: docker:latest
  services:
    - docker:dind
  script:
    - docker build --target test -t myapp:test .
    - docker run --rm myapp:test

build:
  stage: build
  image: docker:latest
  services:
    - docker:dind
  script:
    - docker build --target production -t myapp:$CI_COMMIT_SHA .
    - docker tag myapp:$CI_COMMIT_SHA myapp:latest
  only:
    - main

deploy:
  stage: deploy
  script:
    - docker push myapp:$CI_COMMIT_SHA
    - docker push myapp:latest
  only:
    - main

Best Practices

1. Security

# Use non-root user
FROM node:18-alpine AS production
WORKDIR /app

# Create user and group
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nodejs -u 1001

# Set correct ownership
COPY --from=builder --chown=nodejs:nodejs /app/dist ./dist
COPY --from=builder --chown=nodejs:nodejs /app/node_modules ./node_modules

USER nodejs
CMD ["node", "dist/index.js"]

2. Health Checks

FROM node:18-alpine AS production
WORKDIR /app

COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package.json ./package.json

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

EXPOSE 3000
CMD ["node", "dist/index.js"]

3. Labels and Metadata

FROM node:18-alpine AS production

# Add labels
LABEL maintainer="[email protected]"
LABEL version="1.0.0"
LABEL description="My awesome Node.js application"

# Use build args for dynamic labels
ARG BUILD_DATE
ARG VCS_REF
LABEL org.label-schema.build-date=$BUILD_DATE
LABEL org.label-schema.vcs-ref=$VCS_REF
LABEL org.label-schema.schema-version="1.0"

Troubleshooting Common Issues

1. Permission Issues

# Problem: Permission denied when copying files
COPY --from=builder /app/dist ./dist

# Solution: Set correct permissions
COPY --from=builder --chown=1001:1001 /app/dist ./dist

2. Missing Dependencies

# Problem: Runtime dependencies missing in production
COPY --from=builder /app/node_modules ./node_modules

# Solution: Ensure production-only installation
RUN npm ci --only=production

3. Environment Variables

# Problem: Build-time vs runtime environment
FROM node:18-alpine AS builder
ARG NODE_ENV
ENV NODE_ENV=$NODE_ENV
RUN npm ci

FROM node:18-alpine AS production
ENV NODE_ENV=production
# Production-specific setup

Performance Comparison

Single-Stage vs Multi-Stage

# Single-stage image
docker images single-stage-app
# REPOSITORY          SIZE
# single-stage-app    845MB

# Multi-stage image
docker images multi-stage-app  
# REPOSITORY          SIZE
# multi-stage-app     125MB

# Size reduction: 85% smaller!

Build Time Comparison

# First build (no cache)
docker build --target production -t myapp:latest .
# Time: 2m 30s

# Subsequent build (with cache)
docker build --target production -t myapp:latest .
# Time: 45s (layer caching working)

External Resources:

Related Tutorials:

Last updated on