- Published on
Docker Swarm: Complete Deployment Guide for NestJS & Next.js Apps
- Authors

- Name
- Rakesh Tembhurne
- @tembhurnerakesh
Docker Swarm: Complete Deployment Guide
Docker Swarm deployment guide for NestJS 11 backend and Next.js 15 frontend using one stack file for both development and production environments.
Architecture Overview
Tech Stack:
- Frontend: Next.js 15
- Backend: NestJS 11
- Database: PostgreSQL 18
- Cache: Redis 8
- Message Queue: Kafka 4.1 (Confluent Platform 7.9)
- Reverse Proxy: Nginx (production only)
Server Setup:
- Development: 3 servers (10.0.0.20, 10.0.0.21, 10.0.0.22)
- Production: 3 servers (same IPs, different domain/SSL)
All 3 nodes are managers for high availability and fault tolerance.
Part 1: Initial Server Setup (All 3 Servers)
What You'll Do
- Install Docker on all 3 servers
- Configure firewalls
- Test network connectivity
Where to Run: ON EACH SERVER (10.0.0.20, 10.0.0.21, 10.0.0.22)
# SSH into each server and run these commands
# Update system
sudo apt update && sudo apt upgrade -y
# Install Docker (latest version)
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Add your user to docker group (avoid using sudo)
sudo usermod -aG docker $USER
# IMPORTANT: Log out and log back in, or run:
newgrp docker
# Verify Docker installation
docker --version # Should show 24.0+
# Enable Docker to start on boot
sudo systemctl enable docker
sudo systemctl start docker
# Configure firewall (if UFW is enabled)
sudo ufw allow 2377/tcp # Swarm management
sudo ufw allow 7946/tcp # Container network discovery
sudo ufw allow 7946/udp # Container network discovery
sudo ufw allow 4789/udp # Overlay network traffic
sudo ufw allow 80/tcp # HTTP
sudo ufw allow 443/tcp # HTTPS
sudo ufw allow 22/tcp # SSH (if not already allowed)
# Verify UFW status
sudo ufw status
# Test network connectivity from this server to others
ping -c 3 10.0.0.20
ping -c 3 10.0.0.21
ping -c 3 10.0.0.22
✅ Checkpoint: You should be able to run docker ps without sudo on all 3 servers.
Part 2: Initialize Docker Swarm Cluster
What You'll Do
- Initialize swarm on first server (becomes Leader)
- Join other servers as managers
- Verify all nodes are connected
Step 1: Initialize Swarm (Only on Server 1)
Where to Run: 10.0.0.20 (SSH into this server first)
# Initialize swarm and make this node the leader
docker swarm init --advertise-addr 10.0.0.20
# You'll see output like:
# Swarm initialized: current node (abc123xyz) is now a manager.
#
# To add a manager to this swarm, run the following command:
# docker swarm join --token SWMTKN-1-xxxMANAGERTOKENxxx 10.0.0.20:2377
IMPORTANT: Copy the entire docker swarm join --token... command from the output. You'll need it in Step 2.
If you didn't copy it, retrieve the manager token:
# Run this on 10.0.0.20 to get the token again
docker swarm join-token manager
Verify swarm is initialized:
# Run on 10.0.0.20
docker node ls
# Output should show:
# ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
# abc123xyz * server1 Ready Active Leader 24.0.x
Step 2: Join Server 2 as Manager
Where to Run: 10.0.0.21 (SSH into this server)
# Use the EXACT command from Step 1 output (replace with your actual token)
docker swarm join --token SWMTKN-1-xxxMANAGERTOKENxxx 10.0.0.20:2377
# You'll see:
# This node joined a swarm as a manager.
Step 3: Join Server 3 as Manager
Where to Run: 10.0.0.22 (SSH into this server)
# Use the EXACT same command from Step 1
docker swarm join --token SWMTKN-1-xxxMANAGERTOKENxxx 10.0.0.20:2377
# You'll see:
# This node joined a swarm as a manager.
Step 4: Verify Cluster
Where to Run: Any manager (10.0.0.20 or 10.0.0.21 or 10.0.0.22)
# Check all nodes are connected
docker node ls
# Output should show all 3 nodes:
# ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
# abc123xyz * server1 Ready Active Leader 24.0.x
# def456uvw server2 Ready Active Reachable 24.0.x
# ghi789rst server3 Ready Active Reachable 24.0.x
✅ Checkpoint: All 3 nodes should show STATUS: Ready and MANAGER STATUS: Leader/Reachable.
Part 3: Prepare Deployment Files
What You'll Do
- Create project directory
- Create single stack file for both dev and production
- Create environment files
- Create Nginx configs
- Create Dockerfiles
Where to Run: 10.0.0.20 (primary manager)
Step 1: Create Project Structure
# Create project directory on primary manager
mkdir -p ~/swarm-app
cd ~/swarm-app
# Create directory structure
mkdir -p nginx backend frontend
Step 2: Create Single Stack File
Create docker-stack.yml (works for both dev and production):
version: '3.8'
services:
# PostgreSQL Database
postgres:
image: postgres:18-alpine
environment:
POSTGRES_USER: ${POSTGRES_USER:-devuser}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-devpassword}
POSTGRES_DB: ${POSTGRES_DB:-myapp_dev}
volumes:
- postgres_data:/var/lib/postgresql/data
networks:
- backend
ports:
- target: 5432
published: ${POSTGRES_PORT:-5432}
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.labels.database == true
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
healthcheck:
test: ['CMD-SHELL', 'pg_isready -U ${POSTGRES_USER:-devuser}']
interval: 10s
timeout: 5s
retries: 5
# Redis Cache
redis:
image: redis:8-alpine
command: redis-server --appendonly yes ${REDIS_PASSWORD_CMD}
volumes:
- redis_data:/data
networks:
- backend
ports:
- target: 6379
published: ${REDIS_PORT:-6379}
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.labels.database == true
restart_policy:
condition: on-failure
healthcheck:
test: ['CMD', 'redis-cli', 'ping']
interval: 10s
timeout: 3s
retries: 5
# Zookeeper (required for Kafka)
zookeeper:
image: confluentinc/cp-zookeeper:7.9.0
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
volumes:
- zookeeper_data:/var/lib/zookeeper/data
- zookeeper_logs:/var/lib/zookeeper/log
networks:
- backend
deploy:
replicas: 1
placement:
constraints:
- node.labels.database == true
restart_policy:
condition: on-failure
# Kafka Message Queue
kafka:
image: confluentinc/cp-kafka:7.9.0
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_LOG_RETENTION_HOURS: 168
volumes:
- kafka_data:/var/lib/kafka/data
networks:
- backend
ports:
- target: 9092
published: ${KAFKA_PORT:-9092}
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.labels.database == true
restart_policy:
condition: on-failure
resources:
limits:
cpus: '1'
memory: 2G
reservations:
cpus: '0.5'
memory: 1G
# NestJS Backend
backend:
image: ${BACKEND_IMAGE:-myregistry/nestjs-backend:dev}
environment:
NODE_ENV: ${NODE_ENV:-development}
DATABASE_HOST: postgres
DATABASE_PORT: 5432
DATABASE_USER: ${POSTGRES_USER:-devuser}
DATABASE_PASSWORD: ${POSTGRES_PASSWORD:-devpassword}
DATABASE_NAME: ${POSTGRES_DB:-myapp_dev}
REDIS_HOST: redis
REDIS_PORT: 6379
KAFKA_BROKERS: kafka:9092
PORT: 3000
networks:
- backend
- frontend
ports:
- target: 3000
published: ${BACKEND_PORT:-3000}
mode: ${PORT_MODE:-host}
deploy:
replicas: ${BACKEND_REPLICAS:-2}
placement:
constraints:
- node.labels.application == true
preferences:
- spread: node.id
update_config:
parallelism: 1
delay: 10s
failure_action: rollback
order: start-first
rollback_config:
parallelism: 1
delay: 5s
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
resources:
limits:
cpus: '1'
memory: 1G
reservations:
cpus: '0.5'
memory: 512M
# Next.js Frontend
frontend:
image: ${FRONTEND_IMAGE:-myregistry/nextjs-frontend:dev}
environment:
NODE_ENV: ${NODE_ENV:-development}
NEXT_PUBLIC_API_URL: ${NEXT_PUBLIC_API_URL:-http://10.0.0.20:3000}
networks:
- frontend
ports:
- target: 3000
published: ${FRONTEND_PORT:-3001}
mode: ${PORT_MODE:-host}
deploy:
replicas: ${FRONTEND_REPLICAS:-1}
placement:
constraints:
- node.labels.application == true
preferences:
- spread: node.id
update_config:
parallelism: 1
delay: 10s
failure_action: rollback
order: start-first
restart_policy:
condition: on-failure
resources:
limits:
cpus: '0.5'
memory: 512M
reservations:
cpus: '0.25'
memory: 256M
# Nginx Reverse Proxy (production only)
nginx:
image: nginx:1.27-alpine
ports:
- target: 80
published: 80
- target: 443
published: 443
volumes:
- ssl_certs:/etc/nginx/ssl:ro
- nginx_cache:/var/cache/nginx
configs:
- source: nginx_config
target: /etc/nginx/conf.d/default.conf
networks:
- frontend
deploy:
replicas: ${NGINX_REPLICAS:-0}
placement:
constraints:
- node.role == manager
preferences:
- spread: node.id
update_config:
parallelism: 1
delay: 10s
restart_policy:
condition: on-failure
healthcheck:
test: ['CMD', 'wget', '--quiet', '--tries=1', '--spider', 'http://localhost/health']
interval: 30s
timeout: 10s
retries: 3
networks:
frontend:
driver: overlay
attachable: true
backend:
driver: overlay
attachable: true
volumes:
postgres_data:
redis_data:
kafka_data:
zookeeper_data:
zookeeper_logs:
ssl_certs:
nginx_cache:
configs:
nginx_config:
external: true
Step 3: Create Environment Files
Development Environment (dev.env):
# Create dev.env file
cat > ~/swarm-app/dev.env <<'EOF'
# Node labels will be set by deployment script
NODE_ENV=development
POSTGRES_USER=devuser
POSTGRES_PASSWORD=devpassword
POSTGRES_DB=myapp_dev
POSTGRES_PORT=5432
REDIS_PORT=6379
REDIS_PASSWORD_CMD=
KAFKA_PORT=9092
BACKEND_IMAGE=myregistry/nestjs-backend:dev
BACKEND_PORT=3000
BACKEND_REPLICAS=2
FRONTEND_IMAGE=myregistry/nextjs-frontend:dev
FRONTEND_PORT=3001
FRONTEND_REPLICAS=1
NEXT_PUBLIC_API_URL=http://10.0.0.20:3000
NGINX_REPLICAS=0
PORT_MODE=host
EOF
Production Environment (prod.env):
# Create prod.env file
cat > ~/swarm-app/prod.env <<'EOF'
NODE_ENV=production
POSTGRES_USER=produser
POSTGRES_PASSWORD=CHANGE_THIS_IN_ACTUAL_PRODUCTION
POSTGRES_DB=myapp_prod
POSTGRES_PORT=
REDIS_PORT=
REDIS_PASSWORD_CMD=--requirepass YOUR_REDIS_PASSWORD
KAFKA_PORT=
BACKEND_IMAGE=myregistry/nestjs-backend:latest
BACKEND_PORT=3000
BACKEND_REPLICAS=3
FRONTEND_IMAGE=myregistry/nextjs-frontend:latest
FRONTEND_PORT=3000
FRONTEND_REPLICAS=3
NEXT_PUBLIC_API_URL=https://api.yourdomain.com
NGINX_REPLICAS=2
PORT_MODE=ingress
EOF
Step 4: Create Nginx Configuration
Where to Run: 10.0.0.20
# Create nginx config for production
cat > ~/swarm-app/nginx/production.conf <<'EOF'
upstream backend {
least_conn;
server backend:3000 max_fails=3 fail_timeout=30s;
}
upstream frontend {
least_conn;
server frontend:3000 max_fails=3 fail_timeout=30s;
}
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=web_limit:10m rate=50r/s;
server {
listen 80;
server_name yourdomain.com api.yourdomain.com;
location /.well-known/acme-challenge/ {
root /var/www/certbot;
}
location / {
return 301 https://$host$request_uri;
}
}
server {
listen 443 ssl http2;
server_name yourdomain.com;
ssl_certificate /etc/nginx/ssl/fullchain.pem;
ssl_certificate_key /etc/nginx/ssl/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header Strict-Transport-Security "max-age=31536000" always;
limit_req zone=web_limit burst=20 nodelay;
location / {
proxy_pass http://frontend;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /health {
return 200 "healthy\n";
add_header Content-Type text/plain;
}
}
server {
listen 443 ssl http2;
server_name api.yourdomain.com;
ssl_certificate /etc/nginx/ssl/fullchain.pem;
ssl_certificate_key /etc/nginx/ssl/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
add_header X-Frame-Options "DENY" always;
add_header X-Content-Type-Options "nosniff" always;
limit_req zone=api_limit burst=20 nodelay;
client_max_body_size 10M;
location / {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /health {
proxy_pass http://backend/health;
}
}
EOF
Step 5: Create Dockerfiles
NestJS Backend Dockerfile:
cat > ~/swarm-app/backend/Dockerfile <<'EOF'
FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
COPY yarn.lock ./
RUN yarn install --frozen-lockfile
COPY . .
RUN yarn build
FROM node:22-alpine
WORKDIR /app
COPY package*.json ./
COPY yarn.lock ./
RUN yarn install --frozen-lockfile --production
COPY --from=builder /app/dist ./dist
RUN addgroup -g 1001 -S nodejs && adduser -S nestjs -u 1001
USER nestjs
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s \
CMD node -e "require('http').get('http://localhost:3000/health', (r) => process.exit(r.statusCode === 200 ? 0 : 1))"
CMD ["node", "dist/main.js"]
EOF
Next.js Frontend Dockerfile:
cat > ~/swarm-app/frontend/Dockerfile <<'EOF'
FROM node:22-alpine AS deps
WORKDIR /app
COPY package*.json ./
COPY yarn.lock ./
RUN yarn install --frozen-lockfile
FROM node:22-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
ENV NEXT_TELEMETRY_DISABLED 1
RUN yarn build
FROM node:22-alpine
WORKDIR /app
ENV NODE_ENV production
ENV NEXT_TELEMETRY_DISABLED 1
RUN addgroup -g 1001 -S nodejs && adduser -S nextjs -u 1001
COPY --from=builder /app/public ./public
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static
USER nextjs
EXPOSE 3000
ENV PORT 3000
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s \
CMD node -e "require('http').get('http://localhost:3000', (r) => process.exit(r.statusCode === 200 ? 0 : 1))"
CMD ["node", "server.js"]
EOF
Important for Next.js: Add to next.config.js:
module.exports = {
output: 'standalone',
// ... other config
}
Part 4: Setup Private Docker Registry (Optional but Recommended)
What You'll Do
- Deploy Docker Registry as a swarm service
- Configure insecure registry on all nodes (for development)
- Push/pull images from your local machine to the registry
Why Use Private Registry?
- Images built on your local machine can be pushed to the registry
- All swarm nodes automatically pull images from the registry
- No need to build images on each server
- Version control for your Docker images
- Works offline (no external dependencies)
Where to Run: Commands will specify each location
Step 1: Deploy Docker Registry Service
Where to Run: 10.0.0.20 (primary manager)
# Create registry directory for persistent storage
sudo mkdir -p /mnt/registry
# Deploy registry as a swarm service
docker service create \
--name registry \
--publish published=5000,target=5000 \
--constraint 'node.labels.database==true' \
--mount type=bind,source=/mnt/registry,target=/var/lib/registry \
registry:2
# Verify registry is running
docker service ls | grep registry
docker service ps registry
# Test registry
curl http://10.0.0.20:5000/v2/_catalog
# Should return: {"repositories":[]}
Step 2: Configure Insecure Registry on All Nodes
Where to Run: ON EACH SERVER (10.0.0.20, 10.0.0.21, 10.0.0.22)
# Edit Docker daemon config
sudo tee /etc/docker/daemon.json <<EOF
{
"insecure-registries": ["10.0.0.20:5000"]
}
EOF
# Restart Docker daemon
sudo systemctl restart docker
# Verify configuration
docker info | grep -A 5 "Insecure Registries"
# Should show: 10.0.0.20:5000
Important: After restarting Docker on all nodes, you may need to rejoin the swarm:
# On 10.0.0.20 - get join token
docker swarm join-token manager
# On 10.0.0.21 and 10.0.0.22 - rejoin if needed
# docker swarm join --token SWMTKN-1-xxx... 10.0.0.20:2377
# Verify all nodes are back
docker node ls
Step 3: Configure Your Local Machine
Where to Run: YOUR LOCAL MACHINE (not the servers)
# Edit Docker daemon config on your local machine
sudo tee /etc/docker/daemon.json <<EOF
{
"insecure-registries": ["10.0.0.20:5000"]
}
EOF
# Restart Docker
sudo systemctl restart docker # Linux
# or restart Docker Desktop (Mac/Windows)
# Test connectivity from local machine
curl http://10.0.0.20:5000/v2/_catalog
Step 4: Build and Push Images from Local Machine
Where to Run: YOUR LOCAL MACHINE
# Clone your application code to local machine
cd ~/projects/myapp
# Build backend image
cd backend
docker build -t 10.0.0.20:5000/nestjs-backend:dev .
# Build frontend image
cd ../frontend
docker build -t 10.0.0.20:5000/nextjs-frontend:dev .
# Push images to private registry
docker push 10.0.0.20:5000/nestjs-backend:dev
docker push 10.0.0.20:5000/nextjs-frontend:dev
# Verify images are in registry
curl http://10.0.0.20:5000/v2/_catalog
# Should return: {"repositories":["nestjs-backend","nextjs-frontend"]}
# Check tags
curl http://10.0.0.20:5000/v2/nestjs-backend/tags/list
curl http://10.0.0.20:5000/v2/nextjs-frontend/tags/list
Step 5: Update Stack File to Use Registry
Where to Run: 10.0.0.20
Edit environment files to use registry URL:
# Edit dev.env
nano ~/swarm-app/dev.env
Change image references:
BACKEND_IMAGE=10.0.0.20:5000/nestjs-backend:dev
FRONTEND_IMAGE=10.0.0.20:5000/nextjs-frontend:dev
# Edit prod.env
nano ~/swarm-app/prod.env
Change image references:
BACKEND_IMAGE=10.0.0.20:5000/nestjs-backend:latest
FRONTEND_IMAGE=10.0.0.20:5000/nextjs-frontend:latest
Step 6: Production Images Workflow
Where to Run: YOUR LOCAL MACHINE
# Build production images locally
cd ~/projects/myapp/backend
docker build -t 10.0.0.20:5000/nestjs-backend:latest .
cd ~/projects/myapp/frontend
docker build -t 10.0.0.20:5000/nextjs-frontend:latest .
# Tag with version (good practice)
docker tag 10.0.0.20:5000/nestjs-backend:latest 10.0.0.20:5000/nestjs-backend:v1.0.0
docker tag 10.0.0.20:5000/nextjs-frontend:latest 10.0.0.20:5000/nextjs-frontend:v1.0.0
# Push to registry
docker push 10.0.0.20:5000/nestjs-backend:latest
docker push 10.0.0.20:5000/nestjs-backend:v1.0.0
docker push 10.0.0.20:5000/nextjs-frontend:latest
docker push 10.0.0.20:5000/nextjs-frontend:v1.0.0
Then on swarm cluster:
Where to Run: 10.0.0.20
# Deploy/update stack - images will be pulled from registry automatically
export $(cat prod.env | xargs)
docker stack deploy -c docker-stack.yml myapp
# All nodes will pull from 10.0.0.20:5000 automatically
Step 7: Secure Registry for Production (Optional)
For production, you should secure the registry with TLS and authentication:
Where to Run: 10.0.0.20
# Generate htpasswd file for authentication
sudo apt install -y apache2-utils
mkdir -p ~/registry-auth
# Create user (replace 'admin' and 'password')
htpasswd -Bc ~/registry-auth/htpasswd admin
# Create certificates directory
mkdir -p ~/registry-certs
# Generate self-signed certificate (or use Let's Encrypt)
openssl req -newkey rsa:4096 -nodes -sha256 \
-keyout ~/registry-certs/domain.key \
-x509 -days 365 \
-out ~/registry-certs/domain.crt \
-subj "/CN=10.0.0.20"
# Update registry service with auth and TLS
docker service rm registry
docker service create \
--name registry \
--publish published=5000,target=5000 \
--constraint 'node.labels.database==true' \
--mount type=bind,source=/mnt/registry,target=/var/lib/registry \
--mount type=bind,source=$HOME/registry-auth,target=/auth \
--mount type=bind,source=$HOME/registry-certs,target=/certs \
-e REGISTRY_AUTH=htpasswd \
-e REGISTRY_AUTH_HTPASSWD_REALM="Registry Realm" \
-e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
registry:2
# Login from local machine
docker login 10.0.0.20:5000
# Username: admin
# Password: <your-password>
Registry Management Commands
# List all images in registry
curl http://10.0.0.20:5000/v2/_catalog
# List tags for an image
curl http://10.0.0.20:5000/v2/nestjs-backend/tags/list
# Delete an image (requires registry with DELETE enabled)
# First, get the digest
curl -I -H "Accept: application/vnd.docker.distribution.manifest.v2+json" \
http://10.0.0.20:5000/v2/nestjs-backend/manifests/v1.0.0
# Then delete using digest
# curl -X DELETE http://10.0.0.20:5000/v2/nestjs-backend/manifests/<digest>
# View registry logs
docker service logs registry -f
# Check registry disk usage
du -sh /mnt/registry
Complete Workflow Summary
Development Workflow:
- Local Machine: Write code
- Local Machine: Build images →
docker build -t 10.0.0.20:5000/app:dev . - Local Machine: Push to registry →
docker push 10.0.0.20:5000/app:dev - Swarm (10.0.0.20): Deploy →
docker stack deploy -c docker-stack.yml myapp - All Swarm Nodes: Automatically pull from registry
Production Workflow:
- Local Machine: Build production image →
docker build -t 10.0.0.20:5000/app:v1.0.0 . - Local Machine: Tag as latest →
docker tag 10.0.0.20:5000/app:v1.0.0 10.0.0.20:5000/app:latest - Local Machine: Push →
docker push 10.0.0.20:5000/app:latest - Swarm: Update service →
docker service update --image 10.0.0.20:5000/app:latest myapp_backend
Part 5: Development Deployment
What You'll Do
- Label nodes for workload placement
- Deploy stack in development mode
- Verify deployment
Where to Run: 10.0.0.20 (primary manager)
Step 1: Label Nodes
# Get node names
docker node ls
# Label all nodes (replace server1, server2, server3 with actual hostnames)
docker node update --label-add database=true server1
docker node update --label-add application=true server1
docker node update --label-add application=true server2
docker node update --label-add application=true server3
# Verify labels
docker node inspect server1 --format '{{ .Spec.Labels }}'
docker node inspect server2 --format '{{ .Spec.Labels }}'
docker node inspect server3 --format '{{ .Spec.Labels }}'
Step 2: Deploy Development Stack
Important: Make sure you've already pushed images to the registry (see Part 4)
# Go to project directory
cd ~/swarm-app
# Deploy with development environment
docker stack deploy -c docker-stack.yml myapp < dev.env
# OR export variables first (recommended)
export $(cat dev.env | xargs)
docker stack deploy -c docker-stack.yml myapp
Step 3: Verify Deployment
# Watch deployment progress
watch -n 2 'docker service ls'
# Check which node is running what
docker stack ps myapp --format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}"
# Check logs
docker service logs myapp_backend -f
docker service logs myapp_frontend -f
Step 4: Access Development Services
- Frontend: http://10.0.0.20:3001
- Backend: http://10.0.0.20:3000
- PostgreSQL: 10.0.0.20:5432
- Redis: 10.0.0.20:6379
- Kafka: 10.0.0.20:9092
Part 6: Production Deployment
What You'll Do
- Set up SSL certificates
- Create Docker configs
- Update environment variables
- Deploy production stack
Where to Run: 10.0.0.20 (primary manager)
Step 1: Setup SSL Certificates
# Install Certbot
sudo apt update
sudo apt install -y certbot
# Generate SSL certificate (standalone mode)
sudo certbot certonly --standalone \
-d yourdomain.com \
-d api.yourdomain.com \
--email your@email.com \
--agree-tos \
--non-interactive
# Copy certificates to Docker volume
sudo mkdir -p /var/lib/docker/volumes/myapp_ssl_certs/_data
sudo cp /etc/letsencrypt/live/yourdomain.com/fullchain.pem \
/var/lib/docker/volumes/myapp_ssl_certs/_data/
sudo cp /etc/letsencrypt/live/yourdomain.com/privkey.pem \
/var/lib/docker/volumes/myapp_ssl_certs/_data/
# Set permissions
sudo chmod 644 /var/lib/docker/volumes/myapp_ssl_certs/_data/fullchain.pem
sudo chmod 600 /var/lib/docker/volumes/myapp_ssl_certs/_data/privkey.pem
Step 2: Create Nginx Config as Docker Config
# Create nginx config in Docker
docker config create nginx_config ~/swarm-app/nginx/production.conf
# Verify
docker config ls
Step 3: Update Production Environment Variables
# Edit prod.env to set actual passwords
nano ~/swarm-app/prod.env
# Change these values:
# POSTGRES_PASSWORD=<your-strong-password>
# REDIS_PASSWORD_CMD=--requirepass <your-redis-password>
Step 4: Deploy Production Stack
Important: Make sure you've pushed production images to registry from your local machine (see Part 4, Step 6)
# Remove development stack first (if running)
docker stack rm myapp
# Wait for cleanup (about 10-15 seconds)
sleep 15
# Deploy production stack
cd ~/swarm-app
export $(cat prod.env | xargs)
docker stack deploy -c docker-stack.yml myapp
# Watch deployment
watch -n 2 'docker service ls'
Step 5: Verify Production Deployment
# Check all services
docker service ls
# Should show:
# - myapp_backend: 3/3 replicas
# - myapp_frontend: 3/3 replicas
# - myapp_nginx: 2/2 replicas
# - myapp_postgres: 1/1 replica
# - myapp_redis: 1/1 replica
# - myapp_kafka: 1/1 replica
# - myapp_zookeeper: 1/1 replica
# Test endpoints
curl https://yourdomain.com/health
curl https://api.yourdomain.com/health
# View logs
docker service logs myapp_nginx -f
Step 6: SSL Auto-Renewal
# Create renewal script
sudo tee /usr/local/bin/renew-certs.sh <<'EOF'
#!/bin/bash
certbot renew --quiet
# Copy renewed certificates
cp /etc/letsencrypt/live/yourdomain.com/fullchain.pem /var/lib/docker/volumes/myapp_ssl_certs/_data/
cp /etc/letsencrypt/live/yourdomain.com/privkey.pem /var/lib/docker/volumes/myapp_ssl_certs/_data/
# Reload nginx
docker service update --force myapp_nginx
EOF
# Make executable
sudo chmod +x /usr/local/bin/renew-certs.sh
# Add to crontab (runs daily at 2am)
echo "0 2 * * * root /usr/local/bin/renew-certs.sh" | sudo tee -a /etc/crontab
Part 7: Day-to-Day Operations
Common Commands (Run on Any Manager)
Service Management
# List services
docker service ls
# Scale services
docker service scale myapp_backend=5
docker service scale myapp_frontend=4
# Update service image
docker service update --image myregistry/nestjs-backend:v2.0.0 myapp_backend
# View logs
docker service logs myapp_backend -f --tail 100
docker service logs --since 1h myapp_backend
# Restart service
docker service update --force myapp_backend
# Rollback service
docker service rollback myapp_backend
Stack Management
# View stack services
docker stack services myapp
# View stack tasks (shows which node runs what)
docker stack ps myapp
# Update stack (after changing docker-stack.yml or .env)
export $(cat prod.env | xargs)
docker stack deploy -c docker-stack.yml myapp
# Remove stack
docker stack rm myapp
Node Management
# List nodes
docker node ls
# Drain node for maintenance (stop scheduling new tasks)
docker node update --availability drain server2
# Activate node after maintenance
docker node update --availability active server2
# Remove node (must be drained or down first)
docker node rm server3
Monitoring
# Resource usage
docker stats --no-stream
# Health check
docker service ps myapp_backend --filter "desired-state=running"
# Failed tasks
docker service ps myapp_backend --filter "desired-state=shutdown"
# Inspect network
docker network inspect myapp_frontend
Part 8: Troubleshooting
Service Won't Start
Where to Run: Any manager
# Check service errors
docker service ps myapp_backend --no-trunc
# View detailed logs
docker service logs myapp_backend --tail 100
# Inspect service
docker service inspect myapp_backend --pretty
# Check image
docker service inspect --format='{{.Spec.TaskTemplate.ContainerSpec.Image}}' myapp_backend
Test Connectivity Between Services
# Test from backend to postgres
docker run --rm --network myapp_backend alpine ping postgres
# Test from backend to redis
docker run --rm --network myapp_backend alpine ping redis
Backup Database
# Backup PostgreSQL
docker exec $(docker ps -q -f name=myapp_postgres) \
pg_dump -U produser myapp_prod > backup_$(date +%Y%m%d).sql
# Restore
cat backup_20251015.sql | \
docker exec -i $(docker ps -q -f name=myapp_postgres) \
psql -U produser myapp_prod
Failover Testing
# On server2, simulate failure
docker swarm leave
# On any remaining manager, check status
docker node ls
# server2 will show "Down"
# Services redistribute automatically
docker stack ps myapp
# Rejoin server2
docker swarm join --token SWMTKN-1-xxx... 10.0.0.20:2377
docker node promote server2
Part 9: Scaling Beyond 3 Servers
Overview
The examples use 3 servers for simplicity, but production deployments often need more nodes. This section covers scaling to 5, 10, or even 50+ nodes.
Typical Scenarios:
- Development: 3 servers (10.0.0.20, 10.0.0.21, 10.0.0.22)
- Staging: 5 servers (10.0.0.20-24)
- Production: 10+ servers (could be same IPs or different)
Manager vs Worker Node Strategy
Recommended Configurations
| Cluster Size | Managers | Workers | Quorum | Failure Tolerance |
|---|---|---|---|---|
| 3 nodes | 3 | 0 | 2 | 1 manager down |
| 5 nodes | 3 | 2 | 2 | 1 manager down |
| 7 nodes | 3 | 4 | 2 | 1 manager down |
| 10 nodes | 5 | 5 | 3 | 2 managers down |
| 20 nodes | 5 | 15 | 3 | 2 managers down |
| 50 nodes | 7 | 43 | 4 | 3 managers down |
Key Rules:
- Always use odd number of managers (3, 5, 7, 9)
- More than 7 managers is NOT recommended
- Workers handle application workloads
- Managers handle orchestration + can run workloads
Example: Scaling from 3 to 10 Servers
Current Setup (3 Servers)
10.0.0.20 - Manager (Leader) + Worker
10.0.0.21 - Manager (Reachable) + Worker
10.0.0.22 - Manager (Reachable) + Worker
Target Setup (10 Servers - 5 Managers + 5 Workers)
10.0.0.20 - Manager (Leader)
10.0.0.21 - Manager (Reachable)
10.0.0.22 - Manager (Reachable)
10.0.0.23 - Manager (Reachable) [NEW]
10.0.0.24 - Manager (Reachable) [NEW]
10.0.0.25 - Worker [NEW]
10.0.0.26 - Worker [NEW]
10.0.0.27 - Worker [NEW]
10.0.0.28 - Worker [NEW]
10.0.0.29 - Worker [NEW]
Step 1: Prepare New Servers
Where to Run: ON EACH NEW SERVER (10.0.0.23-29)
# SSH into each new server and run:
sudo apt update && sudo apt upgrade -y
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Add user to docker group
sudo usermod -aG docker $USER
newgrp docker
# Enable Docker
sudo systemctl enable docker
sudo systemctl start docker
# Configure firewall
sudo ufw allow 2377/tcp
sudo ufw allow 7946/tcp
sudo ufw allow 7946/udp
sudo ufw allow 4789/udp
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
# Configure insecure registry (if using private registry)
sudo tee /etc/docker/daemon.json <<EOF
{
"insecure-registries": ["10.0.0.20:5000"]
}
EOF
sudo systemctl restart docker
# Test connectivity to existing cluster
ping -c 3 10.0.0.20
Step 2: Get Join Tokens
Where to Run: 10.0.0.20 (any existing manager)
# Get manager join token
docker swarm join-token manager
# Get worker join token
docker swarm join-token worker
# Save both tokens - you'll need them for new nodes
Step 3: Add New Manager Nodes
Where to Run: 10.0.0.23 and 10.0.0.24
# Join as manager
docker swarm join --token SWMTKN-1-xxxMANAGERTOKENxxx 10.0.0.20:2377
Verify on any manager:
docker node ls
# Should show 5 managers now:
# - 1 Leader
# - 4 Reachable
Step 4: Add Worker Nodes
Where to Run: 10.0.0.25, 10.0.0.26, 10.0.0.27, 10.0.0.28, 10.0.0.29
# Join as worker
docker swarm join --token SWMTKN-1-xxxWORKERTOKENxxx 10.0.0.20:2377
Verify on any manager:
docker node ls
# Should show 10 nodes total:
# - 5 Managers (1 Leader, 4 Reachable)
# - 5 Workers
Step 5: Label New Nodes
Where to Run: 10.0.0.20 (any manager)
# Label new managers
docker node update --label-add env=production --label-add application=true server4
docker node update --label-add env=production --label-add application=true server5
# Label new workers
docker node update --label-add env=production --label-add application=true server6
docker node update --label-add env=production --label-add application=true server7
docker node update --label-add env=production --label-add application=true server8
docker node update --label-add env=production --label-add application=true server9
docker node update --label-add env=production --label-add application=true server10
# Verify labels
docker node ls --format "table {{.Hostname}}\t{{.ManagerStatus}}\t{{.Availability}}"
Step 6: Scale Services Automatically
Where to Run: 10.0.0.20
Services will automatically redistribute across all nodes based on labels and constraints.
# Scale backend to use more nodes
docker service scale myapp_backend=10
# Scale frontend to use more nodes
docker service scale myapp_frontend=10
# Check distribution
docker service ps myapp_backend --format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}"
docker service ps myapp_frontend --format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}"
# Example output showing distribution:
# NAME NODE CURRENT STATE
# myapp_backend.1 server1 Running
# myapp_backend.2 server2 Running
# myapp_backend.3 server3 Running
# myapp_backend.4 server4 Running
# myapp_backend.5 server5 Running
# myapp_backend.6 server6 Running
# myapp_backend.7 server7 Running
# myapp_backend.8 server8 Running
# myapp_backend.9 server9 Running
# myapp_backend.10 server10 Running
Handling Different IP Addresses
Scenario: Production Servers with Different IPs
Example:
Development:
10.0.0.20, 10.0.0.21, 10.0.0.22
Production:
192.168.1.10, 192.168.1.11, 192.168.1.12, ... 192.168.1.20
Step 1: Initialize Production Swarm
Where to Run: 192.168.1.10 (first production server)
# Initialize with production IP
docker swarm init --advertise-addr 192.168.1.10
# Get tokens
docker swarm join-token manager
docker swarm join-token worker
Step 2: Update Registry Configuration
On All Production Servers:
# Update insecure registry to production IP
sudo tee /etc/docker/daemon.json <<EOF
{
"insecure-registries": ["192.168.1.10:5000"]
}
EOF
sudo systemctl restart docker
Deploy Registry on Production:
# Deploy registry on production cluster
docker service create \
--name registry \
--publish published=5000,target=5000 \
--constraint 'node.labels.database==true' \
--mount type=bind,source=/mnt/registry,target=/var/lib/registry \
registry:2
Step 3: Update Environment Files
On Production Primary Manager (192.168.1.10):
Edit prod.env to use production IPs:
# prod.env
BACKEND_IMAGE=192.168.1.10:5000/nestjs-backend:latest
FRONTEND_IMAGE=192.168.1.10:5000/nextjs-frontend:latest
NEXT_PUBLIC_API_URL=https://api.yourproductiondomain.com
Step 4: Push Images to Production Registry
From Your Local Machine:
# Configure local machine to push to production registry
sudo tee -a /etc/docker/daemon.json <<EOF
{
"insecure-registries": ["10.0.0.20:5000", "192.168.1.10:5000"]
}
EOF
sudo systemctl restart docker
# Tag and push to production registry
docker tag 10.0.0.20:5000/nestjs-backend:latest 192.168.1.10:5000/nestjs-backend:latest
docker push 192.168.1.10:5000/nestjs-backend:latest
docker tag 10.0.0.20:5000/nextjs-frontend:latest 192.168.1.10:5000/nextjs-frontend:latest
docker push 192.168.1.10:5000/nextjs-frontend:latest
Advanced Node Management
Dedicated Node Roles
Label nodes for specific workloads:
# Database nodes (SSD storage, high memory)
docker node update --label-add role=database --label-add storage=ssd server1
docker node update --label-add role=database --label-add storage=ssd server2
# Application nodes (general purpose)
docker node update --label-add role=application server3
docker node update --label-add role=application server4
docker node update --label-add role=application server5
# Message queue nodes (high CPU)
docker node update --label-add role=messaging --label-add cpu=high server6
# Frontend CDN edge nodes
docker node update --label-add role=edge --label-add region=us-east server7
docker node update --label-add role=edge --label-add region=eu-west server8
Update Stack for Dedicated Roles
Edit docker-stack.yml:
services:
postgres:
deploy:
placement:
constraints:
- node.labels.role == database
- node.labels.storage == ssd
kafka:
deploy:
placement:
constraints:
- node.labels.role == messaging
backend:
deploy:
replicas: 10
placement:
constraints:
- node.labels.role == application
preferences:
- spread: node.id
frontend:
deploy:
replicas: 10
placement:
constraints:
- node.labels.role == edge
preferences:
- spread: node.labels.region
Load Balancer Configuration for Multiple Nodes
Using External Load Balancer (Recommended for Production)
When you have multiple nodes, use an external load balancer (HAProxy, AWS ELB, Nginx) to distribute traffic:
[External Load Balancer]
HAProxy / AWS ELB
IP: 203.0.113.100
|
----------------------------------------
| | |
10.0.0.20:80 10.0.0.21:80 10.0.0.22:80
(nginx replica 1) (nginx replica 2) (nginx replica 3)
HAProxy Configuration Example:
# /etc/haproxy/haproxy.cfg
frontend http_front
bind *:80
bind *:443 ssl crt /etc/ssl/certs/yourdomain.pem
default_backend swarm_nodes
backend swarm_nodes
balance roundrobin
option httpchk GET /health
server node1 10.0.0.20:80 check
server node2 10.0.0.21:80 check
server node3 10.0.0.22:80 check
server node4 10.0.0.23:80 check
server node5 10.0.0.24:80 check
Using Docker Swarm Ingress (Built-in)
Docker Swarm has built-in ingress load balancing. When you publish a port in ingress mode (default), you can access the service via ANY node IP:
services:
nginx:
ports:
- target: 80
published: 80
mode: ingress # Default - accessible via ANY node
This means:
http://10.0.0.20:80→ routes to nginxhttp://10.0.0.21:80→ routes to nginxhttp://10.0.0.22:80→ routes to nginx- All node IPs work! Swarm handles routing internally
DNS Round-Robin Setup:
# Set up DNS with multiple A records
yourdomain.com. IN A 10.0.0.20
yourdomain.com. IN A 10.0.0.21
yourdomain.com. IN A 10.0.0.22
yourdomain.com. IN A 10.0.0.23
yourdomain.com. IN A 10.0.0.24
Maintenance: Draining Nodes
When you need to update or restart a node:
# Drain node (move all tasks away)
docker node update --availability drain server5
# Perform maintenance
ssh user@server5
sudo apt update && sudo apt upgrade -y
sudo reboot
# Reactivate node
docker node update --availability active server5
# Verify tasks are redistributed
docker node ps server5
Removing Nodes from Cluster
# On the node to remove
docker swarm leave
# On manager, verify it's down
docker node ls
# Remove from cluster (on manager)
docker node rm server10
# For manager nodes, demote first
docker node demote server5
# Then on server5
docker swarm leave
# Then remove
docker node rm server5
Quick Commands for Multi-Node Operations
# View node distribution
docker node ls --format "table {{.Hostname}}\t{{.Status}}\t{{.ManagerStatus}}\t{{.Availability}}"
# View service distribution across all nodes
docker service ps myapp_backend --format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}\t{{.Error}}"
# Check which node is Leader
docker node ls | grep Leader
# Rebalance services across nodes
docker service update --force myapp_backend
docker service update --force myapp_frontend
# Check resource usage across all nodes (run on each node)
for node in server{1..10}; do
ssh user@$node "hostname && docker stats --no-stream"
done
Summary
Key Points
- One Stack File: Use
docker-stack.ymlwith environment variables for both dev and production - Environment Files:
dev.envfor development,prod.envfor production - All Commands Run on Managers: Any manager node can run management commands
- Scalable Architecture: Start with 3 nodes, scale to 10+ as needed
- Flexible IPs: Works with same or different IP ranges for dev/prod
Quick Reference
Initialize Swarm:
- Run on 10.0.0.20:
docker swarm init --advertise-addr 10.0.0.20 - Run on 10.0.0.21 & 10.0.0.22:
docker swarm join --token ...
Deploy Development:
export $(cat dev.env | xargs)
docker stack deploy -c docker-stack.yml myapp
Deploy Production:
export $(cat prod.env | xargs)
docker stack deploy -c docker-stack.yml myapp
Update Stack:
# Edit .env file or docker-stack.yml
export $(cat prod.env | xargs)
docker stack deploy -c docker-stack.yml myapp
Access Points
Development:
- Frontend: http://10.0.0.20:3001
- Backend: http://10.0.0.20:3000
Production:
- Frontend: https://yourdomain.com
- Backend: https://api.yourdomain.com
Related Posts
Building Modular Architecture in Next.js
Learn how to build a modular architecture in Next.js 15 that allows you to add, remove, and manage features as independent modules. Complete with a production-ready products CRUD example showing modular components, type-safe APIs, and seamless feature integration.
Why I Built My Own Next.js Boilerplate: Breaking Free from One-Size-Fits-All Solutions
In an AI-driven world where technology evolves rapidly, discover why I created a developer-centric Next.js boilerplate that prioritizes your preferred tools, instant theme switching, and automated UI generation.
Docker Complete Guide for Developers: Modern Practices for 2025
A comprehensive modern guide covering latest Docker features including BuildKit, Compose Watch, Docker Scout, Kubernetes deployment, and production-ready examples with PostgreSQL 17, Redis 7.4, Kafka 3.8, and ClickHouse.