What tools are used for microservices orchestration
Kubernetes is the de facto standard thanks to: openness (CNCF), support by all cloud providers (GKE, EKS, AKS), huge community, standard API.
π’ Junior Level
Orchestration is managing the lifecycle of microservices: starting, stopping, scaling, updating.
Kubernetes is the de facto standard thanks to: openness (CNCF), support by all cloud providers (GKE, EKS, AKS), huge community, standard API.
Main tools
| Tool | What it does |
|---|---|
| Kubernetes (K8s) | Container orchestration (de facto standard) |
| Docker Compose | Local orchestration for development |
| Docker Swarm | Simple container orchestration |
| Apache Mesos | Cluster orchestration |
| HashiCorp Nomad | Simple Kubernetes alternative |
Docker Compose (for development)
version: '3.8'
services:
api-gateway:
build: ./gateway
ports:
- "8080:8080"
depends_on:
- user-service
- order-service
user-service:
build: ./user-service
environment:
- DB_HOST=postgres
- KAFKA_BROKERS=kafka:9092
order-service:
build: ./order-service
environment:
- DB_HOST=postgres
- KAFKA_BROKERS=kafka:9092
postgres:
image: postgres:15
# Image versions (cp-kafka:7.5.0, postgres:15) are examples at time of writing.
# In production, use the latest stable versions.
environment:
POSTGRES_PASSWORD: secret
kafka:
image: confluentinc/cp-kafka:7.5.0
ports:
- "9092:9092"
Kubernetes β basic Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
spec:
replicas: 3
selector:
matchLabels:
app: user-service
template:
metadata:
labels:
app: user-service
spec:
containers:
- name: user-service
image: registry.example.com/user-service:1.2.0
ports:
- containerPort: 8080
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
When NOT to use Kubernetes
- Small team (1-3 devs) β operational complexity is unjustified
- Simple applications β Docker Compose is sufficient
- Strict budget constraints β K8s requires at least 3 nodes
π‘ Middle Level
Kubernetes β main resources
# Service β network access to pods
apiVersion: v1
kind: Service
metadata:
name: user-service
spec:
selector:
app: user-service
ports:
- port: 80
targetPort: 8080
type: ClusterIP # internal access
---
# Ingress β external access via API Gateway
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-gateway
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: api.example.com
http:
paths:
- path: /users
pathType: Prefix
backend:
service:
name: user-service
port:
number: 80
- path: /orders
pathType: Prefix
backend:
service:
name: order-service
port:
number: 80
Helm β package manager for K8s
# Chart.yaml
apiVersion: v2
name: microservices-stack
version: 1.0.0
dependencies:
- name: user-service
version: 1.0.0
- name: order-service
version: 1.0.0
- name: kafka
version: 22.0.0
repository: https://charts.bitnami.com/bitnami
Rolling Update β update without downtime
# Update version
kubectl set image deployment/user-service \
user-service=registry.example.com/user-service:1.3.0
# Rollback on problems
kubectl rollout undo deployment/user-service
# Check status
kubectl rollout status deployment/user-service
Service Mesh β Istio
# VirtualService β traffic management
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: user-service
spec:
hosts:
- user-service
http:
- route:
- destination:
host: user-service
subset: v1
weight: 90
- destination:
host: user-service
subset: v2
weight: 10 # Canary deployment β 10% traffic to v2
---
# DestinationRule β subsets
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: user-service
spec:
host: user-service
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
Tool comparison
| Tool | Complexity | Scale | Use Case |
|---|---|---|---|
| Docker Compose | Low | 1 host | Local development |
| Docker Swarm | Medium | Multiple hosts | Small clusters |
| Kubernetes | High | Any | Production (de facto standard) |
| Nomad | Medium | Any | Simple K8s alternative |
| OpenShift | High | Enterprise | Kubernetes + additional tooling |
π΄ Senior Level
Orchestration architecture
βββββββββββββββββββββββββββββββββββββββββββ
β API Gateway / Ingress β
β (Nginx, Traefik, Istio) β
ββββββββββββββ¬βββββββββββββββββββββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββ
β β β
βββββββββββΌβββββ ββββββββββΌβββββββ βββββββββΌβββββββ
β User Service β βOrder Service β βNotify Serviceβ
β (3 replicas) β β (5 replicas) β β (2 replicas) β
βββββββββββ¬βββββ ββββββββββ¬βββββββ βββββββββ¬βββββββ
β β β
βββββββββββΌβββββ ββββββββββΌβββββββ βββββββββΌβββββββ
β PostgreSQL β β PostgreSQL β β Kafka β
β (Primary + β β (Primary + β β Cluster β
β Replica) β β Replica) β β β
ββββββββββββββββ βββββββββββββββββ ββββββββββββββββ
HPA β Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: user-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: user-service
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000"
behavior:
scaleUp:
stabilizationWindowSeconds: 60
// stabilizationWindowSeconds: cooldown period after scaling.
// scaleDown: 300s β slow scale-down (avoids flapping).
// scaleUp: 60s β fast scale-up (reacts to spikes).
policies:
- type: Pods
value: 4
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300 # Don't scale down aggressively
policies:
- type: Pods
value: 1
periodSeconds: 120
PodDisruptionBudget β protection from cascading failures
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: user-service-pdb
spec:
minAvailable: 2 # Minimum 2 pods always running
selector:
matchLabels:
app: user-service
GitOps β ArgoCD / Flux
# ArgoCD Application
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: user-service
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/company/k8s-manifests
targetRevision: HEAD
path: k8s/user-service
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
Kustomize β environment-specific configuration management
# kustomization.yaml (base)
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
- ingress.yaml
# overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../base
patches:
- target:
kind: Deployment
name: user-service
patch: |-
- op: replace
path: /spec/replicas
value: 5
Service Mesh Patterns
# Fault Injection β resilience testing
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: user-service-fault
spec:
hosts:
- user-service
http:
- fault:
delay:
percentage:
value: 10
fixedDelay: 5s
abort:
percentage:
value: 5
httpStatus: 503
route:
- destination:
host: user-service
# Rate Limiting
apiVersion: networking.istio.io/v1beta1
kind: EnvoyFilter
metadata:
name: rate-limit
spec:
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.local_ratelimit
Production Checklist
β
Helm Charts for packaging
β
GitOps (ArgoCD/Flux) for deployment
β
HPA for autoscaling
β
PDB for protection from cascading failures
β
Pod Anti-Affinity for node distribution
β
Resource Limits to prevent noisy neighbor
β
Liveness/Readiness/Startup Probes
β
Service Mesh (Istio/Linkerd) for observability
β
Canary/Blue-Green deployments
β
Network Policies for isolation
β
Secrets Management (Vault, Sealed Secrets)
β
Pod Security Standards
π― Interview Cheat Sheet
Must know:
- Kubernetes β de facto standard for production orchestration (CNCF, all cloud providers)
- Docker Compose β local development, not production
- Main K8s resources: Deployment, Service, Ingress, HPA, PDB
- HPA β automatic scaling by CPU/memory/custom metrics
- Helm β package manager for K8s, dependency management
- Rolling Update β update without downtime, rollback with one command
- Service Mesh (Istio) β canary deployment, fault injection, rate limiting
- GitOps (ArgoCD/Flux) β automatic deployment from git repository
- Do NOT use K8s for small teams (1-3 devs), simple applications
Frequent follow-up questions:
- HPA stabilizationWindowSeconds? Cooldown period after scaling. scaleUp: 60s (fast), scaleDown: 300s (slow, avoids flapping).
- Why PodDisruptionBudget? Guarantees minimum running pods during maintenance β minAvailable: 2.
- Istio fault injection? Intentionally adds delay/abort for resilience testing β chaos engineering.
- GitOps advantages? Audit trail (git history), rollback = git revert, self-healing (ArgoCD syncs).
Red flags (NOT to say):
- βDocker Compose for productionβ β no, only for development
- βK8s is needed for every projectβ β no, operational complexity is unjustified for small teams
- βHPA at 95% CPU β efficientβ β no, wonβt have time to scale
- βIstio = Kubernetes replacementβ β no, Istio runs on top of K8s (service mesh)
Related topics:
- [[12. How to implement horizontal scaling of microservices]]
- [[7. What is Service Discovery and why is it needed]]
- [[9. What is API Gateway and what problems does it solve]]
- [[21. How to monitor a distributed microservices system]]
- [[17. How to ensure fault tolerance of microservices]]