Question 7 · Section 17

What is Service Discovery and why is it needed

In a containerized environment (Docker, Kubernetes), pods and containers are created and destroyed dynamically. Each new container gets a new IP. Hardcoding addresses is impossi...

Language versions: English Russian Ukrainian

Junior Level

Service Discovery is a mechanism that allows services to find each other in a dynamic environment where IP addresses constantly change.

In a containerized environment (Docker, Kubernetes), pods and containers are created and destroyed dynamically. Each new container gets a new IP. Hardcoding addresses is impossible.

Problem: In microservices, services start and stop (especially in Kubernetes). The IP address changes every time. How does service A find service B?

Solution: Service Discovery — like a “phone book” for services.

Service A -> Service Discovery: "Where is service B?"
Service Discovery -> Service A: "Service B is at 10.0.0.5:8080"
Service A -> 10.0.0.5:8080 -> Service B

Middle Level

Types of Service Discovery

1. Client-side discovery:

Service A -> Service Registry -> gets IP -> calls Service B

2. Server-side discovery:

Service A -> Load Balancer -> Service Registry -> Service B

Components

Service Registry:

Stores a list of services and their addresses
Examples: Consul, Eureka, ZooKeeper, etcd

Service Registration:

A service registers itself with the Registry on startup
On shutdown — deregisters

Common mistakes

  1. Stale entries: ``` Service crashed but didn’t have time to deregister Registry still returns the old IP Solution: health checks + TTL

TTL (Time-To-Live) — a record in the registry “expires” after N seconds. If the service doesn’t refresh it in time — it’s automatically removed.


---

### When Service Discovery is NOT needed

- **2-3 services** with static IPs — DNS is simpler
- **Monolith on the path to microservices** — no dynamic scaling yet
- **Single-server deployment** — everything on one machine

---

## Senior Level

### Architectural Trade-offs

| Client-side                    | Server-side           |
| ------------------------------ | --------------------- |
| More control                   | Simpler for clients   |
| Need to implement load balancing | LB handles it       |
| Depends on Registry client     | Clients are independent |

### Production Experience

**Kubernetes:**

K8s has built-in Service Discovery:

  • Service -> DNS name
  • kube-proxy -> load balancing
  • Endpoints -> list of Pod IPs

Service A -> my-service.default.svc.cluster.local -> kube-proxy -> Pod B


### Best Practices

✅ Health checks for each service ✅ TTL for entries ✅ Client-side caching with expiry ✅ Graceful shutdown -> deregister

❌ Without health checks ❌ Long TTL (stale data) ❌ Without fallback when Registry is unavailable ```


Interview Cheat Sheet

Must know:

  • Service Discovery — “phone book” for microservices in a dynamic environment
  • Two types: client-side (client looks itself) and server-side (LB looks)
  • Service Registry stores a list of services and their addresses (Consul, Eureka, etcd)
  • Health checks + TTL solve the stale entries problem
  • Built into Kubernetes: Service -> DNS name -> kube-proxy -> Pod
  • Graceful shutdown -> deregister from Registry
  • NOT needed for 2-3 services with static IPs

Common follow-up questions:

  • What are stale entries? A service crashed, but the Registry entry is still alive — solution: health checks + TTL.
  • Client-side vs server-side? Client-side = more control, server-side = simpler for clients.
  • Why TTL? A record “expires” after N seconds, if the service doesn’t refresh — it’s removed automatically.
  • How does K8s Service Discovery work? Service -> DNS name -> kube-proxy -> Endpoints (list of Pod IPs).

Red flags (DO NOT say):

  • “Service Discovery is always needed” — no, for 2-3 services DNS is enough
  • “TTL is not important, services are stable” — in a containerized environment IPs change constantly
  • “Client-side is not needed, only server-side” — Netflix Ribbon, Spring Cloud LoadBalancer
  • “Registry = single point of failure, so not needed” — you need fallback and clustering

Related topics:

  • [[8. What is the difference between client-side and server-side discovery]]
  • [[9. What is API Gateway and what problems does it solve]]
  • [[12. How to implement horizontal scaling of microservices]]
  • [[26. What tools are used for microservice orchestration]]
  • [[15. How to organize communication between microservices]]