on
Keep traffic local: Topology-aware routing and EndpointSlices made simple
Kubernetes networking can feel like a crowded festival: pods are musicians on separate stages (zones), services are the festival promoters trying to route fans (traffic) to the right stage, and the network wants to avoid sending fans across the whole venue when a closer stage is available. Two relatively recent pieces of the Kubernetes networking puzzle — EndpointSlices and Topology Aware Routing — work together to keep traffic local, reduce latency and cut cross-zone costs. This article explains what they do, why they matter, and how they behave in the real world.
Why locality matters
When a client talks to a Service, the packets may travel between availability zones (AZs) if the chosen backend pod lives in a different zone. That extra hop adds latency, consumes cross-zone bandwidth, and in cloud environments can have a direct cost. Preferring same-zone backends improves performance and reliability for multi-zone clusters, especially for chatty or latency-sensitive services. The Kubernetes Topology Aware Routing feature exists to encourage that behavior. (kubernetes.io)
The pieces: EndpointSlices and topology hints
EndpointSlices are the modern way Kubernetes represents the set of addresses (endpoints) backing a Service. They scale better than the older Endpoints object and carry richer metadata — including topology information and the new “hints” that suggest which zones an endpoint should serve. EndpointSlices are now the default mechanism for service discovery and are replacing the legacy Endpoints API. (kubernetes.io)
Topology Aware Routing (formerly called Topology Aware Hints) uses those hints: the EndpointSlice controller calculates a proportional allocation of endpoints across zones and sets the hints on EndpointSlice objects. Cluster components such as kube-proxy (or other service proxies) can read those hints and prefer local endpoints when routing traffic. (kubernetes.io)
How it actually works (at a high level)
- The EndpointSlice controller looks at where pods run and how much capacity each zone has (it currently uses allocatable CPU as part of the heuristic). It then populates an endpoints.hints field to allocate endpoints across zones in proportion to capacity. (kubernetes.io)
- Each node’s kube-proxy (or another dataplane consumer of EndpointSlices) can filter the endpoints it will route to using those hints, preferring endpoints assigned to the node’s zone. The result: fewer cross-zone hops when the hints are available and usable. (kubernetes.io)
The built-in heuristic is called “Auto.” It tries to distribute endpoints per zone proportionally; it’s useful when you have many backends per service and traffic originates roughly evenly across zones. (kubernetes.io)
Enabling topology-aware routing on a Service
Kubernetes added a straightforward way to opt a Service into topology-aware routing. You can annotate the Service to request the controller populate hints for that Service. For modern clusters, set the service annotation service.kubernetes.io/topology-mode to “Auto”. (Historically an earlier annotation was used; the docs now direct you to the newer annotation.) Example:
apiVersion: v1
kind: Service
metadata:
name: web
annotations:
service.kubernetes.io/topology-mode: "Auto"
spec:
selector:
app: web
ports:
- protocol: TCP
port: 80
targetPort: 8080
When the annotation is present and the cluster conditions are suitable, EndpointSlices for this Service will include hints like forZones: - name: “zone-a”, and kube-proxy will use those hints to favor same-zone endpoints. (kubernetes.io)
Safeguards and common caveats
Topology-aware routing is helpful, but it’s not a magical switch that fixes every topology problem. The control plane and kube-proxy apply multiple safeguards and will fall back to cluster-wide routing if conditions aren’t safe:
- Not enough endpoints: If a Service has fewer endpoints than the number of zones, the controller won’t set hints. The heuristics need enough endpoints to make a meaningful distribution. In practice this means services with only one or two pods per zone usually won’t see benefits. (kubernetes.io)
- Missing zone labels or resource reporting: If nodes lack topology.kubernetes.io/zone labels or don’t report allocatable CPU, the controller won’t generate hints and the dataplane won’t filter. That makes correct node labeling and healthy node resource reporting essential. (kubernetes.io)
- Transition states: If some EndpointSlices are missing hints (e.g., during upgrades or partial reconciliation), kube-proxy may treat the service as “in transition” and route to all endpoints to avoid accidental blackholing. (kubernetes.io)
- Specific features don’t mix: Topology-aware hints are not used when internalTrafficPolicy is set to Local on a Service — those are orthogonal approaches. (kubernetes.io)
Real-world operators have also observed interactions between topology hints and higher-level controllers or ingress implementations. For example, some ingress setups have reported 503s when topology-aware hints were present but the controller / dataplane didn’t have matching endpoints in a zone, or when the controller and ingress dataplane had mismatched assumptions about which endpoints to include. That’s a practical reminder that your ingress/load-balancer layer, service proxies, and EndpointSlice consumers all need to be compatible with hints to get the intended behavior. (github.com)
When topology-aware routing shines — and when it doesn’t
It works best when:
- Traffic originates roughly evenly across zones (no single-zone hot-spot).
- Services have a significant number of endpoints (the docs recommend ~3+ endpoints per zone to make the heuristic effective).
- Node topology labels and resource reporting are intact, and your dataplane honors EndpointSlice hints. (kubernetes.io)
It’s less useful when:
- A service has very few pods or very uneven traffic origin.
- You rely on a dataplane implementation that doesn’t consume EndpointSlice hints.
- Your control plane and data plane are out of sync during upgrades, causing temporary fallback behavior.
A simple analogy
Imagine a food truck festival where each food truck can serve certain areas. EndpointSlices are the festival map that lists which trucks are where; topology hints are the little signs that say “best served to Zone A.” If the festival organizers are confident that there are enough trucks in each zone, they hand out signs so people nearby head to the nearby truck. But if a zone has only one truck, or the map is outdated, the organizers stop using signs and tell people to choose any truck — safer than sending a crowd to a non-existent truck. (kubernetes.io)
The bigger picture
EndpointSlices are now the canonical representation of service backends and offer richer metadata (hints, conditions, terminating endpoints) than the legacy Endpoints API. Kubernetes is continuing the transition away from Endpoints toward EndpointSlices as the primary discovery mechanism, which unlocks features like topology-aware routing. That transition affects many parts of the ecosystem — controllers, proxies, and cloud integrations — so expect evolving behavior during upgrades. (kubernetes.io)
Final note
Topology-aware routing is a pragmatic, Kubernetes-native way to reduce cross-zone chatter and improve locality without re-architecting your services. It relies on EndpointSlices, correct cluster topology metadata, and a dataplane that respects the hints. Like any orchestration feature, it pays to understand the safeguards and failure modes so you don’t get surprised when the cluster falls back to global routing during transitions or edge cases. (kubernetes.io)
References:
- Kubernetes: Topology Aware Routing (concepts doc). (kubernetes.io)
- Kubernetes: EndpointSlices (concepts doc). (kubernetes.io)
- Kubernetes blog: continuing the transition from Endpoints to EndpointSlices. (v1-34.docs.kubernetes.io)
- Ingress NGINX issue discussing 503s and topology-aware hints interactions. (github.com)