Keep traffic local: Topology-aware routing and EndpointSlices made simple

Kubernetes networking can feel like a crowded festival: pods are musicians on separate stages (zones), services are the festival promoters trying to route fans (traffic) to the right stage, and the network wants to avoid sending fans across the whole venue when a closer stage is available. Two relatively recent pieces of the Kubernetes networking puzzle — EndpointSlices and Topology Aware Routing — work together to keep traffic local, reduce latency and cut cross-zone costs. This article explains what they do, why they matter, and how they behave in the real world.

Why locality matters

When a client talks to a Service, the packets may travel between availability zones (AZs) if the chosen backend pod lives in a different zone. That extra hop adds latency, consumes cross-zone bandwidth, and in cloud environments can have a direct cost. Preferring same-zone backends improves performance and reliability for multi-zone clusters, especially for chatty or latency-sensitive services. The Kubernetes Topology Aware Routing feature exists to encourage that behavior. (kubernetes.io)

The pieces: EndpointSlices and topology hints

EndpointSlices are the modern way Kubernetes represents the set of addresses (endpoints) backing a Service. They scale better than the older Endpoints object and carry richer metadata — including topology information and the new “hints” that suggest which zones an endpoint should serve. EndpointSlices are now the default mechanism for service discovery and are replacing the legacy Endpoints API. (kubernetes.io)

Topology Aware Routing (formerly called Topology Aware Hints) uses those hints: the EndpointSlice controller calculates a proportional allocation of endpoints across zones and sets the hints on EndpointSlice objects. Cluster components such as kube-proxy (or other service proxies) can read those hints and prefer local endpoints when routing traffic. (kubernetes.io)

How it actually works (at a high level)

The built-in heuristic is called “Auto.” It tries to distribute endpoints per zone proportionally; it’s useful when you have many backends per service and traffic originates roughly evenly across zones. (kubernetes.io)

Enabling topology-aware routing on a Service

Kubernetes added a straightforward way to opt a Service into topology-aware routing. You can annotate the Service to request the controller populate hints for that Service. For modern clusters, set the service annotation service.kubernetes.io/topology-mode to “Auto”. (Historically an earlier annotation was used; the docs now direct you to the newer annotation.) Example:

apiVersion: v1
kind: Service
metadata:
  name: web
  annotations:
    service.kubernetes.io/topology-mode: "Auto"
spec:
  selector:
    app: web
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

When the annotation is present and the cluster conditions are suitable, EndpointSlices for this Service will include hints like forZones: - name: “zone-a”, and kube-proxy will use those hints to favor same-zone endpoints. (kubernetes.io)

Safeguards and common caveats

Topology-aware routing is helpful, but it’s not a magical switch that fixes every topology problem. The control plane and kube-proxy apply multiple safeguards and will fall back to cluster-wide routing if conditions aren’t safe:

Real-world operators have also observed interactions between topology hints and higher-level controllers or ingress implementations. For example, some ingress setups have reported 503s when topology-aware hints were present but the controller / dataplane didn’t have matching endpoints in a zone, or when the controller and ingress dataplane had mismatched assumptions about which endpoints to include. That’s a practical reminder that your ingress/load-balancer layer, service proxies, and EndpointSlice consumers all need to be compatible with hints to get the intended behavior. (github.com)

When topology-aware routing shines — and when it doesn’t

It works best when:

It’s less useful when:

A simple analogy

Imagine a food truck festival where each food truck can serve certain areas. EndpointSlices are the festival map that lists which trucks are where; topology hints are the little signs that say “best served to Zone A.” If the festival organizers are confident that there are enough trucks in each zone, they hand out signs so people nearby head to the nearby truck. But if a zone has only one truck, or the map is outdated, the organizers stop using signs and tell people to choose any truck — safer than sending a crowd to a non-existent truck. (kubernetes.io)

The bigger picture

EndpointSlices are now the canonical representation of service backends and offer richer metadata (hints, conditions, terminating endpoints) than the legacy Endpoints API. Kubernetes is continuing the transition away from Endpoints toward EndpointSlices as the primary discovery mechanism, which unlocks features like topology-aware routing. That transition affects many parts of the ecosystem — controllers, proxies, and cloud integrations — so expect evolving behavior during upgrades. (kubernetes.io)

Final note

Topology-aware routing is a pragmatic, Kubernetes-native way to reduce cross-zone chatter and improve locality without re-architecting your services. It relies on EndpointSlices, correct cluster topology metadata, and a dataplane that respects the hints. Like any orchestration feature, it pays to understand the safeguards and failure modes so you don’t get surprised when the cluster falls back to global routing during transitions or edge cases. (kubernetes.io)

References: