We expanded from one Kubernetes cluster to four across two regions. The traffic-routing layer was the hardest piece. Here's what we tried, what worked, and what we'd do again.

On this page

Multi-Cluster Traffic Routing Strategies: Deep Dive

A year ago we ran one Kubernetes cluster — a comfortable mid-size EKS in us-east-1, hosting our entire production stack. Today we run four: two in us-east-1 (one for general workloads, one isolated for our heaviest data-intensive services), and two in us-west-2 (the same split, mirrored). The motivation was a mix of compliance, regional latency, and a hard separation we needed for one specific workload.

The compute side of going multi-cluster is well-trodden territory. The hard part, the part that took most of our planning, was traffic routing. This post is about that layer specifically — what we evaluated, what we shipped, what surprised us.

The shape of the problem #

A request enters the system. It's destined for some service. We need to:

Route it to a cluster (which one?)
Route within that cluster to the right service
Survive a cluster-level failure without dropping the request

For a single cluster, "route to cluster" is trivial. With multiple clusters, you've added a layer of decision-making. Where that decision lives — DNS, an external load balancer, a service mesh, an application — defines your operational model.

Approach 1: DNS-based routing (we considered, declined)#

Simplest model: each cluster has its own ingress and its own subdomain. Route53 with weighted records or latency-based routing decides which cluster the user hits.

Pros: zero new infrastructure. Easy to reason about. Survives a cluster failure cleanly (remove records, traffic shifts).

Cons: TTL granularity. Even at 30s TTL (we wouldn't want lower), failover takes 30s+ to propagate. Per-request routing decisions aren't possible — DNS is sticky per resolution. Cross-cluster failover during a partial degradation is awkward.

We didn't choose this for our user-facing traffic but kept it for some internal admin-tier traffic where the simplicity wins.

Approach 2: External load balancer with cluster targets #

Put an L7 load balancer (AWS ALB or GCP HTTP LB) in front of all clusters, with the clusters as backends. The load balancer decides which cluster gets the request based on health checks and configured rules.

For us, this meant a single ALB in us-east-1 with target groups pointing at NLBs in front of each cluster. Cross-region targets are possible but trickier — typically requires global accelerators or a separate routing layer.

Pros: per-request routing. Sub-second failover on cluster health changes. Familiar AWS-native model.

Cons: ALB doesn't natively span regions. We'd need either Global Accelerator (works but adds cost and a layer) or duplicate ALBs with DNS as the cross-region fallback (back to DNS).

We use this model for routing within a region (one ALB → multi-cluster backends in same region). Cross-region failover sits at DNS.

Approach 3: Service mesh (Istio or Linkerd, multi-cluster)#

A service mesh that spans clusters can route at the application layer with full awareness of both clusters. Istio multi-cluster, Linkerd multi-cluster — both work.

Pros: per-request decisions including intelligent routing (e.g., "route to nearest cluster unless it's saturated, then spill to the next"). Automatic mTLS across clusters. Failover and circuit breaking baked in.

Cons: enormous operational complexity. Cross-cluster control plane sync. Certificate management across clusters. The blast radius of a mesh misconfiguration is the entire mesh.

We trialed Istio multi-cluster for two months. We didn't stick with it. The complexity was too high for our team size, and the operations of "the mesh itself" became its own product. Larger organizations make it work; for us it was too much.

What we actually use #

A pragmatic mix:

code

                           [ Route53 latency-based ]
                                     │
                ┌────────────────────┴────────────────────┐
                ▼                                         ▼
     [ ALB us-east-1 ]                          [ ALB us-west-2 ]
       │           │                               │           │
       ▼           ▼                               ▼           ▼
  [cluster 1] [cluster 2]                    [cluster 3] [cluster 4]
   (general)   (data)                         (general)   (data)

DNS (Route53 latency-based) picks the region. ~50s to fail over a region.
Per-region ALB routes within the region. Sub-second to fail over a cluster.
Inside each cluster, regular Kubernetes Service / Ingress routing.

No service mesh spanning clusters. Some clusters internally use Istio for east-west service-to-service traffic, but the cross-cluster layer is plain ALB + DNS.

How requests get routed #

A user hits api.example.com:

DNS resolves to either the us-east-1 or us-west-2 ALB based on latency from the user.
The ALB checks its target groups. Each target group corresponds to a backend in one of that region's clusters.
Listener rules on the ALB route by path or host header to the right target group.
Within the cluster, the kube-proxy / kube-dns / Ingress flow takes over.

Failover scenarios:

One cluster fails health checks: ALB removes that cluster's targets within ~10s. Other clusters in the region absorb. No DNS change needed.
Whole region fails: ALB health check from the global health monitor fails for all clusters in that region. Route53 health-check-driven failover removes that region from latency-based DNS, traffic shifts to the other region within ~50s.
One service in one cluster degrades: ALB target health (per-target) starts failing. Other targets in that cluster (more replicas of the same service) absorb. If all replicas in cluster A degrade but cluster B's are fine, ALB fails over per-rule.

Cross-cluster service discovery #

Most services are cluster-local — they only talk to other services in the same cluster. A few need to talk across clusters (e.g., a data-export service in cluster A needs to read from a database that lives in cluster B).

We use external-name Kubernetes Services that resolve to the destination cluster's external load balancer:

yaml.yaml

apiVersion: v1
kind: Service
metadata:
  name: shared-database
  namespace: app
spec:
  type: ExternalName
  externalName: db.cluster-b.internal.example.com
  ports:
  - port: 5432

The application connects to shared-database.app.svc.cluster.local, which resolves via ExternalName to the destination cluster's address. Network ACLs allow only specific source clusters to reach the destination port.

This is operationally simple. It does mean cross-cluster traffic crosses the cluster boundary (and the AZ/region boundary) at the network level, which has cost and latency implications. We minimize these connections deliberately — most workloads run within a single cluster.

What surprised us #

Health check granularity. The ALB's target health check is per-target, not per-listener-rule. We had a case where one rule's backend was failing but the same backend was fine for another rule. The health check marked the whole target unhealthy. Workaround: separate target groups per use case, accept the duplication.

Cross-region replication lag. Our database is replicated us-east-1 → us-west-2 with async replication. During a region failover from east to west, requests can briefly see stale data from the replica. Customers occasionally noticed (our app surfaces "your last action might take a moment to reflect"). We added a banner during failover events.

Cost from inter-AZ data transfer. When workloads in cluster 1 talked to workloads in cluster 2 (same region but different AZs), AWS charged inter-AZ data transfer. We didn't notice for a month, then the bill caught us. Solution: keep noisy inter-service traffic intra-cluster; use cluster-2 for genuinely separate workloads.

What we decided not to do #

Active-active database. We considered. The complexity of multi-master conflict resolution (or a CRDT layer) wasn't worth it for our use case. We use active-passive: writes to the primary region, replication to the secondary. Failover involves promoting the replica.

Cluster federation (KCP, Karmada, etc.). Promising but immature for our needs. The operational story for "which cluster does this resource actually live in" was unclear. We may revisit.

Mesh-managed cross-cluster traffic. Istio multi-cluster, as mentioned. Theoretically powerful; in practice, more complexity than we needed for the routing requirements we had.

Recommended sequencing #

If I were doing this for a new team:

Start with DNS-based routing. It's free and works. Don't optimise prematurely.
Add a per-region L7 load balancer when you outgrow DNS-based granularity. This was the move that made the most impact for us.
Add cross-region DNS failover with health checks once you have multiple regions.
Add a service mesh only when you've identified specific problems it solves (mTLS, fine-grained routing, ZTNA-style policy). Not before.
Reconsider cluster federation every 12 months as the ecosystem matures. We've kept tabs but haven't committed.

The temptation with multi-cluster is to reach for the most sophisticated tool first (a mesh, federation, etc.). The simpler tools are 80% of the value at 10% of the operational cost.

What I'd watch for in your numbers #

If you're considering this kind of architecture, look at:

Single-region failure cost. If your business can't survive an hour of regional outage, you need cross-region. If a half-day outage is survivable (as it was for us, briefly, when we started), DNS-based failover gets you there.
Cross-cluster traffic volume. If most service-to-service calls cross cluster boundaries, you have a clustering problem, not a routing problem.
Team size that operates the cluster fleet. Each additional cluster adds operational overhead. Multi-cluster is good for blast-radius reduction; it's not free.

Closing thought #

Routing across clusters is one of those problems where the obviously-elegant solution (a service mesh) is operationally heavy, and the obviously-crude solution (DNS) is operationally light but loses fidelity. The interesting answer for most teams is something boring in the middle: regional load balancers with health checks, DNS for cross-region, simple service discovery patterns within. Boring is what scales.

Deep Dive: Multi-Cluster Traffic Routing Strategies

Multi-Cluster Traffic Routing Strategies: Deep Dive

The shape of the problem #

Approach 1: DNS-based routing (we considered, declined)#

Approach 2: External load balancer with cluster targets #

Approach 3: Service mesh (Istio or Linkerd, multi-cluster)#

What we actually use #

How requests get routed #

Cross-cluster service discovery #

What surprised us #

What we decided not to do #

Recommended sequencing #

What I'd watch for in your numbers #

Closing thought #

Stay Updated

Deep Dive: Model Serving Observability Stack

Deep Dive: Secure Container Supply Chain Controls

More from Cloud

External Secrets Operator: One Secrets Workflow Across Clouds

AWS Graviton Migration: What Broke and What We Saved

Serverless Cold Starts: Measuring and Fixing Them on Lambda

External Secrets Operator: One Secrets Workflow Across Clouds

AWS Graviton Migration: What Broke and What We Saved

Serverless Cold Starts: Measuring and Fixing Them on Lambda

Multi-Region Failover with Route 53: Health Checks and Gotchas

Four Signals That Matter: Choosing SLIs Users Actually Feel

Kustomize Overlays That Scale Across Environments

You might have missed

GitOps with Argo CD: Best Practices for 2025

Prompt Engineering Best Practices: Maximizing LLM Performance

Process Management and Monitoring in Linux

About Kiril Urbonas