How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.
Multi-region can easily become a science project. This is what worked for a five-person platform team supporting a SaaS product.
We began with everything in one AWS region: RDS, EKS, S3, and a shared VPC.
Instead of cloning the entire stack, we:
```hcl module "vpc" { source = "./modules/vpc" region = var.region primary = var.is_primary } ```
/healthz endpoint.We didn’t solve every theoretical edge case, but we can now lose a region and recover in under an hour with a plan the team has actually rehearsed.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
VPCs, subnets, route tables, gateways. The mental model that finally made cloud networking click after I stopped trying to map it 1:1 to physical networks.
We've executed real disaster recoveries twice. The plan that survived contact with reality, and what was wrong about the plans we had before that.
Explore more articles in this category
Three caching patterns, three failure modes. The one we use most, the one that bit us, and the rule that decides which pattern fits which workload.
Bad resource requests waste money or trigger OOMs. The methodology we use to right-size requests based on actual usage, and the gotchas the autoscalers don't fix.
Edge compute is useless without an edge data layer. Three serverless databases that put data within ms of your edge functions, with the tradeoffs that aren't on the marketing pages.