Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.
A practical Linux performance tuning playbook for production servers. The kernel parameters, disk and network tweaks that earn their place, and the ones that turned out to be folklore.
A practical guide to writing and managing systemd services for production. The unit file features that earn their place, plus the operational workflows.
We use CloudFront + Lambda@Edge for specific patterns. The wins, the production gotchas, and where we hit Lambda@Edge's limits.
Postgres, DynamoDB, Redis, Elasticsearch, Snowflake. We use all five for different workloads. The decision criteria, not the marketing comparison.
We've executed real disaster recoveries twice. The plan that survived contact with reality, and what was wrong about the plans we had before that.
VPCs, subnets, route tables, gateways. The mental model that finally made cloud networking click after I stopped trying to map it 1:1 to physical networks.
We run both ECS and EKS in production. Which we use for what, and the actual decision criteria — not the marketing comparison.
A working AWS security baseline, derived from the actual incidents we've had and the audit findings we've cleared.
We use serverless for specific patterns, not as a default. The patterns where it shines, the ones it doesn't, and the gotchas at production scale.
Building visibility into cloud costs that actually drives action. The dashboards we look at, the alerts that fire, and the queries we run.
We run our app in two AWS regions for failover. The hard parts aren't the deployment — they're data consistency, traffic shifting, and the assumptions that break when "primary" is suddenly the wrong region.
We run ~200 Lambda functions. Cold starts, memory tuning, and the cost-vs-latency trade-offs that actually move the bill.