Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.
Practical game day scenarios for CI/CD: broken rollbacks, permission issues, and slow feedback loops—and how we fixed them.
Unify traces, metrics, and logs with OpenTelemetry. Instrumentation, sampling, and backend-agnostic pipelines.
We upgraded a 60-node EKS cluster from 1.27 to 1.31 over six months. Four minor versions, one bad surprise, zero customer impact. Here's the playbook.
Declarative, Git-centric deployments with Argo CD. Directory layout, sync policies, and security.
Practical ways to cut Kubernetes spend: rightsizing, spot/preemptible nodes, and FinOps practices.