Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.
We've shipped three end-to-end ML systems. The pieces that look obvious in slides and turn out to be the actual work.
How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.
A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.
Standard APM doesn't tell you when your LLM-powered features are silently degrading. The signals we track and the dashboards that catch the regressions standard tools miss.
How we deploy LLM-powered features. The deployment patterns are mostly normal; the validation is where the differences are.