Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

Blameless Postmortems: The Template and Facilitation That Works

Our early postmortems quietly assigned blame and taught people to hide mistakes. Here's the template and the facilitation rules that finally made them honest and useful.

Kiril Urbonas·1

Read article

••22 hours ago

Feature Flags for Safe Deploys: Decoupling Release From Deploy

We used to ship code and turn it on in the same breath, so every deploy was a bet. Feature flags split those two events apart and made rollbacks a config toggle.

Kiril Urbonas·1

Read article

••22 hours ago

On-Call Without Burnout: Rotations, Runbooks, and Escalation

Our best engineer quit citing on-call. We rebuilt the whole thing: saner rotations, runbooks that actually help at 3am, and escalation that doesn't punish asking for help.

Kiril Urbonas·1

Read article

••yesterday

Four Signals That Matter: Choosing SLIs Users Actually Feel

Most SLI dashboards track things nobody notices. Here's how we picked the handful of signals that map to real user pain, and dropped the vanity metrics.

Kiril Urbonas·1

Read article

••yesterday

Error Budgets to Roadmap: Turning Reliability Into Prioritization

Reliability arguments used to be shouting matches between SRE and product. An error budget turned them into arithmetic. Here's how we made the number drive the roadmap.

Kiril Urbonas·1

Read article

••2 days ago

GitHub Actions Reusable Workflows: DRY Pipelines at Org Scale

We had the same 180-line build workflow copy-pasted into 60 repos. Fixing one bug meant 60 PRs. Here's the reusable-workflow setup that made it one.

Kiril Urbonas·1

Read article

••3 days ago

Kubernetes Ingress vs Gateway API: Migrating Without Downtime

We moved 40 services off the nginx Ingress controller onto Gateway API without a single dropped connection. Here's the routing overlap trick that made it boring.

Kiril Urbonas·1

Read article

••3 days ago

Kustomize Overlays That Scale Across Environments

Our overlay tree grew to seven environments and started copy-pasting the same patch into each. Here's the component-based layout that stopped the drift.

Kiril Urbonas·1

Read article

••4 days ago

Docker Compose in Production: When It Fits and When It Doesn't

Everyone says Compose is for dev only. We ran it in production for two years on a single node and it was the right call, until the day it very much wasn't.

Kiril Urbonas·1

Read article

••5 days ago

Distroless Docker Images: Smaller, Safer Production Containers

Our node image shipped 240 CVEs, most from OS packages we never called. Moving to distroless dropped the count to single digits and cut image size by 70%.

Kiril Urbonas·1

Read article

••6 days ago

Multi-Arch Docker Builds with Buildx: One Image, Every Platform

Our M-series laptops built arm64, our CI built amd64, and prod pulled whichever tag won the race. Buildx and a manifest list ended the chaos.

Kiril Urbonas

Read article

••last week

Flux vs Argo CD: Picking a GitOps Engine in 2026

After running both in production across a dozen clusters, here's where Flux and Argo CD actually differ and which one we'd reach for now.

Kiril Urbonas

Read article

Page 1 of 11 · 130 posts