Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

Embedding Model Upgrades Without Search Chaos: A Safer RAG Rollout Pattern

A practical embedding model upgrade guide for RAG systems, built from a real support-search migration that initially reduced answer quality instead of improving it.

Kiril Urbonas·51

Read article

••3 months ago

Multi-Cluster Traffic Routing Strategies: A Pragmatic Rollout Pattern for Growing SaaS Teams

A real-world multi-cluster traffic routing guide for SaaS teams that have outgrown a single Kubernetes cluster and need safer rollout control without a service-mesh science project.

Kiril Urbonas·11

Read article

••3 months ago

Systemd Service Reliability Patterns: What We Changed After Repeated Restart Loops

A practical systemd reliability guide for Linux services, built around repeated restart-loop incidents and the unit-file patterns that finally made those services boring.

Kiril Urbonas·6

Read article

••3 months ago

Cloud Disaster Recovery Runbook Design: How Small Teams Rehearse Multi-Region Failover

A practical disaster recovery runbook guide for small cloud teams that need realistic failover steps, clear ownership, and repeatable rehearsals instead of shelfware documents.

Kiril Urbonas·11

Read article

••4 months ago

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

Kiril Urbonas·8

Read article

••4 months ago

A Pragmatic Multi-Region Strategy for Small Teams

How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.

Kiril Urbonas·4

Read article

••4 months ago

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

Kiril Urbonas·3

Read article

••4 months ago

A Pragmatic Multi-Region Strategy for Small Teams

How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.

Kiril Urbonas·4

Read article

••4 months ago

End-of-Week Engineering: Why Smart Tech Teams Don’t Ship Major Changes on Friday

A practical risk-management framework for release timing, Friday deployment policies, progressive delivery, and how elite teams protect reliability and people.

Kiril Urbonas·28

Read article