Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

Kiril Urbonas·10

Read article

••4 months ago

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

Kiril Urbonas·3

Read article

••4 months ago

Systemd Tricks We Use to Keep Services Boring

Concrete systemd unit patterns that reduced flakiness: restart policies, resource limits, and structured logs.

Kiril Urbonas·7

Read article

••4 months ago

A Pragmatic Multi-Region Strategy for Small Teams

How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.

Kiril Urbonas·4

Read article

••4 months ago

End-of-Week Engineering: Why Smart Tech Teams Don’t Ship Major Changes on Friday

A practical risk-management framework for release timing, Friday deployment policies, progressive delivery, and how elite teams protect reliability and people.

Kiril Urbonas·28

Read article

••4 months ago

Kubernetes Cost Optimization for Teams: FinOps Tactics That Actually Work

Cut Kubernetes spend without hurting reliability using a practical FinOps playbook for rightsizing, autoscaling guardrails, showback, and weekly waste cleanup.

Kiril Urbonas·18

Read article

••4 months ago

SRE Error Budgets in Practice: Shipping Fast Without Burning Reliability

A practical way to define SLOs and error budgets, connect them to release decisions, and avoid reliability debates without data.

Kiril Urbonas·21

Read article

••4 months ago

Platform Engineering with Backstage: Build a Useful Developer Portal

How to implement Backstage with real templates, scorecards, and golden paths so internal platform work reduces delivery friction.

Kiril Urbonas·15

Read article

••4 months ago

GitHub Actions for Monorepos: Fast CI Without Pipeline Chaos

A practical pattern for monorepo CI with path filters, matrix builds, caching, and deployment guards that keep feedback fast as teams scale.

Kiril Urbonas·13

Read article

••4 months ago

Azure DevOps Best Practices in 2026: Build Pipelines You Can Trust

A production-focused guide to Azure DevOps: standardized YAML templates, secure service connections, rollout safety, and measurable delivery reliability.

Kiril Urbonas·45

Read article

••4 months ago

AI Best Practices in 2026: Shipping Reliable Systems, Not Demo Magic

A practical production playbook for AI systems: evaluation gates, guardrails, observability, cost control, and reliable release management.

Kiril Urbonas·31

Read article

••4 months ago

AI Best Practices for Engineering Teams: From Prompt Experiments to Platform Discipline

A practical field manual for engineering teams who want AI features that survive real users, incidents, and budgets — not just demo day.

Kiril Urbonas·31

Read article

Page 16 of 44 · 518 posts