Archive
Browse all 274 articles organized by date
2026
31 articlesJanuary
- 29Disaster Recovery Planning: Building Resilient Infrastructure
- 28Operational Checklist: Blue-Green Deployment Guardrails
- 25Infrastructure Monitoring: Observability for IaC
- 24Operational Checklist: Infrastructure Drift Detection Workflow
- 22Ansible Playbook Optimization: Writing Efficient Playbooks
- 20Operational Checklist: Multi-Cluster Traffic Routing Strategies
- 18Pulumi vs Terraform Deep Dive: Choosing the Right IaC Tool
- 15Operational Checklist: Kubernetes Secrets and External Vault Integration
- 14Infrastructure Testing Strategies: Validating Your IaC
- 12Operational Checklist: Python Worker Queue Scaling Patterns
- 11Terraform Modules Best Practices: Building Reusable Infrastructure
- 8Operational Checklist: Model Serving Observability Stack
- 7Linux Container Internals: Understanding How Containers Work
- 4Shell Scripting Best Practices: Writing Maintainable Scripts
- 4Operational Checklist: RAG Retrieval Quality Evaluation
February
- 28End-of-Week Engineering: Why Smart Tech Teams Don’t Ship Major Changes on Friday
- 27Kubernetes Cost Optimization for Teams: FinOps Tactics That Actually Work
- 26SRE Error Budgets in Practice: Shipping Fast Without Burning Reliability
- 25Platform Engineering with Backstage: Build a Useful Developer Portal
- 24GitHub Actions for Monorepos: Fast CI Without Pipeline Chaos
- 23Azure DevOps Best Practices in 2026: Build Pipelines You Can Trust
- 22AI Best Practices in 2026: Shipping Reliable Systems, Not Demo Magic
- 21
2025
134 articlesJanuary
- 29Troubleshooting: Kubernetes Cluster Upgrade Strategy
- 26Field Notes: AI Inference Cost Optimization
- 22Field Notes: SLO-Based Monitoring for APIs
- 18Field Notes: Secure Container Supply Chain Controls
- 14Field Notes: Infrastructure Documentation as Code
- 9Field Notes: Cloud Networking Segmentation Patterns
- 6Field Notes: Incident Response for Platform Teams
2024
105 articlesJanuary
- 28Practical Guide: Cloud Disaster Recovery Runbook Design
- 24Practical Guide: AWS Cost Control with Tagging and Budgets
- 21Practical Guide: Ansible Role Design for Large Teams
- 17Practical Guide: Terraform State Isolation by Environment
- 15Orchestrating AI Agents on Kubernetes
- 13Practical Guide: GitHub Actions Pipeline Reliability
- 10eBPF: The Future of Kernel Observability