_d
devops/ness
Blog
Reading ListAbout
Subscribe
Featured Article

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

AILLMGPTPython
KU
Kiril urbonasDevOps Engineer and AI Enthusiast
|Mar 10, 2026
Real-World RAG Incidents: Lessons from a Production Rollout

Topics

Monitoring280Terraform207AWS166Kubernetes124Python111Security107CI/CD103LLM97Ansible95Linux95

Latest Articles

View All →
Practical Guide: Infrastructure Documentation as Code
••March 27, 2024

Practical Guide: Infrastructure Documentation as Code

Infrastructure Documentation as Code. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: Cloud Networking Segmentation Patterns
••March 23, 2024

Practical Guide: Cloud Networking Segmentation Patterns

Cloud Networking Segmentation Patterns. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Page 41 of 44 · 518 posts
Previous
1...404142...44
Next

DevOpsNess

Practical AI, DevOps, Cloud, and Linux guidance for engineering teams

Weekly deep dives, implementation patterns, and reliability-focused playbooks.

Join NewsletterBrowse Posts
_d
devops/ness

A practical blog covering AI, cloud, DevOps, and modern technology for engineering teams.

Explore

  • Latest Articles
  • Archive
  • Reading List

Resources

  • About
  • RSS Feed
  • Newsletter

Legal

Practical Guide: Incident Response for Platform Teams
••March 20, 2024

Practical Guide: Incident Response for Platform Teams

Incident Response for Platform Teams. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: Blue-Green Deployment Guardrails
••March 16, 2024

Practical Guide: Blue-Green Deployment Guardrails

Blue-Green Deployment Guardrails. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·3 min read
Read article
Practical Guide: Infrastructure Drift Detection Workflow
••March 11, 2024

Practical Guide: Infrastructure Drift Detection Workflow

Infrastructure Drift Detection Workflow. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: Multi-Cluster Traffic Routing Strategies
••March 7, 2024

Practical Guide: Multi-Cluster Traffic Routing Strategies

Multi-Cluster Traffic Routing Strategies. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: Kubernetes Secrets and External Vault Integration
••March 3, 2024

Practical Guide: Kubernetes Secrets and External Vault Integration

Kubernetes Secrets and External Vault Integration. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: Python Worker Queue Scaling Patterns
••February 29, 2024

Practical Guide: Python Worker Queue Scaling Patterns

Python Worker Queue Scaling Patterns. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: Model Serving Observability Stack
••February 25, 2024

Practical Guide: Model Serving Observability Stack

Model Serving Observability Stack. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: RAG Retrieval Quality Evaluation
••February 21, 2024

Practical Guide: RAG Retrieval Quality Evaluation

RAG Retrieval Quality Evaluation. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: Prompt Versioning and Regression Testing
••February 17, 2024

Practical Guide: Prompt Versioning and Regression Testing

Prompt Versioning and Regression Testing. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: LLM Gateway Design for Multi-Provider Inference
••February 13, 2024

Practical Guide: LLM Gateway Design for Multi-Provider Inference

LLM Gateway Design for Multi-Provider Inference. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
  • Privacy
  • Terms

© 2026 DevOpsNess. By Kiril Urbonas.

RSSPrivacyTerms