_d
devops/ness
Blog
Reading ListAbout
Subscribe
Featured Article

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

AILLMGPTPython
KU
Kiril urbonasDevOps Engineer and AI Enthusiast
|Mar 10, 2026
Real-World RAG Incidents: Lessons from a Production Rollout

Topics

Monitoring280Terraform207AWS166Kubernetes124Python111Security107CI/CD103LLM97Ansible95Linux95

Latest Articles

View All →
Fine-tuning Large Language Models: A Practical Guide
••February 12, 2024

Fine-tuning Large Language Models: A Practical Guide

Learn how to fine-tune LLMs like Llama 2, Mistral, and GPT models for your specific use case. Includes LoRA, QLoRA, and full fine-tuning techniques.

KU
Kiril Urbonas·5 min read
Read article
Practical Guide: Kernel and Package Patch Management
••February 10, 2024

Practical Guide: Kernel and Package Patch Management

Kernel and Package Patch Management. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Page 42 of 44 · 518 posts
Previous
1...41424344
Next

DevOpsNess

Practical AI, DevOps, Cloud, and Linux guidance for engineering teams

Weekly deep dives, implementation patterns, and reliability-focused playbooks.

Join NewsletterBrowse Posts
_d
devops/ness

A practical blog covering AI, cloud, DevOps, and modern technology for engineering teams.

Explore

  • Latest Articles
  • Archive
  • Reading List

Resources

  • About
  • RSS Feed
  • Newsletter

Legal

Infrastructure as Code: Terraform vs Pulumi vs Ansible
••February 10, 2024

Infrastructure as Code: Terraform vs Pulumi vs Ansible

Compare Terraform, Pulumi, and Ansible for Infrastructure as Code. Learn when to use each tool and how they complement each other in modern DevOps workflows.

KU
Kiril Urbonas·5 min read
Read article
Linux System Monitoring with Prometheus and Grafana
••February 7, 2024

Linux System Monitoring with Prometheus and Grafana

Set up comprehensive Linux system monitoring using Prometheus and Grafana. Monitor CPU, memory, disk, network, and application metrics with beautiful dashboards.

KU
Kiril Urbonas·6 min read
Read article
Practical Guide: Systemd Service Reliability Patterns
••February 5, 2024

Practical Guide: Systemd Service Reliability Patterns

Systemd Service Reliability Patterns. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
AWS Cost Optimization: 10 Strategies to Reduce Your Cloud Bill
••February 5, 2024

AWS Cost Optimization: 10 Strategies to Reduce Your Cloud Bill

Discover proven strategies to reduce AWS costs by up to 50%. Learn about Reserved Instances, Spot Instances, right-sizing, and automated cost management.

KU
Kiril Urbonas·4 min read
Read article
Building Production-Ready AI Applications with LangChain and Docker
••February 3, 2024

Building Production-Ready AI Applications with LangChain and Docker

Learn how to containerize and deploy LangChain applications in production. Best practices for scaling, monitoring, and maintaining AI-powered services.

KU
Kiril Urbonas·5 min read
Read article
Practical Guide: Linux Performance Baseline Methodology
••February 1, 2024

Practical Guide: Linux Performance Baseline Methodology

Linux Performance Baseline Methodology. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Kubernetes Autoscaling: HPA vs VPA vs Cluster Autoscaler
••February 1, 2024

Kubernetes Autoscaling: HPA vs VPA vs Cluster Autoscaler

Master Kubernetes resource management with Horizontal Pod Autoscaler, Vertical Pod Autoscaler, and Cluster Autoscaler. Learn when to use each and how to configure them for optimal performance.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: Cloud Disaster Recovery Runbook Design
••January 28, 2024

Practical Guide: Cloud Disaster Recovery Runbook Design

Cloud Disaster Recovery Runbook Design. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: AWS Cost Control with Tagging and Budgets
••January 24, 2024

Practical Guide: AWS Cost Control with Tagging and Budgets

AWS Cost Control with Tagging and Budgets. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
Practical Guide: Ansible Role Design for Large Teams
••January 21, 2024

Practical Guide: Ansible Role Design for Large Teams

Ansible Role Design for Large Teams. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas·4 min read
Read article
  • Privacy
  • Terms

© 2026 DevOpsNess. By Kiril Urbonas.

RSSPrivacyTerms