_d
devops/ness
Blog
Reading ListAbout
Subscribe

Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

Category: aiClear filters
Real-World RAG Incidents: Lessons from a Production Rollout
••March 22, 2025

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
Troubleshooting: Model Serving Observability Stack
••March 21, 2025

Troubleshooting: Model Serving Observability Stack

Model Serving Observability Stack. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
Troubleshooting: RAG Retrieval Quality Evaluation
••March 17, 2025

Troubleshooting: RAG Retrieval Quality Evaluation

RAG Retrieval Quality Evaluation. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
Real-World RAG Incidents: Lessons from a Production Rollout
••March 15, 2025

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
Troubleshooting: Prompt Versioning and Regression Testing
••March 13, 2025

Troubleshooting: Prompt Versioning and Regression Testing

Prompt Versioning and Regression Testing. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
Troubleshooting: LLM Gateway Design for Multi-Provider Inference
••March 9, 2025

Troubleshooting: LLM Gateway Design for Multi-Provider Inference

LLM Gateway Design for Multi-Provider Inference. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
Real-World RAG Incidents: Lessons from a Production Rollout
••March 8, 2025

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
Real-World RAG Incidents: Lessons from a Production Rollout
••March 1, 2025

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
AI Agents in DevOps: From Copilots to Autonomous Automation in 2025
••February 28, 2025

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

How AI agents are moving from read-only copilots to autonomous automation with guardrails. Best practices for approval gates and rollback.

KU
Kiril urbonas
Read article
Field Notes: AI Inference Cost Optimization
••January 26, 2025

Field Notes: AI Inference Cost Optimization

AI Inference Cost Optimization. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
Field Notes: Python Worker Queue Scaling Patterns
••December 18, 2024

Field Notes: Python Worker Queue Scaling Patterns

Python Worker Queue Scaling Patterns. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
Field Notes: Model Serving Observability Stack
••December 14, 2024

Field Notes: Model Serving Observability Stack

Model Serving Observability Stack. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
Page 8 of 11 · 121 posts
Previous
1...789...11
Next