Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.
We've shipped four production RAG applications. Each one taught us something. The end-to-end pattern that works.
Run retrieval-augmented generation at scale. Chunking, caching, and observability.
A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.