Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.
Tracking experiments and shipping models are different problems. The MLOps tooling assumes one solution; production splits them. The patterns we use.
Most LLM eval suites correlate poorly with what real users experience. The eval patterns we run that move with prod metrics — and the ones that lied to us.