_d
devops/ness
Blog
Reading ListAbout
Subscribe

Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

Category: aiClear filters
Prompt Engineering Best Practices: Maximizing LLM Performance
••7 months ago

Prompt Engineering Best Practices: Maximizing LLM Performance

Master prompt engineering techniques to get better results from LLMs. Learn about few-shot learning, chain-of-thought, and advanced prompting strategies.

KU
Kiril Urbonas
Read article
Real-World RAG Incidents: Lessons from a Production Rollout
••7 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
AI Model Deployment Strategies: From Development to Production
••7 months ago

AI Model Deployment Strategies: From Development to Production

Complete guide to deploying AI models in production. Learn about model serving, containerization, scaling, and monitoring strategies.

KU
Kiril Urbonas
Read article
Model Quantization Techniques: Reducing LLM Size and Cost
••7 months ago

Model Quantization Techniques: Reducing LLM Size and Cost

Learn how to reduce LLM model size and inference costs using quantization techniques like Q4, Q8, and GPTQ. Practical guide with benchmarks.

KU
Kiril Urbonas
Read article
Real-World RAG Incidents: Lessons from a Production Rollout
••7 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
Vector Databases for AI: Comparing Pinecone, Weaviate, and ChromaDB
••7 months ago

Vector Databases for AI: Comparing Pinecone, Weaviate, and ChromaDB

Compare the top vector databases for AI applications. Learn when to use Pinecone, Weaviate, or ChromaDB based on your requirements.

KU
Kiril Urbonas
Read article
Building RAG Applications: A Complete Guide to Retrieval Augmented Generation
••7 months ago

Building RAG Applications: A Complete Guide to Retrieval Augmented Generation

Learn how to build production-ready RAG applications using vector databases, embedding models, and LLMs. Complete guide with code examples and best practices.

KU
Kiril Urbonas
Read article
RAG in Production: Reliability, Latency, and Cost for LLM Apps
••8 months ago

RAG in Production: Reliability, Latency, and Cost for LLM Apps

Run retrieval-augmented generation at scale. Chunking, caching, and observability.

KU
Kiril urbonas
Read article
Best Practices: AI Inference Cost Optimization
••8 months ago

Best Practices: AI Inference Cost Optimization

AI Inference Cost Optimization. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
Real-World RAG Incidents: Lessons from a Production Rollout
••8 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
Real-World RAG Incidents: Lessons from a Production Rollout
••8 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
Real-World RAG Incidents: Lessons from a Production Rollout
••8 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
Page 5 of 11 · 121 posts
Previous
1...456...11
Next