Tag: rag
8 articles filed under this tag. Newest first below ; start with the highlighted pick if you are new here.
Featured
Secure RAG Systems and Prompt Injection PreventionHow untrusted documents and web pages become indirect injection channels into retrieval pipelines—and how engineers harden ingest, retrieval, and tool boundaries without pretending RAG eliminates adversarial text.
· 6 min read
- Evaluation Frameworks for LLM Applications at Scale
Golden datasets, regression suites, LLM-as-judge patterns, and offline versus online evaluation loops—emphasizing measurement discipline over benchmark theater.
· 6 min read
- Memory Systems for LLM Agents — Short-Term vs Long-Term Memory
Episodic buffers, summarization, retrieval-augmented memory, and persistence patterns for agents—separating conversation state from durable knowledge stores.
· 6 min read
- Contextual Grounding and Hallucination Reduction in LLM Systems
How retrieval, verification loops, and constrained generation patterns reduce unsupported answers—without claiming any pipeline eliminates model confabulation entirely.
· 6 min read
- Context Window Engineering for LLM Systems
Token budgets, truncation, summarization layers, and context packing—how production teams fit prompts, tools, and RAG evidence into finite windows without silent information loss.
· 6 min read
- Retrieval Strategies in RAG — Dense, Sparse, and Hybrid Search
When embedding-based ANN search wins, when lexical BM25-style retrieval wins, and how hybrid fusion behaves at scale—without pretending one algorithm fits every corpus.
· 6 min read
- Architecture of Production-Grade RAG Systems
How chunking, embeddings, retrieval, reranking, grounding, and latency budgets fit together in retrieval-augmented generation systems that survive real traffic—not demos.
· 6 min read
- Building Production RAG Pipelines with LangChain
How retrieval-augmented generation combines vector search over embeddings with LLM context injection to ground responses in real data — and what it takes to run that in production.
· 9 min read