RAG (Retrieval-Augmented Generation)

By The Codegen Team · Updated March 26, 2026 · AI Fundamentals

A technique that enhances LLM responses by retrieving relevant documents from an external knowledge base before generating output.

What is RAG (Retrieval-Augmented Generation)?

Retrieval-augmented generation (RAG) is a technique that enhances LLM responses by retrieving relevant documents or data from an external knowledge base before generating a response. Instead of relying solely on what the model learned during training, RAG systems fetch current, specific information at query time.

In coding contexts, RAG is used to give agents access to documentation, codebase knowledge, and API references that may not be in the model’s training data. A RAG-enabled coding agent can look up your team’s internal API documentation, coding standards, or architecture decisions before writing code.

The quality of RAG depends heavily on the retrieval layer: how documents are indexed, chunked, and ranked for relevance. Poor retrieval leads to irrelevant context, which can degrade output quality.

Frequently Asked Questions