RAG (检索增强生成) 将 LLM 与检索系统结合在一起——从知识库中获取相关信息,并将其作为上下文提供给 LLM 以生成准确、有根据的答案。这是在自定义数据上构建 LLM 应用程序的关键技术。
RAG 的作用
RAG → augment an LLM's generation with RETRIEVED relevant information:
1. RETRIEVE → search a knowledge base (your documents/data) for info relevant to the query
2. AUGMENT → add the retrieved info to the LLM's prompt as CONTEXT
3. GENERATE → the LLM answers using the provided context (grounded in your data)
→ gives the LLM relevant, up-to-date, specific knowledge it wasn't trained on
RAG 通常如何工作
→ index your data: split documents into chunks → create EMBEDDINGS → store in a VECTOR DATABASE
→ at query time: embed the query → find the most SIMILAR chunks (semantic search) →
retrieve them
→ build a prompt: 'Using this context: [retrieved chunks], answer: [query]'
→ the LLM generates an answer grounded in the retrieved context
