目录
- RAG检索增强优化指南
-
- 一、三代RAG架构对比
-
- 1.1 架构演进时间线
- 1.2 三代架构详细对比
- 1.3 Naive RAG – 基础架构
-
- Python实现示例
- 1.4 Advanced RAG – 增强架构
- 1.5 Modular RAG – 模块化架构
- 1.6 RAG 全链路增强路线
- 二、检索前增强
-
- 2.1 HyDE (Hypothetical Document Embeddings)
- 2.2 Query2Doc
- 2.3 查询重写 (Query Rewriting)
- 2.4 查询分解 (Query Decomposition)
- 2.5 Take Step Back
- 三、检索增强
-
- 3.1 核心概念
- 3.2 混合检索 (Hybrid Search)
-
- BM25算法原理
- 向量检索原理
- RRF融合算法
- 3.3 句子窗口检索 (Sentence Window Retriever)
- 3.4 父子文档检索 (Parent Document Retriever)
- 3.5 摘要检索 (Summary Retriever)
- 3.6 假设性问题检索 (Hypothetical Questions Retriever)
- 3.7 多索引融合策略
- 四、检索后增强
-
- 4.1 重排序 (Re-ranking) 原理
- 4.2 主流Re-ranker对比
- 4.3 Re-ranking实现
- 4.4 上下文压缩
- 五、完整生产级示例
-
- 5.1 LangChain生产级实现
- 5.2 LlamaIndex生产级实现
RAG检索增强优化指南
核心概念 RAG (Retrieval-Augmented Generation) 是一种革命性的AI架构,巧妙地将检索系统与生成式大语言模型结合。它像给大模型配备了一个"外挂知识库",让AI能够:
- 从外部知识库实时获取信息
- 基于事实生成答案,大幅减少幻觉
- 知识库可随时更新,无需重新训练模型
- 提供答案来源,增强可解释性
一、三代RAG架构对比
RAG技术经历了三代演进,每一代都解决了前一代的痛点。
1.1 架构演进时间线
#mermaid-svg-cEUNC1ekp3T38YEU{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-cEUNC1ekp3T38YEU .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-cEUNC1ekp3T38YEU .error-icon{fill:#552222;}#mermaid-svg-cEUNC1ekp3T38YEU .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-cEUNC1ekp3T38YEU .marker{fill:#333333;stroke:#333333;}#mermaid-svg-cEUNC1ekp3T38YEU .marker.cross{stroke:#333333;}#mermaid-svg-cEUNC1ekp3T38YEU svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-cEUNC1ekp3T38YEU p{margin:0;}#mermaid-svg-cEUNC1ekp3T38YEU .edge{stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .section–1 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section–1 path,#mermaid-svg-cEUNC1ekp3T38YEU .section–1 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section–1 path{fill:hsl(240, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section–1 text{fill:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon–1{font-size:40px;color:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge–1{stroke:hsl(240, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth–1{stroke-width:17;}#mermaid-svg-cEUNC1ekp3T38YEU .section–1 line{stroke:hsl(60, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-0 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-0 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-0 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-0 path{fill:hsl(60, 100%, 73.5294117647%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-0 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-0{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-0{stroke:hsl(60, 100%, 73.5294117647%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-0{stroke-width:14;}#mermaid-svg-cEUNC1ekp3T38YEU .section-0 line{stroke:hsl(240, 100%, 83.5294117647%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-1 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-1 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-1 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-1 path{fill:hsl(80, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-1 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-1{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-1{stroke:hsl(80, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-1{stroke-width:11;}#mermaid-svg-cEUNC1ekp3T38YEU .section-1 line{stroke:hsl(260, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-2 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-2 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-2 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-2 path{fill:hsl(270, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-2 text{fill:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-2{font-size:40px;color:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-2{stroke:hsl(270, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-2{stroke-width:8;}#mermaid-svg-cEUNC1ekp3T38YEU .section-2 line{stroke:hsl(90, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-3 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-3 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-3 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-3 path{fill:hsl(300, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-3 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-3{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-3{stroke:hsl(300, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-3{stroke-width:5;}#mermaid-svg-cEUNC1ekp3T38YEU .section-3 line{stroke:hsl(120, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-4 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-4 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-4 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-4 path{fill:hsl(330, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-4 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-4{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-4{stroke:hsl(330, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-4{stroke-width:2;}#mermaid-svg-cEUNC1ekp3T38YEU .section-4 line{stroke:hsl(150, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-5 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-5 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-5 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-5 path{fill:hsl(0, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-5 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-5{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-5{stroke:hsl(0, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-5{stroke-width:-1;}#mermaid-svg-cEUNC1ekp3T38YEU .section-5 line{stroke:hsl(180, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-6 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-6 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-6 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-6 path{fill:hsl(30, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-6 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-6{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-6{stroke:hsl(30, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-6{stroke-width:-4;}#mermaid-svg-cEUNC1ekp3T38YEU .section-6 line{stroke:hsl(210, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-7 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-7 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-7 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-7 path{fill:hsl(90, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-7 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-7{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-7{stroke:hsl(90, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-7{stroke-width:-7;}#mermaid-svg-cEUNC1ekp3T38YEU .section-7 line{stroke:hsl(270, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-8 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-8 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-8 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-8 path{fill:hsl(150, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-8 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-8{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-8{stroke:hsl(150, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-8{stroke-width:-10;}#mermaid-svg-cEUNC1ekp3T38YEU .section-8 line{stroke:hsl(330, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-9 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-9 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-9 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-9 path{fill:hsl(180, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-9 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-9{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-9{stroke:hsl(180, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-9{stroke-width:-13;}#mermaid-svg-cEUNC1ekp3T38YEU .section-9 line{stroke:hsl(0, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-10 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-10 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-10 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-10 path{fill:hsl(210, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-10 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-10{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-10{stroke:hsl(210, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-10{stroke-width:-16;}#mermaid-svg-cEUNC1ekp3T38YEU .section-10 line{stroke:hsl(30, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-root rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-root path,#mermaid-svg-cEUNC1ekp3T38YEU .section-root circle{fill:hsl(240, 100%, 46.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-root text{fill:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .icon-container{height:100%;display:flex;justify-content:center;align-items:center;}#mermaid-svg-cEUNC1ekp3T38YEU .edge{fill:none;}#mermaid-svg-cEUNC1ekp3T38YEU .eventWrapper{filter:brightness(120%);}#mermaid-svg-cEUNC1ekp3T38YEU :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
2022
Naive RAG (基础版)
简单检索+生成
2023
Advanced RAG
(增强版)
查询增强+混合检索+重排序
2024
Modular RAG
(模块化版)
动态路由+自我评估
RAG 架构演进史
1.2 三代架构详细对比
| 复杂度 | 简单 | 中等 | 复杂 |
| 检索精度 | 基础 | 高(混合+重排) | 极高(动态优化) |
| 灵活性 | 固定流程 | 可配置策略 | 完全模块化 |
| 适用场景 | 简单问答 | 企业级应用 | 复杂推理任务 |
1.3 Naive RAG – 基础架构
工作原理:最简单的RAG实现,就像"查字典-翻译"的过程:
#mermaid-svg-woA7l6vtMOin57Cp{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-woA7l6vtMOin57Cp .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-woA7l6vtMOin57Cp .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-woA7l6vtMOin57Cp .error-icon{fill:#552222;}#mermaid-svg-woA7l6vtMOin57Cp .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-woA7l6vtMOin57Cp .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-woA7l6vtMOin57Cp .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-woA7l6vtMOin57Cp .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-woA7l6vtMOin57Cp .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-woA7l6vtMOin57Cp .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-woA7l6vtMOin57Cp .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-woA7l6vtMOin57Cp .marker{fill:#333333;stroke:#333333;}#mermaid-svg-woA7l6vtMOin57Cp .marker.cross{stroke:#333333;}#mermaid-svg-woA7l6vtMOin57Cp svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-woA7l6vtMOin57Cp p{margin:0;}#mermaid-svg-woA7l6vtMOin57Cp .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-woA7l6vtMOin57Cp .cluster-label text{fill:#333;}#mermaid-svg-woA7l6vtMOin57Cp .cluster-label span{color:#333;}#mermaid-svg-woA7l6vtMOin57Cp .cluster-label span p{background-color:transparent;}#mermaid-svg-woA7l6vtMOin57Cp .label text,#mermaid-svg-woA7l6vtMOin57Cp span{fill:#333;color:#333;}#mermaid-svg-woA7l6vtMOin57Cp .node rect,#mermaid-svg-woA7l6vtMOin57Cp .node circle,#mermaid-svg-woA7l6vtMOin57Cp .node ellipse,#mermaid-svg-woA7l6vtMOin57Cp .node polygon,#mermaid-svg-woA7l6vtMOin57Cp .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-woA7l6vtMOin57Cp .rough-node .label text,#mermaid-svg-woA7l6vtMOin57Cp .node .label text,#mermaid-svg-woA7l6vtMOin57Cp .image-shape .label,#mermaid-svg-woA7l6vtMOin57Cp .icon-shape .label{text-anchor:middle;}#mermaid-svg-woA7l6vtMOin57Cp .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-woA7l6vtMOin57Cp .rough-node .label,#mermaid-svg-woA7l6vtMOin57Cp .node .label,#mermaid-svg-woA7l6vtMOin57Cp .image-shape .label,#mermaid-svg-woA7l6vtMOin57Cp .icon-shape .label{text-align:center;}#mermaid-svg-woA7l6vtMOin57Cp .node.clickable{cursor:pointer;}#mermaid-svg-woA7l6vtMOin57Cp .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-woA7l6vtMOin57Cp .arrowheadPath{fill:#333333;}#mermaid-svg-woA7l6vtMOin57Cp .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-woA7l6vtMOin57Cp .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-woA7l6vtMOin57Cp .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-woA7l6vtMOin57Cp .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-woA7l6vtMOin57Cp .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-woA7l6vtMOin57Cp .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-woA7l6vtMOin57Cp .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-woA7l6vtMOin57Cp .cluster text{fill:#333;}#mermaid-svg-woA7l6vtMOin57Cp .cluster span{color:#333;}#mermaid-svg-woA7l6vtMOin57Cp div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-woA7l6vtMOin57Cp .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-woA7l6vtMOin57Cp rect.text{fill:none;stroke-width:0;}#mermaid-svg-woA7l6vtMOin57Cp .icon-shape,#mermaid-svg-woA7l6vtMOin57Cp .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-woA7l6vtMOin57Cp .icon-shape p,#mermaid-svg-woA7l6vtMOin57Cp .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-woA7l6vtMOin57Cp .icon-shape rect,#mermaid-svg-woA7l6vtMOin57Cp .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-woA7l6vtMOin57Cp .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-woA7l6vtMOin57Cp .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-woA7l6vtMOin57Cp :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
用户问题
向量检索
相关文档 Top-K
构建Prompt
LLM生成答案
返回结果
专业术语解释:
- 向量检索:将文本转换为高维向量,通过计算向量相似度找到相关文档
- Top-K:检索最相关的前K个文档,通常K取3-10
- Prompt Engineering:精心设计输入给LLM的提示词,以获得更好的输出
Python实现示例
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
# 步骤1: 初始化嵌入模型 – 将文本转换为向量
embeddings = OpenAIEmbeddings()
# 步骤2: 创建向量数据库 – 存储和检索文档向量
vectorstore = Chroma(
collection_name="knowledge_base",
embedding_function=embeddings
)
# 添加知识文档
documents = [
"RAG是Retrieval-Augmented Generation的缩写…",
"向量数据库是专门存储和检索向量的数据库…",
"Embedding模型将文本转换为高维向量…"
]
vectorstore.add_documents(documents)
# 步骤3: 创建RAG链
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
qa_chain = RetrievalQA.from_chain_type(
llm=ChatOpenAI(model="gpt-4", temperature=0),
retriever=retriever,
return_source_documents=True
)
# 执行查询
result = qa_chain({"query": "什么是RAG技术?"})
1.4 Advanced RAG – 增强架构
核心痛点:Naive RAG的三大问题
- 语义鸿沟:用户问题和文档用词不同,检索失败
- 精度不足:向量相似度不等于问题相关性
- 单一维度:只用语义检索,错过关键词匹配
Advanced RAG解决方案:
#mermaid-svg-0xn5cTjlm7cR6ChE{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-0xn5cTjlm7cR6ChE .error-icon{fill:#552222;}#mermaid-svg-0xn5cTjlm7cR6ChE .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-0xn5cTjlm7cR6ChE .marker{fill:#333333;stroke:#333333;}#mermaid-svg-0xn5cTjlm7cR6ChE .marker.cross{stroke:#333333;}#mermaid-svg-0xn5cTjlm7cR6ChE svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-0xn5cTjlm7cR6ChE p{margin:0;}#mermaid-svg-0xn5cTjlm7cR6ChE .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster-label text{fill:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster-label span{color:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster-label span p{background-color:transparent;}#mermaid-svg-0xn5cTjlm7cR6ChE .label text,#mermaid-svg-0xn5cTjlm7cR6ChE span{fill:#333;color:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .node rect,#mermaid-svg-0xn5cTjlm7cR6ChE .node circle,#mermaid-svg-0xn5cTjlm7cR6ChE .node ellipse,#mermaid-svg-0xn5cTjlm7cR6ChE .node polygon,#mermaid-svg-0xn5cTjlm7cR6ChE .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-0xn5cTjlm7cR6ChE .rough-node .label text,#mermaid-svg-0xn5cTjlm7cR6ChE .node .label text,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape .label,#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape .label{text-anchor:middle;}#mermaid-svg-0xn5cTjlm7cR6ChE .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-0xn5cTjlm7cR6ChE .rough-node .label,#mermaid-svg-0xn5cTjlm7cR6ChE .node .label,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape .label,#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape .label{text-align:center;}#mermaid-svg-0xn5cTjlm7cR6ChE .node.clickable{cursor:pointer;}#mermaid-svg-0xn5cTjlm7cR6ChE .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-0xn5cTjlm7cR6ChE .arrowheadPath{fill:#333333;}#mermaid-svg-0xn5cTjlm7cR6ChE .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-0xn5cTjlm7cR6ChE .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-0xn5cTjlm7cR6ChE .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-0xn5cTjlm7cR6ChE .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-0xn5cTjlm7cR6ChE .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-0xn5cTjlm7cR6ChE .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster text{fill:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster span{color:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-0xn5cTjlm7cR6ChE .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE rect.text{fill:none;stroke-width:0;}#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape p,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape rect,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-0xn5cTjlm7cR6ChE .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-0xn5cTjlm7cR6ChE .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-0xn5cTjlm7cR6ChE :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
用户问题
查询增强
混合检索稀疏+密集
重排序 Re-rank
构建Prompt
LLM生成
专业术语解释:
- 查询增强:使用HyDE等技术改写用户问题,使其更易于检索
- 混合检索:结合稀疏检索(BM25)和密集检索(向量),兼顾关键词和语义
- 重排序:用专门模型对检索结果进行精排,提升Top结果质量
1.5 Modular RAG – 模块化架构
设计哲学:将RAG拆解为可插拔的模块,像搭积木一样自由组合
#mermaid-svg-v0udIuTaje09TKuj{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-v0udIuTaje09TKuj .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-v0udIuTaje09TKuj .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-v0udIuTaje09TKuj .error-icon{fill:#552222;}#mermaid-svg-v0udIuTaje09TKuj .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-v0udIuTaje09TKuj .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-v0udIuTaje09TKuj .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-v0udIuTaje09TKuj .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-v0udIuTaje09TKuj .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-v0udIuTaje09TKuj .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-v0udIuTaje09TKuj .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-v0udIuTaje09TKuj .marker{fill:#333333;stroke:#333333;}#mermaid-svg-v0udIuTaje09TKuj .marker.cross{stroke:#333333;}#mermaid-svg-v0udIuTaje09TKuj svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-v0udIuTaje09TKuj p{margin:0;}#mermaid-svg-v0udIuTaje09TKuj .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-v0udIuTaje09TKuj .cluster-label text{fill:#333;}#mermaid-svg-v0udIuTaje09TKuj .cluster-label span{color:#333;}#mermaid-svg-v0udIuTaje09TKuj .cluster-label span p{background-color:transparent;}#mermaid-svg-v0udIuTaje09TKuj .label text,#mermaid-svg-v0udIuTaje09TKuj span{fill:#333;color:#333;}#mermaid-svg-v0udIuTaje09TKuj .node rect,#mermaid-svg-v0udIuTaje09TKuj .node circle,#mermaid-svg-v0udIuTaje09TKuj .node ellipse,#mermaid-svg-v0udIuTaje09TKuj .node polygon,#mermaid-svg-v0udIuTaje09TKuj .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-v0udIuTaje09TKuj .rough-node .label text,#mermaid-svg-v0udIuTaje09TKuj .node .label text,#mermaid-svg-v0udIuTaje09TKuj .image-shape .label,#mermaid-svg-v0udIuTaje09TKuj .icon-shape .label{text-anchor:middle;}#mermaid-svg-v0udIuTaje09TKuj .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-v0udIuTaje09TKuj .rough-node .label,#mermaid-svg-v0udIuTaje09TKuj .node .label,#mermaid-svg-v0udIuTaje09TKuj .image-shape .label,#mermaid-svg-v0udIuTaje09TKuj .icon-shape .label{text-align:center;}#mermaid-svg-v0udIuTaje09TKuj .node.clickable{cursor:pointer;}#mermaid-svg-v0udIuTaje09TKuj .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-v0udIuTaje09TKuj .arrowheadPath{fill:#333333;}#mermaid-svg-v0udIuTaje09TKuj .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-v0udIuTaje09TKuj .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-v0udIuTaje09TKuj .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-v0udIuTaje09TKuj .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-v0udIuTaje09TKuj .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-v0udIuTaje09TKuj .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-v0udIuTaje09TKuj .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-v0udIuTaje09TKuj .cluster text{fill:#333;}#mermaid-svg-v0udIuTaje09TKuj .cluster span{color:#333;}#mermaid-svg-v0udIuTaje09TKuj div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-v0udIuTaje09TKuj .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-v0udIuTaje09TKuj rect.text{fill:none;stroke-width:0;}#mermaid-svg-v0udIuTaje09TKuj .icon-shape,#mermaid-svg-v0udIuTaje09TKuj .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-v0udIuTaje09TKuj .icon-shape p,#mermaid-svg-v0udIuTaje09TKuj .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-v0udIuTaje09TKuj .icon-shape rect,#mermaid-svg-v0udIuTaje09TKuj .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-v0udIuTaje09TKuj .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-v0udIuTaje09TKuj .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-v0udIuTaje09TKuj :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
简单问题
复杂问题
多步推理
不满意
满意
用户问题
智能路由
直接检索
问题拆解
迭代检索
重排序
多路检索
质量评估
返回答案
专业术语解释:
- 智能路由:根据问题类型自动选择最佳检索策略
- 问题拆解:将复杂问题分解为多个子问题分别检索
- 迭代检索:利用上轮检索结果改进下轮检索
- 自我评估:系统自动评估检索质量并优化
1.6 RAG 全链路增强路线

二、检索前增强
检索前增强的目标是改进用户的查询,使其更容易被检索系统理解。
2.1 HyDE (Hypothetical Document Embeddings)
#mermaid-svg-tKdW5iiWpzpVzGkG{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-tKdW5iiWpzpVzGkG .error-icon{fill:#552222;}#mermaid-svg-tKdW5iiWpzpVzGkG .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-tKdW5iiWpzpVzGkG .marker{fill:#333333;stroke:#333333;}#mermaid-svg-tKdW5iiWpzpVzGkG .marker.cross{stroke:#333333;}#mermaid-svg-tKdW5iiWpzpVzGkG svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-tKdW5iiWpzpVzGkG p{margin:0;}#mermaid-svg-tKdW5iiWpzpVzGkG .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster-label text{fill:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster-label span{color:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster-label span p{background-color:transparent;}#mermaid-svg-tKdW5iiWpzpVzGkG .label text,#mermaid-svg-tKdW5iiWpzpVzGkG span{fill:#333;color:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .node rect,#mermaid-svg-tKdW5iiWpzpVzGkG .node circle,#mermaid-svg-tKdW5iiWpzpVzGkG .node ellipse,#mermaid-svg-tKdW5iiWpzpVzGkG .node polygon,#mermaid-svg-tKdW5iiWpzpVzGkG .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-tKdW5iiWpzpVzGkG .rough-node .label text,#mermaid-svg-tKdW5iiWpzpVzGkG .node .label text,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape .label,#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape .label{text-anchor:middle;}#mermaid-svg-tKdW5iiWpzpVzGkG .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-tKdW5iiWpzpVzGkG .rough-node .label,#mermaid-svg-tKdW5iiWpzpVzGkG .node .label,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape .label,#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape .label{text-align:center;}#mermaid-svg-tKdW5iiWpzpVzGkG .node.clickable{cursor:pointer;}#mermaid-svg-tKdW5iiWpzpVzGkG .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-tKdW5iiWpzpVzGkG .arrowheadPath{fill:#333333;}#mermaid-svg-tKdW5iiWpzpVzGkG .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-tKdW5iiWpzpVzGkG .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-tKdW5iiWpzpVzGkG .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-tKdW5iiWpzpVzGkG .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-tKdW5iiWpzpVzGkG .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-tKdW5iiWpzpVzGkG .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster text{fill:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster span{color:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-tKdW5iiWpzpVzGkG .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG rect.text{fill:none;stroke-width:0;}#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape p,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape rect,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-tKdW5iiWpzpVzGkG .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-tKdW5iiWpzpVzGkG .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-tKdW5iiWpzpVzGkG :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
用户问题
LLM生成假设答案
假设性文档
向量编码
向量检索
相关文档
返回结果
核心原理:用户的问题往往不够精准。HyDE通过生成假设性答案来改进检索。
工作原理
- 原问题:“手机电池不耐用的原因”
- 生成假设答案:“手机电池不耐用的主要原因包括:后台应用耗电、屏幕亮度设置过高、电池老化…”
- 用假设答案检索:精度大幅提升!
专业术语解释:
- 假设性文档:让LLM生成一个可能的答案,这个答案包含问题相关的关键词和语义
- 向量表示:假设答案的向量表示与真实答案的向量表示更接近
from langchain.prompts import PromptTemplate
# HyDE Prompt
hyde_prompt = PromptTemplate(
input_variables=["question"],
template="""请写一个可能的答案来回答这个问题:{question}
要求:答案应该像是在百科全书或专业文档中找到的那样。"""
)
def hyde_retrieval(question, vectorstore, top_k=3):
# 生成假设性文档
hypothetical_doc = llm(hyde_prompt.format(question=question))
# 用假设文档进行向量检索
retriever = vectorstore.as_retriever(search_kwargs={"k": top_k})
return retriever.get_relevant_documents(hypothetical_doc)
2.2 Query2Doc
#mermaid-svg-kxjd84eVN5T4JhZ4{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-kxjd84eVN5T4JhZ4 .error-icon{fill:#552222;}#mermaid-svg-kxjd84eVN5T4JhZ4 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-kxjd84eVN5T4JhZ4 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .marker.cross{stroke:#333333;}#mermaid-svg-kxjd84eVN5T4JhZ4 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-kxjd84eVN5T4JhZ4 p{margin:0;}#mermaid-svg-kxjd84eVN5T4JhZ4 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster-label text{fill:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster-label span{color:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster-label span p{background-color:transparent;}#mermaid-svg-kxjd84eVN5T4JhZ4 .label text,#mermaid-svg-kxjd84eVN5T4JhZ4 span{fill:#333;color:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .node rect,#mermaid-svg-kxjd84eVN5T4JhZ4 .node circle,#mermaid-svg-kxjd84eVN5T4JhZ4 .node ellipse,#mermaid-svg-kxjd84eVN5T4JhZ4 .node polygon,#mermaid-svg-kxjd84eVN5T4JhZ4 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .rough-node .label text,#mermaid-svg-kxjd84eVN5T4JhZ4 .node .label text,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape .label,#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape .label{text-anchor:middle;}#mermaid-svg-kxjd84eVN5T4JhZ4 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .rough-node .label,#mermaid-svg-kxjd84eVN5T4JhZ4 .node .label,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape .label,#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape .label{text-align:center;}#mermaid-svg-kxjd84eVN5T4JhZ4 .node.clickable{cursor:pointer;}#mermaid-svg-kxjd84eVN5T4JhZ4 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .arrowheadPath{fill:#333333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-kxjd84eVN5T4JhZ4 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kxjd84eVN5T4JhZ4 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster text{fill:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster span{color:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-kxjd84eVN5T4JhZ4 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 rect.text{fill:none;stroke-width:0;}#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape p,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape rect,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kxjd84eVN5T4JhZ4 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-kxjd84eVN5T4JhZ4 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-kxjd84eVN5T4JhZ4 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
用户问题
LLM生成文档片段
文档片段50-100字
向量编码
向量检索
相关文档
返回结果
核心原理:与HyDE类似,但Query2Doc专注于生成更短的文档片段来改进检索。
Query2Doc vs HyDE
| 输出长度 | 较长(完整答案) | 较短(关键片段) |
| 生成重点 | 完整回答问题 | 捕捉关键概念 |
| 适用场景 | 问答系统 | 文档检索 |
工作原理:
# Query2Doc Prompt
query2doc_prompt = PromptTemplate(
input_variables=["question"],
template="""基于以下问题,生成一个简短的文档片段(50-100字)。
要求:
1. 包含问题中的关键概念
2. 使用专业术语
3. 保持简洁
问题: {question}
文档片段:"""
)
def query2doc_retrieval(question, vectorstore):
# 生成文档片段
doc_fragment = llm(query2doc_prompt.format(question=question))
# 用文档片段进行检索
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
return retriever.get_relevant_documents(doc_fragment)
2.3 查询重写 (Query Rewriting)
核心原理:将口语化、模糊的用户问题改写为专业、精确的形式。
# 查询重写Prompt
rewrite_prompt = PromptTemplate(
template="""将以下用户问题改写为更适合搜索的形式。
要求:保留核心意图,使用更专业的术语,补充可能的同义词。
原问题: {question}
改写后的问题:""",
input_variables=["question"]
)
# 示例
# 原问题: "手机老没电咋办"
# 改写后: "手机电池续航时间短,如何解决电池消耗过快的问题"
2.4 查询分解 (Query Decomposition)
核心原理:将复杂的复合问题分解为多个简单的子问题,分别检索后再合并结果。
为什么需要查询分解?
用户问题:“RAG技术有哪些优势?在实际项目中如何应用?有哪些挑战?”
这个问题包含3个子问题,如果直接检索可能找不到同时涵盖所有内容的文档。
解决方案:
专业术语解释:
- 复合问题:包含多个子问题或多个方面的问题
- 子问题独立检索:每个子问题单独进行检索,提升召回率
from langchain.prompts import PromptTemplate
# 查询分解Prompt
decompose_prompt = PromptTemplate(
template="""将以下复杂问题分解为多个独立的子问题。
要求:
1. 每个子问题都应该能独立检索
2. 子问题之间尽量不重复
3. 分解3-5个子问题
4. 每个子问题独立成行
原始问题: {question}
子问题列表(每行一个):""",
input_variables=["question"]
)
def decompose_query(question, vectorstore):
"""查询分解并检索"""
# 1. 分解问题
sub_questions_str = llm(decompose_prompt.format(question=question))
sub_questions = [q.strip() for q in sub_questions_str.split('\\n') if q.strip()]
print(f"分解出 {len(sub_questions)} 个子问题:")
for i, sq in enumerate(sub_questions, 1):
print(f" {i}. {sq}")
# 2. 对每个子问题检索
all_docs = []
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})
for sq in sub_questions:
docs = retriever.get_relevant_documents(sq)
all_docs.extend(docs)
# 3. 去重并返回Top-N
unique_docs = list({doc.page_content: doc for doc in all_docs}.values())
return unique_docs[:5]
# 使用示例
complex_question = "RAG技术有哪些优势?在实际项目中如何应用?"
docs = decompose_query(complex_question, vectorstore)
2.5 Take Step Back
核心原理:让AI"退一步"思考,从更高层次的概念来理解问题,然后再进行检索。
Take Step Back 示例
用户问题:“Python中的装饰器是怎么工作的?在异步编程中如何使用?”
Step Back后:“Python编程中的装饰器模式和异步编程概念”
优势:从更高层次理解问题的核心概念,检索更全面
专业术语解释:
- 抽象层次:从具体问题上升到更高层次的概念
- 概念关联:通过高层次概念关联更多相关文档
# Take Step Back Prompt
step_back_prompt = PromptTemplate(
template="""你是一个擅长概念抽象的AI助手。
请将以下具体问题抽象为一个更高层次的概念性问题。
要求:
1. 提取问题中的核心概念
2. 从更高层次重新表述问题
3. 保持问题的语义范围
原始问题: {question}
抽象后的问题:""",
input_variables=["question"]
)
def take_step_back_retrieval(question, vectorstore):
"""Step Back检索"""
# 1. 生成抽象问题
abstract_question = llm(step_back_prompt.format(question=question))
print(f"原始问题: {question}")
print(f"抽象问题: {abstract_question}")
# 2. 用抽象问题检索
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
abstract_docs = retriever.get_relevant_documents(abstract_question)
# 3. 用原始问题检索
original_docs = retriever.get_relevant_documents(question)
# 4. 合并去重
all_docs = abstract_docs + original_docs
unique_docs = list({doc.page_content: doc for doc in all_docs}.values())
return unique_docs[:5]
# 使用示例
question = "Python中的装饰器是怎么工作的?在异步编程中如何使用?"
docs = take_step_back_retrieval(question, vectorstore)
Take Step Back vs 查询分解:
| 处理方式 | 抽象到更高层次 | 分解为多个子问题 |
| 检索策略 | 用抽象问题+原问题 | 分别检索每个子问题 |
| 适用场景 | 需要全局视角的问题 | 多方面的复合问题 |
三、检索增强
检索增强的目标是使用多种索引和检索策略,提升检索的召回率和准确率。
3.1 核心概念
多索引检索策略的核心思想是:不同的索引维度捕获不同的信息特征。
#mermaid-svg-xVSYOQeEx7E2tROM{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-xVSYOQeEx7E2tROM .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-xVSYOQeEx7E2tROM .error-icon{fill:#552222;}#mermaid-svg-xVSYOQeEx7E2tROM .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-xVSYOQeEx7E2tROM .marker{fill:#333333;stroke:#333333;}#mermaid-svg-xVSYOQeEx7E2tROM .marker.cross{stroke:#333333;}#mermaid-svg-xVSYOQeEx7E2tROM svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-xVSYOQeEx7E2tROM p{margin:0;}#mermaid-svg-xVSYOQeEx7E2tROM .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster-label text{fill:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster-label span{color:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster-label span p{background-color:transparent;}#mermaid-svg-xVSYOQeEx7E2tROM .label text,#mermaid-svg-xVSYOQeEx7E2tROM span{fill:#333;color:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .node rect,#mermaid-svg-xVSYOQeEx7E2tROM .node circle,#mermaid-svg-xVSYOQeEx7E2tROM .node ellipse,#mermaid-svg-xVSYOQeEx7E2tROM .node polygon,#mermaid-svg-xVSYOQeEx7E2tROM .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-xVSYOQeEx7E2tROM .rough-node .label text,#mermaid-svg-xVSYOQeEx7E2tROM .node .label text,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape .label,#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape .label{text-anchor:middle;}#mermaid-svg-xVSYOQeEx7E2tROM .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-xVSYOQeEx7E2tROM .rough-node .label,#mermaid-svg-xVSYOQeEx7E2tROM .node .label,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape .label,#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape .label{text-align:center;}#mermaid-svg-xVSYOQeEx7E2tROM .node.clickable{cursor:pointer;}#mermaid-svg-xVSYOQeEx7E2tROM .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-xVSYOQeEx7E2tROM .arrowheadPath{fill:#333333;}#mermaid-svg-xVSYOQeEx7E2tROM .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-xVSYOQeEx7E2tROM .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-xVSYOQeEx7E2tROM .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-xVSYOQeEx7E2tROM .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-xVSYOQeEx7E2tROM .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-xVSYOQeEx7E2tROM .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-xVSYOQeEx7E2tROM .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster text{fill:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster span{color:#333;}#mermaid-svg-xVSYOQeEx7E2tROM div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-xVSYOQeEx7E2tROM .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-xVSYOQeEx7E2tROM rect.text{fill:none;stroke-width:0;}#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape p,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape rect,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-xVSYOQeEx7E2tROM .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-xVSYOQeEx7E2tROM .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-xVSYOQeEx7E2tROM :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
原始文档
句子索引精准匹配
摘要索引全局视角
问题索引问答匹配
父文档索引上下文完整
多路召回
重排序
最终结果
3.2 混合检索 (Hybrid Search)
核心原理:结合稀疏检索和密集检索的优势。
专业术语解释:
| 稀疏检索 (BM25) | 基于词频统计,精确匹配关键词 | 关键词精确匹配 | 无法理解语义 |
| 密集检索 (向量) | 基于语义相似度,理解含义 | 理解语义相似性 | 错过精确关键词 |
| 混合检索 | 融合两种结果 | 兼顾两者 | 计算成本较高 |
BM25算法原理
BM25是最成功的稀疏检索算法,其核心思想是对词频进行饱和处理:
score
(
D
,
Q
)
=
∑
i
n
IDF
(
q
i
)
×
f
(
q
i
,
D
)
×
(
k
1
+
1
)
f
(
q
i
,
D
)
+
k
1
×
(
1
−
b
+
b
×
∣
D
∣
avgdl
)
\\text{score}(D, Q) = \\sum_{i}^{n} \\text{IDF}(q_i) \\times \\frac{f(q_i, D) \\times (k_1 + 1)}{f(q_i, D) + k_1 \\times (1 – b + b \\times \\frac{|D|}{\\text{avgdl}})}
score(D,Q)=i∑nIDF(qi)×f(qi,D)+k1×(1−b+b×avgdl∣D∣)f(qi,D)×(k1+1)
公式解读:
-
f
(
q
i
,
D
)
f(q_i, D)
f(qi,D):词q
i
q_i
qi 在文档D
D
D 中出现的次数 -
∣
D
∣
|D|
∣D∣:文档D
D
D 的词数 -
avgdl
\\text{avgdl}
avgdl:所有文档的平均长度 -
k
1
k_1
k1:饱和参数(推荐 1.2-2.0),控制词频的重要性 -
b
b
b:长度归一化参数(推荐 0.75),控制文档长度的影响
BM25改进:对TF-IDF的改进在于词频饱和和文档长度归一化,避免长文档占优势。
向量检索原理
Embedding模型:将文本转换为高维向量(通常768-3072维),相似的文本在向量空间中距离更近。
相似度计算:通常使用余弦相似度
similarity
=
cos
(
θ
)
=
A
⋅
B
∣
∣
A
∣
∣
×
∣
∣
B
∣
∣
\\text{similarity} = \\cos(\\theta) = \\frac{A \\cdot B}{||A|| \\times ||B||}
similarity=cos(θ)=∣∣A∣∣×∣∣B∣∣A⋅B
RRF融合算法
RRF (Reciprocal Rank Fusion):将多个检索系统的结果融合,核心公式:
final_score
(
d
)
=
∑
i
=
1
n
1
k
+
rank
i
(
d
)
\\text{final\\_score}(d) = \\sum_{i=1}^{n} \\frac{1}{k + \\text{rank}_i(d)}
final_score(d)=i=1∑nk+ranki(d)1
其中:
-
k
k
k 是常数(通常取60),控制排名衰减速度 -
rank
i
(
d
)
\\text{rank}_i(d)
ranki(d) 是文档d
d
d 在第i
i
i 个检索系统中的排名(从0开始,排名第1的文档rank=0)
计算示例:
- 排名第1的文档:
1
60
+
0
=
0.0167
\\frac{1}{60+0} = 0.0167
60+01=0.0167 - 排名第2的文档:
1
60
+
1
=
0.0164
\\frac{1}{60+1} = 0.0164
60+11=0.0164 - 排名第10的文档:
1
60
+
9
=
0.0149
\\frac{1}{60+9} = 0.0149
60+91=0.0149
from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import BM25Retriever
# 稀疏检索器 (BM25)
bm25_retriever = BM25Retriever.from_documents(documents, k=5)
# 密集检索器 (向量)
vector_retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
# 融合检索器 – 使用RRF算法
ensemble_retriever = EnsembleRetriever(
retrievers=[bm25_retriever, vector_retriever],
weights=[0.5, 0.5] # 可调整权重
)
results = ensemble_retriever.get_relevant_documents("什么是RAG?")
3.3 句子窗口检索 (Sentence Window Retriever)
核心概念:句子窗口检索是一种平衡精度和上下文的技术。
工作原理:
优势:
- 检索时用小窗口,精度高(向量更精准)
- 返回时用大窗口,上下文完整
from llama_index.node_parser import SentenceWindowNodeParser
from llama_index import VectorStoreIndex, Document
from llama_index.postprocessor import MetadataReplacementPostprocessor
# 创建句子窗口解析器
node_parser = SentenceWindowNodeParser.from_defaults(
window_size=3, # 窗口大小:前后各3句
window_metadata_key="window",
original_text_metadata_key="original_text"
)
# 解析文档为节点
nodes = node_parser.get_nodes_from_documents(documents)
# 创建索引
index = VectorStoreIndex(nodes)
# 创建查询引擎(带窗口恢复)
query_engine = index.as_query_engine(
similarity_top_k=2,
node_postprocessors=[
MetadataReplacementPostprocessor(target_metadata_key="window")
]
)
response = query_engine.query("RAG有哪些优化方式?")
3.4 父子文档检索 (Parent Document Retriever)
核心概念:用小块检索,返回大块。
设计思想
#mermaid-svg-eqBdhxYmxw3HeyM6{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-eqBdhxYmxw3HeyM6 .error-icon{fill:#552222;}#mermaid-svg-eqBdhxYmxw3HeyM6 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-eqBdhxYmxw3HeyM6 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .marker.cross{stroke:#333333;}#mermaid-svg-eqBdhxYmxw3HeyM6 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-eqBdhxYmxw3HeyM6 p{margin:0;}#mermaid-svg-eqBdhxYmxw3HeyM6 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster-label text{fill:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster-label span{color:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster-label span p{background-color:transparent;}#mermaid-svg-eqBdhxYmxw3HeyM6 .label text,#mermaid-svg-eqBdhxYmxw3HeyM6 span{fill:#333;color:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .node rect,#mermaid-svg-eqBdhxYmxw3HeyM6 .node circle,#mermaid-svg-eqBdhxYmxw3HeyM6 .node ellipse,#mermaid-svg-eqBdhxYmxw3HeyM6 .node polygon,#mermaid-svg-eqBdhxYmxw3HeyM6 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .rough-node .label text,#mermaid-svg-eqBdhxYmxw3HeyM6 .node .label text,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape .label,#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape .label{text-anchor:middle;}#mermaid-svg-eqBdhxYmxw3HeyM6 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .rough-node .label,#mermaid-svg-eqBdhxYmxw3HeyM6 .node .label,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape .label,#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape .label{text-align:center;}#mermaid-svg-eqBdhxYmxw3HeyM6 .node.clickable{cursor:pointer;}#mermaid-svg-eqBdhxYmxw3HeyM6 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .arrowheadPath{fill:#333333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-eqBdhxYmxw3HeyM6 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-eqBdhxYmxw3HeyM6 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster text{fill:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster span{color:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-eqBdhxYmxw3HeyM6 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 rect.text{fill:none;stroke-width:0;}#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape p,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape rect,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-eqBdhxYmxw3HeyM6 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-eqBdhxYmxw3HeyM6 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-eqBdhxYmxw3HeyM6 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
大文档 1000字
分割
小文档1200字
小文档2200字
小文档3200字
向量检索
匹配到小文档2
返回父文档完整1000字
工作原理:
专业术语解释:
- 父文档:较大的文档块(如1000字),包含完整的上下文
- 子文档:较小的文档块(如200字),用于精准检索
from langchain.retrievers import ParentDocumentRetriever
from langchain.storage import InMemoryStore
from langchain.text_splitter import RecursiveCharacterTextSplitter
# 小文档分割器(用于检索)
child_splitter = RecursiveCharacterTextSplitter(
chunk_size=200, # 200字符一个块
chunk_overlap=50, # 重叠50字符
separators=["\\n\\n", "\\n", "。", "!", "?", ",", " ", ""]
)
# 大文档分割器(用于返回)
parent_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, # 1000字符一个块
chunk_overlap=100,
separators=["\\n\\n", "\\n", "。", "!", "?"]
)
# 创建父文档检索器
retriever = ParentDocumentRetriever(
vectorstore=vectorstore,
docstore=store,
child_splitter=child_splitter,
parent_splitter=parent_splitter
)
3.5 摘要检索 (Summary Retriever)
#mermaid-svg-dsknaoLZTw9N2qX6{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-dsknaoLZTw9N2qX6 .error-icon{fill:#552222;}#mermaid-svg-dsknaoLZTw9N2qX6 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-dsknaoLZTw9N2qX6 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-dsknaoLZTw9N2qX6 .marker.cross{stroke:#333333;}#mermaid-svg-dsknaoLZTw9N2qX6 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-dsknaoLZTw9N2qX6 p{margin:0;}#mermaid-svg-dsknaoLZTw9N2qX6 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster-label text{fill:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster-label span{color:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster-label span p{background-color:transparent;}#mermaid-svg-dsknaoLZTw9N2qX6 .label text,#mermaid-svg-dsknaoLZTw9N2qX6 span{fill:#333;color:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .node rect,#mermaid-svg-dsknaoLZTw9N2qX6 .node circle,#mermaid-svg-dsknaoLZTw9N2qX6 .node ellipse,#mermaid-svg-dsknaoLZTw9N2qX6 .node polygon,#mermaid-svg-dsknaoLZTw9N2qX6 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-dsknaoLZTw9N2qX6 .rough-node .label text,#mermaid-svg-dsknaoLZTw9N2qX6 .node .label text,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape .label,#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape .label{text-anchor:middle;}#mermaid-svg-dsknaoLZTw9N2qX6 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-dsknaoLZTw9N2qX6 .rough-node .label,#mermaid-svg-dsknaoLZTw9N2qX6 .node .label,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape .label,#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape .label{text-align:center;}#mermaid-svg-dsknaoLZTw9N2qX6 .node.clickable{cursor:pointer;}#mermaid-svg-dsknaoLZTw9N2qX6 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-dsknaoLZTw9N2qX6 .arrowheadPath{fill:#333333;}#mermaid-svg-dsknaoLZTw9N2qX6 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-dsknaoLZTw9N2qX6 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-dsknaoLZTw9N2qX6 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dsknaoLZTw9N2qX6 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-dsknaoLZTw9N2qX6 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dsknaoLZTw9N2qX6 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster text{fill:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster span{color:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-dsknaoLZTw9N2qX6 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 rect.text{fill:none;stroke-width:0;}#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape p,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape rect,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dsknaoLZTw9N2qX6 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-dsknaoLZTw9N2qX6 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-dsknaoLZTw9N2qX6 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
原始文档5000字
生成摘要200字
摘要索引向量存储
用户问题
向量检索
匹配摘要
返回完整文档
核心概念:用文档摘要进行检索,返回完整文档。
为什么需要摘要检索?
场景:用户问"RAG技术在企业中的应用"
问题:
- 原始文档太长(5000字),向量检索容易聚焦细节而忽略主题
- 摘要概括了文档的核心内容(200字)
解决方案:
- 用摘要向量进行检索(匹配主题)
- 返回完整文档(提供细节)
工作原理:
专业术语解释:
- 摘要索引:存储文档摘要的向量索引,用于主题级别的检索
- 全局视角:摘要提供文档的全局视图,避免陷入细节
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.schema import Document
# 1. 摘要生成Prompt
summary_prompt = PromptTemplate(
template="""请为以下文档生成一个简洁的摘要(100字以内):
文档:
{context}
摘要:""",
input_variables=["context"]
)
# 2. 为文档生成摘要
def generate_summary(documents):
llm = OpenAI(temperature=0)
summaries = []
for i, doc in enumerate(documents):
print(f"处理第 {i+1}/{len(documents)} 个文档…")
# 生成摘要
summary = llm(summary_prompt.format(context=doc.page_content))
summaries.append({
"summary": summary,
"original_doc": doc
})
return summaries
# 3. 为知识库生成摘要
summaries = generate_summary(documents)
# 4. 用摘要构建索引
summary_docs = [
Document(
page_content=s["summary"],
metadata={"doc_id": i}
)
for i, s in enumerate(summaries)
]
summary_index = Chroma.from_documents(
summary_docs,
embeddings,
collection_name="summary_index"
)
# 5. 检索函数
def retrieve_by_summary(query):
"""通过摘要检索,返回原始文档"""
# 找到相似的摘要
similar_summaries = summary_index.similarity_search(query, k=1)
if not similar_summaries:
return []
# 获取对应的原始文档
doc_id = similar_summaries[0].metadata["doc_id"]
return [summaries[doc_id]["original_doc"]]
# 使用示例
results = retrieve_by_summary("RAG在企业中的应用场景")
3.6 假设性问题检索 (Hypothetical Questions Retriever)
#mermaid-svg-VhHYhUZxffWjKkhh{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-VhHYhUZxffWjKkhh .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-VhHYhUZxffWjKkhh .error-icon{fill:#552222;}#mermaid-svg-VhHYhUZxffWjKkhh .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-VhHYhUZxffWjKkhh .marker{fill:#333333;stroke:#333333;}#mermaid-svg-VhHYhUZxffWjKkhh .marker.cross{stroke:#333333;}#mermaid-svg-VhHYhUZxffWjKkhh svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-VhHYhUZxffWjKkhh p{margin:0;}#mermaid-svg-VhHYhUZxffWjKkhh .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster-label text{fill:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster-label span{color:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster-label span p{background-color:transparent;}#mermaid-svg-VhHYhUZxffWjKkhh .label text,#mermaid-svg-VhHYhUZxffWjKkhh span{fill:#333;color:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .node rect,#mermaid-svg-VhHYhUZxffWjKkhh .node circle,#mermaid-svg-VhHYhUZxffWjKkhh .node ellipse,#mermaid-svg-VhHYhUZxffWjKkhh .node polygon,#mermaid-svg-VhHYhUZxffWjKkhh .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-VhHYhUZxffWjKkhh .rough-node .label text,#mermaid-svg-VhHYhUZxffWjKkhh .node .label text,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape .label,#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape .label{text-anchor:middle;}#mermaid-svg-VhHYhUZxffWjKkhh .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-VhHYhUZxffWjKkhh .rough-node .label,#mermaid-svg-VhHYhUZxffWjKkhh .node .label,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape .label,#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape .label{text-align:center;}#mermaid-svg-VhHYhUZxffWjKkhh .node.clickable{cursor:pointer;}#mermaid-svg-VhHYhUZxffWjKkhh .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-VhHYhUZxffWjKkhh .arrowheadPath{fill:#333333;}#mermaid-svg-VhHYhUZxffWjKkhh .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-VhHYhUZxffWjKkhh .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-VhHYhUZxffWjKkhh .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-VhHYhUZxffWjKkhh .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-VhHYhUZxffWjKkhh .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-VhHYhUZxffWjKkhh .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-VhHYhUZxffWjKkhh .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster text{fill:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster span{color:#333;}#mermaid-svg-VhHYhUZxffWjKkhh div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-VhHYhUZxffWjKkhh .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-VhHYhUZxffWjKkhh rect.text{fill:none;stroke-width:0;}#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape p,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape rect,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-VhHYhUZxffWjKkhh .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-VhHYhUZxffWjKkhh .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-VhHYhUZxffWjKkhh :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
原始文档
LLM生成假设问题
问题1Python是什么时候创建的?
问题2Python的创始人是谁?
问题3Python有什么特点?
问题索引向量存储
用户问题Python的历史
向量检索
匹配问题
返回原始文档
核心概念:预先生成文档可能回答的问题,用这些问题进行检索。
应用场景
文档:“Python是一门高级编程语言,由Guido van Rossum于1991年创建…”
生成假设性问题:
- “Python是什么时候创建的?”
- “Python的创始人是谁?”
- “Python有什么特点?”
检索时:用户问"Python的历史" → 匹配到"Python是什么时候创建的?" → 返回原文档
工作原理:
from langchain.prompts import PromptTemplate
# 问题生成Prompt
question_gen_prompt = PromptTemplate(
template="""阅读以下文档,生成3个可能的问题。
要求:
1. 问题应该涵盖文档的核心内容
2. 问题的表达方式应该多样化
3. 每个问题独立成行
文档:
{context}
问题列表:""",
input_variables=["context"]
)
def generate_questions(documents):
"""为文档生成假设性问题"""
llm = OpenAI(temperature=0.7)
all_questions = []
for i, doc in enumerate(documents, 1):
# 生成问题
response = llm(question_gen_prompt.format(context=doc.page_content))
questions = [q.strip() for q in response.split('\\n') if q.strip()]
# 存储问题与文档的映射
for q in questions:
all_questions.append({
"question": q,
"source_doc": doc,
"source_id": i
})
return all_questions
# 使用示例
questions = generate_questions(documents)
# 用问题构建索引
question_docs = [
Document(page_content=q["question"], metadata={"source_id": q["source_id"]})
for q in questions
]
question_index = Chroma.from_documents(question_docs, embeddings)
3.7 多索引融合策略
#mermaid-svg-WzTycUmhToTyUlO9{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-WzTycUmhToTyUlO9 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-WzTycUmhToTyUlO9 .error-icon{fill:#552222;}#mermaid-svg-WzTycUmhToTyUlO9 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-WzTycUmhToTyUlO9 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-WzTycUmhToTyUlO9 .marker.cross{stroke:#333333;}#mermaid-svg-WzTycUmhToTyUlO9 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-WzTycUmhToTyUlO9 p{margin:0;}#mermaid-svg-WzTycUmhToTyUlO9 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster-label text{fill:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster-label span{color:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster-label span p{background-color:transparent;}#mermaid-svg-WzTycUmhToTyUlO9 .label text,#mermaid-svg-WzTycUmhToTyUlO9 span{fill:#333;color:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .node rect,#mermaid-svg-WzTycUmhToTyUlO9 .node circle,#mermaid-svg-WzTycUmhToTyUlO9 .node ellipse,#mermaid-svg-WzTycUmhToTyUlO9 .node polygon,#mermaid-svg-WzTycUmhToTyUlO9 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-WzTycUmhToTyUlO9 .rough-node .label text,#mermaid-svg-WzTycUmhToTyUlO9 .node .label text,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape .label,#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape .label{text-anchor:middle;}#mermaid-svg-WzTycUmhToTyUlO9 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-WzTycUmhToTyUlO9 .rough-node .label,#mermaid-svg-WzTycUmhToTyUlO9 .node .label,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape .label,#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape .label{text-align:center;}#mermaid-svg-WzTycUmhToTyUlO9 .node.clickable{cursor:pointer;}#mermaid-svg-WzTycUmhToTyUlO9 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-WzTycUmhToTyUlO9 .arrowheadPath{fill:#333333;}#mermaid-svg-WzTycUmhToTyUlO9 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-WzTycUmhToTyUlO9 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-WzTycUmhToTyUlO9 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-WzTycUmhToTyUlO9 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-WzTycUmhToTyUlO9 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-WzTycUmhToTyUlO9 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-WzTycUmhToTyUlO9 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster text{fill:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster span{color:#333;}#mermaid-svg-WzTycUmhToTyUlO9 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-WzTycUmhToTyUlO9 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-WzTycUmhToTyUlO9 rect.text{fill:none;stroke-width:0;}#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape p,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape rect,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-WzTycUmhToTyUlO9 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-WzTycUmhToTyUlO9 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-WzTycUmhToTyUlO9 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
用户问题
句子索引检索
摘要索引检索
问题索引检索
父文档索引检索
多路召回合并结果
去重移除重复文档
重排序Re-rank
Top-N结果
核心概念:同时使用多种索引策略,融合所有结果。
专业术语解释:
- 多路召回:从多个索引通道同时检索文档
- 结果融合:将多个检索结果合并、去重、排序
def multi_index_retrieval(query):
"""多索引融合检索"""
all_docs = []
# 1. 句子窗口检索
sentence_results = sentence_retriever.get_relevant_documents(query)
all_docs.extend(sentence_results)
# 2. 摘要检索
summary_results = summary_retriever.get_relevant_documents(query)
all_docs.extend(summary_results)
# 3. 问题检索
question_results = question_retriever.get_relevant_documents(query)
all_docs.extend(question_results)
# 4. 去重
unique_docs = list({doc.page_content: doc for doc in all_docs}.values())
# 5. 重排序
reranked_docs = reranker(query, unique_docs)
return reranked_docs[:5]
四、检索后增强
检索后增强的目标是对检索结果进行优化,提升最终答案的质量。
4.1 重排序 (Re-ranking) 原理
核心问题:向量检索使用余弦相似度,但这不等于问题相关性!
例子
用户问:“如何减肥?”
- 文档A(向量相似度0.85):“减肥需要控制饮食和运动” – 直接相关
- 文档B(向量相似度0.87):“这个文档讨论了各种健康问题,包括如何增重、如何减肥…” – 相关但啰嗦
重排序后:文档A排第一,因为直接回答问题
专业术语解释:
- 双塔模型:向量检索使用的模型,查询和文档独立编码
- 交叉编码器:重排序使用的模型,查询和文档一起编码,精度更高但速度更慢
#mermaid-svg-6OQAgoxFNffYeUS1{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-6OQAgoxFNffYeUS1 .error-icon{fill:#552222;}#mermaid-svg-6OQAgoxFNffYeUS1 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-6OQAgoxFNffYeUS1 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-6OQAgoxFNffYeUS1 .marker.cross{stroke:#333333;}#mermaid-svg-6OQAgoxFNffYeUS1 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-6OQAgoxFNffYeUS1 p{margin:0;}#mermaid-svg-6OQAgoxFNffYeUS1 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster-label text{fill:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster-label span{color:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster-label span p{background-color:transparent;}#mermaid-svg-6OQAgoxFNffYeUS1 .label text,#mermaid-svg-6OQAgoxFNffYeUS1 span{fill:#333;color:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .node rect,#mermaid-svg-6OQAgoxFNffYeUS1 .node circle,#mermaid-svg-6OQAgoxFNffYeUS1 .node ellipse,#mermaid-svg-6OQAgoxFNffYeUS1 .node polygon,#mermaid-svg-6OQAgoxFNffYeUS1 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-6OQAgoxFNffYeUS1 .rough-node .label text,#mermaid-svg-6OQAgoxFNffYeUS1 .node .label text,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape .label,#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape .label{text-anchor:middle;}#mermaid-svg-6OQAgoxFNffYeUS1 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-6OQAgoxFNffYeUS1 .rough-node .label,#mermaid-svg-6OQAgoxFNffYeUS1 .node .label,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape .label,#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape .label{text-align:center;}#mermaid-svg-6OQAgoxFNffYeUS1 .node.clickable{cursor:pointer;}#mermaid-svg-6OQAgoxFNffYeUS1 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-6OQAgoxFNffYeUS1 .arrowheadPath{fill:#333333;}#mermaid-svg-6OQAgoxFNffYeUS1 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-6OQAgoxFNffYeUS1 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-6OQAgoxFNffYeUS1 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-6OQAgoxFNffYeUS1 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-6OQAgoxFNffYeUS1 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-6OQAgoxFNffYeUS1 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster text{fill:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster span{color:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-6OQAgoxFNffYeUS1 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 rect.text{fill:none;stroke-width:0;}#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape p,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape rect,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-6OQAgoxFNffYeUS1 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-6OQAgoxFNffYeUS1 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-6OQAgoxFNffYeUS1 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}
向量检索Top-10文档
粗排结果
Re-rank模型精打细算
Top-3高质量文档
4.2 主流Re-ranker对比
| Cohere Rerank-v3.5 | 商业API | 多语言,效果最好 | 中等 | 最高 |
| BGE Reranker-v2 | 开源 | 中文友好,免费 | 较快 | 高 |
| FlashRank | 开源 | 极快,轻量级 | 极快 | 中等 |
4.3 Re-ranking实现
# 方案1: Cohere Rerank
import cohere
def cohere_rerank(query, documents, top_n=3):
co = cohere.Client('your-api-key')
results = co.rerank(
model="rerank-v3.5",
query=query,
documents=[doc.page_content for doc in documents],
top_n=top_n
)
return [documents[r.index] for r in results.results]
# 方案2: BGE Reranker
from sentence_transformers import CrossEncoder
def bge_rerank(query, documents, top_k=3):
model = CrossEncoder('BAAI/bge-reranker-v2-m3')
# 构建查询-文档对
pairs = [[query, doc.page_content] for doc in documents]
# 计算分数
scores = model.predict(pairs)
# 排序
scored_docs = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)
return [doc for doc, score in scored_docs[:top_k]]
4.4 上下文压缩
核心概念:对检索到的文档进行压缩,去除无关内容,保留关键信息。
专业术语解释:
- 上下文压缩:提取文档中与问题最相关的片段
- 信息密度:单位长度内的有效信息量
from langchain.retrievers import ContextualCompressionRetriever
from langchain.document_compressors import LLMChainExtractor
# 创建压缩器
compressor = LLMChainExtractor.from_llm(ChatOpenAI(model="gpt-4"))
# 创建压缩检索器
compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor,
base_retriever=vectorstore.as_retriever()
)
# 检索并压缩
compressed_docs = compression_retriever.get_relevant_documents("什么是RAG?")
五、完整生产级示例
5.1 LangChain生产级实现
import os
from typing import List, Dict
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.retrievers import ContextualCompressionRetriever
from langchain_community.document_compressors.cohere_rerank import CohereRerank
from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import BM25Retriever
class ProductionRAG:
"""生产级RAG系统"""
def __init__(
self,
knowledge_base: List[str],
model: str = "gpt-4",
top_k: int = 20,
rerank_top_n: int = 5
):
# 初始化嵌入模型
self.embeddings = OpenAIEmbeddings()
# 创建向量数据库
self.vectorstore = Chroma(
collection_name="production_kb",
embedding_function=self.embeddings
)
# 添加文档
if self.vectorstore._collection.count() == 0:
from langchain.schema import Document
docs = [Document(page_content=text) for text in knowledge_base]
self.vectorstore.add_documents(docs)
# 混合检索器
bm25_retriever = BM25Retriever.from_documents(docs, k=top_k)
vector_retriever = self.vectorstore.as_retriever(search_kwargs={"k": top_k})
# 融合检索器
self.retriever = EnsembleRetriever(
retrievers=[bm25_retriever, vector_retriever],
weights=[0.4, 0.6]
)
# 重排序
compressor = CohereRerank(top_n=rerank_top_n)
self.compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor,
base_retriever=self.retriever
)
# 创建RAG链
self.qa_chain = RetrievalQA.from_chain_type(
llm=ChatOpenAI(model=model, temperature=0),
retriever=self.compression_retriever,
return_source_documents=True
)
def query(self, question: str) –> Dict:
"""执行查询"""
result = self.qa_chain({"query": question})
return {
"question": question,
"answer": result["result"],
"sources": result.get("source_documents", [])
}
5.2 LlamaIndex生产级实现
from llama_index import VectorStoreIndex, ServiceContext
from llama_index.llms import OpenAI
from llama_index.embeddings import OpenAIEmbedding
from llama_index.node_parser import SentenceWindowNodeParser
from llama_index.postprocessor import MetadataReplacementPostprocessor
class ProductionRAGLlama:
"""生产级RAG系统 – LlamaIndex版本"""
def __init__(self, data_dir: str = "./data"):
# 配置服务上下文
self.service_context = ServiceContext.from_defaults(
llm=OpenAI(model="gpt-4", temperature=0),
embed_model=OpenAIEmbedding(model="text-embedding-ada-002"),
chunk_size=512,
chunk_overlap=50
)
# 加载文档
from llama_index import SimpleDirectoryReader
documents = SimpleDirectoryReader(data_dir).load_data()
# 句子窗口解析器
node_parser = SentenceWindowNodeParser.from_defaults(window_size=3)
nodes = node_parser.get_nodes_from_documents(documents)
# 创建索引
self.index = VectorStoreIndex(nodes, service_context=self.service_context)
# 创建查询引擎
self.query_engine = self.index.as_query_engine(
similarity_top_k=10,
node_postprocessors=[
MetadataReplacementPostprocessor(target_metadata_key="window")
],
response_mode="compact"
)
def query(self, question: str) –> str:
"""执行查询"""
response = self.query_engine.query(question)
return str(response)
网硕互联帮助中心


评论前必须登录!
注册