云计算百科
云计算领域专业知识百科平台

RAG检索增强优化指南

目录

  • RAG检索增强优化指南
    • 一、三代RAG架构对比
      • 1.1 架构演进时间线
      • 1.2 三代架构详细对比
      • 1.3 Naive RAG – 基础架构
        • Python实现示例
      • 1.4 Advanced RAG – 增强架构
      • 1.5 Modular RAG – 模块化架构
      • 1.6 RAG 全链路增强路线
    • 二、检索前增强
      • 2.1 HyDE (Hypothetical Document Embeddings)
      • 2.2 Query2Doc
      • 2.3 查询重写 (Query Rewriting)
      • 2.4 查询分解 (Query Decomposition)
      • 2.5 Take Step Back
    • 三、检索增强
      • 3.1 核心概念
      • 3.2 混合检索 (Hybrid Search)
        • BM25算法原理
        • 向量检索原理
        • RRF融合算法
      • 3.3 句子窗口检索 (Sentence Window Retriever)
      • 3.4 父子文档检索 (Parent Document Retriever)
      • 3.5 摘要检索 (Summary Retriever)
      • 3.6 假设性问题检索 (Hypothetical Questions Retriever)
      • 3.7 多索引融合策略
    • 四、检索后增强
      • 4.1 重排序 (Re-ranking) 原理
      • 4.2 主流Re-ranker对比
      • 4.3 Re-ranking实现
      • 4.4 上下文压缩
    • 五、完整生产级示例
      • 5.1 LangChain生产级实现
      • 5.2 LlamaIndex生产级实现

RAG检索增强优化指南

核心概念 RAG (Retrieval-Augmented Generation) 是一种革命性的AI架构,巧妙地将检索系统与生成式大语言模型结合。它像给大模型配备了一个"外挂知识库",让AI能够:

  • 从外部知识库实时获取信息
  • 基于事实生成答案,大幅减少幻觉
  • 知识库可随时更新,无需重新训练模型
  • 提供答案来源,增强可解释性

一、三代RAG架构对比

RAG技术经历了三代演进,每一代都解决了前一代的痛点。

1.1 架构演进时间线

#mermaid-svg-cEUNC1ekp3T38YEU{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-cEUNC1ekp3T38YEU .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-cEUNC1ekp3T38YEU .error-icon{fill:#552222;}#mermaid-svg-cEUNC1ekp3T38YEU .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-cEUNC1ekp3T38YEU .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-cEUNC1ekp3T38YEU .marker{fill:#333333;stroke:#333333;}#mermaid-svg-cEUNC1ekp3T38YEU .marker.cross{stroke:#333333;}#mermaid-svg-cEUNC1ekp3T38YEU svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-cEUNC1ekp3T38YEU p{margin:0;}#mermaid-svg-cEUNC1ekp3T38YEU .edge{stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .section–1 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section–1 path,#mermaid-svg-cEUNC1ekp3T38YEU .section–1 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section–1 path{fill:hsl(240, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section–1 text{fill:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon–1{font-size:40px;color:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge–1{stroke:hsl(240, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth–1{stroke-width:17;}#mermaid-svg-cEUNC1ekp3T38YEU .section–1 line{stroke:hsl(60, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-0 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-0 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-0 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-0 path{fill:hsl(60, 100%, 73.5294117647%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-0 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-0{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-0{stroke:hsl(60, 100%, 73.5294117647%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-0{stroke-width:14;}#mermaid-svg-cEUNC1ekp3T38YEU .section-0 line{stroke:hsl(240, 100%, 83.5294117647%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-1 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-1 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-1 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-1 path{fill:hsl(80, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-1 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-1{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-1{stroke:hsl(80, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-1{stroke-width:11;}#mermaid-svg-cEUNC1ekp3T38YEU .section-1 line{stroke:hsl(260, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-2 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-2 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-2 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-2 path{fill:hsl(270, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-2 text{fill:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-2{font-size:40px;color:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-2{stroke:hsl(270, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-2{stroke-width:8;}#mermaid-svg-cEUNC1ekp3T38YEU .section-2 line{stroke:hsl(90, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-3 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-3 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-3 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-3 path{fill:hsl(300, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-3 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-3{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-3{stroke:hsl(300, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-3{stroke-width:5;}#mermaid-svg-cEUNC1ekp3T38YEU .section-3 line{stroke:hsl(120, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-4 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-4 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-4 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-4 path{fill:hsl(330, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-4 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-4{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-4{stroke:hsl(330, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-4{stroke-width:2;}#mermaid-svg-cEUNC1ekp3T38YEU .section-4 line{stroke:hsl(150, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-5 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-5 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-5 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-5 path{fill:hsl(0, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-5 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-5{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-5{stroke:hsl(0, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-5{stroke-width:-1;}#mermaid-svg-cEUNC1ekp3T38YEU .section-5 line{stroke:hsl(180, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-6 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-6 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-6 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-6 path{fill:hsl(30, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-6 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-6{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-6{stroke:hsl(30, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-6{stroke-width:-4;}#mermaid-svg-cEUNC1ekp3T38YEU .section-6 line{stroke:hsl(210, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-7 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-7 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-7 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-7 path{fill:hsl(90, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-7 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-7{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-7{stroke:hsl(90, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-7{stroke-width:-7;}#mermaid-svg-cEUNC1ekp3T38YEU .section-7 line{stroke:hsl(270, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-8 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-8 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-8 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-8 path{fill:hsl(150, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-8 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-8{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-8{stroke:hsl(150, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-8{stroke-width:-10;}#mermaid-svg-cEUNC1ekp3T38YEU .section-8 line{stroke:hsl(330, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-9 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-9 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-9 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-9 path{fill:hsl(180, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-9 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-9{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-9{stroke:hsl(180, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-9{stroke-width:-13;}#mermaid-svg-cEUNC1ekp3T38YEU .section-9 line{stroke:hsl(0, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-10 rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-10 path,#mermaid-svg-cEUNC1ekp3T38YEU .section-10 circle,#mermaid-svg-cEUNC1ekp3T38YEU .section-10 path{fill:hsl(210, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-10 text{fill:black;}#mermaid-svg-cEUNC1ekp3T38YEU .node-icon-10{font-size:40px;color:black;}#mermaid-svg-cEUNC1ekp3T38YEU .section-edge-10{stroke:hsl(210, 100%, 76.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .edge-depth-10{stroke-width:-16;}#mermaid-svg-cEUNC1ekp3T38YEU .section-10 line{stroke:hsl(30, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-cEUNC1ekp3T38YEU .lineWrapper line{stroke:black;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled,#mermaid-svg-cEUNC1ekp3T38YEU .disabled circle,#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:lightgray;}#mermaid-svg-cEUNC1ekp3T38YEU .disabled text{fill:#efefef;}#mermaid-svg-cEUNC1ekp3T38YEU .section-root rect,#mermaid-svg-cEUNC1ekp3T38YEU .section-root path,#mermaid-svg-cEUNC1ekp3T38YEU .section-root circle{fill:hsl(240, 100%, 46.2745098039%);}#mermaid-svg-cEUNC1ekp3T38YEU .section-root text{fill:#ffffff;}#mermaid-svg-cEUNC1ekp3T38YEU .icon-container{height:100%;display:flex;justify-content:center;align-items:center;}#mermaid-svg-cEUNC1ekp3T38YEU .edge{fill:none;}#mermaid-svg-cEUNC1ekp3T38YEU .eventWrapper{filter:brightness(120%);}#mermaid-svg-cEUNC1ekp3T38YEU :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

2022

Naive RAG (基础版)

简单检索+生成

2023

Advanced RAG

(增强版)

查询增强+混合检索+重排序

2024

Modular RAG

(模块化版)

动态路由+自我评估

RAG 架构演进史

1.2 三代架构详细对比

特性维度Naive RAGAdvanced RAGModular RAG
复杂度 简单 中等 复杂
检索精度 基础 高(混合+重排) 极高(动态优化)
灵活性 固定流程 可配置策略 完全模块化
适用场景 简单问答 企业级应用 复杂推理任务

1.3 Naive RAG – 基础架构

工作原理:最简单的RAG实现,就像"查字典-翻译"的过程:

  • 检索:在向量数据库中查找与问题相关的文档
  • 构建:将检索到的文档和问题一起组成Prompt
  • 生成:让LLM基于这些信息生成答案
  • #mermaid-svg-woA7l6vtMOin57Cp{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-woA7l6vtMOin57Cp .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-woA7l6vtMOin57Cp .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-woA7l6vtMOin57Cp .error-icon{fill:#552222;}#mermaid-svg-woA7l6vtMOin57Cp .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-woA7l6vtMOin57Cp .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-woA7l6vtMOin57Cp .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-woA7l6vtMOin57Cp .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-woA7l6vtMOin57Cp .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-woA7l6vtMOin57Cp .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-woA7l6vtMOin57Cp .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-woA7l6vtMOin57Cp .marker{fill:#333333;stroke:#333333;}#mermaid-svg-woA7l6vtMOin57Cp .marker.cross{stroke:#333333;}#mermaid-svg-woA7l6vtMOin57Cp svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-woA7l6vtMOin57Cp p{margin:0;}#mermaid-svg-woA7l6vtMOin57Cp .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-woA7l6vtMOin57Cp .cluster-label text{fill:#333;}#mermaid-svg-woA7l6vtMOin57Cp .cluster-label span{color:#333;}#mermaid-svg-woA7l6vtMOin57Cp .cluster-label span p{background-color:transparent;}#mermaid-svg-woA7l6vtMOin57Cp .label text,#mermaid-svg-woA7l6vtMOin57Cp span{fill:#333;color:#333;}#mermaid-svg-woA7l6vtMOin57Cp .node rect,#mermaid-svg-woA7l6vtMOin57Cp .node circle,#mermaid-svg-woA7l6vtMOin57Cp .node ellipse,#mermaid-svg-woA7l6vtMOin57Cp .node polygon,#mermaid-svg-woA7l6vtMOin57Cp .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-woA7l6vtMOin57Cp .rough-node .label text,#mermaid-svg-woA7l6vtMOin57Cp .node .label text,#mermaid-svg-woA7l6vtMOin57Cp .image-shape .label,#mermaid-svg-woA7l6vtMOin57Cp .icon-shape .label{text-anchor:middle;}#mermaid-svg-woA7l6vtMOin57Cp .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-woA7l6vtMOin57Cp .rough-node .label,#mermaid-svg-woA7l6vtMOin57Cp .node .label,#mermaid-svg-woA7l6vtMOin57Cp .image-shape .label,#mermaid-svg-woA7l6vtMOin57Cp .icon-shape .label{text-align:center;}#mermaid-svg-woA7l6vtMOin57Cp .node.clickable{cursor:pointer;}#mermaid-svg-woA7l6vtMOin57Cp .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-woA7l6vtMOin57Cp .arrowheadPath{fill:#333333;}#mermaid-svg-woA7l6vtMOin57Cp .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-woA7l6vtMOin57Cp .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-woA7l6vtMOin57Cp .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-woA7l6vtMOin57Cp .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-woA7l6vtMOin57Cp .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-woA7l6vtMOin57Cp .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-woA7l6vtMOin57Cp .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-woA7l6vtMOin57Cp .cluster text{fill:#333;}#mermaid-svg-woA7l6vtMOin57Cp .cluster span{color:#333;}#mermaid-svg-woA7l6vtMOin57Cp div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-woA7l6vtMOin57Cp .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-woA7l6vtMOin57Cp rect.text{fill:none;stroke-width:0;}#mermaid-svg-woA7l6vtMOin57Cp .icon-shape,#mermaid-svg-woA7l6vtMOin57Cp .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-woA7l6vtMOin57Cp .icon-shape p,#mermaid-svg-woA7l6vtMOin57Cp .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-woA7l6vtMOin57Cp .icon-shape rect,#mermaid-svg-woA7l6vtMOin57Cp .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-woA7l6vtMOin57Cp .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-woA7l6vtMOin57Cp .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-woA7l6vtMOin57Cp :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    用户问题

    向量检索

    相关文档 Top-K

    构建Prompt

    LLM生成答案

    返回结果

    专业术语解释:

    • 向量检索:将文本转换为高维向量,通过计算向量相似度找到相关文档
    • Top-K:检索最相关的前K个文档,通常K取3-10
    • Prompt Engineering:精心设计输入给LLM的提示词,以获得更好的输出
    Python实现示例

    from langchain.embeddings import OpenAIEmbeddings
    from langchain.vectorstores import Chroma
    from langchain.chat_models import ChatOpenAI
    from langchain.chains import RetrievalQA

    # 步骤1: 初始化嵌入模型 – 将文本转换为向量
    embeddings = OpenAIEmbeddings()

    # 步骤2: 创建向量数据库 – 存储和检索文档向量
    vectorstore = Chroma(
    collection_name="knowledge_base",
    embedding_function=embeddings
    )

    # 添加知识文档
    documents = [
    "RAG是Retrieval-Augmented Generation的缩写…",
    "向量数据库是专门存储和检索向量的数据库…",
    "Embedding模型将文本转换为高维向量…"
    ]
    vectorstore.add_documents(documents)

    # 步骤3: 创建RAG链
    retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
    qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="gpt-4", temperature=0),
    retriever=retriever,
    return_source_documents=True
    )

    # 执行查询
    result = qa_chain({"query": "什么是RAG技术?"})

    1.4 Advanced RAG – 增强架构

    核心痛点:Naive RAG的三大问题

    • 语义鸿沟:用户问题和文档用词不同,检索失败
    • 精度不足:向量相似度不等于问题相关性
    • 单一维度:只用语义检索,错过关键词匹配

    Advanced RAG解决方案:

    #mermaid-svg-0xn5cTjlm7cR6ChE{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-0xn5cTjlm7cR6ChE .error-icon{fill:#552222;}#mermaid-svg-0xn5cTjlm7cR6ChE .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-0xn5cTjlm7cR6ChE .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-0xn5cTjlm7cR6ChE .marker{fill:#333333;stroke:#333333;}#mermaid-svg-0xn5cTjlm7cR6ChE .marker.cross{stroke:#333333;}#mermaid-svg-0xn5cTjlm7cR6ChE svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-0xn5cTjlm7cR6ChE p{margin:0;}#mermaid-svg-0xn5cTjlm7cR6ChE .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster-label text{fill:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster-label span{color:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster-label span p{background-color:transparent;}#mermaid-svg-0xn5cTjlm7cR6ChE .label text,#mermaid-svg-0xn5cTjlm7cR6ChE span{fill:#333;color:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .node rect,#mermaid-svg-0xn5cTjlm7cR6ChE .node circle,#mermaid-svg-0xn5cTjlm7cR6ChE .node ellipse,#mermaid-svg-0xn5cTjlm7cR6ChE .node polygon,#mermaid-svg-0xn5cTjlm7cR6ChE .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-0xn5cTjlm7cR6ChE .rough-node .label text,#mermaid-svg-0xn5cTjlm7cR6ChE .node .label text,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape .label,#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape .label{text-anchor:middle;}#mermaid-svg-0xn5cTjlm7cR6ChE .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-0xn5cTjlm7cR6ChE .rough-node .label,#mermaid-svg-0xn5cTjlm7cR6ChE .node .label,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape .label,#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape .label{text-align:center;}#mermaid-svg-0xn5cTjlm7cR6ChE .node.clickable{cursor:pointer;}#mermaid-svg-0xn5cTjlm7cR6ChE .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-0xn5cTjlm7cR6ChE .arrowheadPath{fill:#333333;}#mermaid-svg-0xn5cTjlm7cR6ChE .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-0xn5cTjlm7cR6ChE .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-0xn5cTjlm7cR6ChE .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-0xn5cTjlm7cR6ChE .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-0xn5cTjlm7cR6ChE .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-0xn5cTjlm7cR6ChE .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster text{fill:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE .cluster span{color:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-0xn5cTjlm7cR6ChE .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-0xn5cTjlm7cR6ChE rect.text{fill:none;stroke-width:0;}#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape p,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-0xn5cTjlm7cR6ChE .icon-shape rect,#mermaid-svg-0xn5cTjlm7cR6ChE .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-0xn5cTjlm7cR6ChE .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-0xn5cTjlm7cR6ChE .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-0xn5cTjlm7cR6ChE :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    用户问题

    查询增强

    混合检索稀疏+密集

    重排序 Re-rank

    构建Prompt

    LLM生成

    专业术语解释:

    • 查询增强:使用HyDE等技术改写用户问题,使其更易于检索
    • 混合检索:结合稀疏检索(BM25)和密集检索(向量),兼顾关键词和语义
    • 重排序:用专门模型对检索结果进行精排,提升Top结果质量

    1.5 Modular RAG – 模块化架构

    设计哲学:将RAG拆解为可插拔的模块,像搭积木一样自由组合

    #mermaid-svg-v0udIuTaje09TKuj{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-v0udIuTaje09TKuj .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-v0udIuTaje09TKuj .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-v0udIuTaje09TKuj .error-icon{fill:#552222;}#mermaid-svg-v0udIuTaje09TKuj .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-v0udIuTaje09TKuj .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-v0udIuTaje09TKuj .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-v0udIuTaje09TKuj .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-v0udIuTaje09TKuj .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-v0udIuTaje09TKuj .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-v0udIuTaje09TKuj .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-v0udIuTaje09TKuj .marker{fill:#333333;stroke:#333333;}#mermaid-svg-v0udIuTaje09TKuj .marker.cross{stroke:#333333;}#mermaid-svg-v0udIuTaje09TKuj svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-v0udIuTaje09TKuj p{margin:0;}#mermaid-svg-v0udIuTaje09TKuj .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-v0udIuTaje09TKuj .cluster-label text{fill:#333;}#mermaid-svg-v0udIuTaje09TKuj .cluster-label span{color:#333;}#mermaid-svg-v0udIuTaje09TKuj .cluster-label span p{background-color:transparent;}#mermaid-svg-v0udIuTaje09TKuj .label text,#mermaid-svg-v0udIuTaje09TKuj span{fill:#333;color:#333;}#mermaid-svg-v0udIuTaje09TKuj .node rect,#mermaid-svg-v0udIuTaje09TKuj .node circle,#mermaid-svg-v0udIuTaje09TKuj .node ellipse,#mermaid-svg-v0udIuTaje09TKuj .node polygon,#mermaid-svg-v0udIuTaje09TKuj .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-v0udIuTaje09TKuj .rough-node .label text,#mermaid-svg-v0udIuTaje09TKuj .node .label text,#mermaid-svg-v0udIuTaje09TKuj .image-shape .label,#mermaid-svg-v0udIuTaje09TKuj .icon-shape .label{text-anchor:middle;}#mermaid-svg-v0udIuTaje09TKuj .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-v0udIuTaje09TKuj .rough-node .label,#mermaid-svg-v0udIuTaje09TKuj .node .label,#mermaid-svg-v0udIuTaje09TKuj .image-shape .label,#mermaid-svg-v0udIuTaje09TKuj .icon-shape .label{text-align:center;}#mermaid-svg-v0udIuTaje09TKuj .node.clickable{cursor:pointer;}#mermaid-svg-v0udIuTaje09TKuj .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-v0udIuTaje09TKuj .arrowheadPath{fill:#333333;}#mermaid-svg-v0udIuTaje09TKuj .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-v0udIuTaje09TKuj .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-v0udIuTaje09TKuj .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-v0udIuTaje09TKuj .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-v0udIuTaje09TKuj .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-v0udIuTaje09TKuj .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-v0udIuTaje09TKuj .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-v0udIuTaje09TKuj .cluster text{fill:#333;}#mermaid-svg-v0udIuTaje09TKuj .cluster span{color:#333;}#mermaid-svg-v0udIuTaje09TKuj div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-v0udIuTaje09TKuj .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-v0udIuTaje09TKuj rect.text{fill:none;stroke-width:0;}#mermaid-svg-v0udIuTaje09TKuj .icon-shape,#mermaid-svg-v0udIuTaje09TKuj .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-v0udIuTaje09TKuj .icon-shape p,#mermaid-svg-v0udIuTaje09TKuj .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-v0udIuTaje09TKuj .icon-shape rect,#mermaid-svg-v0udIuTaje09TKuj .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-v0udIuTaje09TKuj .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-v0udIuTaje09TKuj .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-v0udIuTaje09TKuj :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    简单问题

    复杂问题

    多步推理

    不满意

    满意

    用户问题

    智能路由

    直接检索

    问题拆解

    迭代检索

    重排序

    多路检索

    质量评估

    返回答案

    专业术语解释:

    • 智能路由:根据问题类型自动选择最佳检索策略
    • 问题拆解:将复杂问题分解为多个子问题分别检索
    • 迭代检索:利用上轮检索结果改进下轮检索
    • 自我评估:系统自动评估检索质量并优化

    1.6 RAG 全链路增强路线

    在这里插入图片描述

    二、检索前增强

    检索前增强的目标是改进用户的查询,使其更容易被检索系统理解。

    2.1 HyDE (Hypothetical Document Embeddings)

    #mermaid-svg-tKdW5iiWpzpVzGkG{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-tKdW5iiWpzpVzGkG .error-icon{fill:#552222;}#mermaid-svg-tKdW5iiWpzpVzGkG .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-tKdW5iiWpzpVzGkG .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-tKdW5iiWpzpVzGkG .marker{fill:#333333;stroke:#333333;}#mermaid-svg-tKdW5iiWpzpVzGkG .marker.cross{stroke:#333333;}#mermaid-svg-tKdW5iiWpzpVzGkG svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-tKdW5iiWpzpVzGkG p{margin:0;}#mermaid-svg-tKdW5iiWpzpVzGkG .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster-label text{fill:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster-label span{color:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster-label span p{background-color:transparent;}#mermaid-svg-tKdW5iiWpzpVzGkG .label text,#mermaid-svg-tKdW5iiWpzpVzGkG span{fill:#333;color:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .node rect,#mermaid-svg-tKdW5iiWpzpVzGkG .node circle,#mermaid-svg-tKdW5iiWpzpVzGkG .node ellipse,#mermaid-svg-tKdW5iiWpzpVzGkG .node polygon,#mermaid-svg-tKdW5iiWpzpVzGkG .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-tKdW5iiWpzpVzGkG .rough-node .label text,#mermaid-svg-tKdW5iiWpzpVzGkG .node .label text,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape .label,#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape .label{text-anchor:middle;}#mermaid-svg-tKdW5iiWpzpVzGkG .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-tKdW5iiWpzpVzGkG .rough-node .label,#mermaid-svg-tKdW5iiWpzpVzGkG .node .label,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape .label,#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape .label{text-align:center;}#mermaid-svg-tKdW5iiWpzpVzGkG .node.clickable{cursor:pointer;}#mermaid-svg-tKdW5iiWpzpVzGkG .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-tKdW5iiWpzpVzGkG .arrowheadPath{fill:#333333;}#mermaid-svg-tKdW5iiWpzpVzGkG .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-tKdW5iiWpzpVzGkG .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-tKdW5iiWpzpVzGkG .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-tKdW5iiWpzpVzGkG .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-tKdW5iiWpzpVzGkG .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-tKdW5iiWpzpVzGkG .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster text{fill:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG .cluster span{color:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-tKdW5iiWpzpVzGkG .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-tKdW5iiWpzpVzGkG rect.text{fill:none;stroke-width:0;}#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape p,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-tKdW5iiWpzpVzGkG .icon-shape rect,#mermaid-svg-tKdW5iiWpzpVzGkG .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-tKdW5iiWpzpVzGkG .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-tKdW5iiWpzpVzGkG .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-tKdW5iiWpzpVzGkG :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    用户问题

    LLM生成假设答案

    假设性文档

    向量编码

    向量检索

    相关文档

    返回结果

    核心原理:用户的问题往往不够精准。HyDE通过生成假设性答案来改进检索。

    工作原理

    • 原问题:“手机电池不耐用的原因”
    • 生成假设答案:“手机电池不耐用的主要原因包括:后台应用耗电、屏幕亮度设置过高、电池老化…”
    • 用假设答案检索:精度大幅提升!

    专业术语解释:

    • 假设性文档:让LLM生成一个可能的答案,这个答案包含问题相关的关键词和语义
    • 向量表示:假设答案的向量表示与真实答案的向量表示更接近

    from langchain.prompts import PromptTemplate

    # HyDE Prompt
    hyde_prompt = PromptTemplate(
    input_variables=["question"],
    template="""请写一个可能的答案来回答这个问题:{question}

    要求:答案应该像是在百科全书或专业文档中找到的那样。"""
    )

    def hyde_retrieval(question, vectorstore, top_k=3):
    # 生成假设性文档
    hypothetical_doc = llm(hyde_prompt.format(question=question))

    # 用假设文档进行向量检索
    retriever = vectorstore.as_retriever(search_kwargs={"k": top_k})
    return retriever.get_relevant_documents(hypothetical_doc)

    2.2 Query2Doc

    #mermaid-svg-kxjd84eVN5T4JhZ4{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-kxjd84eVN5T4JhZ4 .error-icon{fill:#552222;}#mermaid-svg-kxjd84eVN5T4JhZ4 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-kxjd84eVN5T4JhZ4 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .marker.cross{stroke:#333333;}#mermaid-svg-kxjd84eVN5T4JhZ4 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-kxjd84eVN5T4JhZ4 p{margin:0;}#mermaid-svg-kxjd84eVN5T4JhZ4 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster-label text{fill:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster-label span{color:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster-label span p{background-color:transparent;}#mermaid-svg-kxjd84eVN5T4JhZ4 .label text,#mermaid-svg-kxjd84eVN5T4JhZ4 span{fill:#333;color:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .node rect,#mermaid-svg-kxjd84eVN5T4JhZ4 .node circle,#mermaid-svg-kxjd84eVN5T4JhZ4 .node ellipse,#mermaid-svg-kxjd84eVN5T4JhZ4 .node polygon,#mermaid-svg-kxjd84eVN5T4JhZ4 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .rough-node .label text,#mermaid-svg-kxjd84eVN5T4JhZ4 .node .label text,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape .label,#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape .label{text-anchor:middle;}#mermaid-svg-kxjd84eVN5T4JhZ4 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .rough-node .label,#mermaid-svg-kxjd84eVN5T4JhZ4 .node .label,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape .label,#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape .label{text-align:center;}#mermaid-svg-kxjd84eVN5T4JhZ4 .node.clickable{cursor:pointer;}#mermaid-svg-kxjd84eVN5T4JhZ4 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .arrowheadPath{fill:#333333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kxjd84eVN5T4JhZ4 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-kxjd84eVN5T4JhZ4 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kxjd84eVN5T4JhZ4 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster text{fill:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 .cluster span{color:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-kxjd84eVN5T4JhZ4 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-kxjd84eVN5T4JhZ4 rect.text{fill:none;stroke-width:0;}#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape p,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-kxjd84eVN5T4JhZ4 .icon-shape rect,#mermaid-svg-kxjd84eVN5T4JhZ4 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kxjd84eVN5T4JhZ4 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-kxjd84eVN5T4JhZ4 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-kxjd84eVN5T4JhZ4 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    用户问题

    LLM生成文档片段

    文档片段50-100字

    向量编码

    向量检索

    相关文档

    返回结果

    核心原理:与HyDE类似,但Query2Doc专注于生成更短的文档片段来改进检索。

    Query2Doc vs HyDE

    特性HyDEQuery2Doc
    输出长度 较长(完整答案) 较短(关键片段)
    生成重点 完整回答问题 捕捉关键概念
    适用场景 问答系统 文档检索

    工作原理:

  • 将用户问题转换为更丰富的文档表示
  • 生成的文档包含问题相关的关键词和概念
  • 用这个文档进行向量检索
  • # Query2Doc Prompt
    query2doc_prompt = PromptTemplate(
    input_variables=["question"],
    template="""基于以下问题,生成一个简短的文档片段(50-100字)。

    要求:
    1. 包含问题中的关键概念
    2. 使用专业术语
    3. 保持简洁

    问题: {question}

    文档片段:"""
    )

    def query2doc_retrieval(question, vectorstore):
    # 生成文档片段
    doc_fragment = llm(query2doc_prompt.format(question=question))

    # 用文档片段进行检索
    retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
    return retriever.get_relevant_documents(doc_fragment)

    2.3 查询重写 (Query Rewriting)

    核心原理:将口语化、模糊的用户问题改写为专业、精确的形式。

    # 查询重写Prompt
    rewrite_prompt = PromptTemplate(
    template="""将以下用户问题改写为更适合搜索的形式。
    要求:保留核心意图,使用更专业的术语,补充可能的同义词。

    原问题: {question}

    改写后的问题:""",
    input_variables=["question"]
    )

    # 示例
    # 原问题: "手机老没电咋办"
    # 改写后: "手机电池续航时间短,如何解决电池消耗过快的问题"

    2.4 查询分解 (Query Decomposition)

    核心原理:将复杂的复合问题分解为多个简单的子问题,分别检索后再合并结果。

    为什么需要查询分解?

    用户问题:“RAG技术有哪些优势?在实际项目中如何应用?有哪些挑战?”

    这个问题包含3个子问题,如果直接检索可能找不到同时涵盖所有内容的文档。

    解决方案:

  • 分解为3个子问题
  • 分别检索每个子问题的相关文档
  • 合并所有检索结果
  • 专业术语解释:

    • 复合问题:包含多个子问题或多个方面的问题
    • 子问题独立检索:每个子问题单独进行检索,提升召回率

    from langchain.prompts import PromptTemplate

    # 查询分解Prompt
    decompose_prompt = PromptTemplate(
    template="""将以下复杂问题分解为多个独立的子问题。

    要求:
    1. 每个子问题都应该能独立检索
    2. 子问题之间尽量不重复
    3. 分解3-5个子问题
    4. 每个子问题独立成行

    原始问题: {question}

    子问题列表(每行一个):""",
    input_variables=["question"]
    )

    def decompose_query(question, vectorstore):
    """查询分解并检索"""
    # 1. 分解问题
    sub_questions_str = llm(decompose_prompt.format(question=question))
    sub_questions = [q.strip() for q in sub_questions_str.split('\\n') if q.strip()]

    print(f"分解出 {len(sub_questions)} 个子问题:")
    for i, sq in enumerate(sub_questions, 1):
    print(f" {i}. {sq}")

    # 2. 对每个子问题检索
    all_docs = []
    retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

    for sq in sub_questions:
    docs = retriever.get_relevant_documents(sq)
    all_docs.extend(docs)

    # 3. 去重并返回Top-N
    unique_docs = list({doc.page_content: doc for doc in all_docs}.values())
    return unique_docs[:5]

    # 使用示例
    complex_question = "RAG技术有哪些优势?在实际项目中如何应用?"
    docs = decompose_query(complex_question, vectorstore)

    2.5 Take Step Back

    核心原理:让AI"退一步"思考,从更高层次的概念来理解问题,然后再进行检索。

    Take Step Back 示例

    用户问题:“Python中的装饰器是怎么工作的?在异步编程中如何使用?”

    Step Back后:“Python编程中的装饰器模式和异步编程概念”

    优势:从更高层次理解问题的核心概念,检索更全面

    专业术语解释:

    • 抽象层次:从具体问题上升到更高层次的概念
    • 概念关联:通过高层次概念关联更多相关文档

    # Take Step Back Prompt
    step_back_prompt = PromptTemplate(
    template="""你是一个擅长概念抽象的AI助手。

    请将以下具体问题抽象为一个更高层次的概念性问题。

    要求:
    1. 提取问题中的核心概念
    2. 从更高层次重新表述问题
    3. 保持问题的语义范围

    原始问题: {question}

    抽象后的问题:""",
    input_variables=["question"]
    )

    def take_step_back_retrieval(question, vectorstore):
    """Step Back检索"""
    # 1. 生成抽象问题
    abstract_question = llm(step_back_prompt.format(question=question))
    print(f"原始问题: {question}")
    print(f"抽象问题: {abstract_question}")

    # 2. 用抽象问题检索
    retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
    abstract_docs = retriever.get_relevant_documents(abstract_question)

    # 3. 用原始问题检索
    original_docs = retriever.get_relevant_documents(question)

    # 4. 合并去重
    all_docs = abstract_docs + original_docs
    unique_docs = list({doc.page_content: doc for doc in all_docs}.values())

    return unique_docs[:5]

    # 使用示例
    question = "Python中的装饰器是怎么工作的?在异步编程中如何使用?"
    docs = take_step_back_retrieval(question, vectorstore)

    Take Step Back vs 查询分解:

    特性Take Step Back查询分解
    处理方式 抽象到更高层次 分解为多个子问题
    检索策略 用抽象问题+原问题 分别检索每个子问题
    适用场景 需要全局视角的问题 多方面的复合问题

    三、检索增强

    检索增强的目标是使用多种索引和检索策略,提升检索的召回率和准确率。

    3.1 核心概念

    多索引检索策略的核心思想是:不同的索引维度捕获不同的信息特征。

    #mermaid-svg-xVSYOQeEx7E2tROM{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-xVSYOQeEx7E2tROM .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-xVSYOQeEx7E2tROM .error-icon{fill:#552222;}#mermaid-svg-xVSYOQeEx7E2tROM .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-xVSYOQeEx7E2tROM .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-xVSYOQeEx7E2tROM .marker{fill:#333333;stroke:#333333;}#mermaid-svg-xVSYOQeEx7E2tROM .marker.cross{stroke:#333333;}#mermaid-svg-xVSYOQeEx7E2tROM svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-xVSYOQeEx7E2tROM p{margin:0;}#mermaid-svg-xVSYOQeEx7E2tROM .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster-label text{fill:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster-label span{color:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster-label span p{background-color:transparent;}#mermaid-svg-xVSYOQeEx7E2tROM .label text,#mermaid-svg-xVSYOQeEx7E2tROM span{fill:#333;color:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .node rect,#mermaid-svg-xVSYOQeEx7E2tROM .node circle,#mermaid-svg-xVSYOQeEx7E2tROM .node ellipse,#mermaid-svg-xVSYOQeEx7E2tROM .node polygon,#mermaid-svg-xVSYOQeEx7E2tROM .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-xVSYOQeEx7E2tROM .rough-node .label text,#mermaid-svg-xVSYOQeEx7E2tROM .node .label text,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape .label,#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape .label{text-anchor:middle;}#mermaid-svg-xVSYOQeEx7E2tROM .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-xVSYOQeEx7E2tROM .rough-node .label,#mermaid-svg-xVSYOQeEx7E2tROM .node .label,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape .label,#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape .label{text-align:center;}#mermaid-svg-xVSYOQeEx7E2tROM .node.clickable{cursor:pointer;}#mermaid-svg-xVSYOQeEx7E2tROM .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-xVSYOQeEx7E2tROM .arrowheadPath{fill:#333333;}#mermaid-svg-xVSYOQeEx7E2tROM .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-xVSYOQeEx7E2tROM .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-xVSYOQeEx7E2tROM .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-xVSYOQeEx7E2tROM .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-xVSYOQeEx7E2tROM .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-xVSYOQeEx7E2tROM .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-xVSYOQeEx7E2tROM .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster text{fill:#333;}#mermaid-svg-xVSYOQeEx7E2tROM .cluster span{color:#333;}#mermaid-svg-xVSYOQeEx7E2tROM div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-xVSYOQeEx7E2tROM .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-xVSYOQeEx7E2tROM rect.text{fill:none;stroke-width:0;}#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape p,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-xVSYOQeEx7E2tROM .icon-shape rect,#mermaid-svg-xVSYOQeEx7E2tROM .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-xVSYOQeEx7E2tROM .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-xVSYOQeEx7E2tROM .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-xVSYOQeEx7E2tROM :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    原始文档

    句子索引精准匹配

    摘要索引全局视角

    问题索引问答匹配

    父文档索引上下文完整

    多路召回

    重排序

    最终结果

    3.2 混合检索 (Hybrid Search)

    核心原理:结合稀疏检索和密集检索的优势。

    专业术语解释:

    检索方式核心原理优势劣势
    稀疏检索 (BM25) 基于词频统计,精确匹配关键词 关键词精确匹配 无法理解语义
    密集检索 (向量) 基于语义相似度,理解含义 理解语义相似性 错过精确关键词
    混合检索 融合两种结果 兼顾两者 计算成本较高
    BM25算法原理

    BM25是最成功的稀疏检索算法,其核心思想是对词频进行饱和处理:

    score

    (

    D

    ,

    Q

    )

    =

    i

    n

    IDF

    (

    q

    i

    )

    ×

    f

    (

    q

    i

    ,

    D

    )

    ×

    (

    k

    1

    +

    1

    )

    f

    (

    q

    i

    ,

    D

    )

    +

    k

    1

    ×

    (

    1

    b

    +

    b

    ×

    D

    avgdl

    )

    \\text{score}(D, Q) = \\sum_{i}^{n} \\text{IDF}(q_i) \\times \\frac{f(q_i, D) \\times (k_1 + 1)}{f(q_i, D) + k_1 \\times (1 – b + b \\times \\frac{|D|}{\\text{avgdl}})}

    score(D,Q)=inIDF(qi)×f(qi,D)+k1×(1b+b×avgdlD)f(qi,D)×(k1+1)

    公式解读:

    • f

      (

      q

      i

      ,

      D

      )

      f(q_i, D)

      f(qi,D):词

      q

      i

      q_i

      qi 在文档

      D

      D

      D 中出现的次数

    • D

      |D|

      D:文档

      D

      D

      D 的词数

    • avgdl

      \\text{avgdl}

      avgdl:所有文档的平均长度

    • k

      1

      k_1

      k1:饱和参数(推荐 1.2-2.0),控制词频的重要性

    • b

      b

      b:长度归一化参数(推荐 0.75),控制文档长度的影响

    BM25改进:对TF-IDF的改进在于词频饱和和文档长度归一化,避免长文档占优势。

    向量检索原理

    Embedding模型:将文本转换为高维向量(通常768-3072维),相似的文本在向量空间中距离更近。

    相似度计算:通常使用余弦相似度

    similarity

    =

    cos

    (

    θ

    )

    =

    A

    B

    A

    ×

    B

    \\text{similarity} = \\cos(\\theta) = \\frac{A \\cdot B}{||A|| \\times ||B||}

    similarity=cos(θ)=∣∣A∣∣×∣∣B∣∣AB

    RRF融合算法

    RRF (Reciprocal Rank Fusion):将多个检索系统的结果融合,核心公式:

    final_score

    (

    d

    )

    =

    i

    =

    1

    n

    1

    k

    +

    rank

    i

    (

    d

    )

    \\text{final\\_score}(d) = \\sum_{i=1}^{n} \\frac{1}{k + \\text{rank}_i(d)}

    final_score(d)=i=1nk+ranki(d)1

    其中:

    • k

      k

      k 是常数(通常取60),控制排名衰减速度

    • rank

      i

      (

      d

      )

      \\text{rank}_i(d)

      ranki(d) 是文档

      d

      d

      d 在第

      i

      i

      i 个检索系统中的排名(从0开始,排名第1的文档rank=0)

    计算示例:

    • 排名第1的文档:

      1

      60

      +

      0

      =

      0.0167

      \\frac{1}{60+0} = 0.0167

      60+01=0.0167

    • 排名第2的文档:

      1

      60

      +

      1

      =

      0.0164

      \\frac{1}{60+1} = 0.0164

      60+11=0.0164

    • 排名第10的文档:

      1

      60

      +

      9

      =

      0.0149

      \\frac{1}{60+9} = 0.0149

      60+91=0.0149

    from langchain.retrievers import EnsembleRetriever
    from langchain_community.retrievers import BM25Retriever

    # 稀疏检索器 (BM25)
    bm25_retriever = BM25Retriever.from_documents(documents, k=5)

    # 密集检索器 (向量)
    vector_retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

    # 融合检索器 – 使用RRF算法
    ensemble_retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, vector_retriever],
    weights=[0.5, 0.5] # 可调整权重
    )

    results = ensemble_retriever.get_relevant_documents("什么是RAG?")

    3.3 句子窗口检索 (Sentence Window Retriever)

    核心概念:句子窗口检索是一种平衡精度和上下文的技术。

    工作原理:

  • 小窗口检索:将文档分割成单句,用单句进行向量检索
  • 大窗口返回:检索到相关句子后,返回其前后若干句组成的窗口
  • 优势:

    • 检索时用小窗口,精度高(向量更精准)
    • 返回时用大窗口,上下文完整

    from llama_index.node_parser import SentenceWindowNodeParser
    from llama_index import VectorStoreIndex, Document
    from llama_index.postprocessor import MetadataReplacementPostprocessor

    # 创建句子窗口解析器
    node_parser = SentenceWindowNodeParser.from_defaults(
    window_size=3, # 窗口大小:前后各3句
    window_metadata_key="window",
    original_text_metadata_key="original_text"
    )

    # 解析文档为节点
    nodes = node_parser.get_nodes_from_documents(documents)

    # 创建索引
    index = VectorStoreIndex(nodes)

    # 创建查询引擎(带窗口恢复)
    query_engine = index.as_query_engine(
    similarity_top_k=2,
    node_postprocessors=[
    MetadataReplacementPostprocessor(target_metadata_key="window")
    ]
    )

    response = query_engine.query("RAG有哪些优化方式?")

    3.4 父子文档检索 (Parent Document Retriever)

    核心概念:用小块检索,返回大块。

    设计思想

    #mermaid-svg-eqBdhxYmxw3HeyM6{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-eqBdhxYmxw3HeyM6 .error-icon{fill:#552222;}#mermaid-svg-eqBdhxYmxw3HeyM6 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-eqBdhxYmxw3HeyM6 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .marker.cross{stroke:#333333;}#mermaid-svg-eqBdhxYmxw3HeyM6 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-eqBdhxYmxw3HeyM6 p{margin:0;}#mermaid-svg-eqBdhxYmxw3HeyM6 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster-label text{fill:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster-label span{color:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster-label span p{background-color:transparent;}#mermaid-svg-eqBdhxYmxw3HeyM6 .label text,#mermaid-svg-eqBdhxYmxw3HeyM6 span{fill:#333;color:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .node rect,#mermaid-svg-eqBdhxYmxw3HeyM6 .node circle,#mermaid-svg-eqBdhxYmxw3HeyM6 .node ellipse,#mermaid-svg-eqBdhxYmxw3HeyM6 .node polygon,#mermaid-svg-eqBdhxYmxw3HeyM6 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .rough-node .label text,#mermaid-svg-eqBdhxYmxw3HeyM6 .node .label text,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape .label,#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape .label{text-anchor:middle;}#mermaid-svg-eqBdhxYmxw3HeyM6 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .rough-node .label,#mermaid-svg-eqBdhxYmxw3HeyM6 .node .label,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape .label,#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape .label{text-align:center;}#mermaid-svg-eqBdhxYmxw3HeyM6 .node.clickable{cursor:pointer;}#mermaid-svg-eqBdhxYmxw3HeyM6 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .arrowheadPath{fill:#333333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-eqBdhxYmxw3HeyM6 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-eqBdhxYmxw3HeyM6 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-eqBdhxYmxw3HeyM6 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster text{fill:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 .cluster span{color:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-eqBdhxYmxw3HeyM6 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-eqBdhxYmxw3HeyM6 rect.text{fill:none;stroke-width:0;}#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape p,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-eqBdhxYmxw3HeyM6 .icon-shape rect,#mermaid-svg-eqBdhxYmxw3HeyM6 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-eqBdhxYmxw3HeyM6 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-eqBdhxYmxw3HeyM6 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-eqBdhxYmxw3HeyM6 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    大文档 1000字

    分割

    小文档1200字

    小文档2200字

    小文档3200字

    向量检索

    匹配到小文档2

    返回父文档完整1000字

    工作原理:

  • 将大文档分割成父子两个层次
  • 用小文档进行向量检索(精度高)
  • 返回对应的大文档(上下文完整)
  • 专业术语解释:

    • 父文档:较大的文档块(如1000字),包含完整的上下文
    • 子文档:较小的文档块(如200字),用于精准检索

    from langchain.retrievers import ParentDocumentRetriever
    from langchain.storage import InMemoryStore
    from langchain.text_splitter import RecursiveCharacterTextSplitter

    # 小文档分割器(用于检索)
    child_splitter = RecursiveCharacterTextSplitter(
    chunk_size=200, # 200字符一个块
    chunk_overlap=50, # 重叠50字符
    separators=["\\n\\n", "\\n", "。", "!", "?", ",", " ", ""]
    )

    # 大文档分割器(用于返回)
    parent_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, # 1000字符一个块
    chunk_overlap=100,
    separators=["\\n\\n", "\\n", "。", "!", "?"]
    )

    # 创建父文档检索器
    retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=store,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter
    )

    3.5 摘要检索 (Summary Retriever)

    #mermaid-svg-dsknaoLZTw9N2qX6{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-dsknaoLZTw9N2qX6 .error-icon{fill:#552222;}#mermaid-svg-dsknaoLZTw9N2qX6 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-dsknaoLZTw9N2qX6 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-dsknaoLZTw9N2qX6 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-dsknaoLZTw9N2qX6 .marker.cross{stroke:#333333;}#mermaid-svg-dsknaoLZTw9N2qX6 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-dsknaoLZTw9N2qX6 p{margin:0;}#mermaid-svg-dsknaoLZTw9N2qX6 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster-label text{fill:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster-label span{color:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster-label span p{background-color:transparent;}#mermaid-svg-dsknaoLZTw9N2qX6 .label text,#mermaid-svg-dsknaoLZTw9N2qX6 span{fill:#333;color:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .node rect,#mermaid-svg-dsknaoLZTw9N2qX6 .node circle,#mermaid-svg-dsknaoLZTw9N2qX6 .node ellipse,#mermaid-svg-dsknaoLZTw9N2qX6 .node polygon,#mermaid-svg-dsknaoLZTw9N2qX6 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-dsknaoLZTw9N2qX6 .rough-node .label text,#mermaid-svg-dsknaoLZTw9N2qX6 .node .label text,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape .label,#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape .label{text-anchor:middle;}#mermaid-svg-dsknaoLZTw9N2qX6 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-dsknaoLZTw9N2qX6 .rough-node .label,#mermaid-svg-dsknaoLZTw9N2qX6 .node .label,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape .label,#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape .label{text-align:center;}#mermaid-svg-dsknaoLZTw9N2qX6 .node.clickable{cursor:pointer;}#mermaid-svg-dsknaoLZTw9N2qX6 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-dsknaoLZTw9N2qX6 .arrowheadPath{fill:#333333;}#mermaid-svg-dsknaoLZTw9N2qX6 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-dsknaoLZTw9N2qX6 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-dsknaoLZTw9N2qX6 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dsknaoLZTw9N2qX6 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-dsknaoLZTw9N2qX6 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dsknaoLZTw9N2qX6 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster text{fill:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 .cluster span{color:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-dsknaoLZTw9N2qX6 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-dsknaoLZTw9N2qX6 rect.text{fill:none;stroke-width:0;}#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape p,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-dsknaoLZTw9N2qX6 .icon-shape rect,#mermaid-svg-dsknaoLZTw9N2qX6 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dsknaoLZTw9N2qX6 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-dsknaoLZTw9N2qX6 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-dsknaoLZTw9N2qX6 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    原始文档5000字

    生成摘要200字

    摘要索引向量存储

    用户问题

    向量检索

    匹配摘要

    返回完整文档

    核心概念:用文档摘要进行检索,返回完整文档。

    为什么需要摘要检索?

    场景:用户问"RAG技术在企业中的应用"

    问题:

    • 原始文档太长(5000字),向量检索容易聚焦细节而忽略主题
    • 摘要概括了文档的核心内容(200字)

    解决方案:

    • 用摘要向量进行检索(匹配主题)
    • 返回完整文档(提供细节)

    工作原理:

  • 为每个文档生成摘要
  • 用摘要构建向量索引
  • 检索时匹配摘要,返回完整文档
  • 专业术语解释:

    • 摘要索引:存储文档摘要的向量索引,用于主题级别的检索
    • 全局视角:摘要提供文档的全局视图,避免陷入细节

    from langchain.prompts import PromptTemplate
    from langchain.llms import OpenAI
    from langchain.schema import Document

    # 1. 摘要生成Prompt
    summary_prompt = PromptTemplate(
    template="""请为以下文档生成一个简洁的摘要(100字以内):

    文档:
    {context}

    摘要:""",
    input_variables=["context"]
    )

    # 2. 为文档生成摘要
    def generate_summary(documents):
    llm = OpenAI(temperature=0)
    summaries = []

    for i, doc in enumerate(documents):
    print(f"处理第 {i+1}/{len(documents)} 个文档…")

    # 生成摘要
    summary = llm(summary_prompt.format(context=doc.page_content))

    summaries.append({
    "summary": summary,
    "original_doc": doc
    })

    return summaries

    # 3. 为知识库生成摘要
    summaries = generate_summary(documents)

    # 4. 用摘要构建索引
    summary_docs = [
    Document(
    page_content=s["summary"],
    metadata={"doc_id": i}
    )
    for i, s in enumerate(summaries)
    ]

    summary_index = Chroma.from_documents(
    summary_docs,
    embeddings,
    collection_name="summary_index"
    )

    # 5. 检索函数
    def retrieve_by_summary(query):
    """通过摘要检索,返回原始文档"""
    # 找到相似的摘要
    similar_summaries = summary_index.similarity_search(query, k=1)

    if not similar_summaries:
    return []

    # 获取对应的原始文档
    doc_id = similar_summaries[0].metadata["doc_id"]
    return [summaries[doc_id]["original_doc"]]

    # 使用示例
    results = retrieve_by_summary("RAG在企业中的应用场景")

    3.6 假设性问题检索 (Hypothetical Questions Retriever)

    #mermaid-svg-VhHYhUZxffWjKkhh{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-VhHYhUZxffWjKkhh .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-VhHYhUZxffWjKkhh .error-icon{fill:#552222;}#mermaid-svg-VhHYhUZxffWjKkhh .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-VhHYhUZxffWjKkhh .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-VhHYhUZxffWjKkhh .marker{fill:#333333;stroke:#333333;}#mermaid-svg-VhHYhUZxffWjKkhh .marker.cross{stroke:#333333;}#mermaid-svg-VhHYhUZxffWjKkhh svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-VhHYhUZxffWjKkhh p{margin:0;}#mermaid-svg-VhHYhUZxffWjKkhh .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster-label text{fill:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster-label span{color:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster-label span p{background-color:transparent;}#mermaid-svg-VhHYhUZxffWjKkhh .label text,#mermaid-svg-VhHYhUZxffWjKkhh span{fill:#333;color:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .node rect,#mermaid-svg-VhHYhUZxffWjKkhh .node circle,#mermaid-svg-VhHYhUZxffWjKkhh .node ellipse,#mermaid-svg-VhHYhUZxffWjKkhh .node polygon,#mermaid-svg-VhHYhUZxffWjKkhh .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-VhHYhUZxffWjKkhh .rough-node .label text,#mermaid-svg-VhHYhUZxffWjKkhh .node .label text,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape .label,#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape .label{text-anchor:middle;}#mermaid-svg-VhHYhUZxffWjKkhh .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-VhHYhUZxffWjKkhh .rough-node .label,#mermaid-svg-VhHYhUZxffWjKkhh .node .label,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape .label,#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape .label{text-align:center;}#mermaid-svg-VhHYhUZxffWjKkhh .node.clickable{cursor:pointer;}#mermaid-svg-VhHYhUZxffWjKkhh .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-VhHYhUZxffWjKkhh .arrowheadPath{fill:#333333;}#mermaid-svg-VhHYhUZxffWjKkhh .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-VhHYhUZxffWjKkhh .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-VhHYhUZxffWjKkhh .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-VhHYhUZxffWjKkhh .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-VhHYhUZxffWjKkhh .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-VhHYhUZxffWjKkhh .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-VhHYhUZxffWjKkhh .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster text{fill:#333;}#mermaid-svg-VhHYhUZxffWjKkhh .cluster span{color:#333;}#mermaid-svg-VhHYhUZxffWjKkhh div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-VhHYhUZxffWjKkhh .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-VhHYhUZxffWjKkhh rect.text{fill:none;stroke-width:0;}#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape p,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-VhHYhUZxffWjKkhh .icon-shape rect,#mermaid-svg-VhHYhUZxffWjKkhh .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-VhHYhUZxffWjKkhh .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-VhHYhUZxffWjKkhh .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-VhHYhUZxffWjKkhh :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    原始文档

    LLM生成假设问题

    问题1Python是什么时候创建的?

    问题2Python的创始人是谁?

    问题3Python有什么特点?

    问题索引向量存储

    用户问题Python的历史

    向量检索

    匹配问题

    返回原始文档

    核心概念:预先生成文档可能回答的问题,用这些问题进行检索。

    应用场景

    文档:“Python是一门高级编程语言,由Guido van Rossum于1991年创建…”

    生成假设性问题:

    • “Python是什么时候创建的?”
    • “Python的创始人是谁?”
    • “Python有什么特点?”

    检索时:用户问"Python的历史" → 匹配到"Python是什么时候创建的?" → 返回原文档

    工作原理:

  • 为每个文档预先生成3-5个可能的问题
  • 用这些问题构建向量索引
  • 用户查询时,匹配这些问题,返回对应文档
  • from langchain.prompts import PromptTemplate

    # 问题生成Prompt
    question_gen_prompt = PromptTemplate(
    template="""阅读以下文档,生成3个可能的问题。

    要求:
    1. 问题应该涵盖文档的核心内容
    2. 问题的表达方式应该多样化
    3. 每个问题独立成行

    文档:
    {context}

    问题列表:""",
    input_variables=["context"]
    )

    def generate_questions(documents):
    """为文档生成假设性问题"""
    llm = OpenAI(temperature=0.7)
    all_questions = []

    for i, doc in enumerate(documents, 1):
    # 生成问题
    response = llm(question_gen_prompt.format(context=doc.page_content))
    questions = [q.strip() for q in response.split('\\n') if q.strip()]

    # 存储问题与文档的映射
    for q in questions:
    all_questions.append({
    "question": q,
    "source_doc": doc,
    "source_id": i
    })

    return all_questions

    # 使用示例
    questions = generate_questions(documents)

    # 用问题构建索引
    question_docs = [
    Document(page_content=q["question"], metadata={"source_id": q["source_id"]})
    for q in questions
    ]

    question_index = Chroma.from_documents(question_docs, embeddings)

    3.7 多索引融合策略

    #mermaid-svg-WzTycUmhToTyUlO9{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-WzTycUmhToTyUlO9 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-WzTycUmhToTyUlO9 .error-icon{fill:#552222;}#mermaid-svg-WzTycUmhToTyUlO9 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-WzTycUmhToTyUlO9 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-WzTycUmhToTyUlO9 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-WzTycUmhToTyUlO9 .marker.cross{stroke:#333333;}#mermaid-svg-WzTycUmhToTyUlO9 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-WzTycUmhToTyUlO9 p{margin:0;}#mermaid-svg-WzTycUmhToTyUlO9 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster-label text{fill:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster-label span{color:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster-label span p{background-color:transparent;}#mermaid-svg-WzTycUmhToTyUlO9 .label text,#mermaid-svg-WzTycUmhToTyUlO9 span{fill:#333;color:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .node rect,#mermaid-svg-WzTycUmhToTyUlO9 .node circle,#mermaid-svg-WzTycUmhToTyUlO9 .node ellipse,#mermaid-svg-WzTycUmhToTyUlO9 .node polygon,#mermaid-svg-WzTycUmhToTyUlO9 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-WzTycUmhToTyUlO9 .rough-node .label text,#mermaid-svg-WzTycUmhToTyUlO9 .node .label text,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape .label,#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape .label{text-anchor:middle;}#mermaid-svg-WzTycUmhToTyUlO9 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-WzTycUmhToTyUlO9 .rough-node .label,#mermaid-svg-WzTycUmhToTyUlO9 .node .label,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape .label,#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape .label{text-align:center;}#mermaid-svg-WzTycUmhToTyUlO9 .node.clickable{cursor:pointer;}#mermaid-svg-WzTycUmhToTyUlO9 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-WzTycUmhToTyUlO9 .arrowheadPath{fill:#333333;}#mermaid-svg-WzTycUmhToTyUlO9 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-WzTycUmhToTyUlO9 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-WzTycUmhToTyUlO9 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-WzTycUmhToTyUlO9 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-WzTycUmhToTyUlO9 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-WzTycUmhToTyUlO9 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-WzTycUmhToTyUlO9 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster text{fill:#333;}#mermaid-svg-WzTycUmhToTyUlO9 .cluster span{color:#333;}#mermaid-svg-WzTycUmhToTyUlO9 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-WzTycUmhToTyUlO9 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-WzTycUmhToTyUlO9 rect.text{fill:none;stroke-width:0;}#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape p,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-WzTycUmhToTyUlO9 .icon-shape rect,#mermaid-svg-WzTycUmhToTyUlO9 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-WzTycUmhToTyUlO9 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-WzTycUmhToTyUlO9 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-WzTycUmhToTyUlO9 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    用户问题

    句子索引检索

    摘要索引检索

    问题索引检索

    父文档索引检索

    多路召回合并结果

    去重移除重复文档

    重排序Re-rank

    Top-N结果

    核心概念:同时使用多种索引策略,融合所有结果。

    专业术语解释:

    • 多路召回:从多个索引通道同时检索文档
    • 结果融合:将多个检索结果合并、去重、排序

    def multi_index_retrieval(query):
    """多索引融合检索"""
    all_docs = []

    # 1. 句子窗口检索
    sentence_results = sentence_retriever.get_relevant_documents(query)
    all_docs.extend(sentence_results)

    # 2. 摘要检索
    summary_results = summary_retriever.get_relevant_documents(query)
    all_docs.extend(summary_results)

    # 3. 问题检索
    question_results = question_retriever.get_relevant_documents(query)
    all_docs.extend(question_results)

    # 4. 去重
    unique_docs = list({doc.page_content: doc for doc in all_docs}.values())

    # 5. 重排序
    reranked_docs = reranker(query, unique_docs)

    return reranked_docs[:5]


    四、检索后增强

    检索后增强的目标是对检索结果进行优化,提升最终答案的质量。

    4.1 重排序 (Re-ranking) 原理

    核心问题:向量检索使用余弦相似度,但这不等于问题相关性!

    例子

    用户问:“如何减肥?”

    • 文档A(向量相似度0.85):“减肥需要控制饮食和运动” – 直接相关
    • 文档B(向量相似度0.87):“这个文档讨论了各种健康问题,包括如何增重、如何减肥…” – 相关但啰嗦

    重排序后:文档A排第一,因为直接回答问题

    专业术语解释:

    • 双塔模型:向量检索使用的模型,查询和文档独立编码
    • 交叉编码器:重排序使用的模型,查询和文档一起编码,精度更高但速度更慢

    #mermaid-svg-6OQAgoxFNffYeUS1{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-6OQAgoxFNffYeUS1 .error-icon{fill:#552222;}#mermaid-svg-6OQAgoxFNffYeUS1 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-6OQAgoxFNffYeUS1 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-6OQAgoxFNffYeUS1 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-6OQAgoxFNffYeUS1 .marker.cross{stroke:#333333;}#mermaid-svg-6OQAgoxFNffYeUS1 svg{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-6OQAgoxFNffYeUS1 p{margin:0;}#mermaid-svg-6OQAgoxFNffYeUS1 .label{font-family:\”trebuchet ms\”,verdana,arial,sans-serif;color:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster-label text{fill:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster-label span{color:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster-label span p{background-color:transparent;}#mermaid-svg-6OQAgoxFNffYeUS1 .label text,#mermaid-svg-6OQAgoxFNffYeUS1 span{fill:#333;color:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .node rect,#mermaid-svg-6OQAgoxFNffYeUS1 .node circle,#mermaid-svg-6OQAgoxFNffYeUS1 .node ellipse,#mermaid-svg-6OQAgoxFNffYeUS1 .node polygon,#mermaid-svg-6OQAgoxFNffYeUS1 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-6OQAgoxFNffYeUS1 .rough-node .label text,#mermaid-svg-6OQAgoxFNffYeUS1 .node .label text,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape .label,#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape .label{text-anchor:middle;}#mermaid-svg-6OQAgoxFNffYeUS1 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-6OQAgoxFNffYeUS1 .rough-node .label,#mermaid-svg-6OQAgoxFNffYeUS1 .node .label,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape .label,#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape .label{text-align:center;}#mermaid-svg-6OQAgoxFNffYeUS1 .node.clickable{cursor:pointer;}#mermaid-svg-6OQAgoxFNffYeUS1 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-6OQAgoxFNffYeUS1 .arrowheadPath{fill:#333333;}#mermaid-svg-6OQAgoxFNffYeUS1 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-6OQAgoxFNffYeUS1 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-6OQAgoxFNffYeUS1 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-6OQAgoxFNffYeUS1 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-6OQAgoxFNffYeUS1 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-6OQAgoxFNffYeUS1 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster text{fill:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 .cluster span{color:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:\”trebuchet ms\”,verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-6OQAgoxFNffYeUS1 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-6OQAgoxFNffYeUS1 rect.text{fill:none;stroke-width:0;}#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape p,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-6OQAgoxFNffYeUS1 .icon-shape rect,#mermaid-svg-6OQAgoxFNffYeUS1 .image-shape rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-6OQAgoxFNffYeUS1 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-6OQAgoxFNffYeUS1 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-6OQAgoxFNffYeUS1 :root{–mermaid-font-family:\”trebuchet ms\”,verdana,arial,sans-serif;}

    向量检索Top-10文档

    粗排结果

    Re-rank模型精打细算

    Top-3高质量文档

    4.2 主流Re-ranker对比

    模型类型特点速度准确率
    Cohere Rerank-v3.5 商业API 多语言,效果最好 中等 最高
    BGE Reranker-v2 开源 中文友好,免费 较快
    FlashRank 开源 极快,轻量级 极快 中等

    4.3 Re-ranking实现

    # 方案1: Cohere Rerank
    import cohere

    def cohere_rerank(query, documents, top_n=3):
    co = cohere.Client('your-api-key')
    results = co.rerank(
    model="rerank-v3.5",
    query=query,
    documents=[doc.page_content for doc in documents],
    top_n=top_n
    )
    return [documents[r.index] for r in results.results]

    # 方案2: BGE Reranker
    from sentence_transformers import CrossEncoder

    def bge_rerank(query, documents, top_k=3):
    model = CrossEncoder('BAAI/bge-reranker-v2-m3')

    # 构建查询-文档对
    pairs = [[query, doc.page_content] for doc in documents]

    # 计算分数
    scores = model.predict(pairs)

    # 排序
    scored_docs = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)
    return [doc for doc, score in scored_docs[:top_k]]

    4.4 上下文压缩

    核心概念:对检索到的文档进行压缩,去除无关内容,保留关键信息。

    专业术语解释:

    • 上下文压缩:提取文档中与问题最相关的片段
    • 信息密度:单位长度内的有效信息量

    from langchain.retrievers import ContextualCompressionRetriever
    from langchain.document_compressors import LLMChainExtractor

    # 创建压缩器
    compressor = LLMChainExtractor.from_llm(ChatOpenAI(model="gpt-4"))

    # 创建压缩检索器
    compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=vectorstore.as_retriever()
    )

    # 检索并压缩
    compressed_docs = compression_retriever.get_relevant_documents("什么是RAG?")


    五、完整生产级示例

    5.1 LangChain生产级实现

    import os
    from typing import List, Dict
    from langchain.embeddings import OpenAIEmbeddings
    from langchain.vectorstores import Chroma
    from langchain.chat_models import ChatOpenAI
    from langchain.chains import RetrievalQA
    from langchain.retrievers import ContextualCompressionRetriever
    from langchain_community.document_compressors.cohere_rerank import CohereRerank
    from langchain.retrievers import EnsembleRetriever
    from langchain_community.retrievers import BM25Retriever

    class ProductionRAG:
    """生产级RAG系统"""

    def __init__(
    self,
    knowledge_base: List[str],
    model: str = "gpt-4",
    top_k: int = 20,
    rerank_top_n: int = 5
    ):
    # 初始化嵌入模型
    self.embeddings = OpenAIEmbeddings()

    # 创建向量数据库
    self.vectorstore = Chroma(
    collection_name="production_kb",
    embedding_function=self.embeddings
    )

    # 添加文档
    if self.vectorstore._collection.count() == 0:
    from langchain.schema import Document
    docs = [Document(page_content=text) for text in knowledge_base]
    self.vectorstore.add_documents(docs)

    # 混合检索器
    bm25_retriever = BM25Retriever.from_documents(docs, k=top_k)
    vector_retriever = self.vectorstore.as_retriever(search_kwargs={"k": top_k})

    # 融合检索器
    self.retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, vector_retriever],
    weights=[0.4, 0.6]
    )

    # 重排序
    compressor = CohereRerank(top_n=rerank_top_n)
    self.compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=self.retriever
    )

    # 创建RAG链
    self.qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model=model, temperature=0),
    retriever=self.compression_retriever,
    return_source_documents=True
    )

    def query(self, question: str) > Dict:
    """执行查询"""
    result = self.qa_chain({"query": question})
    return {
    "question": question,
    "answer": result["result"],
    "sources": result.get("source_documents", [])
    }

    5.2 LlamaIndex生产级实现

    from llama_index import VectorStoreIndex, ServiceContext
    from llama_index.llms import OpenAI
    from llama_index.embeddings import OpenAIEmbedding
    from llama_index.node_parser import SentenceWindowNodeParser
    from llama_index.postprocessor import MetadataReplacementPostprocessor

    class ProductionRAGLlama:
    """生产级RAG系统 – LlamaIndex版本"""

    def __init__(self, data_dir: str = "./data"):
    # 配置服务上下文
    self.service_context = ServiceContext.from_defaults(
    llm=OpenAI(model="gpt-4", temperature=0),
    embed_model=OpenAIEmbedding(model="text-embedding-ada-002"),
    chunk_size=512,
    chunk_overlap=50
    )

    # 加载文档
    from llama_index import SimpleDirectoryReader
    documents = SimpleDirectoryReader(data_dir).load_data()

    # 句子窗口解析器
    node_parser = SentenceWindowNodeParser.from_defaults(window_size=3)
    nodes = node_parser.get_nodes_from_documents(documents)

    # 创建索引
    self.index = VectorStoreIndex(nodes, service_context=self.service_context)

    # 创建查询引擎
    self.query_engine = self.index.as_query_engine(
    similarity_top_k=10,
    node_postprocessors=[
    MetadataReplacementPostprocessor(target_metadata_key="window")
    ],
    response_mode="compact"
    )

    def query(self, question: str) > str:
    """执行查询"""
    response = self.query_engine.query(question)
    return str(response)

    赞(0)
    未经允许不得转载:网硕互联帮助中心 » RAG检索增强优化指南
    分享到: 更多 (0)

    评论 抢沙发

    评论前必须登录!