Self-reflective RAG with LangGraph: Self-RAG and CRAG

LangChain
7 Feb 202417:38

Summary

TLDRLance 从 Lang Chain 介绍了如何利用 Lang 图构建多样化和复杂的响应式问答(RAG)流程。他首先概述了基本的 RAG 流程,包括问题检索、文档传递给大型语言模型(LLM)生成答案。然后,他探讨了在实践中遇到的问题,如何时基于问题上下文进行检索,检索到的文档是否合适,以及如果不合适应如何处理。Lance 引入了主动 RAG 的概念,即 LLM 根据现有检索或生成结果决定何时何地进行检索。他讨论了在 RAG 应用中对 LLM 控制的不同级别,并引入了状态机的概念,允许 LLM 在指定所有可用转换的情况下选择不同步骤。Lance 展示了如何使用 Lang Graph 实现状态机,通过构建图来实现更复杂和多样化的 RAG 流程。他以一篇名为 CAG(Corrective RAG)的论文为例,演示了如何实现一个包含检索、文档评级、基于评级生成答案或从外部源检索的复杂流程。Lance 还展示了如何使用 Lang Smith 观察生成的追踪,提供了对整个流程的清晰视图。最后,他鼓励观众尝试使用 Lang Graph 进行流程工程,并关注即将发布的关于实现自适应 RAG 和 C RAG 的博客文章。

Takeaways

  • 📚 基本的RAG(Retrieval-Augmented Generation)流程从检索相关问题文档开始,然后通过LLM(Large Language Model)生成答案。
  • 🤔 在实践中,会遇到多种类型的问题,如何时基于问题上下文进行检索,检索到的文档是否合适,以及如果不合适是否应该丢弃并改进问题后重新检索。
  • 🔄 主动RAG的概念,即LLM根据现有检索或生成结果决定何时何地进行检索。
  • 🎛️ 在RAG应用中,可以对LLM进行不同级别的控制,包括选择单步输出、路由决策以及构建更复杂的逻辑流程。
  • 🤖 状态机的概念,允许LLM在RAG流程中选择不同步骤,同时指定所有可用的转换。
  • 📈 Lang Chain最近发布的Lang Graph提供了一种构建状态机的好方法,可以用于RAG和其他应用。
  • 🔍 通过CAG(Corrective Retrieval-Augmented Generation)论文介绍了一种主动RAG方法,该方法结合了多种想法,如文档检索、评级、生成答案以及知识精炼。
  • 📝 演示了如何使用Lang Graph实现状态机,包括简化流程、使用搜索引擎补充输出以及查询优化。
  • 📈 通过Lang Smith可以观察到RAG流程的每一步,包括检索、评级、文档过滤和生成答案。
  • 🛠️ 强调了“流程工程”的重要性,即在构建复杂逻辑流程时,需要仔细思考每个阶段的状态变化。
  • 📝 讨论了如何通过Lang Graph实现复杂的RAG逻辑流程,并强调了其在逻辑推理和工作流程构建方面的优势。
  • 🔗 提到了即将发布的一篇博客文章,将讨论使用状态机和Lang Graph实现自适应RAG和CAG的两种不同主动RAG方法。

Q & A

  • 什么是基本的RAG流程?

    -基本的RAG流程包括从索引中检索与问题相关的文档,将这些文档传入大型语言模型(LLM)的上下文窗口,以生成基于检索文档的答案。

  • 在实践中,我们如何处理不同类型的问题?

    -在实践中,我们可能会遇到几种不同类型的问题,例如何时根据问题的上下文进行检索,检索到的文档是否足够好,如果不好,我们是否应该丢弃它们,以及我们如何通过改进问题来重新尝试检索。

  • 什么是主动RAG?

    -主动RAG是一种过程,其中大型语言模型(LLM)根据现有的检索或生成结果来决定何时何地进行检索。

  • 在RAG应用中,我们如何控制LLM?

    -在RAG应用中,我们可以通过几种不同的方式控制LLM,包括使用LLM选择单个步骤的输出,使用路由来决定问题应该路由到向量存储还是图数据库,或者构建更复杂的逻辑流程,让LLM在不同步骤之间进行选择,同时指定所有可用的转换,这被称为状态机。

  • Lang Graph是什么,它如何帮助构建状态机?

    -Lang Graph是一种工具,它提供了一种很好的方式去构建RAG和其他应用的状态机。它允许用户构建更多样化和复杂的RAG流程,并将它们实现为图形,这有助于更广泛地理解流程工程,即思考所需的工作流程并实现它。

  • CAG(Corrective RAG)是什么,它如何实现主动RAG?

    -CAG是一种用于主动RAG的方法,它结合了几个不同的概念。首先进行文档检索,然后对它们进行评分。如果至少有一个文档的相关性超过阈值,就会进行生成。如果所有文档都不符合标准,它会从外部源检索,使用网络搜索,并将搜索结果作为生成答案的上下文。

  • 如何使用Lang Graph实现CAG?

    -使用Lang Graph实现CAG涉及到定义状态、创建节点和条件边。首先决定是否有文档相关,然后进行网络搜索以补充输出。接着,定义一个状态,这是一个将在图中传递和修改的核心对象。通过定义函数来修改每个节点的状态,实现检索、评分、条件决策、查询转换和生成等步骤。

  • 在Lang Graph中,状态是如何被修改的?

    -在Lang Graph中,状态是一个字典,它包含了与RAG相关的内容,如问题、文档、生成等。在图的每个节点上,通过定义一个函数来修改状态,例如在检索节点上,通过检索函数将检索到的文档添加到状态中。

  • Lang Smith是什么,它如何帮助我们?

    -Lang Smith是一个平台,它允许用户记录和检查Lang Graph的运行情况。通过设置API密钥,所有的生成都会被记录在Lang Smith中,用户可以查看节点、评估结果以及每一步的输出,这有助于直观地检查和理解整个RAG流程。

  • 如何通过Lang Graph进行流程工程?

    -通过Lang Graph进行流程工程涉及仔细思考整个工作流程,然后实现它。这包括定义状态、创建节点和条件边,并为每个节点定义函数来执行所需的状态修改。通过这种方式,可以构建出编码了更复杂逻辑推理工作流程的清晰、良好工程化的图形。

  • Lang Graph在构建复杂逻辑工作流时有什么优势?

    -Lang Graph在构建复杂逻辑工作流时的优势在于它能够清晰地指定所有想要执行的转换,并且每个节点都被明确地枚举出来,这在使用其他更复杂的推理方法时并不总是可能的。此外,Lang Graph的使用非常直观,有助于理解和构建逻辑工作流。

  • 如何使用Lang Graph来优化网络搜索?

    -使用Lang Graph来优化网络搜索涉及到在状态中设置一个搜索值,并根据文档的评分结果来决定是否执行网络搜索。如果所有文档都不相关,就会触发网络搜索,并将搜索结果添加到上下文中,用于生成答案。

Outlines

00:00

📚 介绍LangChain和RAG流程

Lance从LangChain介绍了如何使用LangChain构建多样化和复杂的RAG(Retrieval-Augmented Generation)流程。首先,他概述了基本的RAG流程,即从索引中检索相关问题文档,然后通过LLM(大型语言模型)生成答案。接着,他讨论了在实践中遇到的几种问题,如何时检索、检索到的文档质量如何,以及如果文档质量不佳应如何处理。Lance引入了主动RAG的概念,即LLM基于现有检索或生成结果决定何时何地进行检索。他还提到了在RAG应用中对LLM的不同控制级别,包括基本案例、路由以及构建更复杂逻辑流程的状态机。Lance强调了LangGraph在构建状态机和RAG流程中的作用,并提出了流程工程的概念,即先规划工作流程,然后实施。

05:01

🔍 RAG流程的实现和状态定义

Lance展示了如何使用LangGraph实现CAG(Corrective Retrieval-Augmented Generation)流程,这是一种主动RAG方法,它首先检索文档,然后对文档进行评分。如果至少有一个文档的相关性超过阈值,则进行生成。如果所有文档都不符合要求,则从外部源检索,并使用网络搜索来补充上下文以生成答案。Lance详细介绍了状态的定义,即在LangGraph中传递和修改的核心对象,它是一个包含问题、文档、生成等信息的字典。他通过一个例子说明了状态在每个节点如何被修改,包括检索、文档评分、逻辑决策(基于评分决定是否进行网络搜索或直接生成答案)以及查询转换。

10:03

🔗 状态机的逻辑和实现

Lance继续讨论了如何通过LangGraph实现状态机,他强调了逻辑决策的重要性,并展示了如何通过定义函数来修改状态。他解释了如何为每个节点定义函数,这些函数接收状态并进行相应的修改,如检索、生成答案、文档评分等。Lance还提到了如何使用条件边来决定基于评分结果的下一步操作,例如,如果文档不相关,则进行查询转换和网络搜索。他还展示了如何在LangSmith上观察LangGraph运行的结果,包括每个节点的输出和评分。

15:04

🚀 LangGraph在RAG中的应用和优势

Lance总结了LangGraph在RAG中的应用,并强调了使用LangGraph进行流程工程的好处。他提到,通过LangGraph,可以以清晰、工程化的方式编码更复杂的逻辑推理工作流程,并明确指定所有想要执行的转换。Lance认为这种构建逻辑工作流的方式非常直观,并且使用LangGraph进行追踪检查也很直观,因为每个节点都被清晰地枚举出来。他鼓励观众尝试使用LangGraph,并提到即将发布的一篇博客文章,该文章讨论了使用状态机和LangGraph实现自适应RAG和CAG两种不同的主动RAG方法。

Mindmap

Keywords

💡Lang图

Lang图是一种用于构建状态机的工具,它允许用户定义复杂的逻辑流程,并通过图形化的方式实现它们。在视频中,Lang图被用来构建一个用于文档检索和生成答案的复杂流程。它的核心价值在于能够清晰地规划和执行复杂的工作流程,使得状态转换和逻辑决策变得直观和易于管理。

💡检索(Retrieval)

检索是从索引中获取相关文档的过程。在视频的上下文中,检索是响应问题并从向量存储中找到相关文档的第一步。检索的结果会直接影响到后续的处理步骤,如文档的分级和答案的生成。

💡上下文窗口(Context Window)

上下文窗口是指在生成答案时,用于提供相关信息的界面或区域。在视频中,检索到的文档会通过上下文窗口传递给大型语言模型(LLM),以便生成基于这些文档的答案。

💡大型语言模型(LLM)

大型语言模型(LLM)是一种人工智能模型,能够处理和生成自然语言文本。在视频的主题中,LLM用于理解检索到的文档内容,并基于这些内容生成答案。LLM在主动检索(Active Retrieval)的过程中起到关键作用,它能够决定何时以及如何进行文档检索。

💡主动检索(Active Retrieval)

主动检索是一种检索策略,其中LLM会基于现有检索结果或生成的内容来决定何时以及如何进行下一步的检索。这与被动检索相对,后者是线性的、无决策的文档检索过程。主动检索允许系统更灵活地响应复杂的查询。

💡状态机(State Machine)

状态机是一种计算模型,它根据输入和当前状态来决定下一个状态。在视频的上下文中,状态机用于控制文档检索和答案生成的流程。状态机允许定义复杂的决策路径,使得流程可以根据条件逻辑进行动态调整。

💡向量存储(Vector Store)

向量存储是一种数据结构,用于存储和检索文档的向量表示。在视频中,向量存储用于快速检索与用户问题相关的文档。通过将文档转换为向量形式,可以高效地进行相似性搜索,从而找到最相关的文档。

💡文档分级(Document Grading)

文档分级是指评估检索到的文档与用户问题的相关性的过程。在视频的流程中,所有检索到的文档都会根据一定的标准进行分级,以确定它们是否足够相关,从而决定是否直接生成答案或进行进一步的检索。

💡条件边(Conditional Edge)

条件边是状态机中的一种边,它的转移依赖于特定的条件。在视频的Lang图实现中,条件边用于在文档分级后决定是进行网络搜索还是直接生成答案,这取决于是否有文档被评为相关。

💡流工程(Flow Engineering)

流工程是指设计和优化流程的实践,确保它们以最有效的方式运行。在视频的上下文中,流工程与使用Lang图来设计复杂的文档检索和答案生成流程相关。它强调了规划和实施逻辑流程的重要性,以便系统能够处理各种查询并生成准确的答案。

💡Lang Smith

Lang Smith是一个用于跟踪和观察Lang图运行的工具。在视频中,它用于记录和检查整个文档检索和答案生成流程的每一步。通过Lang Smith,可以清晰地看到每个节点的执行情况,包括文档检索、分级、查询转换和答案生成。

Highlights

Lance介绍了使用Lang图构建复杂的RAG流程的基本概念。

基本RAG流程包括问题检索、文档获取和基于这些文档的答案生成。

提出了Active RAG的概念,即让LLM根据已有的检索或生成决定何时何地进行检索。

介绍了更复杂的逻辑流程构建方法,如状态机,以管理不同的RAG操作步骤。

Lang图支持构建用于RAG的状态机,促进复杂流程的实现。

讨论了如何通过Lang图实现一个称为CAG的RAG改进方法。

CAG方法中,首先检索文档,然后评估文档的相关性,如果相关则生成答案。

如果检索到的文档不相关,将从外部源检索新的文档用于答案生成。

演示了使用Lang图简化RAG状态机构建流程的步骤。

详细展示了在Lang图中如何为每个节点定义功能和逻辑流程。

讨论了流程工程的概念,即如何系统地设计和实现复杂的工作流。

通过Lang图监控RAG操作的实际运行和追踪结果。

提供了即将发布的博客文章的预告,讨论了使用Lang图实现不同RAG方法的细节。

强调了通过Lang图进行流程审查的直观性和效率。

鼓励使用Lang图进行流程工程,以优化和简化复杂逻辑操作。

展望了Lang图在RAG流程创新中的潜力和未来应用。

Transcripts

play00:01

hi this is Lance from Lang chain I'm

play00:03

going to be talking about using Lang

play00:04

graph to build a diverse and

play00:07

sophisticated rag

play00:08

flows so just to set the stage the basic

play00:11

rag flow you can see here starts with a

play00:15

question retrieval of relevant documents

play00:17

from an index which are passed into the

play00:20

context window of an llm for generation

play00:22

of an answer grounded in the ret

play00:24

documents so that's kind of the basic

play00:27

outline and we can see it's like a very

play00:29

linear path

play00:31

um in practice though you often

play00:34

encounter a few different types of

play00:36

questions like when do we actually want

play00:38

to

play00:39

retrieve based upon the context of the

play00:41

question um are the retrieve documents

play00:44

actually good or not and if they're not

play00:46

good should we discard them and then how

play00:49

do we loot back and retry retrieval with

play00:51

for example an improved

play00:53

question so these types of questions

play00:56

motivate an idea of active rag which

play00:59

which is a process where an llm actually

play01:01

decides when and where to retrieve based

play01:04

upon like existing

play01:05

retrievals or existing

play01:08

Generations now when you think about

play01:11

this there's a few different levels of

play01:13

control that you have over an llm in a

play01:16

rag

play01:17

application the base case like we saw

play01:20

with our chain is you just use an llm to

play01:23

choose a single steps output so for

play01:26

example in traditional rag you feed it

play01:28

documents and it decides the generation

play01:32

so it's just kind of one step now a lot

play01:35

of rag workflows will use the idea of

play01:37

routing so like given a question should

play01:40

I route it to a vector store or a graph

play01:44

DB um and we have seen this quite a

play01:48

bit now this newer idea that I want to

play01:51

introduce is how do we build more

play01:55

sophisticated logical

play01:57

flows um in a rag

play02:00

pipeline um that you let the llm choose

play02:04

between different steps but specify all

play02:09

the transitions that are

play02:11

available and this is known as we call a

play02:13

state

play02:14

machine now there's a few different

play02:18

architectures that have emerged uh to

play02:21

build different types of radic chains

play02:23

and of course chains are traditionally

play02:26

used just for like very basic graph but

play02:29

there's no State machine is a bit newer

play02:32

and Lang graph which we recently

play02:34

released provides a really nice way to

play02:37

build State machines for Rag and for

play02:40

other

play02:41

things and the general idea here is that

play02:44

you can lay out more diverse and

play02:46

complicated rag flows and then Implement

play02:50

them as

play02:51

graphs and it kind of motivates this

play02:53

more broad idea of of like flow

play02:55

engineering and thinking through the

play02:57

actual like workflow that you want and

play03:00

then implementing it um and we're going

play03:02

to actually do that right now so I'm

play03:06

going to Pi uh a recent paper called CAG

play03:09

corrective rag which is really a nice

play03:12

method um for active rag that

play03:15

incorporates a few different

play03:17

ideas um so first you retrieve documents

play03:22

and then you grade

play03:23

them now if at least one document

play03:27

exceeds the threshold for relevance

play03:30

you go to generation you generate your

play03:34

answer um and it does this knowledge

play03:37

refinement stage after that but let's

play03:41

not worry about that for right now it's

play03:43

kind of not essential for understanding

play03:44

the basic flow here so again you do a

play03:48

grade for relevance for every document

play03:51

if any is relevant you

play03:53

generate now if they're all ambiguous or

play03:57

incorrect based upon your grader

play04:00

you retrieve from an external Source

play04:03

they use web

play04:04

search and then they pass that as their

play04:07

context for answer

play04:10

generation so it's a really neat

play04:12

workflow where you're doing retrieval

play04:14

just like with basic rag but then you're

play04:16

reasoning about the documents if they're

play04:18

relevant go ahead and at least one is

play04:21

relevant go ahead and generate if

play04:22

they're not retrieve from alternative

play04:25

source and then pack that into the

play04:27

context and generate your answer

play04:31

so let's see how we would implement this

play04:34

as a state machine using Lang

play04:38

graph um we'll make a few

play04:41

simplifications

play04:43

um we're going to first decide if any

play04:47

documents are relevant we'll go ahead

play04:50

and do the the web

play04:52

search um to supplement the output so

play04:55

that's just like kind of one minor

play04:57

modification um we'll use search for web

play05:01

search um we use Query writing to

play05:03

optimize the search for uh to optimize

play05:06

the web search but it follows a lot of

play05:08

the the intuitions of the main paper uh

play05:12

small note here we set the Tav API key

play05:16

and another small mode I've already set

play05:18

my lsmith API key um which we'll see is

play05:22

useful a bit later for observing the

play05:25

resulting

play05:26

traces now I'm going to index three blog

play05:30

posts that I

play05:31

like um I'm going to use chrom ADB I'm

play05:34

going to use open eye embeddings I'm

play05:36

going to run this right now this will

play05:38

create a vector store for me from these

play05:40

three blog

play05:43

posts and then what I'm going to do is

play05:46

Define a

play05:48

state now this is kind of the core

play05:51

object that's going to be passed around

play05:53

my graph that I'm going to

play05:55

modify and right here is where I Define

play05:57

it and the key point to know note right

play05:59

now is it's just a dictionary and it can

play06:03

contain things that are relevant for rag

play06:05

like question documents generation and

play06:08

we'll see how we update that in in in a

play06:10

little bit but the first thing to note

play06:12

is we Define our state and this is

play06:15

what's going to be modified in every

play06:17

note of our

play06:18

graph now here's really the Crux of it

play06:21

and this is the thing I want to zoom in

play06:22

on a little bit um

play06:26

so when you kind of move from just

play06:28

thinking about prompts to thinking about

play06:30

overall flows it it's like kind of a fun

play06:33

and interesting exercise I kind of think

play06:35

about this as it's been mentioned on

play06:37

Twitter a little bit more like flow

play06:40

engineering so let's think through what

play06:43

was actually done in the paper and what

play06:47

modifications to our state are going to

play06:49

happen in each stage so we start with a

play06:52

question you can see that on the far

play06:53

left and this kind of state is

play06:55

represented as a dictionary like we have

play06:57

we start with a question we perform

play06:59

retrieval from our Vector ster which we

play07:01

just created that's going to give us

play07:03

documents so that's one node we made an

play07:06

an adjustment to our state by adding

play07:09

documents that's step

play07:11

one now we have a second node where

play07:13

we're going to grade the documents and

play07:16

in this node we might filter some out so

play07:18

we are making a modification to state

play07:20

which is why it's a node so we're going

play07:22

to have a

play07:23

greater then we're going to have what

play07:26

we're going to call a conditional Edge

play07:28

so we saw we went from question to

play07:30

retrieval retrieval always goes to

play07:33

grading and now we have a

play07:35

decision if any document is

play07:38

irrelevant we're going to go ahead and

play07:42

do web search to

play07:43

supplement and if they're all relevant

play07:45

will go to generation it's a minor kind

play07:48

of a minor kind of logical uh decision

play07:51

that we're going to

play07:53

make um if any are not relevant we'll

play07:56

transform the query and we'll do we

play07:59

search and we'll use that for Generation

play08:01

so that's really it and that's how we

play08:03

can kind of think about our flow and how

play08:05

our States can be modified throughout

play08:07

this

play08:08

flow now all we then need to do and I I

play08:12

kind of found

play08:13

spending 10 minutes thinking carefully

play08:16

through your flow

play08:18

engineering is really valuable because

play08:20

from here it's really just

play08:21

implementation

play08:23

details um and it's pretty easy as

play08:26

you'll see so basically I'm going to run

play08:29

this code block but then we can like

play08:31

walk through some of it I won't show you

play08:32

everything so it'll get a little bit

play08:33

boring but really all we're

play08:36

doing is we're finding functions for

play08:38

every node that take in the state and

play08:42

modify in some way that's all it's going

play08:44

on so thing about retrieval we run

play08:47

retrieval we take in state remember it's

play08:49

a dict we get our state dict like this

play08:52

we extract one key question from our

play08:54

dick we pass that to a retriever we get

play08:57

documents and we write back out State

play09:00

now with documents key added that's

play09:04

all generate going to be similar we take

play09:07

in state now we have our question and

play09:09

documents we pull in a prompt we Define

play09:12

an LM we do minor postprocessing on

play09:15

documents we set up a chain for

play09:17

retrieval uh or sorry for Generation

play09:19

which is just going to be take our

play09:20

prompt pump Plum that to an llm pars the

play09:23

output to string and we run it right

play09:26

here invoking our documents and our

play09:29

question to get our answer we write that

play09:32

back to State that's

play09:34

it and you can kind of follow here for

play09:36

every node we just Define a function

play09:40

that performs the state modification

play09:41

that we want to do on that

play09:43

node grading documents is going to be

play09:45

the same um in this case I do a little

play09:49

thing extra here because I actually

play09:51

Define a pantic data model for my grater

play09:54

so that the output of that particular

play09:57

grading chain is a binary yes or no you

play10:00

can look at the code make sure it's all

play10:02

shared um and that just makes sure that

play10:04

our output is is very deterministic so

play10:07

that we then can down here perform

play10:11

logical filtering so what you can see

play10:14

here is um we Define this search value

play10:19

no and we iterate through our documents

play10:22

we grade them if any document uh is

play10:26

graded as not relevant we flag a search

play10:29

thing to yes that means we're going to

play10:32

perform web search we then add that to

play10:35

our state dict at the end so run web

play10:37

search now that value is true that's

play10:40

it and you can kind of see we go through

play10:42

some other nodes here there's web search

play10:44

node um now here is where our one

play10:48

conditional Edge we Define right here

play10:51

this is where we decide to generate or

play10:52

not based on that search key so we again

play10:57

get our state let's extract the various

play10:59

values so we have the search value now

play11:02

if search is yes we return the next no

play11:06

that we want to go to so in this case

play11:09

it'll be transform query which will then

play11:11

go to web search else we go to

play11:15

generate so what we can see is we laid

play11:19

out our graph which you can kind of see

play11:21

up

play11:23

here and now we Define functions for all

play11:25

those nodes as well as the conditional

play11:28

Edge

play11:30

and now we scroll down all we have to do

play11:33

is just lay that out here again as our

play11:36

flow and this is kind of what you might

play11:38

think of as like kind of flow

play11:39

engineering where you're just laying out

play11:41

the graph as you drew it where we have

play11:45

set our entry point as retrieve we're

play11:47

adding an edge between retrieve and

play11:49

grade documents so we went retrieval

play11:51

grade documents we add our conditional

play11:53

Edge depending on the grade either

play11:56

transform the query go to web search or

play11:59

just go to generate we create an edge

play12:02

between transform the query and web

play12:03

search then web search to generate and

play12:06

then we also have an edge generate to

play12:08

end and that's our whole graph that's it

play12:10

so we can just run

play12:12

this and now I'm going to ask a question

play12:15

so let's just say um how does agent

play12:19

memory work for example let's just try

play12:21

that and what this is going to do is

play12:23

going to print out what's going on as we

play12:26

run through this graph so um um first we

play12:29

going to see output from

play12:31

retrieve this is going to be all of our

play12:33

documents that we retrieved so that's

play12:35

that's fine this is from our our

play12:36

retriever then you can see that we're

play12:38

doing a relevance check across our

play12:41

documents and this is kind of

play12:43

interesting right you can see we grading

play12:45

them here one is grade as not

play12:48

relevant um and okay you can see the

play12:51

documents are now filtered because we've

play12:52

remove the one that's not relevant and

play12:54

because one is not relevant we decide

play12:57

okay we're going to is transform the

play12:59

query and run web

play13:01

search and um you can see after query

play13:05

transformation we rewrite the question

play13:07

slightly we then run web

play13:10

search um and you can see from web

play13:12

search it searched from some additional

play13:15

sources um which you can actually see

play13:18

here it's

play13:19

appended as a so here it is so here it's

play13:23

a new document appended from web search

play13:25

which is from memory and knowledge

play13:27

requirements so it it basically looked

play13:29

up some AI architecture related to

play13:31

memory uh web results so that's fine

play13:34

that's exactly what we want to

play13:36

do and then um we generate a

play13:39

response so that's great and this is

play13:42

just showing you everything in kind of

play13:43

gory detail but I'm going to show you

play13:46

one other thing that's that's really

play13:47

nice about this if I go to

play13:51

lsmith I have my AP I key set so all my

play13:54

Generations are just logged to to Lang

play13:57

Smith and I can see my Lang graph run

play14:00

here now what's really cool is this

play14:03

shows me all of my nodes so remember we

play14:07

had retrieve

play14:09

grade we evaluated the grade because one

play14:13

was irrelevant we then went ahead and

play14:15

transformed the query we did a web

play14:17

search we pended that to our context you

play14:20

can see all those steps are laid out

play14:22

here in fact you can even look at every

play14:24

single uh grader and its output I will

play14:27

move this up LLY um so you can see that

play14:32

the different scores for grades okay so

play14:34

this particular retrieval was graded as

play14:37

as not relevant so that's fine that that

play14:39

can happen in some cases and because of

play14:43

that um we did a query transformation so

play14:46

we modified the question slightly how

play14:49

does memory how does the memory system

play14:51

and artificial agents function so it's

play14:53

just a minor rephrasing of the question

play14:55

we did this tly web search this is where

play14:58

it queried from this particular blog

play15:00

post or medium so it's like a sing web

play15:03

query we can like sanity check it and

play15:05

then what's need is we can go to our

play15:07

generate step look at open Ai and here's

play15:09

our full prompt how does the memory

play15:11

system in our official agents function

play15:14

and then here's all of our documents so

play15:17

this is the this is the web search as

play15:19

well as we still have the relevant

play15:21

chunks that were atriev from our blog

play15:23

posts um and then here's our answer so

play15:29

that's really it you can see how um

play15:32

really moving from the notion of just

play15:36

like I'll actually go back to the

play15:38

original um moving

play15:42

from uh I will try to open this up a

play15:45

little

play15:46

bit

play15:51

um yeah I can see my face still um the

play15:57

transition from from laying out simple

play16:03

chains to

play16:06

flows is a really interesting and

play16:08

helpful way of thinking about why graphs

play16:10

are really interesting because you can

play16:13

encode more sophisticated logical

play16:16

reasoning

play16:17

workflows but in a

play16:19

very like clean and well-engineered way

play16:24

where you can specify all the

play16:26

transitions that you actually want to

play16:28

have

play16:29

executed um and I actually find this way

play16:33

of thinking and building kind of logical

play16:36

uh like workflows really

play16:38

intuitive um we have a blog post coming

play16:41

out uh tomorrow that discusses both

play16:45

implementing self rag as well as C rag

play16:48

for two different active rag approaches

play16:50

using uh this idea of of State machines

play16:54

and Lang graph um so I encourage you to

play16:57

play with it uh I found it really uh

play17:00

intuitive to work with um I also found

play17:04

uh inspection of traces to be quite

play17:07

intuitive using Lang graph because every

play17:12

node is enumerated pretty clearly for

play17:14

you which is not always the case when

play17:16

you're using other types of of more

play17:18

complex reasoning approaches for example

play17:20

like agents so in any case um I hope

play17:24

this was helpful and I definitely

play17:25

encourage you to check out um kind this

play17:28

notion of like flow engineering using

play17:30

line graph and in the context of rag it

play17:32

can be really powerful hopefully as You'

play17:34

seen here thank

play17:36

you

Rate This

5.0 / 5 (0 votes)

Related Tags
Lang图状态机文档检索答案生成逻辑流程推理工程主动推理知识精炼查询优化AI架构
Do you need a summary in English?