第一篇: 先进的RAG管道(Advanced RAG Pipeline) 中英文字幕

AI大模型之美
24 Dec 202315:18

Summary

TLDR本课程详细介绍了如何使用Llama Index建立基础和高级的检索增强生成(RAG)管道。课程首先解释了基础RAG管道的工作机制,包括摄取、检索和合成三个部分。随后,通过使用True Lens定义一组评估指标,我们可以对高级RAG技术与基础管道进行基准测试。课程还探讨了如何通过使用句子窗口检索和自动合并检索等高级技术来提升性能,并利用True Lens进行评估和基准测试。

Takeaways

  • 📚 介绍了如何使用Llama Index建立基础和高级的检索增强生成(RAG)管道。
  • 🔍 RAG管道由三个组件组成:摄取(ingestion)、检索(retrieval)和合成(synthesis)。
  • 📈 通过True Lens定义一组指标,用于基准测试高级RAG技术与基础管道的性能。
  • 📊 展示了如何使用Llama Index和OpenAI的LM来创建一个简单的RAG应用程序。
  • 📝 讨论了如何对文档进行分块、嵌入和索引化处理。
  • 🔑 介绍了如何使用GBT 3.5 Turbo和Hugging Face BG Small模型进行文档的嵌入和检索。
  • 💡 通过True Lens模块初始化反馈函数,创建了RAG评估三元组,包括答案相关性、上下文相关性和基础性。
  • 📌 强调了自动化评估(如True Lens)在大规模评估生成AI应用时的重要性。
  • 🌟 比较了基础RAG管道和高级检索技术(如句子窗口检索和自动合并检索)的性能。
  • 🚀 展示了如何设置和评估句子窗口检索和自动合并检索这两种高级检索技术。
  • 📈 提供了一个综合的排行榜,展示了不同检索技术在评估指标上的表现和效率。

Q & A

  • 什么是RAG(Retrieval-Augmented Generation)管道?

    -RAG管道是一种结合了信息检索和文本生成的技术,它包含三个部分:信息摄取、检索和合成。通过这种管道,可以对用户查询生成更加丰富和准确的回答。

  • RAG管道中的'摄取'阶段是做什么的?

    -在RAG管道的'摄取'阶段,首先加载一组文档,然后将每个文档分割成多个文本块,并为每个文本块生成嵌入向量,最后将这些嵌入向量存储到索引中。

  • 如何使用Llama Index建立RAG管道?

    -通过使用Llama Index,我们可以创建一个简单的LLM应用程序,它内部使用OpenAI的LM。首先,我们需要创建一个服务上下文对象,指定使用的LM和嵌入模型,然后使用Llama Index的Vector Store Index对文档进行索引。

  • 如何评估RAG管道的性能?

    -使用True Lens初始化反馈函数,创建一个RAG评估三元组,包括查询、响应和上下文之间的成对比较。这样可以创建三个不同的评估模块:回答相关性、上下文相关性和基础性。

  • 什么是句子窗口检索?

    -句子窗口检索是一种先进的检索技术,通过嵌入和检索单个句子来工作,检索后将句子替换为原始检索句子周围的更大句子窗口。这样可以为AI提供更多的上下文信息,以更好地回答查询。

  • 自合并检索器是如何工作的?

    -自合并检索器通过构建一个层级结构,其中较大的父节点包含引用父节点的较小子节点。在检索过程中,如果父节点的大部分子节点被检索出来,那么子节点就会被父节点替换,从而实现检索节点的层级合并。

  • 如何设置和使用自合并检索器?

    -使用自合并检索器需要构建一个层级索引,并从这个索引中获取查询引擎。可以使用辅助函数来构建句子窗口索引,并在接下来的课程中深入了解其工作原理。

  • 在评估RAG管道时,True Lens的作用是什么?

    -True Lens是一个标准机制,用于大规模评估生成性AI应用程序。它允许我们根据特定领域的需求和应用程序的动态变化来评估应用程序,而不必依赖昂贵的人工评估或设置基准。

  • 在RAG管道中,如何提高上下文相关性?

    -通过使用更先进的检索技术,如句子窗口检索和自合并检索,可以提高检索到的上下文相关性。这些技术能够提供更多的上下文信息,从而在合成最终回答时,使得回答更加相关和准确。

  • RAG管道中的'合成'阶段是什么?

    -在RAG管道的'合成'阶段,将检索到的上下文块与用户查询结合起来,放入LM的提示窗口中,从而生成最终的回答。

  • 在RAG管道中,如何优化总成本?

    -通过提高检索和合成的性能,可以在保持高相关性的同时降低总成本。例如,使用句子窗口检索和自合并检索技术可以提高基础性和上下文相关性,从而在不增加总成本的情况下提高性能。

Outlines

00:00

📚 基础与高级RAG管道的搭建

本段落介绍了如何搭建基础和高级的检索增强生成(RAG)管道。首先,通过加载文档并将其分块,使用文本分割器和嵌入模型创建嵌入,然后将这些嵌入存储到索引中。接着,通过用户查询执行检索,获取与查询最相似的文档块,并将其与用户查询结合,用于语言模型的合成阶段,以生成最终响应。此外,还讨论了如何使用Triara建立评估基准,以及如何通过Llama Index和OpenAI的语言模型快速开始。最后,鼓励用户上传自己的PDF文件进行实践,并进行了一些基本的文档检查。

05:01

🔍 评估和基准测试

这一部分详细描述了如何使用True Lens初始化反馈函数,创建RAG评估三元组,包括回答相关性、上下文相关性和基础性评估。通过预建的问题列表和True Lens模块的初始化,可以对应用程序进行评估。True Lens作为一种评估机制,允许根据特定领域的需求和应用程序的变化来定制评估标准。此外,还介绍了如何使用True Lens记录器进行评估,并展示了如何通过UI界面查看评估结果,包括回答相关性、上下文相关性和基础性等指标。

10:04

🚀 高级检索技术

本段落探讨了两种高级检索技术:句子窗口检索和自动合并检索。句子窗口检索通过嵌入和检索单个句子,然后将句子替换为原始检索句子周围的更大句子窗口,以提供更多上下文信息。自动合并检索则通过构建一个层级结构,将检索节点合并到更大的父节点中。段落中还展示了如何设置这些高级检索技术,并提供了示例查询以及如何使用True Lens评估这些技术的性能。

15:06

📈 评估结果与技术比较

在这一部分中,对基础RAG管道和两种高级检索技术(句子窗口检索和自动合并检索)的评估结果进行了比较和分析。通过对比各项指标,如基础性、回答相关性和上下文相关性,以及总成本和延迟,可以得出哪种检索技术更为有效。结果显示,自动合并检索在基础性和上下文相关性方面表现更好,且总成本更低。最后,通过UI界面展示了如何比较不同检索技术的性能。

📖 深入了解评估模块

最后一段概述了在下一课中将深入探讨RAG三元组评估模块:回答相关性、上下文相关性和基础性。这将帮助用户更好地理解如何使用这些模块,并了解每个模块的具体需求。

Mindmap

Keywords

💡rag Pipeline

rag Pipeline(检索增强生成管道)是指结合检索和生成技术的人工智能系统,用于提高信息检索的准确性和生成内容的相关性。在视频中,介绍了如何设置基础和高级的rag管道,包括三个主要组件:摄取(ingestion)、检索(retrieval)和合成(synthesis)。例如,基础的rag管道通过将文档分块、生成嵌入向量并存储于索引中,然后根据用户查询检索最相似的块,并将其与用户查询结合以生成最终响应。

💡llama index

llama index是一个用于存储和检索向量表示的系统,它在构建rag管道中扮演了重要角色。在视频中,使用llama index来存储文档块的嵌入向量,并在此基础上执行检索操作。

💡embedding model

embedding model(嵌入模型)是一种将文本或其他数据转换为数值向量的人工智能模型,这些数值向量能够捕捉原始数据的语义信息。在视频中,嵌入模型用于将文本块转换为嵌入向量,以便在llama index中进行高效的相似性检索。

💡user query

user query(用户查询)指的是用户输入到系统中的问题或请求。在rag管道中,用户查询是触发检索和生成过程的起点。系统根据用户查询从索引中检索相关信息,并结合这些信息生成响应。

💡evaluation Benchmark

evaluation Benchmark(评估基准)是指用于衡量和比较系统性能的标准或指标集合。在视频中,通过设置评估基准来评价不同rag技术的效果,如基本管道与高级管道的性能对比。

💡true lens

true lens是一个用于评估和反馈生成性AI应用的工具。它通过定义一系列的评估模块来自动评价应用的性能,如答案的相关性、上下文的相关性等。在视频中,true lens用于初始化反馈函数并进行自动化评估。

💡sentence window retrieval

sentence window retrieval(句子窗口检索)是一种高级的检索技术,它通过嵌入和检索单个句子来工作,检索后将句子替换为原始检索句子周围的更大句子窗口。这种方法允许系统在检索更细粒度信息的同时,为生成的响应提供更多的上下文信息,以期提高检索和合成的性能。

💡auto merging retriever

auto merging retriever(自动合并检索器)是一种构建层次结构数据检索的技术,它通过合并检索到的节点到更大的父节点来工作。如果父节点的大部分子节点被检索到,那么在检索结果中,子节点将被父节点替代。这种方法可以在检索过程中更有效地组织和合并信息。

💡context relevance

context relevance(上下文相关性)是指检索到的信息或生成的响应与用户查询的关联程度。在rag管道中,上下文相关性是评估系统性能的重要指标之一,它衡量了检索到的上下文是否与用户的查询紧密相关。

💡groundedness

groundedness(基础性)是指生成的响应在多大程度上由检索到的上下文信息所支持。在rag管道中,一个高基础性的响应意味着它不仅与用户的查询相关,而且得到了检索到的确切信息的支持。

Highlights

本课程提供了如何使用Llama Index设置基础和高级RAG(检索增强生成)管道的全面概述。

RAG管道由三个不同组件组成:摄取、检索和合成。

通过向量数据库等存储系统创建索引视图,可以提高检索的准确性。

使用True Lens定义一组度量标准,以便将高级RAG技术与基础管道进行基准测试。

通过注入阶段,首先加载一组文档,然后将每个文档分割成文本块。

对于每个文本块,使用嵌入模型生成嵌入,然后将嵌入的文本块卸载到索引中。

检索阶段首先对索引发起用户查询,然后检索与用户查询最相似的K个块。

合成阶段将检索到的块与用户查询结合,放入语言模型的提示窗口中,以生成最终响应。

本课程将深入探讨如何使用Llama Index和True Lens设置评估基准。

通过创建简单的LLM应用程序,使用Llama Index内部使用的OpenAI LLM。

使用True Lens初始化反馈函数,创建RAG评估三元组,包括查询响应和上下文之间的成对比较。

ALMs作为评估生成性AI应用的标准机制,允许我们以定制化的方式评估应用。

通过True Lens评估器,可以在仪表板上查看每个查询的输入输出、记录ID标签等详细信息。

高级检索技术,如句子窗口检索和自动合并检索,可以提高检索和合成性能。

句子窗口检索通过嵌入和检索单个句子,然后在检索后用原始检索句子周围的更大句子窗口替换句子。

自动合并检索器通过将检索节点合并到更大的父节点中工作,从而在检索期间实现层次化合并。

通过比较不同检索技术的结果,可以得到一个综合的排行榜,了解各种技术的表现。

Transcripts

play00:00

in this lesson you'll get a full

play00:01

overview of how to set up both a basic

play00:03

and advanced rag Pipeline with llama

play00:05

index we'll load in an evaluation

play00:07

Benchmark and use true lens to define a

play00:09

set of metrics so that we can Benchmark

play00:11

Advanced rag techniques against the Bas

play00:13

line or basic pipeline in the next few

play00:15

lessons we'll explore each lesson a

play00:17

little bit more in

play00:18

depth let's first walk through how a

play00:21

basic retrieval augmented generation

play00:23

pipeline works or rag pipeline it

play00:25

consists of three different components

play00:27

ingestion retrieval and synthesis

play00:30

going through the injection phase we

play00:32

first load in a set of documents for

play00:35

each document we split it into a set of

play00:37

text trunks using a text sper then for

play00:40

each Chunk we generate an embedding for

play00:42

that chunk using an embedding model and

play00:45

then for each chunk with embedding we

play00:47

offload it to an index which is a view

play00:49

of a storage system such as a vector

play00:51

database once the data is stored within

play00:54

an index we then perform retrieval

play00:56

against that index first we launch a

play00:59

user query against against the index and

play01:01

then we fetch the top K move similar

play01:03

chunks to the user

play01:05

query afterwards we take these relment

play01:08

chunks combine it with the user query

play01:10

and put it into the prompt window of the

play01:12

LM in the synthesis phase and this

play01:14

allows us to generate a final

play01:16

response this notebook will walk you

play01:18

through how to set up a basic and

play01:20

advanced rag Pipeline with llama index

play01:22

we will also use triara to help set up

play01:25

an evaluation Benchmark so that we can

play01:26

measure improvements against the

play01:28

Baseline

play01:30

for this quick start you will need an

play01:31

open API key note that for this lesson

play01:35

we'll use a set of helper functions to

play01:36

get you s and running quickly and we'll

play01:38

do a deep dive into some of these

play01:39

sections in the future

play01:44

lessons next we'll create a simple llm

play01:46

application using llama index which

play01:49

internally uses an open AI llm in terms

play01:52

of the data source we'll use the how to

play01:54

build a career in AI PDF written by

play01:57

Andro note that you can also upload your

play01:59

own PDF file if you wish uh and for this

play02:02

lesson we encourage you to do so let's

play02:04

do some basic sanity checking of what

play02:07

the document consist of as well as the

play02:09

length of the

play02:11

document we see that we have a list of

play02:13

documents uh there's 41 elements in

play02:16

there uh each item in that list is a

play02:18

document object and we'll also show uh a

play02:21

snippet of the text for a given document

play02:24

next we'll merge these into a single

play02:25

document because it helps with overall

play02:27

text blending accuracy when using more

play02:29

advanced retrieval methods such as a

play02:31

sentence window retrieval as well as

play02:32

Auto merging retrieval The Next Step

play02:34

here is to index these documents and we

play02:37

can do this with the vector store index

play02:39

within llama

play02:47

index next we Define a service context

play02:50

object which contains both the alab

play02:51

we're going to use as well as the

play02:52

embedding model we're going to use the

play02:54

alab we're going to use is gbt 3.5 Turbo

play02:57

from open Ai and then the embedding

play02:59

model that we're going to use is the

play03:01

hugging face BG small

play03:07

model these Cas steps show this inje

play03:10

process right here we've loaded in

play03:12

documents and then in one line Vector

play03:15

stor index off from documents we're

play03:16

doing the chunking embedding and

play03:19

indexing under the hood with the

play03:21

embedding mod that you

play03:28

specified next we obtain a query engine

play03:31

from this index that allows us to send

play03:33

user queries that do retrieval and

play03:35

synthesis against this

play03:40

data let's try out our first

play03:47

request and the query is what are steps

play03:49

to take when finding projects to build

play03:50

your experience let's

play03:57

F star small and gradually increase the

play04:00

scope and complexity of your projects

play04:02

great so it's working so now you've set

play04:04

up the basic rag pipeline uh the next

play04:07

step is to set up some evaluations

play04:09

against this pipeline to understand how

play04:11

well it performs and this will also

play04:13

provide the basis for defining our

play04:14

Advanced retrieval methods the sentence

play04:16

window retriever as well as the auto

play04:18

merging retriever in this section we use

play04:20

true lens to initialize feedback

play04:22

functions we initialize a helper

play04:24

function get feedbacks to return a list

play04:26

of feedback functions to evalate our app

play04:29

here we've created a rag evaluation

play04:31

Triad which consists of pairwise

play04:34

comparisons between the query response

play04:36

and context and so this really creates

play04:39

three different evaluation modules

play04:41

answer relevance context relevance and

play04:44

groundedness answer relevance is is the

play04:47

response relevant to the query context

play04:49

relevance is is the retrieve context

play04:52

relevant to the

play04:53

query and groundedness is is the

play04:56

response supported by the

play04:58

context we'll walk through how to set

play05:00

this up yourself in the next few

play05:02

notebooks the first thing we need to do

play05:04

is to create a set of questions on which

play05:06

to test our application here we've

play05:08

pre-written the first 10 and we

play05:10

encourage you to add to

play05:16

theist and now we have some evaluation

play05:18

questions what are the keys to building

play05:20

a career in AI how can teamwork

play05:23

contribute to success in AI Etc the

play05:26

first thing we need to do is to create a

play05:28

set of questions on which to test our

play05:30

application here we've pre-written the

play05:32

first 10 but we encourage you to also

play05:35

add to this list here we specify fund

play05:38

new question what is the right AI job

play05:40

for me and we add it to the eval

play05:42

questions list now we can initialize the

play05:44

true lens modules to begin our

play05:46

evaluation

play05:51

process we've initialized the true L

play05:53

module and now we've reset the

play05:58

database we can now initialize our

play06:00

evaluation modules alms are growing as a

play06:03

standard mechanism for evaluating

play06:05

generative AI applications at scale

play06:07

rather than relying on expensive human

play06:09

evaluation or set benchmarks alms allow

play06:12

us to evaluate our applications in a way

play06:14

that is custom to The Domain in which we

play06:15

operate and dynamic to the changing

play06:18

demands for our

play06:19

application here we've pre-built a shin

play06:21

recorder to use for this example in the

play06:24

recorder we've included the standard

play06:26

Triad of evaluations for evaluating rags

play06:29

ground in this context relevance and

play06:31

answer relevance uh we'll also specify

play06:33

an ID so that we can track this version

play06:35

of our app as we experiment we can track

play06:37

new versions by simply changing app ID

play06:40

now we can run the query engine again

play06:42

with the true lines

play06:44

context so what's happening here is that

play06:47

we're sending each query to our query

play06:49

engine and in the background uh the true

play06:52

lens recorder is evaluating each four

play06:54

queries against these three

play06:58

metrics if you see some warning messages

play07:01

uh don't worry about it uh some of it is

play07:03

system about

play07:13

it here we can see a list of queries as

play07:16

well as their Associated

play07:20

responses you can see the input output

play07:23

the record ID tags and

play07:26

more you can also see the answer

play07:28

relevance context relevance and

play07:30

groundedness for each row in this

play07:33

dashboard you can see your evaluation

play07:34

metrics like context relevance answer

play07:36

relevance and groundedness as well as

play07:38

average leny total cost and more and in

play07:40

a

play07:41

UI here we see that the answer relevance

play07:45

and groundedness are decently high but

play07:47

Cloud taex relevance is pretty low now

play07:50

let's see if we can improve these

play07:51

metrics with more advanced retrieval

play07:53

techniques like sentence window

play07:55

retrieval as well as onut merging

play07:57

retrieval the first Advanced technique

play07:59

will talk about is sentence window

play08:01

retrieval this works by embedding and

play08:03

retrieving single sentences so more

play08:05

granular trunks but after retrieval the

play08:08

sentences are replaced with a larger

play08:10

window of sentences around the original

play08:13

retrieve

play08:14

sentence the intuition is that this

play08:16

allows for the Alm to have more context

play08:18

for the information retrieved in order

play08:20

to better answer queries while still

play08:22

retrieving on more granular pieces of

play08:24

information so ideally improving both

play08:26

retrieval as well as synthesis

play08:28

performance now let's take a look at how

play08:30

to set it

play08:34

up first we'll use opening IBT 3.5 turbo

play08:38

next we'll construct our sentence window

play08:40

index over the given

play08:43

document just a reminder that we have a

play08:46

helper function for constructing the

play08:47

sentence window index and we'll do a

play08:49

deep dive in how this works under the

play08:51

hood in the next few

play08:56

lessons similar to before we'll get a

play08:58

query engine from from the sentence

play08:59

window

play09:00

index and now that we've set this up we

play09:03

can try running an example

play09:05

query here the question is how do I get

play09:07

started on a personal project in

play09:12

Ai and we get back response get started

play09:15

on a personal project in AI is first

play09:17

important to identify scope the project

play09:21

great similarly to before let's try

play09:24

getting the true lines evaluation

play09:25

context and try benchmarking the

play09:28

results

play09:32

so here we import the true recorder

play09:35

sentence window which is a pre-built

play09:36

true lines recorder for the sentence

play09:38

window

play09:40

index and now we'll run the sentence

play09:42

window retriever on top of these

play09:44

evaluation questions and then compare

play09:47

performance on the rag Triad of

play09:49

evaluation

play09:57

modules here we can see the respons come

play09:59

in as they're being

play10:03

run some examples of questions or

play10:06

responses um how can teamwork contribute

play10:08

to success in AI teamwork can contribute

play10:10

to success in AI by allowing individuals

play10:13

to Leverage The expertise and insights

play10:14

of their colleagues what what's the

play10:17

importance of networking in AI

play10:18

networking is important in AI because it

play10:20

allows individuals to connect with

play10:21

others who have experience and knowledge

play10:23

in the

play10:24

field great now that we've run

play10:27

evaluations for two techniques the Bic

play10:29

rag pipeline as well as a sentence

play10:30

window retrieval pipeline let's get a

play10:32

leader W of the results and see what's

play10:34

going

play10:36

on here we see that General groundedness

play10:40

um is eight percentage points better

play10:43

than the Baseline drag pip Point answer

play10:46

relevance is more or less the same

play10:48

context relevance is also better for the

play10:50

sentence window prary engine latency is

play10:53

more or less the same and the tottal

play10:55

cost is

play10:58

lower

play11:00

since the groundedness and context

play11:01

viment are higher but the total cost is

play11:03

lower we can intu it that the sentence

play11:06

window retriever is actually giving us

play11:09

more relevant context and more

play11:11

efficiently as

play11:15

well when we go back into the UI we can

play11:18

see that we now have a comparison

play11:20

between the direct query engine the

play11:22

Baseline as well as a sentence window

play11:24

and we can see the metrix that we just

play11:27

saw in the notebook display DOI as well

play11:30

the next Advanced retrieval technique

play11:32

we'll talk about is the auto merging

play11:34

retriever here we construct a hierarchy

play11:36

of larger parent nodes with smaller

play11:39

child nodes that reference the parent

play11:41

node so for instance we might have a

play11:44

parent node of chunks size 512 tokens

play11:47

and underneath there are four child

play11:49

nodes of Chunk size 128 tokens that link

play11:52

to this parent

play11:54

node the autom merging retriever works

play11:57

by merging retrieve nodes into larger

play12:00

parent nodes which means that during

play12:02

retrieval if a parent actually has a

play12:05

majority of its children nodes retrieved

play12:07

then we'll replace the children nodes

play12:09

with the parent

play12:11

node so this allows us to hierarchically

play12:13

merge a retriev noes the combination of

play12:16

all the child nodes is the same text as

play12:18

the parent Noe similarly to the sentence

play12:21

window retriever in the next few lessons

play12:23

we'll do a bit more of a deep dive on

play12:24

how it works here we'll show you how to

play12:27

set it up with our helper functions

play12:32

here we've built the auto merging index

play12:36

um again using gbt 3.5 turbo for the LM

play12:39

as well as the BG model for the embeding

play12:44

model we got the query engine from the

play12:47

auto merging

play12:49

Retriever and let's try running an

play12:51

example

play12:52

query how do I build a portfolio of AI

play12:55

projects in the logs here you actually

play12:57

see the merging project process go one

play12:59

we're merging nodes into a parent node

play13:02

uh to basically Retreat the parent node

play13:04

as opposed to the child

play13:07

node to build a portfolio of AI projects

play13:10

it is important to start with simple

play13:12

undertakings and gradually progress some

play13:13

more complex

play13:15

ones great so we see that it's working

play13:18

now let's Benchmark results with true

play13:24

lines we got a pre-built true lens

play13:27

recorder on top of our Auto virgin

play13:32

retriever we then run the auto virgin

play13:34

Retriever with true lents on top of our

play13:36

valuation

play13:39

questions here for each question you

play13:42

actually see the merging process going

play13:43

on such as merging three nodes into the

play13:46

parent node for the first question if we

play13:49

scroll down just a little bit we see

play13:52

that for some of these other questions

play13:53

we're also performing the merging

play13:55

process merging three nodes into the

play13:56

parent node merging one node into par an

play13:59

example question response pair is what

play14:02

is the importance of networking in AI

play14:04

networking is important in AI because it

play14:06

helps in building a strong professional

play14:07

networking Community now that we've run

play14:10

all three retrieval techniques the basic

play14:12

rag pipeline as well as the two Advanced

play14:14

retrieval methods we can view a

play14:16

comprehensive leaderboard to see how all

play14:18

three techniques stack

play14:20

up we get pretty nice results for the

play14:22

autover virgent query engine on top of

play14:24

the evaluation questions we get 100% in

play14:27

terms of groundedness 94% in terms of

play14:29

answer relevance um 43% in terms of

play14:33

context relevance Which is higher than

play14:35

both the sentence window and the

play14:36

Baseline rack

play14:38

Pipeline and we get roughly equivalent

play14:41

total costs to the sentence window opary

play14:42

engine implying that the retrieval here

play14:44

is more efficient with equivalent

play14:49

latency and at the end you can view this

play14:51

in the dashboard as well this lesson

play14:54

gives you a comprehensive overview of

play14:56

how to set up a basic and advanced frag

play14:58

pipeline

play14:59

and also how to set up evaluation

play15:00

modules to measure performance in the

play15:03

next lesson aicon will do a deep dive

play15:05

into these evaluation modules

play15:07

specifically the rag Triad of graved

play15:11

answer relevance and context relevance

play15:13

and you'll learn a bit more about how to

play15:15

use these modules and what each module

play15:17

needs

Rate This

5.0 / 5 (0 votes)

Related Tags
人工智能RAG技术Llama索引TrueLens性能评估文本检索生成模型职业发展AI行业技术教程
Do you need a summary in English?