A Comprehensive Cookbook for Claude 3

LlamaIndex
7 Mar 202422:52

Summary

TLDRスクリプトのエッセンスを包む魅力的な要約で、ユーザーを引き付ける短くyet正確な概観を提供する。

Takeaways

  • 🚀 Claud 3が最新リリースされ、AnthropicからCLA, CLA3 Haiku, CLA3 Opusの3つのモデルが提供された。
  • 🏆 CLA3 Opusが最も優れたモデルであり、jt4を含んだ他のモデルをすべてアウトパフォームしている。
  • 📈 CLA3 Opusは、様々なベンチマークで優れた結果を示しており、ユーザーのテスト結果も印象的だ。
  • 🤖 200,000のコンタクトウィンドォが存在し、Anthropicはそれを1兆に拡大する予定である。
  • 🔌 Cloud 3はAPIを通じてアクセス可能で、Cloud ProにサブスクライブしてOpusにアクセスすることができる。
  • 📚 Llama index Pythonライブラリを利用して、Anthropicの統合を簡単に行える。
  • 📄 シンプルな記事を用いて、データインデックスの方法を紹介している。
  • 🔍 Vector Store IndexとSummary Indexを使って、ドキュメントの知識をインデックス化し、検索を行う。
  • 🛠️ RAG(Retrieval-Augmented Generation)パイプラインを使って、特定の質問に対する回答を生成する。
  • 🔄 Router Query Engineを使えば、複数のツールを用いて質問をルーティングすることができる。
  • 🔢 SQL Query Engineを使って、構造化されたデータベース上でのテキストSQLを実行することができる。
  • 🌐 React Agentを使って、直接的なプロンプトをLG(Language Model)に投げて、問題解決のためのアクションを決定する。

Q & A

  • Claud 3は何日リリースされたのですか?

    -Claud 3は2024年3月4日にリリースされました。

  • Anthropicがリリースした3つの新しいモデルは何ですか?

    -Anthropicがリリースした3つの新しいモデルはCLA 3, Haiku CLA 3, そしてCLA 3 Opusです。

  • CLA 3 Opusがどのように他のモデルを優位であると評価されていますか?

    -CLA 3 Opusは、さまざまなベンチマークにおいて他のモデルを優位であると評価されています。特に、jt4を含んでおり、数値的な結果だけでなく、TwitterでClaudeを試して遊んでいる人々の意見からも、その優位性が示されています。

  • CLA 3 Opusのcontact wendoは何ですか?

    -CLA 3 Opusのcontact wendoは200,000で、Anthropicはこれを1 millionに拡大する予定です。

  • CLA 3 Opusにアクセスする方法は何ですか?

    -CLA 3 OpusにアクセスするにはAPIを通じて行うことができます。また、CLA Proにサブスクライブすることで、Opusにデフォルトでアクセスすることができます。

  • Llama index pythonライブラリとAnthropicの統合方法について教えてください。

    -Llama index pythonライブラリを利用することで、Anthropicを統合することができます。Anthropicは標準で埋め込モデルを提供していないため、この場合、Hugging FaceのBGE埋め込モデルを使用します。また、Anthropicパッケージをインストールして、統合を直接使用できるラッパーをpip installでインストールすることができます。

  • トイなデータセットを使用して、CLA 3 Opusをどのようにデモンストレーションするか教えてください。

    -トイなデータセットを使用して、CLA 3 Opusをデモンストレーションするために、単純な記事をWebから読み込んで、Beautiful Soupを使用してHTMLをクリーニングし、テキストを整形します。そして、AnthropicのAPIキーを入力し、必要なインポートと設定を行います。最後に、ベクトルストアインデックスとサマリーインデックスを作成し、問い合わせエンジンを実行して、CLA 3 Opusの機能を示します。

  • CLA 3 Opusを使ったSQLクエリエンジンの設定方法について教えてください。

    -CLA 3 Opusを使ったSQLクエリエンジンを設定するためには、まずChinook SQLライトデータベースをダウンロードし、SQLAlchemyを使用してデータベースに接続します。その後、SQLデータベースとクエリを実行したいテーブルをNL SQLテーブル問い合わせエンジンに渡します。このエンジンは自然言語をSQLに翻訳し、データベースに対して実行し、結果を返します。

  • CLA 3 Opusを使った構造化データ抽出の方法について教えてください。

    -CLA 3 Opusを使った構造化データ抽出では、L indexの構造化データ抽出プログラムを使用します。これは、プロンプトLLMと必要な出力形式のPonicスキーマを組み合わせたものです。LLMテキスト完成プログラムや、OpeenAIの関数呼び出しと統合するPonicプログラムを使用して、LLMに適切なJSON出力を生成させることができます。

  • CLA 3 Opusを使ったリアクションエージェントの作り方について教えてください。

    -CLA 3 Opusを使ったリアクションエージェントを作るためには、直接的なプロンプティングを使用してLMを操作します。リアクションエージェントは一般的なエージェントで、任意のLMを取り扱います。エージェントは入力とツールのセットを受け取り、LMにプロンプトを生成してアクションを出力します。また、Chain of Thought reasoningとツール使用を組み合わせたフレームワークを使用して、問題を解決するためのアクションを決定します。

  • 上記のスクリプトで説明されたCLA 3 Opusの使用例之中で最も興味深いものは何ですか?

    -最も興味深い使用例は、サブクエストションクエリエンジンです。これは質問をサブクエストションに分解し、それぞれのサブクエストションに対応するツールを決定するものです。これにより、複雑な質問をより細かく分解し、異なるツールを使って回答することができるため、より深い理解と洞察を得ることができます。

Outlines

00:00

📚 イントロダクションとClaud 3の紹介

このセクションでは、JerryがLlama Indexから来て、Claud 3を使用する方法を紹介しています。Claud 3は2024年3月4日にリリースされ、現在の最高のモデルセットであり、特にClaud Opusが他のモデルを大幅に優位付けています。Anthropicがリリースした3つの新しいモデル、CLA、Haiku、Sonet、Opusについても触れられています。Opusは最も優れたモデルであり、様々なベンチマークで他のモデルを凌駕しています。また、Claud 3はAPIを通じてアクセス可能で、Claud Proのサブスクリプションを通じてOpusにアクセスできます。このセクションでは、Claud 3の基本的な使用法と、Llama Indexでの応用方法について学びます。

05:01

🛠️ Llama IndexでのClaud 3の使用法

このセクションでは、Llama Indexを使用してClaud 3を実装する方法について説明されています。まず、Llama IndexのPythonライブラリをインストールし、Hugging FaceのBGE埋め込みモデルを使用してRAG(Retrieval-Augmented Generation)を設定します。次に、The Vergeの簡単な記事をデータインデックスとして読み込み、Beautiful Soup Web Readerを使用してHTMLをクリーンアップします。また、AnthropicのAPIキーを設定し、Llama IndexのLMアソシエーションパッケージをインストールして、Claud 3との統合を行います。

10:01

🔍 RAGパイプラインとクエリエンジンの使用

このセクションでは、RAGパイプラインとクエリエンジンの使用方法が紹介されています。まず、ドキュメントにインデックスを作り、ベクトルストアインデックスとサマーリーインデックスを作成します。次に、ベクトルインデックスを使用してクエリエンジンを設定し、コンパクトモードで応答を生成します。クエリエンジンを使用して、OpenAIとMetaのAIツールについて質問し、応答を取得します。また、ルータークエリエンジンを使用して、質問を複数のツールにルーティングする方法も紹介されています。

15:02

📊 クエリ分解とSQLクエリエンジン

このセクションでは、クエリ分解とSQLクエリエンジンの使用方法が説明されています。クエリ分解は、1つの質問を複数のサブクエリに分解し、それぞれのサブクエリに適したツールを選択するプロセスです。また、SQLクエリエンジンを使用して、構造化されたデータベースに接続し、テキストSQLを実行する方法も紹介されています。Chinook SQLライトデータベースを使用して、音楽アーティストやアルバムに関する情報を問い合わせる例が示されています。

20:05

🤖 構造データ抽出とリアクションエージェント

このセクションでは、構造データ抽出とリアクションエージェントの使用方法が紹介されています。構造データ抽出では、L Indexの「プログラム」を使用して、LLMに構造化データを生成させます。また、リアクションエージェントは、直接的なプロンプティングを使用してLMを操作し、問題を解決するためのアクションを決定します。このセクションでは、映画「The Shining」に基づくアルバムの例を使用して、構造データ抽出のプロセスを説明し、リアクションエージェントがどのように動作するかを示しています。

🚀 Claud 3の多様な使用法の概要

最後のセクションでは、Claud 3の多様な使用法について概説されています。Jerryは、Claud 3を使用して行った実験の結果を共有し、Claud 3が様々なタスクをどのように解決するかを説明しています。また、将来的には、Claud 3と同様の機能を備えたより高度なエージェントが登場する可能性があると予測しています。このセクションの終わりには、視聴者にコメントを残すよう促し、次回の動画に会うことを楽しみにしています。

Mindmap

Keywords

💡Claud 3

Claud 3は、Anthropicがリリースした新しいモデルのセットで、現在の最高水準のモデルとされています。特にClaud Opusが注目されており、様々なベンチマークで他のモデルを凌駕しています。

💡API

APIは、ソフトウェアアプリケーション間で通信を行うためのインターフェースです。Claud 3はAPIを通じてアクセス可能で、これにより開発者は自分のアプリケーションにClaud 3の機能を組み込むことができます。

💡LM

LMは、Language Modelの略で、自然言語処理技術の一種です。このモデルは、テキストデータからパターンを学習し、自然な文章を生成したり、質問に答えたりすることができる能力を持っています。

💡embedding model

埋め込みモデルは、テキストデータを数字のベクトル表現に変換する自然言語処理の技術です。これにより、コンピュータがテキストを理解し、関連性の高い情報を検索することができます。

💡textas SQL

textas SQLは、自然言語をSQLクエリに変換する技術です。これにより、ユーザーは自然な英語で数据库に問い合わせを行うことができます。

💡retrieval

検索と呼ばれるプロセスは、大量のデータから特定の情報を見つけ出すことを意味します。このビデオでは、自然言語モデルを使用して、テキストデータから関連性の高い情報を検索する方法について説明されています。

💡Chain of Thought

Chain of Thoughtは、問題解決のための一連の論理的な思考プロセスです。このビデオでは、Claud 3がどのようにして質問を分解し、回答を生成するのかを示しています。

💡structured data extraction

構造化データ抽出は、テキストデータから構造化された情報を取り出すプロセスです。このビデオでは、Claud 3が如何に自然言語を解析し、データベースのような構造化形式に変換するのかが説明されています。

💡query engine

クエリエンジンは、データベースや情報源から問い合わせを行うためのソフトウェアコンポーネントです。このビデオでは、自然言語を理解し、適切な回答を生成するようにClaud 3を訓練する方法について説明されています。

💡react agent

リアクティブエージェントは、自然言語処理技術の一種で、ユーザーの入力に対してリアルタイムの応答を行うことができます。このビデオでは、Claud 3が如何にして会話の履歴を維持し、一連の思考プロセスを通じて回答を生成するのかが示されています。

Highlights

Introduction to Claud 3 and its models, including Claud Opus, which is considered the best model available today.

Claud 3 was released on March 4th, 2024, and has outperformed other models like jt4 in various benchmarks.

Anthropic released three new models: CLA, three Haiku, CLA 3 Sonet, and CLA 3 Opus.

Claud Opus has a 200,000 contact wendo and plans to extend it to 1 million soon.

Cloud 3 is accessible via API and can be subscribed to through Cloud Pro for access to Opus by default.

Demonstration of using Claud 3 in LM applications through a cookbook approach.

Explanation of how to install the Llama index python Library and the hugging face BGE embedding model for RAG setup.

Showcase of a simple article from The Verge used for data indexing and how it fits within a context window for reasonable responses.

Instructions on installing the Llama index LM Anthropic package for integration with the rest of the abstractions.

Importing from llma index. lm. anthropic to access underlying methods including completion, chat, and streaming.

Definition of a settings object in L index for configuration settings and the use of embedding models.

Creation of two indexes: vector store index for topk retrieval and summary index for key-value indexing.

Running the first RAG pipeline with Vector index as the query engine and response mode set to compact.

Introduction to the router query engine, which decides which choice a given query should be routed to based on a set of choices.

Example of joint question answering and summarization using vector and summary tools with the router query engine.

Explanation of query decomposition as an advanced RAG concept, allowing for more complex question answering.

Demonstration of the sub question query engine, which decomposes questions into sub questions and picks relevant tools to answer them.

Overview of the SQL query engine, showing how to connect to a structured database and run text SQL over it using Claud 3.

Example of structured data extraction using L index's program abstraction for structured data extraction.

Introduction to the react agent, a general-purpose agent that prompts the LM to output actions to solve tasks.

Illustration of how the react agent works by maintaining conversation history and using Chain of Thought reasoning with tool use.

Transcripts

play00:01

hey everyone uh Jerry here from llama

play00:03

index and today we'll be going through a

play00:06

cookbook showing you how to use Claud 3

play00:08

in your LM applications so Claud uh 3

play00:12

just came out two days ago on Monday

play00:14

March 4th 2024 and it is probably the

play00:19

best set of models out there today um

play00:21

especially Claud Opus and so anthropic

play00:24

released three new models there's CLA

play00:26

three Haiku CLA 3 Sonet and CLA 3 opus

play00:30

um and Opus is by far the best model and

play00:33

pretty much outperforms all other models

play00:35

including jt4 on a variety of different

play00:38

benchworks um you see this here uh in

play00:40

terms of numbers but you also see it in

play00:42

terms of people personally playing

play00:44

around with Claude on Twitter and by all

play00:46

means all the results seem quite

play00:48

impressive um also uh Claude has a

play00:52

200,000 contact wendo UM and they said

play00:55

they'll extend it to 1 million soon but

play00:58

right now in terms of what users can use

play01:00

it's uh 200k the other piece here is

play01:03

that uh it's uh Cloud 3 is accessible

play01:05

via API and of course you can also

play01:07

subscribe to Cloud Pro to get access to

play01:09

Opus by default um the chat UI is is

play01:12

cloud 3on it so we've been playing

play01:15

around with it actually and it is indeed

play01:17

quite impressive and the goal of this is

play01:19

to really just show you a notebook of

play01:21

how all the different use cases you can

play01:23

plug CLA 3 into uh from basic rag to

play01:26

textas SQL to agents um and we'll walk

play01:29

through how you can use cloud within

play01:31

llama index so let's go through the

play01:33

cloud 3 cookbook um to start with uh

play01:36

you're just going to want to pip install

play01:39

um the Llama index python Library um

play01:42

anthropic doesn't natively offer

play01:44

embedding models and so in this case

play01:45

we're just going to use an off-the-shelf

play01:47

hugging face uh BGE embedding model uh

play01:49

for any sort of rag setup we're also

play01:52

going to load in a simple article from

play01:54

the Webb uh just for data indexing this

play01:56

is like a toy article um and to use

play01:58

anthropic you're going to want to

play02:00

install the Llama index LM anthropic

play02:03

package uh this allows you to pip

play02:05

install integration and then directly

play02:07

use the wrapper and and integrate this

play02:09

with the rest of our abstractions so you

play02:11

run these pip installs you get back the

play02:13

output um just to save some time we've

play02:16

already run it for you the next piece is

play02:18

to load in some data um and so here uh

play02:22

we just load in a simple article from

play02:23

The Verge um the title of this is the

play02:26

synthetic social network is coming and

play02:28

you can see it's a pretty article uh

play02:30

there's really not that much stuff in

play02:31

here um you can actually fit this entire

play02:34

thing easily within a context window and

play02:37

get back a reasonable response um but

play02:40

really this is just a showcase um how

play02:42

you can you know build something over a

play02:43

toy data set we use the beautiful superp

play02:46

web reader to load in this web page and

play02:48

this is a nice reader that cleans out

play02:49

the HTML and formats the text for

play02:52

you the next piece is to enter your

play02:55

anthropic API key um here we've already

play02:57

pre-entered it um but just make sure to

play02:59

to do it when you follow this

play03:02

notebook now we get to the interesting

play03:05

stuff so we want to uh import from llma

play03:07

index. lm. anthropic inport anthropic

play03:11

this is the Llama index wrapper that

play03:13

wraps the anthropic client SDK and

play03:16

allows you to basically access all the

play03:18

underlying methods um including

play03:20

completion chat uh async as well as

play03:23

streaming you see you just need to

play03:24

specify a model name and then optionally

play03:26

the temperature here we set the

play03:28

temperature equals a zero just to like

play03:31

um enforce some determinism in the

play03:32

notebook runs and then for the model

play03:35

we're currently going to test with Opus

play03:36

um by default uh actually by default

play03:39

it's Collide 2 but you can also use

play03:40

Sonet if you

play03:42

want the next item here is to just

play03:45

Define a settings object um and this is

play03:47

a convenience wrapper within L index

play03:49

where you can basically Define some

play03:50

configuration settings once instead of

play03:52

passing the llm through uh all the other

play03:54

modules that we have so you can you know

play03:56

just call settings. llm equals uh LM

play04:00

that you defined here and then the

play04:02

embedding model um you can actually you

play04:04

know we have proper abstractions for

play04:06

embedding models you can import the

play04:07

embedding modules and set the embedding

play04:09

module as follows here is actually just

play04:11

a string and this is basically just some

play04:13

syntatic sugar um instead of trying to

play04:15

import the module just Define a prefix

play04:18

here we Define local which means it's a

play04:20

local hugging face model and then here

play04:22

is just the hugging face model ID so

play04:24

here we just use the ba B um BGE small

play04:28

models

play04:32

so let's run this again and then as a

play04:35

Next Step we're going to create two

play04:37

indexes one is a vector store index

play04:39

which is a classic index and LOM index

play04:41

that allows you to uh generate

play04:43

embeddings for um all the chunks within

play04:45

your knowledge uh the documents that you

play04:47

feed it and then interact with it via

play04:49

topk retrieval and then the summary

play04:51

index which will just index everything

play04:54

um by by key value and then during

play04:56

retrieval it just returns everything so

play04:57

it's basically a very simple index that

play05:00

just returns all the context that it has

play05:03

so we'll first um call the vector store

play05:06

index uh to basically you know build an

play05:09

index over these documents the next step

play05:11

is to call summary index off from

play05:17

documents now um we just Define some

play05:20

helpful Imports for

play05:24

logging now that we've built uh the

play05:27

vector index we can basically run our

play05:29

first rag pipeline um the first thing

play05:32

we'll show is just you know a very basic

play05:34

rag pipeline um defined by Vector index.

play05:38

as query Engine with response mode

play05:40

equals compact a query engine here is

play05:42

just a retrieval query engine which

play05:44

first runs retrieval and then runs a

play05:47

response synthesis and setting response

play05:50

mode equals compact means that during

play05:52

response synthesis um you know we

play05:54

retrieved a bunch of chunks and so we're

play05:56

going to take these chunks and try to

play05:57

compact it as much as possible into the

play05:59

prompt window of an llm um this is

play06:02

probably the default setting you should

play06:04

always use whenever using a query engine

play06:05

on top of an index once you get back a

play06:08

query engine this is just an object you

play06:10

can call um to ask any questions and get

play06:12

back response and so it'll just trigger

play06:13

the rag pipeline run you know given that

play06:17

uh the contents of this article we're

play06:19

going to ask how do open Ai and meta

play06:21

defer on AI tools run

play06:28

this

play06:37

so it's going to do EMB batting based

play06:38

retrieval and then synthesis via Cloud

play06:40

free it probably takes a little bit of

play06:42

time because you know the EMB batting

play06:43

model needs a little bit of time to run

play06:45

um but you know it's able to do it and

play06:47

then gives you back a response based on

play06:50

the information provided opening ey

play06:51

about taking somewhat different

play06:55

approaches of course there's other

play06:57

response modes too there's uh response

play06:59

mode equals refine and then also

play07:01

response mode equals Tre summarized we

play07:03

won't really go into a detail here but

play07:05

if you want you can check out the docs

play07:07

just to see how these response modes

play07:10

work to go a little bit beyond the

play07:13

simple rag pipeline uh The Next Step

play07:14

here is to go over the router query

play07:17

engine um and and here you know uh what

play07:20

is a router a router is basically just a

play07:22

simple module that given a query and a

play07:24

set of choices decides Which choice that

play07:27

given query should be routed to and so

play07:28

it'll just call that choice um and and

play07:31

pass the query over to it it's very

play07:34

simple um and you know in some of our

play07:36

other talks we basically paint it as one

play07:38

of the simplest agent abstraction you

play07:40

can do because it uses an LM for

play07:41

reasoning um it's basically just a

play07:43

prompt and then it just does some basic

play07:46

Dynamic uh Choice selection and picking

play07:48

to Route the query to uh this one of the

play07:51

use cases here is actually doing joint

play07:54

question answering and

play07:55

summarization and so as an example here

play07:58

you know we can basically Define uh a

play08:00

vector tool which is a wrapper around

play08:02

the query engine which does you know the

play08:04

top cave rag

play08:06

setup the other is a summary tool which

play08:08

is wrapper around the summary index

play08:10

query engine um which retrieves

play08:12

everything and can do summarization like

play08:14

queries over that data so you know given

play08:18

these two tools as well as the metadata

play08:20

attached to each

play08:22

tool um we can now Define a router over

play08:25

these choices so let's first instantiate

play08:28

these um both are defined as query

play08:30

engine tools right and each has a name

play08:33

and

play08:34

description and as a Next Step uh we'll

play08:37

Define a router query engine the router

play08:39

query engine just takes in both tools um

play08:42

and then you know you can actually do

play08:44

multiple choice selection if you want as

play08:46

in it can choose more than one choice uh

play08:48

here we only only have two tools so we

play08:50

set select multi to false um and then

play08:53

let's run a question over it one

play08:55

interesting piece about Opus which we

play08:57

tried is if you just ask a question

play08:59

question like what was mentioned about

play09:00

meta it actually ends up throwing an

play09:01

error in our router um because it ends

play09:03

up not picking one of the two choices um

play09:06

so this is just a slight Quirk will work

play09:08

out um but it you can basically say you

play09:10

know what was mentioned about meta use a

play09:12

tool and then it'll actually try using

play09:14

that tool to um uh one of these two

play09:17

tools to answer the question so let's

play09:20

run

play09:28

this

play09:39

you know you see it takes just a little

play09:40

bit of time because it's using the model

play09:42

to one make a choice and then two do

play09:44

retrieval and synthesis but we get back

play09:47

the final answer right and you see that

play09:50

you know um if you ask this question

play09:52

what was men about meta um you see that

play09:55

it's able to give you back the answer

play09:57

thata is doing the following related to

play09:58

AI

play10:00

if you take a look at the number of

play10:01

source nodes um response St Source

play10:06

nodes you see it's equal to two um and

play10:10

this is basically the K value Set uh for

play10:12

top K retrieval for the vector query

play10:14

engine um so it's effectively picking

play10:16

the vector query engine if you really

play10:19

want to see a verose output all you have

play10:20

to do is just toggle a verbose equals

play10:23

true and you'll be able to see actually

play10:25

what choice I made in the log

play10:27

outputs okay

play10:29

we're going to skip multi selection for

play10:31

now but actually let's go on to another

play10:34

Advanced frag concept which is query

play10:36

decomposition um so this is what we call

play10:39

the sub question query engine and what

play10:42

it is is it's a layer again on top of a

play10:45

set of tools that you define and what it

play10:48

does is given a question it'll actually

play10:50

decompose that question into a set of

play10:52

sub questions and decide what tools

play10:54

correspond to those sub question

play10:55

questions in that sense it's actually a

play10:57

little bit more complicated than routing

play11:00

because routing takes the original

play11:01

question and chooses you know the tool

play11:03

or subset of tools that I should route

play11:05

to now the sub question query engine um

play11:09

also takes those questions and

play11:10

decomposes those questions into sub

play11:13

questions and also picks the tools

play11:15

relevant uh that are uh necessary to

play11:18

answer those sub questions so does an

play11:20

extra step a query

play11:21

decomposition and similarly as before we

play11:24

Define both the vector tool as well as

play11:26

the summary tool so the vector tool is

play11:28

better for you know again answering

play11:29

specific questions and summary tool is

play11:31

better for summarizing an entire

play11:34

document we'll import an sa sync iio

play11:36

because you know you'll see this

play11:37

actually just spins off sub questions uh

play11:40

launches separate async um threads and

play11:44

then we Define the sub question query

play11:46

engine and call sub question query

play11:48

engine do from defaults um we call it

play11:50

with both of these tools we set ver both

play11:53

equals to true so we can actually see

play11:55

the log outputs and then let's ask a

play11:57

multi-part question you know what was

play11:59

mentioned about meta and how do that

play12:00

defer from how open AI is talked

play12:05

about as we're running this you see it's

play12:15

running and you see that you know it

play12:18

actually generated five sub questions um

play12:20

so you know Opus is pretty eager about

play12:22

just trying to really break down that

play12:23

question into a bunch of things that

play12:25

could be answered by the different tools

play12:27

um first of the sub questions as what

play12:29

was mentioned about meta in the document

play12:31

that's using Vector search one is the

play12:34

summary is summarize the key points

play12:36

about how meta is discussed in the

play12:37

document um so that's really you know

play12:39

going through all the context actually

play12:42

another is using Vector search what was

play12:44

mentioned about open AI in the document

play12:46

um and the summary tool is summarize the

play12:48

key points about how open AI is

play12:49

discussed in the document and the last

play12:51

is compare and contrast the key

play12:52

differences and how they're discussed

play12:54

based on the

play12:55

summaries you see you actually see all

play12:58

the questions are launched in parallel

play12:59

and so you see the answers getting

play13:01

stream backed maybe a little bit out of

play13:02

order but all the answers are coming in

play13:05

uh from these different tool

play13:07

executions and then at the end of the

play13:09

day it's com it combines all the

play13:11

responses and gives you back a final

play13:14

results you know uh according to the

play13:17

article it's taking a different approach

play13:19

to Genai compared to open

play13:24

AI so again this is another step towards

play13:28

um an that can do query planning

play13:30

execution um and we'll actually see what

play13:34

a full react agent looks like later

play13:36

later on but you know this is basically

play13:38

a one shot query decomposition um

play13:41

question answering

play13:42

tool the next item here is our SQL query

play13:45

engine where um we show how to connect

play13:49

to an un to to a structured database and

play13:51

run text SQL over it and here this is

play13:54

basically showing how you can use quad 3

play13:56

with our text SQL abstractions so we'll

play13:58

download the Chinook um you know SQL

play14:01

light database um it's just a very

play14:03

popular test database containing

play14:05

information about music artists uh

play14:07

albums and we'll ask some questions over

play14:10

it uh we put all the data in SQL light

play14:13

and we connect to it via SQL

play14:16

Alchemy so engine equals crate

play14:19

engine we then wrap it with our own SQL

play14:22

database abstraction um which then

play14:25

allows you to plug it into our NL SQL

play14:28

table query engine um so slight mouthful

play14:31

but basically it's our tax to SQL query

play14:33

engine and um the the SQL table query

play14:37

engine takes in the SQL database along

play14:39

with the tables that you want to query

play14:41

over and so you define this query engine

play14:43

and then when you ask query engine.

play14:45

query what are some of these albums um

play14:47

it's able to give you back a

play14:57

response so it's running right now and

play14:59

and you know under the hood what's

play15:00

happening is it's taking the natural

play15:02

language translating it to SQL using the

play15:04

LM executing the SQL against a database

play15:06

and and giving you back to

play15:09

response oh and there's another LM call

play15:12

at the end to synthesize a final

play15:13

response for

play15:15

you let's actually go through and and

play15:18

take a look at an example query um and

play15:20

so let's ask what are some tracks from

play15:22

the artist ACDC and limit it to three

play15:25

and in this example we'll actually be

play15:26

able to see the SQL query that's

play15:29

created so first we'll run

play15:32

it just going to hide this for

play15:45

now okay and it dra an answer and then

play15:49

we can see the SQL query under the hood

play15:52

um so you know select tracks. name

play15:55

title

play15:57

on

play16:03

the next uh application use case is

play16:05

structure data extraction um this is a

play16:08

very popular use case with um llm eays

play16:12

and also you know more and more llms are

play16:14

actually coming out with in it support

play16:16

for function calling so started with

play16:18

open AI clad 3 actually has inbuilt

play16:21

support for function calling as well um

play16:23

but one of the things here is um you

play16:25

know we're we're actually still working

play16:26

on function calling support for cloud 3

play16:28

and you know in general for any llm you

play16:30

know even if they don't support function

play16:32

calling you can prompt the llm to try to

play16:34

Output uh correct Json so in this

play16:37

setting um we have what we call programs

play16:39

within L index which is our main

play16:41

abstraction for structured data

play16:43

extraction it's basically combining a

play16:45

prompt llm as well as your desired

play16:47

output format in a pantic schema um and

play16:50

you can you know uh basically pack you

play16:52

can plug in any llm into it uh and try

play16:55

to generate an answer that conforms that

play16:57

schema of course certain models will

play16:59

work better than others um the llm text

play17:01

completion program relies on Direct

play17:03

prompting we also have a penic program

play17:06

that integrates with for instance open

play17:08

AI function calling and we're working on

play17:09

an integration with Cloud as well so in

play17:11

this setting we we're using llm text

play17:13

completion

play17:14

program um and then you know we will

play17:17

Define the desired pantic schema which

play17:20

is song as well as album so song

play17:22

contains both title as well as the

play17:24

length and seconds and then the album is

play17:27

uh name artist and list of songs so you

play17:29

can see it's actually you know album

play17:31

contains a list of songs so there's some

play17:33

sort of nesting going

play17:34

on what you can do is after you define

play17:37

these pantic uh classes you just import

play17:41

um our llm text completion program um

play17:44

and you just call it do from defaults

play17:46

and you pass in three things one is you

play17:48

pass in the desired output format which

play17:50

is our album um class that

play17:53

schema you pass in the prompt template

play17:56

string uh which is the input that you

play17:57

want to feed to the the LM you pass in

play17:59

the LM

play18:01

itself so first let's run this and then

play18:04

let's run

play18:06

this and the input to this program is

play18:09

basically all the free template

play18:10

variables that are exist in the promp

play18:12

template string so here you see you know

play18:13

use the movie movie name as inspiration

play18:15

this means this is the input string that

play18:17

you want to fill in so if you call

play18:19

program movie name equals The Shining

play18:21

it'll generate an example album um

play18:23

that's inspired from the

play18:27

shrining

play18:38

you see it's complete and then the final

play18:40

output looks like this you know you have

play18:42

Overlook Hotel the artist and Alysa

play18:45

songs you see that the type of the

play18:47

output is actually just the album um

play18:50

class defined here and the output

play18:52

representation you know correctly

play18:54

contains the name artist and actually

play18:56

extracts out a set of songs as well

play19:03

last but not least we'll set up a react

play19:06

agent with uh uh with anthropic clad 3

play19:10

um similarly to the structure data

play19:12

extraction example the react agent just

play19:14

relies on Direct prompting for the LM um

play19:17

so we don't really integrate with uh the

play19:20

function calling um yet and so actually

play19:23

an agent that directly leverages

play19:25

function calling to help make the next

play19:26

decisions is something that upcoming and

play19:29

and coming up soon but the react agent

play19:31

is a general purpose agent that

play19:33

basically takes in any LM and tries to

play19:35

prompt the LM into outputting you know

play19:37

given a set of tools the actions right

play19:40

uh to to take uh in order to solve the

play19:42

issue or in order to solve the task so

play19:45

the react agent takes in uh input set of

play19:47

tools we'll use the tools that we

play19:49

defined above Vector tool and summary

play19:51

tool um again Vector tools for question

play19:53

answering summary tool is for

play19:55

summarization we pass in the anthropic

play19:57

Cloud 3

play19:59

and we initialize the

play20:01

agent uh now we can chat with the agent

play20:04

so we can do agent. chat um it will

play20:06

maintain the conversation history over

play20:08

time and if we just say hello uh it

play20:11

won't use a tool right it will just give

play20:12

you back the

play20:13

response the react agent of course uses

play20:16

the react um you know framework which is

play20:19

just Chain of Thought reasoning combined

play20:21

with tool use um so given a question

play20:23

it'll break it down step by step and

play20:25

then within each step it'll decide to

play20:26

call a tool um or uh to to finish

play20:32

execution so here is hello how can I

play20:34

assist you today and then the next one

play20:37

is you know let's ask the exact same

play20:38

question we asked a sub question query

play20:40

engine um what was mentioned about meta

play20:42

how does it defer from open Ai and this

play20:44

is basically a multi-part question and

play20:46

let's see how the agent is able to

play20:47

answer this

play20:56

question you see that it's going through

play20:58

the Chain of Thought Loop right now um

play21:00

first it says to answer this question I

play21:03

I will need to search the provided

play21:04

documents for mentions of how meta and

play21:05

open AI uh and how they are discussed

play21:08

differently so the first is Vector

play21:10

search on meta um so it's processing

play21:15

that uh and of course the vector you

play21:17

know rag pipeline gives you back an

play21:20

answer and then given this you generate

play21:22

the next thought which is it provides

play21:24

useful information but you still haven't

play21:26

searched open a yet and so you need

play21:28

search mentions open AI so you do Vector

play21:30

search input equals open AI you get back

play21:33

another observation about uh after

play21:35

running the the vector tool on opening

play21:38

ey uh and after this you know in the

play21:40

conversation history sees that it has

play21:42

both meta and opening ey and then when

play21:44

you pass it to an LM it'll be able to

play21:46

give you back the final

play21:57

response

play22:03

great and so you know the final thought

play22:05

is that you know I have all this

play22:06

information and so therefore I am done

play22:09

and so this is the final answer that you

play22:11

get there's of course more interesting

play22:14

uh agent approaches out there um

play22:16

everything from uh plan and solve like

play22:18

basically doing some sort of query

play22:19

planning like the sub question query

play22:21

engine but doing it in a loop um there's

play22:24

being able to do some sort of like async

play22:25

parallel function calling execution like

play22:27

L and compiler there's doing stuff

play22:29

around like Monte Carlo treesearch um

play22:32

and a lot more stuff to come but hope

play22:34

this was a general overview of how to

play22:36

use cloud and a Vari of different use

play22:37

cases and we'll do a more in-depth Deep

play22:40

dive into a lot of this especially just

play22:43

to better explore the capabilities of

play22:44

cloud but in any case thanks and feel

play22:47

free to leave your comments below and

play22:48

see you guys

play22:50

soon

Rate This

5.0 / 5 (0 votes)

Related Tags
Claud3AnthropicOpusモデルLMアプリケーションAI技術HuggingFaceデータインデックス自然言語処理Web ScrapingSQLクエリ構造データ抽出
Do you need a summary in English?