Building open source LLM agents with Llama 3

LangChain
7 Jun 202417:39

Summary

TLDRビデオスクリプトでは、オープンソースのLLMエージェントを構築する方法が紹介されています。Lance L chainが、ツール使用、計画、記憶というエージェントの主要なコンポーネントを解説し、llama 3を用いてそれらを実装する方法を説明しています。ツールとしてGrockやTav、lsmithを用いて、LLMが外部ツールを認識し、必要なペイロードを返すことで、機能を呼び出す方法を解説しています。さらに、Lang chainを使用して、LLMとツールを組み合わせたエージェントの構築プロセスを説明しており、多様なツールを通じて複雑なタスクを実行する能力を示しています。

Takeaways

  • 🧠 キーポイント:Lance L chainは、オープンソースのLLMを使用してエージェントを構築することに高い関心を示しており、その方法を紹介しています。
  • 🛠️ ツール使用の概念:エージェントの中心的なコンポーネントとして、計画、記憶、ツール使用が挙げられます。ツール使用は、LLMに外部ツールへの認識を与え、そのツールを呼び出すためのペイロードを返すことです。
  • 🔧 LLMへのツールのバインド:任意の関数をツールとしてLLMにバインドし、そのツールを使用するためのペイロードをLLMが返す仕組みが説明されています。
  • 📚 Lang chainのツールデコレータ:任意の関数をツールに変換し、LLMにバインドするためのメカニズムが提供されています。
  • 🔍 ウェブ検索ツールの例:ウェブ検索機能をLLMが使用するためのツールとして実装する例が紹介されています。
  • 🤖 LLMの応答プロンプト:LLMが有用なアシスタントとして機能し、ウェブ検索やカスタム関数を用いた応答を提供することが示されています。
  • 🔗 ツール呼び出しの流れ:LLMがツールを呼び出すかどうかの決定プロセスと、その結果に基づく応答の流れが説明されています。
  • 🌐 Lang graphの紹介:Lang graphを使用して、エージェントのフローを構築し、サイクルを含む複雑なフローを扱うことができると説明されています。
  • 🔄 状態とノード:Lang graphでは、状態がグラフ全体にわたって保持され、ノードによってアクセスされることが示されています。
  • 🎨 多様なツールの追加:Replicateを用いて、テキストから画像への変換、画像からテキストへの変換、テキストから音声への変換など、多様なツールをLLMに追加する方法が紹介されています。
  • 🌟 エージェントの柔軟性:エージェントは汎用であり、さまざまな種類のツールを組み合わせて使用することができると強調されています。

Q & A

  • ラナス・L・チェーンはどのようなプロジェクトに興味を持っていますか?

    -ラナス・L・チェーンは、オープンソースのLLM(Large Language Model)を使用してLLMエージェントを構築することに高い興味を持ちました。

  • エージェントの中心的なコンポーネントは何ですか?

    -エージェントの中心的なコンポーネントは計画、記憶、そしてツールの使用です。

  • ツールの使用とは何を意味しますか?

    -ツールの使用とは、LLMに存在する外部ツールに対する認識を与え、LLMがそのツールを呼び出すために必要なペイロードを返すことを意味します。

  • Magic Functionの例は何を行うものですか?

    -Magic Functionは、与えられた入力に2を加算するシンプルな関数です。LLMがこの関数を認識し、ユーザーの入力に基づいて実行する必要があるかどうかを判断し、必要な引数を返すようにトレーニングされています。

  • Grockを用いた場合、どのようにLLMにツールをバインドさせるのでしょうか?

    -Grockを使用すると、任意の関数をツールに変換し、そのツールをLLMにバインドさせることができ、LLMは自然言語の質問に対して、ツールの名前と引数を返すことができます。

  • LLMがツールを呼び出す際に返すペイロードには何が含まれていますか?

    -LLMがツールを呼び出す際に返すペイロードには、呼び出すツールの名前と、そのツールに渡す引数が含まれています。

  • ラン・グラフ(Lang Graph)とは何であり、どのようにエージェントを構築するのに使われますか?

    -ラン・グラフは、フィードバックを含むフローを定義する方法であり、エージェントの構築に使われます。状態がグラフ全体にわたって保持され、グラフ内のすべてのノードからアクセスできます。

  • LLMがツールを呼び出すかどうかを判断する条件エッジとは何ですか?

    -条件エッジは、LLMの結果に基づいてツールが呼び出されるかどうかを判断し、呼び出された場合はツールノードにルーティングし、呼び出されなかった場合は終了するものです。

  • Replicateサービスを使用してLLMにどのような機能を追加することができますか?

    -Replicateサービスを使用して、テキストから画像への変換、画像からテキストへの変換、テキストから音声への変換など、多様なモデルをLLMに追加することができます。

  • Replicateを用いてLLMに新しいツールを追加するプロセスはどのように行われますか?

    -Replicateを用いてLLMに新しいツールを追加するには、ツールデコレーターと関数定義を使用してモデルを呼び出し、その後、ツールリストに新しい関数を追加するだけです。

  • このスクリプトを通じてどのようにエージェントの柔軟性と拡張性が示されていますか?

    -このスクリプトでは、単純な関数やウェブ検索だけでなく、Replicateを通じて多様なモデルをLLMにアタッチし、その機能を拡張する例を通じて、エージェントの柔軟性と拡張性が示されています。

Outlines

00:00

🤖 LLM エージェントのツール使用入門

Lance LがLLMエージェントの作り方を解説。ツール使用の概念をLilan Wangのブログポストに基づいて説明し、ツールを用いた計画、記憶、ツール使用の3つの主要なコンポーネントを解説。Llama 3を用いて、ツールを使用する方法をステップバイステップで解説。ツールのバインドや、外部ツールへのアクセスをLLMに提供する方法を説明。Grockを用いたデモンストレーションで、ウェブ検索やカスタム関数を通じて、LLMがどのように自然言語で問い合わせに応えるかを示す。

05:01

🔄 Lang ChainとLgraphを使ったエージェントの構築

Lang Chainのツールデコレータを使って任意の関数をツールに変換し、LLMにバインドする方法を解説。Lgraphを使って、ツールの使用を含むエージェントのフローを構築する方法を紹介。条件エッジを用いて、ツールが呼び出された場合のルーティングを示し、ツールからの返答をLLMがどのように処理するかを説明。簡単な例として、'Magic function'を用いたデモンストレーションを行い、自然言語での応答までのプロセスを追う。

10:01

🌐 複雑なツールを用いたエージェントの拡張

Replicateサービスを用いて、Llama 3にマルチモーダル能力を追加する方法を解説。テキストから画像への変換、画像からテキストへの変換、テキストから音声への変換など、さまざまなモデルをツールとして追加。これらのツールをLLMエージェントにバインドし、複雑な問い合わせに応答する能力を拡張する方法を紹介。

15:02

🎨 エージェントの多様性とツールの柔軟性

エージェントの多様性とツールの柔軟性を示す例として、テキストから画像への変換ツールを使って、特定のテキストを画像に変換するプロセスを紹介。Replicateを通じてアクセス可能な様々なモデルを用いて、LLMエージェントの能力を拡張し、複雑なタスクに対応する能力を示す。

Mindmap

Keywords

💡LLMエージェント

LLMエージェントとは、オープンソースのLLM(Large Language Model)を使用して構築された、特定のタスクを自動化するためのソフトウェアです。このビデオでは、LLMエージェントの構築方法について説明しており、その核となる要素として計画、記憶、ツールの使用が挙げられます。

💡ツール使用

ツール使用とは、LLMが外部のツールを認識し、そのツールを呼び出すためのペイロードを返すことを指します。ビデオでは、LLMが特定の関数を認識し、関数名と引数を返し、それらを用いて関数を実行するプロセスが解説されています。

💡Grok

Grokは、ビデオ内で使用されるLM(Language Model)の一つで、ツール使用に適したプロンプトや調整を提供できるとされています。Grokは、LLMがツールを認識し、適切に呼び出すプロセスに役立ちます。

💡Tav

Tavは、ウェブ検索を可能にするツールであり、LLMエージェントがインターネット上の情報を取得するために使用されます。ビデオでは、Tavをツールとして使用して、最新の情報を取得する例が紹介されています。

💡Llama 3

Llama 3は、ビデオ内で使用されるオープンソースのLLMであり、エージェントの構築に用いられます。このLLMは、ツール使用に適応され、複雑なタスクを自動化する能力を持ちます。

💡Lang Chain

Lang Chainは、LLMエージェントの構築に使用されるフレームワークであり、ビデオ内でその機能と使い方について説明されています。Lang Chainを使用することで、LLMにツールをバインドし、複雑なタスクを実行させることができます。

💡条件エッジ

条件エッジは、Lang Chain内のグラフで使用される機能で、特定の条件に基づいてフローを制御します。ビデオでは、ツールが呼び出された場合にのみ特定のノードに移動する条件エッジが使用されています。

💡Lgraph

Lgraphは、Lang Chainの一部であり、状態を維持しながら複数のノードを通じてフローを管理します。ビデオでは、Lgraphを使用して、LLMエージェントの複雑なタスクを実行するプロセスを可視化しています。

💡Replicate

Replicateは、ビデオ内で紹介されるサービスで、さまざまなモデルにアクセスできる機能を提供しています。Replicateを用いて、LLMエージェントにマルチモーダル能力を追加することができます。

💡テキスト-to-イメージ

テキスト-to-イメージは、ビデオ内で使用されるツールの1つで、テキストをもとに画像を生成する機能を持ちます。Replicateを通じてアクセスされるモデルを使用して、LLMエージェントはテキストを画像に変換できます。

💡テキスト-to-音声

テキスト-to-音声は、テキストを音声に変換するツールであり、LLMエージェントが音声メッセージを作成できるようにします。ビデオでは、このツールを使用して、テキストコンテンツを音声に変換するデモンストレーションが行われています。

Highlights

使用开源大型语言模型(LLM)构建代理的讨论,特别是使用LangChain和Llama 3。

Lilan Wang的博客文章概述了代理的三个核心组成部分:规划、记忆和工具使用。

展示了如何使用LangChain将任意函数转换为代理可以使用的工具。

使用Grock作为语言模型(LM),Tav作为网页搜索工具,以及Lsmith用于追踪。

解释了如何让LLM识别外部工具并返回调用该工具所需的有效载荷。

演示了如何将自定义函数和网络搜索功能绑定到LLM上。

展示了LLM如何根据用户输入决定是否调用工具,并返回相应的函数名和参数。

讨论了LLM需要通过微调或提示来启用工具使用的重要性。

介绍了LangGraph的概念,它是一种布局流程的方式,特别是具有循环特性的流程。

展示了如何在LangGraph中定义代理,包含两个节点:助手节点和工具节点。

解释了条件边的概念,它根据LLM的输出决定是否调用工具。

演示了如何使用LangGraph构建一个完整的代理流程,并可视化为图形。

测试了代理流程,展示了如何调用自定义函数和网络搜索工具。

讨论了如何将Replicate服务用于增加Llama 3的多模态能力。

展示了如何将文本到图像、图像到文本和文本到语音等工具集成到代理中。

通过Replicate集成不同模型,展示了如何扩展Llama 3的功能。

最终演示了如何使用LangChain和LangGraph构建具有多种工具的复杂代理。

Transcripts

play00:00

hey this is Lance L chain we seem very

play00:02

high interest in building llm agents

play00:05

using open source llms and so we wanted

play00:07

to talk through how to do that from

play00:08

scratch using llama 3 so first what is

play00:11

an agent so lilan Wang is a very nice

play00:14

blog post that laid out the central

play00:16

components of Agents being planning

play00:18

memory and Tool use so I want to walk

play00:20

through these components individually

play00:22

and how I can use them with llama 3 so

play00:25

first let's talk about tool use I'm

play00:27

going to copy over some code and we're

play00:28

going to walk through it so I this

play00:30

notebook done a few pip installs set a

play00:32

few API Keys we'll use grock as our LM

play00:34

we'll use Tav uh for web search as one

play00:37

of our tools and we'll use lsmith for

play00:39

tracing but that's all I've done here

play00:41

okay and I'm going to kind of have this

play00:43

image side by side so we can look at it

play00:45

so first tool use what's the big idea

play00:48

here the big idea is simply this I want

play00:50

to take an llm give it awareness of some

play00:53

external tool that exists and have the

play00:55

llm return the payload necessary to

play00:58

invoke that tool that's really all

play00:59

that's going going on now this is often

play01:01

kind of confused and I wanted to kind of

play01:03

zoom in and explain this exactly so

play01:05

let's say I have a function called Magic

play01:07

function which which takes an input and

play01:08

adds two to it I want to give an llm the

play01:11

ability to recognize whether not or not

play01:14

to invoke this function and to return

play01:17

the payload necessary to run the

play01:19

function given the user input so here's

play01:21

exactly what I want to have happen I

play01:23

want to take that function somehow bind

play01:25

it to my llm and give it an input then

play01:29

return both the function name itself and

play01:32

the arguments necessary to run the

play01:34

function remember llms are just string

play01:36

to string right it doesn't have the

play01:37

magic ability to call that function

play01:39

natively but what it can do is return

play01:41

okay I've seen this function I know it

play01:44

exists and I'm going to give you exactly

play01:46

like the input format necessary or the

play01:47

payload to run the function as well as

play01:50

the name of the function okay so that's

play01:51

really all that's going on so first this

play01:56

tool decorator in Lang chain allows you

play01:57

to take any arbitrary function just turn

play01:59

it into a tool and let's just kick this

play02:01

off so here's my magic function and

play02:03

here's a web search function so these

play02:04

are two things that I want to kind of

play02:06

turn into tools and I can do that right

play02:09

here so we can run this now if I look at

play02:11

Magic function now it's a structured

play02:13

tool it has a name it has a

play02:16

description and um it also has that

play02:19

input or arguments as uh captured as a

play02:22

pantic schema okay so all this

play02:25

information can be passed directly to

play02:26

our llm that's the key point so this

play02:28

allows us to go from arbitrary functions

play02:31

to tools that can be bound to an llm

play02:33

okay so that's kind of step one now step

play02:35

two this is where things are kind of

play02:37

interesting I'm going to use grock here

play02:41

and I'm going to use a prompt I'm

play02:42

basically going to say you're helpful

play02:43

assistant with two tools web search and

play02:45

a custom function use web search for

play02:48

current events use the magic function if

play02:50

the user directly asked for it otherwise

play02:52

just answer directly okay so that's kind

play02:53

of my prompt and let's test this in two

play02:57

cases to explain exactly how this works

play02:59

okay so all I'm doing I'm using chat

play03:01

grock setting llama

play03:02

370b and I'm creating this uh runnable

play03:05

this is kind of a lang chain primitive

play03:07

for it basically invoking llm so that's

play03:09

all I've done now here's what's

play03:10

interesting this is piping the prompt to

play03:13

an llm and I've bound my tools to the LM

play03:16

so this is automatically taking those

play03:17

tools we defined and it's basically

play03:20

giving them to the LM such that it's

play03:21

aware of them so it's that's basically

play03:24

represented in this red box here you

play03:26

take external tools and you basically

play03:28

bind them to LM so the is aware that

play03:30

they exist that's kind of step one now

play03:33

here's step two I can basically take a

play03:36

question so I'm going to ask what is

play03:38

Magic function 3 I'm going to invoke my

play03:40

runnable or my chain right with this and

play03:44

let's see what happens I'm going to run

play03:46

this now here's what's interesting that

play03:48

payload contains an object tool calls

play03:51

which contains the name of the function

play03:53

and the arguments that's it so that's

play03:56

the key thing and I can look at the raw

play03:58

payload as well so the raw payload is

play04:00

just simply this AI message it contains

play04:03

you know a bunch of information but

play04:04

here's the main thing it contains

play04:06

basically um the name of the function to

play04:09

call and the arguments pass to the

play04:11

function so again that's exactly

play04:12

represented here all that's happening is

play04:14

I've taken a function I've turned it

play04:16

into a tool I've bound it to my llm I

play04:19

can ask a question natural language and

play04:21

the llm can respond directly with the L

play04:24

the function to call or the tool to use

play04:27

and the input argument to use based upon

play04:30

the user input that's the key point and

play04:32

that's really all that's happening

play04:33

function calling that's all I need you

play04:35

to know okay so here's the other key

play04:37

thing what if I just ask a question

play04:39

about the United States based on my

play04:41

prompt it should not try to invoke any

play04:42

of these tools now now let's test that I

play04:44

run this good and so this payload tool

play04:47

calls empty I can look at the raw

play04:49

payload and yeah now it's just a chat

play04:51

response right the capital of the US is

play04:53

Washington DC great okay so that's it so

play04:56

hopefully now you understand how tool

play04:58

use works and now remember this requires

play05:00

an LM that's actually been fine-tuned or

play05:02

prompted or otherwise is compatible with

play05:05

tool use and this is a very important

play05:06

Point uh we talked to the folks at Croc

play05:09

they have kind of an an a proprietary

play05:11

implementation for how they do this um

play05:14

which we don't know fully but it is

play05:16

reported that works very well with llama

play05:17

70b llama 370b and that in my experience

play05:20

I've seen it to indeed work quite well

play05:22

so in any case the key point is this I

play05:25

can take any arbitrary functions I want

play05:27

I can turn them into tools I can then

play05:30

pass those tools to an llm I can bind

play05:32

them and then you can see right here

play05:35

when I invoke my llm with a question the

play05:38

LM makes decision to use one of the

play05:41

tools and if it does it's going to

play05:42

return to you the name of the tool it

play05:44

wants to use and the input argument

play05:46

that's the key Point okay so that is

play05:50

really what uh you need to know about

play05:52

tool use now we get to the fun stuff

play05:55

we're going to build the agent and for

play05:57

this I'm going to use Lang graph and I'm

play05:58

going to explain kind of how this works

play06:00

over time but first the way to think

play06:01

about L graph is basically it's a way to

play06:03

lay out

play06:04

flows and flows in particular with L

play06:07

graph are often characterized by Cycles

play06:09

so the ability to kind of do feedback

play06:11

and that's really relevant for agents

play06:13

and we'll explain why here shortly so L

play06:15

graph basically takes a state which can

play06:17

live over the course of your graph or

play06:19

flow and it can be accessed by all kind

play06:21

of what we're going to call nodes in

play06:23

your graph okay so first as state I'm

play06:26

just going to find a set of messages and

play06:28

don't worry too much about this for now

play06:29

this will all make sense in about a

play06:31

minute okay now here's where things are

play06:34

going to get interesting I'm going to

play06:35

Define an agent that contains two nodes

play06:39

okay so

play06:43

first first we're going to take our

play06:46

input again it's a human message we pass

play06:48

that to our LM which has the bound tools

play06:50

the llm is going to make a decision to

play06:52

use a tool or not and we just walk

play06:54

through this that's the step one that's

play06:55

this thing we've already seen right now

play06:58

what we're going to do in Lang graph is

play06:59

is we're going to add basic what we're

play07:01

going to call an a conditional Edge so

play07:03

this Edge is going to all it's going to

play07:04

do is say was there a tool call or not

play07:07

if there was a tool call I'm going to

play07:10

route that over to a separate node that

play07:12

basically runs the tool so let's walk

play07:14

through with our example we just did um

play07:17

what is Magic function of

play07:19

three the llm made the decision to

play07:21

invoke the magic function and it gave it

play07:23

had the it gave us the payload right we

play07:25

just saw that so that's arguments input

play07:27

is three name is Magic function those

play07:30

get plumbed over to what we're going to

play07:32

call tool node which actually invokes

play07:35

the necessary tool so it's going to

play07:36

basically take in this name magic

play07:38

function it's going to look up magic

play07:41

function itself and it's basically just

play07:43

going to run that function with this

play07:44

input payload and then it's going to

play07:46

return that as a tool message to the llm

play07:50

that's all it's going to go on llm is

play07:52

going to see that tool message it's

play07:53

going to make a decision about what to

play07:54

do next and eventually this is going to

play07:57

keep running until there's a natural

play07:58

language response

play08:00

and this in this kind of toy example the

play08:02

tool message would return with the

play08:03

result of five that would be returned to

play08:05

the LM the LM would see that and say

play08:07

okay the result is five and then you

play08:09

would exit so that's like the toy

play08:11

example we want to we want to see now we

play08:13

can implement this all in line graph

play08:14

really easily and let's actually just

play08:16

talk through that quickly I've copied

play08:17

over the code here so all basic we've

play08:19

defined here is we have this assistant

play08:21

so this is basically just wrapping the

play08:24

chain that we defined up here this

play08:25

assistant runnable we just wrap that and

play08:28

basically all doing here is we're adding

play08:30

a

play08:31

retry so basically if a tool is if a

play08:35

tool is called then we're good that's

play08:37

valid if it has meaningful text we're

play08:39

good but otherwise we do reprompt it

play08:42

that's all we're doing here right we're

play08:43

just making sure that the llm actually

play08:45

return to valid response so that's

play08:47

really all to worry about here there um

play08:50

we're also creating this tool node so

play08:52

this tool node basically just will try

play08:53

to invoke the tool um and it'll

play08:56

basically have a little um we're going

play08:58

to add a little thing to handle errors

play09:00

in the feedback that this is all these

play09:02

are just like utility functions so don't

play09:03

really worry too much about them now

play09:05

here's kind of the interesting bit we're

play09:07

just going to build the graph and it's

play09:08

going to look exactly like we show here

play09:10

so what we're going to do is we're going

play09:12

to add a node for our assistant right

play09:15

we're going to add a node for our tool

play09:16

node and that's kind of this piece and

play09:19

this piece that's our tool node um and

play09:21

then we're going to add this conditional

play09:22

Edge which is a tools condition which is

play09:25

all it's going to be is this piece it's

play09:26

basically going to take the result from

play09:28

the LM is a tool called if yes go to the

play09:31

tool node if no end and we we can

play09:33

Implement that right here um so this

play09:36

tools condition that's all it's going to

play09:37

do it's basically going to return either

play09:39

a tool is invoked or end um and then we

play09:42

go from tools back to the assistant now

play09:44

let's run all this and we can see what's

play09:46

nice about Lang graph is we actually

play09:48

it'll automatically lay this out as a

play09:49

graph for us we can visualize it here so

play09:51

what's going to happen is we're going to

play09:52

start we're going to invoke our

play09:54

assistant um our assistant will in some

play09:57

cases um ask to use a tool it'll go then

play10:01

go to the tool node the tool will be

play10:02

invoked that'll return to the assistant

play10:04

and that will continue until there's

play10:06

natural language response and then we'll

play10:07

end that's it nice and easy so let's

play10:11

actually test this

play10:13

out

play10:14

um and I'm going to go ahead let's ask a

play10:18

super simple question so let's look at

play10:19

we I have kind of two questions was

play10:21

magic function 3 and was the weather and

play10:23

SF let's ask question the first question

play10:25

what's Magic function 3 boom so we're

play10:27

going to run this now now like i' like

play10:29

to go over to lsmith and look at the

play10:30

result here so let's actually just walk

play10:32

through this this basically allows us to

play10:35

say we basically started we went to our

play10:37

assistant and these are the functions

play10:40

available to our assistant so that's

play10:41

kind of know we gave it magic function

play10:43

we gave it web search you know here's

play10:44

the prompt what's Magic function 3 and

play10:46

what we get as an output is again the

play10:49

function to use and the payload to pass

play10:52

to the function so again remember this

play10:54

is kind of always a little bit of a

play10:55

confusing thing an llm can't magically

play10:58

call functions an is typed string to

play11:00

string it can return strings um and it

play11:03

ingests strings so that's fine all it's

play11:05

going to return in this particular case

play11:07

is just the payload to run the function

play11:09

as well as the function name but that's

play11:11

it that's all the LM is responsible for

play11:13

then what we need to do is we have this

play11:15

tools node see that's here that will

play11:18

then invoke our function and so you can

play11:21

see the input is just the argument the

play11:23

output is you know 3 + uh 3 + 2 5 great

play11:27

now this goes back to our llm

play11:30

and then our llm just simply sees okay

play11:33

it sees this tool message that the

play11:35

function was called here's the output of

play11:36

five and it returns natural language the

play11:38

result of magic function is five and

play11:40

then we end that's it nice and simple

play11:42

and we can see that also kind of laid

play11:44

out here here's our human message this

play11:46

is the AI message um so basically the AI

play11:49

makes a decision to invoke the tool and

play11:51

it gives you the input payload then

play11:53

here's the output tool message saying I

play11:55

ran the tool here's the output the llm

play11:58

gets that back and basically gives you

play12:00

natural language and then based upon our

play12:02

condition here this tools condition if

play12:04

it's natural language it ends if it's a

play12:06

tool invocation it goes back to the tool

play12:08

node right so that goes to here um so in

play12:12

this particular case it went back to the

play12:13

assistant and now it's a natural

play12:15

language response which means we just

play12:17

end that's

play12:18

it so that's kind of a nice and simple

play12:21

example now if we go let's try something

play12:24

light slight more complicated let's try

play12:26

our other tool so let's like try what's

play12:28

the weather and SF right now um so we're

play12:31

going to try to run that and cool we can

play12:34

actually see that it's going to call our

play12:36

web search endpoint that's great it gets

play12:39

this this kind of raw tool message back

play12:40

from the endpoint and then the AI will

play12:43

synthesize that into uh you know the

play12:45

weather is 60° right now with mist okay

play12:48

so that's really it this explains how

play12:50

you can lay out arbitrary agents with

play12:53

llama 3 open source llm uh we use chat

play12:57

grock to do that grock has been uh

play13:00

adapted for Tool use and that's the kind

play13:01

of main important thing you need to

play13:03

recognize that you need an LM that

play13:05

actually has tool use enabled via

play13:06

prompting or fine tuning or

play13:08

otherwise um and what you can see is if

play13:12

we kind of go back to the

play13:14

diagram what we've done here is we're

play13:17

using linecraft to kind of orchestrate

play13:18

this process and what's going to happen

play13:20

is you take a question in our L makes

play13:23

the decision based on the question to

play13:25

invoke a tool and then this conditional

play13:28

Ed Ed will determine hey if a tool is is

play13:31

kind of invoked then go to the tool node

play13:33

and actually execute the tool the tool

play13:36

is executed you get a tool message back

play13:38

with the tool output send that back to

play13:40

the LM LM reasons again and it could

play13:43

make a decision to call another tool but

play13:45

in our particular case in both cases the

play13:47

tool message output was returned to the

play13:50

LM the LM then responds in natural

play13:53

language here is the solution and

play13:55

because of that we end and that's it

play13:57

that's kind of how to build an agent

play13:58

from scratch using an open source llm

play14:01

llama 3 with Lang Lang graph to

play14:03

orchestrate it hopefully um from kind of

play14:06

kind of very simple components and first

play14:07

principles and again the key thing here

play14:10

really is the ability or the ability for

play14:13

an LM to reliably invoke tool so we

play14:16

talked through the case of adding two

play14:17

tools magic function and web search to

play14:20

our agent now let's say we wanted to

play14:21

make this a little bit more complicated

play14:23

and try some additional tools so

play14:25

replicate is a service allows you to to

play14:27

access many different uh models which is

play14:30

really convenient and I'm going to go

play14:32

ahead and use it uh to augment llama 3

play14:35

with a few multimodal capabilities so

play14:38

all I've done is I've set my replicate

play14:39

API key so I've actually already done

play14:41

that I've import replicate and I'm going

play14:43

to use a few different things here so

play14:44

I'm going to do a text toer text to

play14:46

image tool which is going to call this

play14:47

particular model which is basically an

play14:49

open doly model which will go text to

play14:51

image um I'm going to create again

play14:54

another tool image to text in this case

play14:56

take an image in it'll use a lot laa uh

play14:59

a version of lava to then produce text

play15:01

from the image and text of speech this

play15:04

is another option so really all you need

play15:07

to do here is very simply just again use

play15:09

this tool decorator with a function

play15:10

definition that invokes the model of

play15:12

choice so now the question is how do we

play15:15

add these as tools to our agent so again

play15:17

it's kind of like before all we need to

play15:19

do is just update our tools list to

play15:22

include some of our new functions here

play15:25

that's it pretty simple now that tools

play15:27

list is already bound to our our uh our

play15:31

agent here so let's just go ahead and

play15:33

kind of rerun everything just to make

play15:34

sure this all works and I'm going to do

play15:37

here is just update my question list to

play15:39

include a few new questions that related

play15:41

to my new tools and let's go ahead and

play15:43

try one so let's say I want to try um my

play15:48

index 2 question so questions two and

play15:51

this is going to be my question related

play15:52

to um image to uh this is going to be

play15:55

text to image so let's basically say

play15:57

I'll kick this off and I'll go back can

play15:59

show you um so this is going to

play16:02

basically uh in hopefully invoke the

play16:05

text image tool based on this prompt a

play16:08

yellow a yellow puppy uh running through

play16:11

running free with wild flowers in the

play16:13

mountains behind so that's our prompt

play16:16

we're going to pass it to um our text

play16:19

image

play16:20

tool and it looks like that has been

play16:23

called correctly so that's great now we

play16:25

can also go over to

play16:26

lsmith I can check my projects

play16:29

here uh cool here's my agent here it is

play16:33

running so we can also look at the trace

play16:34

to confirm that everything's working so

play16:36

cool so it looks like it is calling text

play16:39

image tool so that's fantastic that's

play16:41

running right

play16:44

now great so our tool ran now we can

play16:47

check our image here and look at that

play16:49

very nice so again this is just showing

play16:52

you the ability to create agents that

play16:55

have many different types of tools again

play16:57

previously we only had covered uh kind

play16:59

of two very simple tools a magic

play17:01

function web search but we can actually

play17:02

do pretty interesting things so this

play17:04

actually shows how you can take

play17:05

replicate for example and basically

play17:08

invoke many different llms hosted by

play17:09

replicate or or know not just llms but

play17:12

different types of models so this is a

play17:14

text image model image of text and so

play17:16

forth text of speech basically to

play17:18

augment llama 3 and give it multimodal

play17:20

capabilities so in any case it's a

play17:22

really nice kind of illustration of the

play17:23

fact that um agents are very general and

play17:27

tools can be composed of many different

play17:29

kinds of things in this particular case

play17:31

different models through replicate which

play17:33

we can attach to llama 3 to augments

play17:36

capabilities thanks

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
LLMツール使用エージェントオープンソースプロンプトFine-tuningWeb検索テキストツールマルチモーダルReplicate
هل تحتاج إلى تلخيص باللغة الإنجليزية؟