Building open source LLM agents with Llama 3

LangChain
7 Jun 202417:39

Summary

The video is abnormal, and we are working hard to fix it.
Please replace the link and try again.
The video is abnormal, and we are working hard to fix it.
Please replace the link and try again.
The video is abnormal, and we are working hard to fix it.
Please replace the link and try again.

Outlines

The video is abnormal, and we are working hard to fix it.
Please replace the link and try again.

Mindmap

Keywords

The video is abnormal, and we are working hard to fix it.
Please replace the link and try again.

Highlights

The video is abnormal, and we are working hard to fix it.
Please replace the link and try again.

Transcripts

play00:00

hey this is Lance L chain we seem very

play00:02

high interest in building llm agents

play00:05

using open source llms and so we wanted

play00:07

to talk through how to do that from

play00:08

scratch using llama 3 so first what is

play00:11

an agent so lilan Wang is a very nice

play00:14

blog post that laid out the central

play00:16

components of Agents being planning

play00:18

memory and Tool use so I want to walk

play00:20

through these components individually

play00:22

and how I can use them with llama 3 so

play00:25

first let's talk about tool use I'm

play00:27

going to copy over some code and we're

play00:28

going to walk through it so I this

play00:30

notebook done a few pip installs set a

play00:32

few API Keys we'll use grock as our LM

play00:34

we'll use Tav uh for web search as one

play00:37

of our tools and we'll use lsmith for

play00:39

tracing but that's all I've done here

play00:41

okay and I'm going to kind of have this

play00:43

image side by side so we can look at it

play00:45

so first tool use what's the big idea

play00:48

here the big idea is simply this I want

play00:50

to take an llm give it awareness of some

play00:53

external tool that exists and have the

play00:55

llm return the payload necessary to

play00:58

invoke that tool that's really all

play00:59

that's going going on now this is often

play01:01

kind of confused and I wanted to kind of

play01:03

zoom in and explain this exactly so

play01:05

let's say I have a function called Magic

play01:07

function which which takes an input and

play01:08

adds two to it I want to give an llm the

play01:11

ability to recognize whether not or not

play01:14

to invoke this function and to return

play01:17

the payload necessary to run the

play01:19

function given the user input so here's

play01:21

exactly what I want to have happen I

play01:23

want to take that function somehow bind

play01:25

it to my llm and give it an input then

play01:29

return both the function name itself and

play01:32

the arguments necessary to run the

play01:34

function remember llms are just string

play01:36

to string right it doesn't have the

play01:37

magic ability to call that function

play01:39

natively but what it can do is return

play01:41

okay I've seen this function I know it

play01:44

exists and I'm going to give you exactly

play01:46

like the input format necessary or the

play01:47

payload to run the function as well as

play01:50

the name of the function okay so that's

play01:51

really all that's going on so first this

play01:56

tool decorator in Lang chain allows you

play01:57

to take any arbitrary function just turn

play01:59

it into a tool and let's just kick this

play02:01

off so here's my magic function and

play02:03

here's a web search function so these

play02:04

are two things that I want to kind of

play02:06

turn into tools and I can do that right

play02:09

here so we can run this now if I look at

play02:11

Magic function now it's a structured

play02:13

tool it has a name it has a

play02:16

description and um it also has that

play02:19

input or arguments as uh captured as a

play02:22

pantic schema okay so all this

play02:25

information can be passed directly to

play02:26

our llm that's the key point so this

play02:28

allows us to go from arbitrary functions

play02:31

to tools that can be bound to an llm

play02:33

okay so that's kind of step one now step

play02:35

two this is where things are kind of

play02:37

interesting I'm going to use grock here

play02:41

and I'm going to use a prompt I'm

play02:42

basically going to say you're helpful

play02:43

assistant with two tools web search and

play02:45

a custom function use web search for

play02:48

current events use the magic function if

play02:50

the user directly asked for it otherwise

play02:52

just answer directly okay so that's kind

play02:53

of my prompt and let's test this in two

play02:57

cases to explain exactly how this works

play02:59

okay so all I'm doing I'm using chat

play03:01

grock setting llama

play03:02

370b and I'm creating this uh runnable

play03:05

this is kind of a lang chain primitive

play03:07

for it basically invoking llm so that's

play03:09

all I've done now here's what's

play03:10

interesting this is piping the prompt to

play03:13

an llm and I've bound my tools to the LM

play03:16

so this is automatically taking those

play03:17

tools we defined and it's basically

play03:20

giving them to the LM such that it's

play03:21

aware of them so it's that's basically

play03:24

represented in this red box here you

play03:26

take external tools and you basically

play03:28

bind them to LM so the is aware that

play03:30

they exist that's kind of step one now

play03:33

here's step two I can basically take a

play03:36

question so I'm going to ask what is

play03:38

Magic function 3 I'm going to invoke my

play03:40

runnable or my chain right with this and

play03:44

let's see what happens I'm going to run

play03:46

this now here's what's interesting that

play03:48

payload contains an object tool calls

play03:51

which contains the name of the function

play03:53

and the arguments that's it so that's

play03:56

the key thing and I can look at the raw

play03:58

payload as well so the raw payload is

play04:00

just simply this AI message it contains

play04:03

you know a bunch of information but

play04:04

here's the main thing it contains

play04:06

basically um the name of the function to

play04:09

call and the arguments pass to the

play04:11

function so again that's exactly

play04:12

represented here all that's happening is

play04:14

I've taken a function I've turned it

play04:16

into a tool I've bound it to my llm I

play04:19

can ask a question natural language and

play04:21

the llm can respond directly with the L

play04:24

the function to call or the tool to use

play04:27

and the input argument to use based upon

play04:30

the user input that's the key point and

play04:32

that's really all that's happening

play04:33

function calling that's all I need you

play04:35

to know okay so here's the other key

play04:37

thing what if I just ask a question

play04:39

about the United States based on my

play04:41

prompt it should not try to invoke any

play04:42

of these tools now now let's test that I

play04:44

run this good and so this payload tool

play04:47

calls empty I can look at the raw

play04:49

payload and yeah now it's just a chat

play04:51

response right the capital of the US is

play04:53

Washington DC great okay so that's it so

play04:56

hopefully now you understand how tool

play04:58

use works and now remember this requires

play05:00

an LM that's actually been fine-tuned or

play05:02

prompted or otherwise is compatible with

play05:05

tool use and this is a very important

play05:06

Point uh we talked to the folks at Croc

play05:09

they have kind of an an a proprietary

play05:11

implementation for how they do this um

play05:14

which we don't know fully but it is

play05:16

reported that works very well with llama

play05:17

70b llama 370b and that in my experience

play05:20

I've seen it to indeed work quite well

play05:22

so in any case the key point is this I

play05:25

can take any arbitrary functions I want

play05:27

I can turn them into tools I can then

play05:30

pass those tools to an llm I can bind

play05:32

them and then you can see right here

play05:35

when I invoke my llm with a question the

play05:38

LM makes decision to use one of the

play05:41

tools and if it does it's going to

play05:42

return to you the name of the tool it

play05:44

wants to use and the input argument

play05:46

that's the key Point okay so that is

play05:50

really what uh you need to know about

play05:52

tool use now we get to the fun stuff

play05:55

we're going to build the agent and for

play05:57

this I'm going to use Lang graph and I'm

play05:58

going to explain kind of how this works

play06:00

over time but first the way to think

play06:01

about L graph is basically it's a way to

play06:03

lay out

play06:04

flows and flows in particular with L

play06:07

graph are often characterized by Cycles

play06:09

so the ability to kind of do feedback

play06:11

and that's really relevant for agents

play06:13

and we'll explain why here shortly so L

play06:15

graph basically takes a state which can

play06:17

live over the course of your graph or

play06:19

flow and it can be accessed by all kind

play06:21

of what we're going to call nodes in

play06:23

your graph okay so first as state I'm

play06:26

just going to find a set of messages and

play06:28

don't worry too much about this for now

play06:29

this will all make sense in about a

play06:31

minute okay now here's where things are

play06:34

going to get interesting I'm going to

play06:35

Define an agent that contains two nodes

play06:39

okay so

play06:43

first first we're going to take our

play06:46

input again it's a human message we pass

play06:48

that to our LM which has the bound tools

play06:50

the llm is going to make a decision to

play06:52

use a tool or not and we just walk

play06:54

through this that's the step one that's

play06:55

this thing we've already seen right now

play06:58

what we're going to do in Lang graph is

play06:59

is we're going to add basic what we're

play07:01

going to call an a conditional Edge so

play07:03

this Edge is going to all it's going to

play07:04

do is say was there a tool call or not

play07:07

if there was a tool call I'm going to

play07:10

route that over to a separate node that

play07:12

basically runs the tool so let's walk

play07:14

through with our example we just did um

play07:17

what is Magic function of

play07:19

three the llm made the decision to

play07:21

invoke the magic function and it gave it

play07:23

had the it gave us the payload right we

play07:25

just saw that so that's arguments input

play07:27

is three name is Magic function those

play07:30

get plumbed over to what we're going to

play07:32

call tool node which actually invokes

play07:35

the necessary tool so it's going to

play07:36

basically take in this name magic

play07:38

function it's going to look up magic

play07:41

function itself and it's basically just

play07:43

going to run that function with this

play07:44

input payload and then it's going to

play07:46

return that as a tool message to the llm

play07:50

that's all it's going to go on llm is

play07:52

going to see that tool message it's

play07:53

going to make a decision about what to

play07:54

do next and eventually this is going to

play07:57

keep running until there's a natural

play07:58

language response

play08:00

and this in this kind of toy example the

play08:02

tool message would return with the

play08:03

result of five that would be returned to

play08:05

the LM the LM would see that and say

play08:07

okay the result is five and then you

play08:09

would exit so that's like the toy

play08:11

example we want to we want to see now we

play08:13

can implement this all in line graph

play08:14

really easily and let's actually just

play08:16

talk through that quickly I've copied

play08:17

over the code here so all basic we've

play08:19

defined here is we have this assistant

play08:21

so this is basically just wrapping the

play08:24

chain that we defined up here this

play08:25

assistant runnable we just wrap that and

play08:28

basically all doing here is we're adding

play08:30

a

play08:31

retry so basically if a tool is if a

play08:35

tool is called then we're good that's

play08:37

valid if it has meaningful text we're

play08:39

good but otherwise we do reprompt it

play08:42

that's all we're doing here right we're

play08:43

just making sure that the llm actually

play08:45

return to valid response so that's

play08:47

really all to worry about here there um

play08:50

we're also creating this tool node so

play08:52

this tool node basically just will try

play08:53

to invoke the tool um and it'll

play08:56

basically have a little um we're going

play08:58

to add a little thing to handle errors

play09:00

in the feedback that this is all these

play09:02

are just like utility functions so don't

play09:03

really worry too much about them now

play09:05

here's kind of the interesting bit we're

play09:07

just going to build the graph and it's

play09:08

going to look exactly like we show here

play09:10

so what we're going to do is we're going

play09:12

to add a node for our assistant right

play09:15

we're going to add a node for our tool

play09:16

node and that's kind of this piece and

play09:19

this piece that's our tool node um and

play09:21

then we're going to add this conditional

play09:22

Edge which is a tools condition which is

play09:25

all it's going to be is this piece it's

play09:26

basically going to take the result from

play09:28

the LM is a tool called if yes go to the

play09:31

tool node if no end and we we can

play09:33

Implement that right here um so this

play09:36

tools condition that's all it's going to

play09:37

do it's basically going to return either

play09:39

a tool is invoked or end um and then we

play09:42

go from tools back to the assistant now

play09:44

let's run all this and we can see what's

play09:46

nice about Lang graph is we actually

play09:48

it'll automatically lay this out as a

play09:49

graph for us we can visualize it here so

play09:51

what's going to happen is we're going to

play09:52

start we're going to invoke our

play09:54

assistant um our assistant will in some

play09:57

cases um ask to use a tool it'll go then

play10:01

go to the tool node the tool will be

play10:02

invoked that'll return to the assistant

play10:04

and that will continue until there's

play10:06

natural language response and then we'll

play10:07

end that's it nice and easy so let's

play10:11

actually test this

play10:13

out

play10:14

um and I'm going to go ahead let's ask a

play10:18

super simple question so let's look at

play10:19

we I have kind of two questions was

play10:21

magic function 3 and was the weather and

play10:23

SF let's ask question the first question

play10:25

what's Magic function 3 boom so we're

play10:27

going to run this now now like i' like

play10:29

to go over to lsmith and look at the

play10:30

result here so let's actually just walk

play10:32

through this this basically allows us to

play10:35

say we basically started we went to our

play10:37

assistant and these are the functions

play10:40

available to our assistant so that's

play10:41

kind of know we gave it magic function

play10:43

we gave it web search you know here's

play10:44

the prompt what's Magic function 3 and

play10:46

what we get as an output is again the

play10:49

function to use and the payload to pass

play10:52

to the function so again remember this

play10:54

is kind of always a little bit of a

play10:55

confusing thing an llm can't magically

play10:58

call functions an is typed string to

play11:00

string it can return strings um and it

play11:03

ingests strings so that's fine all it's

play11:05

going to return in this particular case

play11:07

is just the payload to run the function

play11:09

as well as the function name but that's

play11:11

it that's all the LM is responsible for

play11:13

then what we need to do is we have this

play11:15

tools node see that's here that will

play11:18

then invoke our function and so you can

play11:21

see the input is just the argument the

play11:23

output is you know 3 + uh 3 + 2 5 great

play11:27

now this goes back to our llm

play11:30

and then our llm just simply sees okay

play11:33

it sees this tool message that the

play11:35

function was called here's the output of

play11:36

five and it returns natural language the

play11:38

result of magic function is five and

play11:40

then we end that's it nice and simple

play11:42

and we can see that also kind of laid

play11:44

out here here's our human message this

play11:46

is the AI message um so basically the AI

play11:49

makes a decision to invoke the tool and

play11:51

it gives you the input payload then

play11:53

here's the output tool message saying I

play11:55

ran the tool here's the output the llm

play11:58

gets that back and basically gives you

play12:00

natural language and then based upon our

play12:02

condition here this tools condition if

play12:04

it's natural language it ends if it's a

play12:06

tool invocation it goes back to the tool

play12:08

node right so that goes to here um so in

play12:12

this particular case it went back to the

play12:13

assistant and now it's a natural

play12:15

language response which means we just

play12:17

end that's

play12:18

it so that's kind of a nice and simple

play12:21

example now if we go let's try something

play12:24

light slight more complicated let's try

play12:26

our other tool so let's like try what's

play12:28

the weather and SF right now um so we're

play12:31

going to try to run that and cool we can

play12:34

actually see that it's going to call our

play12:36

web search endpoint that's great it gets

play12:39

this this kind of raw tool message back

play12:40

from the endpoint and then the AI will

play12:43

synthesize that into uh you know the

play12:45

weather is 60° right now with mist okay

play12:48

so that's really it this explains how

play12:50

you can lay out arbitrary agents with

play12:53

llama 3 open source llm uh we use chat

play12:57

grock to do that grock has been uh

play13:00

adapted for Tool use and that's the kind

play13:01

of main important thing you need to

play13:03

recognize that you need an LM that

play13:05

actually has tool use enabled via

play13:06

prompting or fine tuning or

play13:08

otherwise um and what you can see is if

play13:12

we kind of go back to the

play13:14

diagram what we've done here is we're

play13:17

using linecraft to kind of orchestrate

play13:18

this process and what's going to happen

play13:20

is you take a question in our L makes

play13:23

the decision based on the question to

play13:25

invoke a tool and then this conditional

play13:28

Ed Ed will determine hey if a tool is is

play13:31

kind of invoked then go to the tool node

play13:33

and actually execute the tool the tool

play13:36

is executed you get a tool message back

play13:38

with the tool output send that back to

play13:40

the LM LM reasons again and it could

play13:43

make a decision to call another tool but

play13:45

in our particular case in both cases the

play13:47

tool message output was returned to the

play13:50

LM the LM then responds in natural

play13:53

language here is the solution and

play13:55

because of that we end and that's it

play13:57

that's kind of how to build an agent

play13:58

from scratch using an open source llm

play14:01

llama 3 with Lang Lang graph to

play14:03

orchestrate it hopefully um from kind of

play14:06

kind of very simple components and first

play14:07

principles and again the key thing here

play14:10

really is the ability or the ability for

play14:13

an LM to reliably invoke tool so we

play14:16

talked through the case of adding two

play14:17

tools magic function and web search to

play14:20

our agent now let's say we wanted to

play14:21

make this a little bit more complicated

play14:23

and try some additional tools so

play14:25

replicate is a service allows you to to

play14:27

access many different uh models which is

play14:30

really convenient and I'm going to go

play14:32

ahead and use it uh to augment llama 3

play14:35

with a few multimodal capabilities so

play14:38

all I've done is I've set my replicate

play14:39

API key so I've actually already done

play14:41

that I've import replicate and I'm going

play14:43

to use a few different things here so

play14:44

I'm going to do a text toer text to

play14:46

image tool which is going to call this

play14:47

particular model which is basically an

play14:49

open doly model which will go text to

play14:51

image um I'm going to create again

play14:54

another tool image to text in this case

play14:56

take an image in it'll use a lot laa uh

play14:59

a version of lava to then produce text

play15:01

from the image and text of speech this

play15:04

is another option so really all you need

play15:07

to do here is very simply just again use

play15:09

this tool decorator with a function

play15:10

definition that invokes the model of

play15:12

choice so now the question is how do we

play15:15

add these as tools to our agent so again

play15:17

it's kind of like before all we need to

play15:19

do is just update our tools list to

play15:22

include some of our new functions here

play15:25

that's it pretty simple now that tools

play15:27

list is already bound to our our uh our

play15:31

agent here so let's just go ahead and

play15:33

kind of rerun everything just to make

play15:34

sure this all works and I'm going to do

play15:37

here is just update my question list to

play15:39

include a few new questions that related

play15:41

to my new tools and let's go ahead and

play15:43

try one so let's say I want to try um my

play15:48

index 2 question so questions two and

play15:51

this is going to be my question related

play15:52

to um image to uh this is going to be

play15:55

text to image so let's basically say

play15:57

I'll kick this off and I'll go back can

play15:59

show you um so this is going to

play16:02

basically uh in hopefully invoke the

play16:05

text image tool based on this prompt a

play16:08

yellow a yellow puppy uh running through

play16:11

running free with wild flowers in the

play16:13

mountains behind so that's our prompt

play16:16

we're going to pass it to um our text

play16:19

image

play16:20

tool and it looks like that has been

play16:23

called correctly so that's great now we

play16:25

can also go over to

play16:26

lsmith I can check my projects

play16:29

here uh cool here's my agent here it is

play16:33

running so we can also look at the trace

play16:34

to confirm that everything's working so

play16:36

cool so it looks like it is calling text

play16:39

image tool so that's fantastic that's

play16:41

running right

play16:44

now great so our tool ran now we can

play16:47

check our image here and look at that

play16:49

very nice so again this is just showing

play16:52

you the ability to create agents that

play16:55

have many different types of tools again

play16:57

previously we only had covered uh kind

play16:59

of two very simple tools a magic

play17:01

function web search but we can actually

play17:02

do pretty interesting things so this

play17:04

actually shows how you can take

play17:05

replicate for example and basically

play17:08

invoke many different llms hosted by

play17:09

replicate or or know not just llms but

play17:12

different types of models so this is a

play17:14

text image model image of text and so

play17:16

forth text of speech basically to

play17:18

augment llama 3 and give it multimodal

play17:20

capabilities so in any case it's a

play17:22

really nice kind of illustration of the

play17:23

fact that um agents are very general and

play17:27

tools can be composed of many different

play17:29

kinds of things in this particular case

play17:31

different models through replicate which

play17:33

we can attach to llama 3 to augments

play17:36

capabilities thanks

Rate This

5.0 / 5 (0 votes)

Related Tags
AI AgentsLangChainGradioLLM ToolsTool UseWeb SearchCustom FunctionsMultimodal AIOpen SourceAI OrchestrationLlama Model