What's next for AI agentic workflows ft. Andrew Ng of AI Fund

Sequoia Capital
26 Mar 202413:40

Summary

TLDRThe transcript discusses the evolution of AI agents and their impact on computer science, highlighting the significance of iterative, agentic workflows in enhancing AI performance. It emphasizes the potential of design patterns like reflection, multi-agent collaboration, planning, and debate in boosting productivity and achieving remarkable results. The speaker also underscores the importance of adapting to these agentic workflows and the role of fast token generation in facilitating iterative processes.

Takeaways

  • ๐Ÿง  The importance of neural networks and GPUs in AI development, with Andrew Ng's significant contributions through his work on Coursera, deeplearning.ai, and Google Brain.
  • ๐Ÿ“ The contrast between non-agentic and agentic workflows in AI, where the latter involves iterative processes similar to human thought and revision.
  • ๐Ÿค– The concept of AI agents and their potential to transform AI applications, emphasizing the shift from traditional AI usage to more interactive and collaborative models.
  • ๐Ÿ”„ The iterative process of agentic workflows that involve planning, execution, revision, and testing, leading to improved outcomes over non-agentic approaches.
  • ๐Ÿ“ˆ The case study highlighting the effectiveness of agentic workflows with GPT-3.5, where incorporating agentic strategies improved performance over zero-shot prompting.
  • ๐Ÿ’ก The four broad design patterns observed in AI agents: reflection, multi-agent collaboration, planning, and multi-agent debate, each with varying degrees of maturity and reliability.
  • ๐Ÿ” The use of self-reflection in AI coding agents to identify and correct errors in their own generated code, demonstrating a level of autonomy and self-awareness.
  • ๐Ÿ‘ฅ The potential of multi-agent systems, where different AI agents can take on various roles and collaborate effectively, leading to complex problem-solving and decision-making.
  • ๐Ÿš€ The anticipation of rapid advancements in AI capabilities due to agentic workflows, suggesting a significant shift in how AI applications are designed and utilized.
  • ๐Ÿ•’ The need for patience and dedication when working with AI agents, as the iterative process may require more time for the AI to deliver optimized results.
  • ๐Ÿ’ฌ The closing thoughts on the journey towards AGI (Artificial General Intelligence) and how agentic reasoning design patterns could contribute to this long-term goal.

Q & A

  • What is the main focus of the discussion in the transcript?

    -The main focus of the discussion is the development and application of AI agents using various design patterns, particularly in the context of neural networks and large language models (LMs).

  • Who is Andrew and what is his notable contribution to the field of AI?

    -Andrew is a renowned computer science professor at Stanford, known for his early work in developing neural networks with GPUs. He is also the creator of Coursera, popular courses like deeplearning.ai, and the founder and early lead of Google Brain.

  • What are the two different workflows for using LMs as mentioned in the transcript?

    -The two workflows mentioned are non-agentic and agentic. The non-agentic workflow involves typing a prompt and generating an answer, similar to asking a person to write an essay without using backspace. The agentic workflow is more iterative, involving multiple interactions with the LM, such as writing an outline, conducting web research, drafting, revising, and iterating until the desired outcome is achieved.

  • How does the agentic workflow improve results compared to the non-agentic workflow?

    -The agentic workflow delivers remarkably better results as it allows for a more iterative and reflective process. This includes the ability for the LM to self-evaluate its own code, revise it based on feedback, and continue to improve through several rounds of thinking and revising.

  • What is the significance of the study using the human eval benchmark?

    -The study using the human eval benchmark demonstrated that an agentic workflow with GPT-3.5 outperformed even GPT-4 in certain coding tasks. This highlights the effectiveness of the agentic approach and its potential to enhance the performance of AI systems.

  • What are the four broad design patterns mentioned in the transcript?

    -The four broad design patterns mentioned are reflection, planning, multi-agent collaboration, and two-use (using LMs for various tasks like analysis, information gathering, and action).

  • How does self-reflection work in the context of an LM coding agent?

    -Self-reflection involves prompting the LM to review and evaluate the code it generated, identify any issues, and suggest improvements. This process can lead to the LM creating a better version of the code based on its own feedback.

  • What is the concept of multi-agent collaboration?

    -Multi-agent collaboration involves using multiple LMs, each prompted to act in different roles (e.g., coder, critic, CEO, designer), to work together on a task. These agents can have extended conversations and collaborate to achieve complex outcomes, such as developing a software program.

  • Why is fast token generation important in agentic workflows?

    -Fast token generation is crucial in agentic workflows because it allows for quicker iterations. The LM can generate tokens at a pace much faster than a human can read, which facilitates the rapid exchange and refinement of ideas, leading to more efficient and potentially higher-quality outcomes.

  • What is the significance of the trend towards AGI (Artificial General Intelligence) as mentioned in the transcript?

    -The path towards AGI is viewed as a journey rather than a destination. The use of agentic workflows and design patterns is seen as a way to make progress towards achieving AGI, with the potential to improve AI systems' ability to perform a wide range of tasks autonomously and effectively.

  • What is the advice given for effectively using AI agents in one's workflow?

    -The advice given is to be patient and allow AI agents time to process and respond, even if it takes minutes or hours. Just like delegating tasks to a team member and checking in later, it's important to give AI agents the time they need to provide the best possible outcomes.

Outlines

00:00

๐Ÿค– Introduction to AI Agents and Their Impact

The speaker begins by acknowledging Andreu's contributions to computer science, particularly in the realm of neural networks and AI. He introduces the concept of AI agents, emphasizing their potential as a significant trend in AI development. The speaker shares his experience with problem sets from a Stanford course, highlighting the importance of iterative learning and refinement. He contrasts non-agentic workflows, where tasks are completed in one go without revision, to agentic workflows that involve multiple iterations and refinements, leading to better outcomes. The speaker also discusses the effectiveness of using agentic workflows with GPT-3.5 and the potential for improved performance over newer models like GPT-4.

05:01

๐Ÿ“š Design Patterns in AI Agents

The speaker delves into the design patterns observed in AI agents, noting the chaotic yet promising landscape of AI research and development. He outlines four key design patterns: reflection, multi-agent collaboration, planning, and multi-agent debate. Reflection involves self-assessment and improvement of code, while multi-agent collaboration introduces the concept of having different agents perform specialized tasks. Planning algorithms are highlighted for their ability to adapt and overcome failures, and multi-agent debate showcases the power of diverse perspectives leading to enhanced outcomes. The speaker emphasizes the importance of these patterns in boosting productivity and the potential for AI to expand its capabilities.

10:04

๐Ÿš€ Future Trends and the Path to AGI

In the final paragraph, the speaker discusses the future trends in AI, predicting a significant expansion of AI capabilities due to agentic workflows. He challenges the conventional expectation of immediate responses from AI, advocating for patience and the understanding that AI agents may require more time to deliver high-quality outcomes. The speaker also emphasizes the importance of fast token generation for iterative processes in agentic workflows. He expresses excitement for upcoming AI models and believes that these advancements could bring us closer to achieving AGI, viewing it as a journey rather than a destination. The speaker concludes with a hopeful outlook on the role of agent workflows in propelling AI forward.

Mindmap

Keywords

๐Ÿ’กNeural Networks with GPUs

Neural Networks with GPUs refer to the use of graphical processing units to accelerate the training and deployment of artificial neural networks, which are computational models inspired by the human brain and are a key component of deep learning. In the context of the video, this technology was pivotal in the early development of AI and machine learning, as it enabled faster and more efficient processing of large datasets, leading to significant advancements in the field.

๐Ÿ’กCoursera

Coursera is an online learning platform that offers a wide range of courses, including those related to computer science and AI, from top universities and institutions worldwide. It has played a significant role in democratizing education by making high-quality learning resources accessible to a global audience. In the video, Coursera is mentioned as one of the platforms where Andrew has contributed to the dissemination of knowledge in AI and deep learning.

๐Ÿ’กDeeplearning.ai

Deeplearning.ai is a website founded by Andrew Ng that provides educational content and resources focused on deep learning, a subset of machine learning that involves the use of neural networks with many layers to learn representations of data. The platform is known for its high-quality courses and materials that cater to both beginners and experienced professionals in the field of AI. In the context of the video, it is one of the initiatives through which Andrew has contributed to the AI community.

๐Ÿ’กGoogle Brain

Google Brain is a research project by Google that focuses on advancing the field of deep learning and artificial intelligence. It involves the collaboration of researchers and engineers working on various AI-related problems, from improving the performance of neural networks to developing new machine learning algorithms. In the video, Andrew's role as a founder and early lead of Google Brain is highlighted, emphasizing his significant contributions to the company's AI initiatives.

๐Ÿ’กAI Agents

AI Agents refer to autonomous systems or software entities that can perceive their environment, make decisions, and take actions to achieve specific goals. In the context of the video, AI agents are discussed as a transformative trend in AI development, where the focus shifts from non-agentic workflows to more interactive and iterative processes that mimic human problem-solving strategies.

๐Ÿ’กIterative Workflow

An iterative workflow is a process of repeated cycles of development and refinement, where the output of each cycle is reviewed, revised, and improved upon. This approach is contrasted with a non-iterative, one-shot process in the video, where tasks like writing an essay or coding are completed in a single attempt without revision. Iterative workflows are emphasized as more effective in achieving higher quality results, as they allow for continuous improvement and refinement.

๐Ÿ’กHuman Evaluation Benchmark

The Human Evaluation Benchmark is a tool or a set of standards used to assess and compare the performance of AI systems or algorithms against human performance in specific tasks. It provides a quantitative measure of how well an AI model can accomplish a given task and is often used to evaluate the effectiveness of new AI technologies or improvements to existing ones. In the video, the speaker uses the Human Evaluation Benchmark to demonstrate the improved performance of AI agents in an iterative workflow compared to non-agentic approaches.

๐Ÿ’กPlanning Algorithms

Planning algorithms are computational methods used in AI to determine a sequence of actions that a system should take to achieve a specific goal or set of goals. These algorithms are crucial in creating AI agents capable of autonomous decision-making and problem-solving, as they simulate and evaluate different possible courses of action to select the most effective one. In the video, planning algorithms are discussed as a key component in enabling AI agents to handle complex tasks and recover from failures.

๐Ÿ’กMulti-Agent Collaboration

Multi-Agent Collaboration refers to the interaction and cooperation of multiple AI agents to achieve a common goal. This approach leverages the strengths of different AI systems or models to solve complex problems that may be beyond the capabilities of a single agent. In the context of the video, multi-agent collaboration is presented as an emerging and powerful design pattern that can lead to surprising and effective outcomes, such as collaboratively developing a complex program or game.

๐Ÿ’กAgentic Reasoning

Agentic Reasoning refers to the ability of AI systems to simulate human-like thought processes and decision-making, including planning, problem-solving, and learning from experience. It involves creating AI agents that can interact with their environment in a purposeful and goal-oriented manner. In the video, the speaker emphasizes the importance of agentic reasoning as a design pattern that can significantly boost productivity and lead to more advanced AI capabilities.

๐Ÿ’กFast Token Generation

Fast Token Generation is the ability of AI models, particularly language models, to quickly produce a large number of tokens or text elements during the iterative process of agentic workflows. This speed is crucial for maintaining the efficiency of the workflow, as it allows for more iterations and refinements within a shorter period. In the context of the video, fast token generation is highlighted as an important trend that can enhance the performance of AI agents, even when using slightly lower quality models.

Highlights

Andreu's early contributions to neural networks with GPUs and his role in creating Coursera and deeplearning.ai.

The importance of iterative processes in AI workflows, compared to non-agentic, one-shot methods.

The surprising effectiveness of agentic workflows in improving AI performance, even surpassing newer models like GPT-4.

The concept of reflection as a powerful tool in AI design patterns, allowing systems to self-evaluate and improve their output.

The potential of multi-agent collaboration, where different AI agents can work together, each playing different roles.

The use of two-use systems in expanding the capabilities of language models, especially in areas like computer vision.

The impact of planning algorithms on AI's ability to autonomously handle failures and reroute processes.

The increasing role of AI agents in personal workflows, such as research, where they can assist in gathering and analyzing information.

The significance of fast token generation in agentic workflows, allowing for quicker iterations and potentially better results.

The anticipation of future AI advancements due to agentic reasoning design patterns.

The need for patience and dedication when working with AI agents, as they may require more time to process and respond effectively.

The potential of lower quality models with faster token generation to outperform slightly higher quality models with slower token generation.

The ongoing journey towards AGI and how agent workflows might contribute to this progression.

The excitement around upcoming AI models like Cloud T5, CL 4, GPT-5, and Gemini 2.0.

Transcripts

play00:03

all of you uh know Andreu in as a famous

play00:06

uh computer science professor at

play00:08

Stanford was really early on in the

play00:10

development of neural networks with gpus

play00:13

of course a creator of corsera and

play00:15

popular courses like

play00:17

deeplearning.ai also the founder and

play00:19

Creator uh and early lead of Google

play00:22

brain uh but one thing I've always

play00:24

wanted to ask you before I hand it over

play00:26

Andrew while you're on stage uh is a

play00:30

question I think would be relevant to

play00:31

the whole audience 10 years ago on

play00:35

problem set number two of cs229 you gave

play00:38

me a

play00:39

b and I was wondering I looked it over I

play00:42

was wondering what you saw that I did

play00:44

incorrectly so anyway Andrew thank you

play00:47

Hansen um looking forward to sharing

play00:49

with all of you what I'm seeing with AI

play00:51

agents which I think is the exciting

play00:53

Trend that I think everyone building in

play00:56

AI should pay attention to and then also

play00:57

excited about all all the other uh on

play01:00

Sak presentations so hey agents you know

play01:03

today the way most of us use Lish models

play01:05

is like this with a non- agentic

play01:07

workflow where you type a prompt and

play01:10

generates an answer and that's a bit

play01:12

like if you ask a person to write an

play01:14

essay on a topic and I say please sit

play01:16

down to the keyboard and just type the

play01:18

essay from start to finish without ever

play01:21

using backspace um and despite how hard

play01:24

thises is L's do it remarkably well in

play01:27

contrast with an agentic workflow this

play01:30

is what it may look like have an AI have

play01:32

an LM say write an essay outline do you

play01:35

need to do any web research if so let's

play01:37

do that then write the first draft and

play01:40

then read your own first draft and think

play01:42

about what parts need revision and then

play01:45

revise your draft and you go on and on

play01:47

and so this workflow is much more

play01:49

iterative where you may have the L do

play01:52

some thinking um and then revise this

play01:55

article and then do some more thinking

play01:57

and iterate this through a number of

play02:00

times and what not many people

play02:02

appreciate is this delivers remarkably

play02:05

better results um I've actually been

play02:07

really surprised myself working these

play02:08

agent workflows how well how well they

play02:11

work I's do one case study at my team

play02:14

analyzed some data uh using a coding

play02:16

Benchmark called the human eval

play02:18

Benchmark released by open a few years

play02:20

ago um but this says coding problems

play02:22

like given the nonent list of integers

play02:25

return the sum of all the all elements

play02:26

are an even positions and it turns out

play02:29

the answer is you code snipper like that

play02:31

so today lot of us will use zero shot

play02:33

prompting meaning we tell the AI write

play02:35

the code and have it run on the first

play02:37

spot like who codes like that no human

play02:39

codes like that just type out the code

play02:40

and run it maybe you do I can't do that

play02:43

um so it turns out that if you use GPT

play02:46

3.5 uh zero shot prompting it gets it

play02:50

48% right uh gp4 way better 607 7% right

play02:55

but if you take an agentic workflow and

play02:57

wrap it around GPT 3.5 I say it actually

play03:01

does better than even

play03:03

gbd4 um and if you were to wrap this

play03:06

type of workflow around gb4 you know it

play03:09

it it also um does very well and you

play03:12

notice that gbd 3.5 with an agentic

play03:15

workflow actually outperforms

play03:18

gp4 um and I think this has and this

play03:21

means that this has signant consequences

play03:24

fighting how we all approach building

play03:26

applications so agents is the ter of

play03:29

around a lot there's a lot of consultant

play03:31

reports talk about agents the future of

play03:33

AI blah blah blah I want to be a bit

play03:35

concrete and share of you um the broad

play03:38

design patterns I'm seeing in agents

play03:40

it's a very messy chaotic space tons of

play03:42

research tons of Open Source there's a

play03:44

lot going on but I try to categorize um

play03:46

bit more concretely what's going on

play03:48

agents reflection is a tool that I think

play03:51

many of us should just use it just works

play03:54

uh to use I think it's more widely

play03:56

appreciated but actually works pretty

play03:57

well I think of these as pretty robust

play03:59

technology when I use them I can you

play04:01

know almost always get them to work well

play04:04

um planning and multi-agent

play04:05

collaboration I think is more emerging

play04:08

when I use them sometimes my mind is

play04:10

blown for how well they work but at

play04:12

least at this moment in time I don't

play04:13

feel like I can always get them to work

play04:15

Rel Lively so let me walk through these

play04:18

four design patterns in the few slides

play04:20

and if some of you go back and yourself

play04:22

will ask your engineers to use these I

play04:24

think you get a productivity boost quite

play04:26

quickly so reflection here's an example

play04:29

let's say ask a system please write code

play04:31

for me for a given task then we have a

play04:34

coder agent just an LM that you prompt

play04:37

to write code to say you def du task

play04:40

write a function like that um an example

play04:42

of

play04:43

self-reflection would be if you then

play04:45

prompt the LM with something like this

play04:47

here's code intended for a toas and just

play04:50

give it back the exact same code that

play04:51

they just generated and then say check

play04:53

the code carefully for correctness sound

play04:55

efficiency good construction CRI just

play04:57

write prompt like that it turns out the

play04:59

same l that you prompted to write the

play05:01

code may be able to spot problems like

play05:03

this bug in line Five May fix it by blah

play05:05

blah blah and if you now take his own

play05:07

feedback and give it to it and reprompt

play05:09

it it may come up with a version two of

play05:12

the code that could well work better

play05:13

than the first version not guaranteed

play05:15

but it works you know often enough for

play05:17

this be wor trying for a lot of

play05:19

applications um to foreshadow to use if

play05:22

you let it run unit test if it fails a

play05:25

unit test then he why do you fail the

play05:27

unit test have that conversation and be

play05:29

able to figure out fail the unit test so

play05:31

you should try changing something and

play05:32

come up with V3 by the way for those of

play05:35

you that want to learn more about these

play05:37

Technologies I'm very excited about them

play05:38

for each of the four sections I have a

play05:40

little recommended reading section at

play05:42

the bottom that you know hopefully gives

play05:44

more references and again just the

play05:46

foreshadow multi-agent systems I've

play05:48

described as a single coder agent that

play05:51

you prompt to have it you know have this

play05:52

conversation with itself um one Natural

play05:55

Evolution of this idea is instead of a

play05:57

single code agent you can can have two

play06:00

agents where one is a coder agent and

play06:02

the second is a Critic agent and these

play06:05

could be the same base LM model but that

play06:08

you prompt in different ways where you

play06:10

say one your expert coder right code the

play06:12

other one say your expert code review to

play06:14

review this code and this Tye of

play06:16

workflow is actually pretty easy to

play06:18

implement I think it's such a very

play06:19

general purpose technology for a lot of

play06:21

workflows this would give you a

play06:23

significant boost in in the performance

play06:25

of LMS um the second design pattern is

play06:28

to use many of where already have seen

play06:31

you know LM based systems uh uh using

play06:33

tools on the left is a screenshot from

play06:36

um co-pilot on the right is something

play06:39

that I kind of extracted from uh gp4 but

play06:42

you know LM today if you ask it what's

play06:44

the best coffee maker web search for

play06:46

some problems um will generate code and

play06:48

run code um and it turns out that there

play06:51

are a lot of different tools that many

play06:53

different people are using for analysis

play06:56

for gathering information for taking

play06:58

action for personal productivity

play07:00

um it turns out a lot of the early work

play07:02

in two use turned out to be in the

play07:03

computer vision Community because before

play07:06

large language models lm's you know they

play07:09

couldn't do anything with images so the

play07:10

only option was that the LM generate a

play07:13

function called that could manipulate an

play07:15

image like generate an image or do

play07:17

object detection or whatever so if you

play07:18

actually look at literature it's been

play07:20

interesting how much of the work um in

play07:22

two years seems like it originated from

play07:25

Vision because LMS would blind to images

play07:27

before you know gp4 and and and lava and

play07:31

so on um so that's two use and it

play07:34

expands what an LM can do um and then

play07:38

planning you know for those of you that

play07:40

have not yet played a lot with planning

play07:42

algorithms I I feel like a lot of people

play07:44

talk about the chat GPT moment where

play07:46

you're wow never seen anything like this

play07:48

I think if not used planning alums many

play07:51

people will have a kind of a AI agent

play07:54

wow I couldn't imagine the AI agent

play07:56

doing this I've run live demos where

play07:59

something failed and the AI agent

play08:01

rerouted around the failures I've

play08:02

actually had quite a few of those moment

play08:04

wow you can't believe my AI system just

play08:07

did that autonomously but um one example

play08:10

that I adapted from a hugging GPT paper

play08:12

you know you say this general image

play08:14

where the girls read where a girl is

play08:16

reading a book and it posts the same as

play08:17

a boy in the image example. jpack and

play08:19

please subscribe the new image for your

play08:21

voice so give an example like this um

play08:23

today we have ai agents who can kind of

play08:25

decide first thing I need to do is

play08:27

determine the post of the boy

play08:29

um then you know find the right model

play08:32

maybe on hugging face to extract the

play08:34

post then next need to find a post image

play08:37

model to synthesize a picture of a of a

play08:40

girl of as following the instructions

play08:43

then use image to text to and then

play08:46

finally use text of speech and today we

play08:48

actually have agents that I don't want

play08:50

to say they work reliably you know

play08:52

they're kind of finicky they don't

play08:55

always work but when it works is

play08:57

actually pretty amazing but with agentic

play08:59

loops sometimes you can recover from

play09:00

earlier failures as well so I find

play09:03

myself already using research agents for

play09:05

some of my work where one of piece of

play09:07

research but I don't feel like you know

play09:09

Googling myself and spend a long time I

play09:11

should send to the research agent come

play09:13

back in a few minutes and see what it's

play09:14

come up with and and it sometimes works

play09:16

sometimes doesn't right but that's

play09:17

already a part of my personal

play09:20

workflow the final design pattern multi-

play09:22

Asian collaboration this is one of those

play09:24

funny things but uh um it works much

play09:28

better than you might think

play09:29

uh uh but on the left is a screenshot

play09:33

from a paper called um chat Dev uh which

play09:36

is completely open which actually open

play09:38

source many of you saw the you know

play09:41

flashy social media announcements of

play09:43

demo of a Devon uh uh Chad Dev is open

play09:46

source it runs on my laptop and what

play09:49

Chad Dev doeses is example of a

play09:51

multi-agent system where you prompt one

play09:54

LM to sometimes act like the CEO of a

play09:57

software engine company sometimes Act

play09:59

designer sometime a product manager

play10:01

sometimes I a tester and this flock of

play10:03

agents that you built by prompting an LM

play10:05

to tell them you're now Co you're now

play10:08

software engineer they collaborate have

play10:10

an extended conversation so that if you

play10:12

tell it please develop a game develop a

play10:15

GOI game they'll actually spend you know

play10:18

a few minutes writing code testing it uh

play10:21

iterating and then generate a like

play10:23

surprisingly complex programs doesn't

play10:25

always work I've used it sometimes it

play10:27

doesn't work sometimes it's amazing but

play10:29

this technology is really um getting

play10:32

better and and just one of design

play10:34

pattern it turns out that multi-agent

play10:36

debate where you have different agents

play10:38

you know for example could be have ch

play10:40

GPT and Gemini debate each other that

play10:42

actually results in better performance

play10:45

as well so having multiple simulated air

play10:48

agents work together has been a powerful

play10:50

design pattern as well um so just to

play10:53

summarize I think these are the these

play10:55

are the the the uh patterns of seen and

play10:58

I think that if we were to um use these

play11:01

uh uh patterns you know in our work a

play11:04

lot of us can get a prity boost quite

play11:06

quickly and I think that um agentic

play11:09

reasoning design patterns are going to

play11:12

be important uh this is my small slide I

play11:15

expect that the set of T AI could do

play11:17

will expand dramatically this year uh

play11:20

because of agentic workflows and one

play11:23

thing that it's actually difficult

play11:25

people to get used to is when we prompt

play11:27

an LM we want to response right away

play11:29

um in fact a decade ago when I was you

play11:31

know having discussions around at at at

play11:33

Google on um it called a big box search

play11:36

we type a long prompt one of the reasons

play11:39

you know I failed to push successfully

play11:42

for that was because when you do a web

play11:43

search you one of responds back in half

play11:45

a second right that's just human nature

play11:47

we like that instant grab instant

play11:49

feedback but for a lot of the agent

play11:50

workflows um I think we'll need to learn

play11:53

to dedicate the toss and AI agent and

play11:56

patiently wait minutes maybe even hours

play11:58

uh to for a response but just like I've

play12:01

seen a lot of novice managers delegate

play12:03

something to someone and then check in 5

play12:05

minutes later right and that's not

play12:07

productive um I think we need to it be

play12:10

difficult we need to do that with some

play12:11

of our AI agents as well I saw I heard

play12:14

some loss um and then one other

play12:17

important Trend fast token generation is

play12:18

important because with these agented

play12:21

workflows we're iterating over and over

play12:23

so the LM is generating tokens for the

play12:25

elm to read so be able to generate

play12:26

tokens way faster than any human to read

play12:29

is fantastic and I think that um

play12:31

generating more tokens really quickly

play12:33

from even a slightly lower quality LM

play12:36

might give good results compared to

play12:39

slower tokens from a better LM maybe

play12:41

it's a little bit controversial because

play12:43

it may let you go around this Loop a lot

play12:44

more times kind of like the results I

play12:46

showed with gbd3 and an agent

play12:48

architecture on the first slide um and

play12:51

cand I'm really looking forward to Cloud

play12:53

5 and uh CL 4 and gb5 and Gemini 2.0 and

play12:56

all these other wonderful models that

play12:58

may are building

play12:59

and part of me feels like if you're

play13:01

looking forward to running your thing on

play13:03

gp5 zero shot you know you mayble to get

play13:07

closer to that level performance on some

play13:09

applications than you might think with

play13:11

agenting reasoning um but on an early

play13:14

model I think I I I I think this is an

play13:17

important Trend uh uh and honestly the

play13:21

path to AGI feels like a journey rather

play13:24

than a destination but I think this typ

play13:26

of agent workflows could help us take a

play13:29

small step forward on this very long

play13:31

journey thank

play13:35

[Applause]

play13:38

you

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
AI WorkflowsNeural NetworksMachine LearningProductivity BoostInnovationAndreuStanfordDeep LearningAI AgentsMulti-Agent Systems