AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"
Summary
TLDRDr. Andrew Ng, a prominent figure in AI and co-founder of Google Brain, delivered a talk at Sequoia Capital, emphasizing the transformative potential of AI agents. Ng highlighted the importance of an 'agentic workflow' in AI, where multiple agents with distinct roles collaborate and iterate on tasks, leading to superior outcomes over traditional single-model approaches. He discussed the effectiveness of this method using benchmarks like the HumanEval coding test, where an agentic workflow with GPT 3.5 outperformed GPT-4 with zero-shot prompting. Ng also touched on key agentic design patterns, including reflection, tool use, planning, and multi-agent collaboration, which are set to expand the scope of AI capabilities. He concluded by advocating for patience with AI agents, as their iterative processes may require longer wait times for more refined results, and expressed optimism about the future of AI with the advent of faster inference speeds.
Takeaways
- 🧠 Dr. Andrew Ng, a leading AI expert, is very optimistic about the future of agents in AI, emphasizing their iterative and collaborative nature.
- 📈 Agents, when compared to non-agentic workflows, can produce better results through a process of planning, iteration, and collaboration.
- 🤖 The concept of 'reflection' in agents allows language models to review and improve their own outputs, leading to higher quality results.
- 🛠️ 'Tool use' is a significant feature where agents can utilize predefined tools to perform specific tasks, enhancing their capabilities.
- 📈 Sequoia, a renowned venture capital firm, has a portfolio representing over 25% of the NASDAQ's total value, highlighting their successful tech investments.
- 🔍 'Planning' and 'multi-agent collaboration' are emerging as robust technologies that can lead to surprising and effective outcomes.
- 📝 An agentic workflow can outperform even the latest language models like GPT-4 when used in conjunction with GPT-3.5.
- 🚀 The potential of agentic workflows is expected to expand the scope of tasks AI can perform, possibly leading to significant productivity boosts.
- ⏱️ Fast token generation is crucial for agentic workflows, as it allows for more iterations and quicker responses in complex tasks.
- 🔗 The use of multi-agent systems, where different agents play different roles, can lead to better performance and more reliable outcomes.
- 🌟 The path to Artificial General Intelligence (AGI) is viewed as a journey, and agentic workflows are seen as a step forward in this long-term goal.
Q & A
Who is Dr. Andrew Ng and why is he considered a leading mind in artificial intelligence?
-Dr. Andrew Ng is a computer scientist known for co-founding and heading Google Brain, being the former Chief Scientist at Baidu, and his significant contributions to the field of AI. He has studied at prestigious institutions like UC Berkeley, MIT, and Carnegie Mellon, and he co-founded Coursera, an online learning platform offering a wide range of courses in computer science and other subjects.
What is the significance of Sequoia in the context of Silicon Valley venture capital firms?
-Sequoia is one of the most legendary venture capital firms in Silicon Valley, known for its ability to pick technological winners. Their portfolio of companies represents more than 25% of the total value of the NASDAQ, which is an incredible statistic considering the vast number of companies listed on the exchange.
What is the difference between a non-agentic workflow and an agentic workflow in the context of using language models?
-A non-agentic workflow involves using a language model to generate an answer to a prompt in one go, without any back-and-forth interaction. An agentic workflow, on the other hand, is iterative and involves multiple agents with different roles working together, revising, and iterating on a task to achieve the best possible outcome.
How does the agentic workflow improve the results of tasks compared to a non-agentic approach?
-The agentic workflow improves results by allowing multiple agents, each with different roles and tools, to work together and iterate on a task. This collaborative and iterative process is more aligned with how humans work, leading to higher quality outcomes.
What is the 'reflection' tool in the context of agentic workflows?
-Reflection is a tool used in agentic workflows where a large language model is prompted to review and find ways to improve its own output. This self-evaluation and iterative improvement process can significantly enhance the performance of the language model.
How does tool use enhance the capabilities of language models?
-Tool use allows language models to leverage pre-existing, hardcoded code for specific functions, such as web scraping or stock information retrieval. By providing these tools to the language model, it can use them as needed, enhancing its capabilities without the need to generate new code from scratch.
What is the concept of planning in the context of AI agents?
-Planning in AI agents involves giving the language model the ability to think through steps more slowly and methodically, often by explaining its reasoning step by step. This forced planning can lead to more thoughtful and accurate results.
What are the benefits of multi-agent collaboration in AI workflows?
-Multi-agent collaboration allows different agents, potentially powered by different models, to work together, each contributing their specialized skills or perspectives. This collaboration can lead to more robust and higher quality outcomes compared to a single-agent approach.
How does the concept of 'fast token generation' relate to agentic workflows?
-Fast token generation is important for agentic workflows because these workflows often involve multiple iterations. The ability to generate more tokens quickly, even from a slightly lower quality language model, can lead to better results due to the increased number of iterations possible.
What are some of the challenges in implementing agentic workflows?
-One of the challenges is that agents can be finicky and may not always work as expected. However, the iterative nature of agentic workflows allows for recovery from earlier failures. Another challenge is adjusting to the slower pace of response, as agentic workflows may require waiting for minutes or even hours for the best results.
How does Dr. Andrew Ng's talk relate to the future of AI and the concept of AGI (Artificial General Intelligence)?
-Dr. Ng's talk emphasizes the potential of agentic workflows in pushing the boundaries of what AI can do, which aligns with the pursuit of AGI. While AGI is a long-term goal, the advancements in agentic workflows, tool use, and multi-agent collaboration could contribute to incremental progress towards achieving more general intelligence in AI systems.
What are some of the key takeaways from Dr. Ng's talk regarding the future applications of AI?
-Dr. Ng suggests that the set of tasks AI can perform will expand dramatically due to agentic workflows. He also highlights the importance of fast token generation and the potential for productivity boosts in various workflows. Additionally, he emphasizes the need for patience and a shift in expectations regarding the speed of AI responses, especially when leveraging agentic reasoning.
Outlines
🚀 Dr. Andrew Ng's Optimism on AI Agents
Dr. Andrew Ng, a prominent computer scientist and co-founder of Google Brain, shares his enthusiasm for AI agents at Sequoia, a prestigious Silicon Valley venture capital firm. Ng discusses the potential of models like GPT 3.5 to reason at the level of GPT 4. He emphasizes the iterative and collaborative nature of agentic workflows, which allow for multiple agents to work together, improving tasks through continuous iteration. Ng's talk is significant as it outlines the future of AI, which he believes lies in agents, and provides insights into the powerful combination of different agents with specialized roles.
📈 Agentic Workflows Surpass Zero-Shot Performance
The paragraph delves into the effectiveness of agentic workflows, comparing them to the zero-shot approach where AI is given a task without any examples or opportunities for reflection. It highlights a case study where using an agentic workflow with GPT 3.5 outperformed GPT 4's zero-shot prompting. The summary outlines the broad design patterns seen in agents, such as reflection, tool use, planning, and multi-agent collaboration. These patterns are seen as robust technologies that can significantly enhance productivity and performance of AI applications.
🔍 Reflection and Tool Use in Agentic Design
This section focuses on the concept of reflection, where a large language model (LLM) is asked to review and improve its own output. It also discusses tool use, which allows LLMs to utilize custom-coded tools or existing libraries to enhance their capabilities. The paragraph explains how these tools can be hardcoded for predictable outcomes and how they can be integrated into LLMs to improve their functionality and efficiency.
🤖 Multi-Agent Collaboration and Planning
The paragraph explores the idea of multi-agent collaboration, where different agents with distinct roles work together to achieve a task. It also touches on planning, which enables LLMs to think through steps more deliberately. The speaker shares his experience with research agents and how they can be integrated into personal workflows. The potential of multi-agent systems like crew AI and autogen is highlighted, emphasizing their ability to produce complex and high-quality outcomes when agents collaborate effectively.
⚡ Fast Token Generation and the Future of Agents
The final paragraph discusses the importance of fast token generation for agentic workflows, which rely on rapid iteration. It suggests that even a slightly lower quality LLM can produce good results if it generates tokens quickly, allowing for more iterations in the workflow. The speaker expresses excitement about the upcoming models like GPT 5 and the potential for agents to take a step forward in the journey towards artificial general intelligence (AGI). The paragraph concludes with a call to embrace耐心等待 (patient waiting) for AI agents to complete their tasks, comparing it to the way managers delegate work and expect results.
Mindmap
Keywords
💡Agents
💡Dr. Andrew Ng
💡Sequoia
💡GPT (Generative Pre-trained Transformer)
💡Iterative Workflow
💡Tool Use
💡Reflection
💡Multi-Agent Collaboration
💡Human Eval Benchmark
💡Zero-Shot Prompting
💡Planning
Highlights
Dr. Andrew Ng, a leading mind in AI and co-founder of Google Brain, is very optimistic about the future of agents in AI.
Agents powered by models like GPT 3.5 can reason at the level of GPT 4, indicating significant advancements in AI capabilities.
Sequoia, a renowned Silicon Valley venture capital firm, has a portfolio representing over 25% of the NASDAQ's total value, demonstrating their success in identifying technological winners.
Non-agentic workflows are compared to asking a person to write an essay without revision, whereas agentic workflows involve iterative processes similar to human planning and revision.
The power of agentic workflows lies in the ability to have multiple agents with different roles working together and iterating on a task.
Case studies show that an agentic workflow with GPT 3.5 outperforms zero-shot GPT 4 on coding benchmarks, highlighting the effectiveness of iterative agentic processes.
Reflection, a tool that allows large language models to review and improve their own outputs, significantly enhances performance.
Tool use in AI involves providing agents with custom-coded tools, APIs, and libraries, expanding their capabilities beyond their initial programming.
Planning and multi-agent collaboration are emerging technologies that, despite being finicky, can produce phenomenal results when agents work together.
Different models can power different agents, providing diverse perspectives and enhancing the quality of the final outcome.
Self-reflection in coding involves an agent reviewing and improving its own code, which can lead to more efficient and error-free results.
Automating the coding process through agents can lead to significant productivity boosts and improved code quality.
The use of tools by AI agents allows them to leverage existing, tested code and functionalities, making them more reliable and efficient.
Planning algorithms enable AI agents to autonomously find solutions to problems, circumventing failures, and adapting to new requirements.
Multi-agent collaboration, where different agents play different roles, can lead to complex problem-solving and innovative solutions.
The future of AI is expected to expand dramatically with agentic workflows, potentially changing how we interact with and utilize AI systems.
As AI models become more commoditized, the cost of using these advanced AI functionalities is expected to decrease, making them more accessible.
Fast token generation is crucial for agentic workflows, allowing for more iterations and faster response times, which can lead to better results.
The journey towards AGI (Artificial General Intelligence) is incremental, and agentic workflows could represent a step forward in this long-term progression.
Transcripts
Dr Andrew ning just did a talk at
Sequoia and is all about agents and he
is incredibly bullish on agents he said
things like GPT 3.5 powering agents can
actually reason to the level of GPT 4
and a lot of other really interesting
tidbits so we're going to watch his talk
together and I'm going to walk you
through step by step what he's saying
and why it's so important I am
incredibly bullish on agents myself
that's why I make so many videos about
them and I truly believe the future of
artificial intelligence is going to be a
gentic so first who is Dr Andrew ning he
is a computer scientist he was the
co-founder and head of Google brain the
former Chief scientist of Buu and a
leading mind in artificial intelligence
he went to UC Berkeley MIT and Carnegie
melon so smart smart dude and he
co-founded this company corsera where
you can learn a ton about computer
science about math a bunch of different
topics absolutely free and so what he's
doing is truly incredible and so when he
talks about AI you should listen so
let's get to this talk this is at seoa
and if you're not familiar with seoa
they are one of the most legendary
Silicon Valley venture capital firms
ever now here's an interesting stat
about seoa that just shows how
incredible they are at picking
technological winners their portfolio of
companies represents more than 25% of
Today's total value of the the NASDAQ so
the total value of all the companies
that are listed on the NASDAQ 25% of
that market capitalization are companies
that are owned or have been owned or
invested in by seoa incredible stat
let's look at some of their companies
Reddit instacart door Dash Airbnb a
little company called Apple block
snowflake vanta Zoom stripe WhatsApp
OCTA Instagram this list is absolutely
absurd all right another of the preface
let me get into the Talk itself so a
agents you know today the way most of us
use l Shish models is like this with a
non- agentic workflow where you type a
prompt and generates an answer and
that's a bit like if you ask a person to
write an essay on a topic and I say
please sit down to the keyboard and just
type the essay from start to finish
without ever using backspace um and
despite how hard this is L's do it
remarkably well in contrast with an
agentic workflow this is what it may
look like have an AI have an LM say
write an essay outline do you need to do
any web research if so let's do that
then write the first draft and then read
your own first draft and think about
what parts need revision and then revise
your draft and you go on and on and so
this workflow is much more iterative
where you may have the L do some
thinking um and then revise this article
and then do some more thinking and
iterate this
through a number of times so I want to
pause it there and talk about this
because this is the best explanation for
why agents are so powerful I've heard a
lot of people say well agents are just
llms right and yeah technically that's
true but the power of an agentic
workflow is the fact that you can have
multiple agents all with different roles
different backgrounds different personas
different tools working together and
iterating that's the important word
iterating on a task so in this example
he said okay write an essay and yeah an
llm can do that and usually it's pretty
darn good but now let's say you have one
agent who is the writer another agent
who is the reviewer another for the
spell checker another for the grammar
checker another for the fact Checker and
they're all working together and they
iterate over and over again passing the
essay back and forth making sure that it
finally ends up to be the best possible
outcome and so this is how humans work
humans as he said do not just do
everything in one take without thinking
through and planning we plan we iterate
and then we find the best solution so
let's keep listening what not many
people appreciate is this delivers
remarkably better results um I've
actually really surprised myself working
these agent workflows how well how well
they work other let's do one case study
at my team analyzed some data using a
coding Benchmark called the human eval
Benchmark released by open a few years
ago um but this says coding problems
like given the nonent list of integers
return the sum of all the odd elements
are an even positions and it turns out
the answer is you co snipper like that
so today lot of us will use zero shot
prompting meaning we tell the AI write
the code and have it run on the first
spot like who codes like that no human
codes like that just type out the code
and run it maybe you do I can't do that
um so it turns out that if you use GPT
3.5 uh zero shot prompting it gets it
48% right uh gbd4 way better 67 7% right
but if you take an agentic workflow and
wrap it around GPT 3.5 say it actually
does better than even
gbd4 um and if you were to wrap this
type of workflow around gbd4 you know it
it it also um does very well all right
let's pause here and think about what he
just said over here we have the zero
shot which basically means you're simply
telling the large language model do this
thing not giving it any example not
giving it any chance to think or to
iterate or any fancy prompting just do
this thing and it got the human evalve
Benchmark 48% correct then GPT 4 67%
which is you know a huge Improvement and
we're going to continue to see
Improvement when GPT 5 comes out and so
on however look at this GPT 3.5 wrapped
in an agentic workflow any of these all
perform better than the zero shot GPT 4
using only GPT 3.5 and this lb BD plus
reflection it's actually nearly 100%
it's over 95% then of course if we wrap
GPT 4 in the agentic workflow metag GPT
for example we all know about it
performs incredibly well across the
board and agent coder kind of at the top
here so it's really just showing the
power of agentic workflows and you
notice that GB 3.5 with an agentic
workflow actually outperforms
gp4 um and I think this has and this
means that this has signant consequences
I think how we all approach building
applications so agents is the term has
been tossed around a lot there's a lot
of consultant reports how about agents
the future of AI blah blah blah I want
to be a bit concrete and share of you um
the broad design patterns I'm seeing in
agents it's a very messy chaotic space
tons of research tons of Open Source
there's a lot going on but I try to
categorize um bit more concretely what's
going on agents reflection is a tool
that I think many of us are just use it
just works uh to use I think it's more
widely appreciated but actually works
pretty well I think of these as pretty
robust Technologies when I all right
let's stop there and talk about what
these things are so reflection is as
obvious as it sounds you are literally
saying to the large language model
reflect on the output you just gave me
find a way to improve it then return
another result or just return the
improvements so very straightforward and
it seems so obvious but this actually
causes large language models to perform
a lot better and then we have tool use
and we learned all about tool use with
projects like autogen and crew AI tool
use just means that you can give them
tools to use you can custom code tools
it's like function calling so you could
say Okay I want a web scraping tool and
I want an SEC lookup tool so you can get
stock information about ticker symbols
you can even plug in complex math
libraries to it I mean the possibilities
are literally endless so you can give a
bunch of tools that the large language
model didn't previously have you just
describe what the tool does and the
large language model can actually choose
when to use the tool it's really cool
use them I can you know almost always
get them to work well um planning and
multi-agent collaboration I think is
more emerging when I use them sometimes
my mind is blown for how well they work
but at least at this moment in time I
don't feel like I can always get them to
work reliably so let me walk through
these full design Pat
all right so he's going to walk through
it but I just want to touch on what
planning and multi-agent collaboration
is so planning we're basically saying
giving the large language model the
ability to think more slowly to plan
steps and that's usually by the way why
in all of my llm tests I say explain
your reasoning step by step because that
kind of forces them to plan and to think
through each step which usually produces
better results and then multi-agent
collaboration that is autogen and crew
AI that is a very emergent technology
techology I am extremely bullish on it
it is sometimes difficult to get the
agents to behave like you need them to
but with enough QA and enough testing
and iteration you usually can and the
results are phenomenal and not only do
you get the benefit of having the large
language model essentially reflect with
different personalities or different
roles but you can actually have
different models powering different
agents and so you're getting the benefit
of the reflection based on the quality
of each model so you're basically
getting really different opinions as
these agents are working together so
let's keep listening and if some of you
go back and yourself will ask your
engineers to use these I think you get a
productivity boost quite quickly so
reflection here's an example let's say I
ask a system please write Cod for me for
a given task then we have a coder agent
just an LM that you prompt to write code
to say you def do Tas write a function
like that um an example of
self-reflection would be if you then
prompt the LM with something like this
here's code intended for a toss and just
give it back the exact same code that
they just generated and then say check
the code carefully for correctness sound
efficiency good construction CRI just
write a prompt like that it turns out
the same L that you prompted to write
the code may be able to spot problems
like this bug in line five and fix it by
blah blah blah and if you now take his
own feedback and give it to it and
reprompt it it may come up with a
version two of the code that could well
work better than the first version not
guaranteed but it works you know often
enough but this to be worth trying for a
law of appli so what you usually see me
doing in my llm test videos is for
example let's say I say write the Game
snake in Python and it gives me the game
Snake it's that is zero shot I'm just
saying write it all out in one go then I
take it I put it in my VSS code I play
it I get the error or I look for any
bugs and then I paste that back in to
the large language model to fix now
that's essentially me acting as an agent
and what we can do is use an agent to
automate me so basically look at the
code look for any potential errors and
even agents that can run the code get
the error and pass it back into the
large language model now it's completely
automated coding to foreshadow to use if
you let it run unit tests if it fails a
unit test then why do you fail the unit
test have that conversation and be able
to figure out failed the unit test so
you should try changing something and
come up with V3 by the way for those of
you that want to learn more about these
Technologies I'm very excited about them
for each of the four sections I have a
little recommended reading section in
the bottom that you know hopefully gives
more references and again just the
foreshadow of multi-agent systems I've
described as a single coder agent that
you prompt to have it you know have this
conversation with itself um one Natural
Evolution of this idea is instead of a
single code agent you can have two
agents where one is a code agent and the
second is a critic agent and these could
be the same base LM model but they you
prompt in different ways where you say
one your exper coder right code the
other one say your expert code review as
to review this code and this type of
workflow is actually pretty easy to
implement I think such a very general
purpose technology for a lot of
workflows this will give you a
significant boost in in the performance
of LMS um the second design pattern is
to use many of you will already have
seen you know lmb systems uh uh using
tools on the left is a screenshot from
um co-pilot on the right is something
that I kind of extracted from uh gbd4
but you know LM today if you ask it
what's the best coffee maker can do web
search for some problems LMS will
generate code and run codes um and it
turns out that there are a lot of
different tools that many different
people are using for analysis for
gathering information for taking action
personal productivity um it turns out a
lot of the early work and to use turned
out to be in the computer vision
Community because before large language
models LMS you know they couldn't do
anything with images so the only option
was that the LM generate a function call
that could manipulate an image like
generate an image or do object detection
or whatever so if you actually look at
literature it's been interesting how
much of the work um in two years seems
like it originated from Vision because
Elms would blind to images before you
know GPD 4V and and and lava and so on
um so that's to use in it all right so
tool use incredibly incredibly important
because you're basically giving the
large language model code to use it is
hardcoded code so you always know the
result it's not another large language
model that might produce something a
little different each time this is
hardcoded and always is going to produce
the same output so these tools are very
valuable and the cool thing about tools
is we don't have to rewrite them right
we don't have to write them from scratch
these are tools that programmers already
test app to use in their code so whether
it's external libraries API calls all of
these things can now be used by large
language models and that is really
exciting we're not going to have to
rewrite all of this tooling and then
planning you know for those of you that
have not yet played a lot with planning
algorithms I I feel like a lot of people
talk about the chat GPT moment where
you're wow never seen anything like this
I think if not use planning alums many
people will have a kind of a AI agent
wow I couldn't imag imagine the AI agent
doing this so I've run live demos where
something failed and the AI agent
rerouted around the failure I've
actually had quite a few of them like
wow you can't believe my AI system just
did that autonomously but um one example
that I adapted from hugging GPT paper
you know you say this general image
where the girls read where girl and by
the way I made a video about hugging GPT
it is an amazing paper I'll link that in
the description below I was reading a
book and it post the same as a boy in
the image example le. jpack and please
subcribe the new imagy re voice so give
an example like this um today we have ai
agents who can kind of decide first
thing I need to do is determine the post
of the boy um then you know find the
right model maybe on hugging face to
extract the post then next need to find
a post image model to synthesize a
picture of a of a girl of as following
the instructions then use uh image to
text and then finally use text to speech
and today we actually have agents that
I don't want to say they work reliably
you know they're kind of finicky they
don't always work but when it works is
actually pretty amazing but with agentic
Loop sometimes you can recover from
earlier failures as well so yeah and
that's a really important Point agents
are a little bit finicky but since you
can iterate and the Agents can usually
recover from their issues that makes
them a lot more powerful and as we
continue to evolve agents as we get
better agentic models better tooling
better Frameworks like crew aai and
autogen all of these kind of finicky
aspects of agents are going to start to
get reduced tremendously I find myself
already using research agents in some of
my work well one a piece of research but
I don't feel like you know Googling
myself and spend long time I should send
to the research agent come back in a few
minutes and see what it's come up with
and and it it sometimes works sometimes
doesn't right but that's already a part
of my personal
workflow the final design pattern multi-
Asian collaboration ation this is one of
those funny things but uh um it works
much better than you might think uh uh
but on the left is a screenshot from a
paper called um chat Dev I made a video
about this it'll be in the description
below as well uh which is completely
open which actually open source many of
you saw the you know flashy social media
announcement of demo of a Devon uh uh
Chad Dev is open source it runs on my
laptop and what Chad Dev does is example
of a multi-agent system where you prompt
one LM to sometimes act like the CEO of
a software engine company sometimes act
a designer sometime a product manager
sometimes ACC a tester and this flock of
agents that you buil by prompting an LM
to tell them you're now coo you're now
software engineer they collaborate have
an extended conversation so that if you
tell it please develop a game develop a
GOI game they'll actually spend you know
a few minutes writing code testing it
iterating and then generate a like
surprisingly complex programs doesn't
always work I've used it sometimes it
doesn't work sometimes is amazing but
this technology is really um getting
better and and just one of design
pattern it turns out that multi-agent
debate where you have different agents
you know for example could be have ch
GPT and Gemini debate each other that
actually results in better performance
as well all right so he said the
important part right there when you have
different agents and each of them are
are powered by different models maybe
even fine-tuned models fine-tuned
specifically for their task and their
role you get really good performance and
that is exactly what a project like crew
AI like autogen is made for so Gabby
multiple simulated air agents work
together has been a powerful design
pattern as well um so just to summarize
I think these are the these are the the
the uh patterns I've seen and I think
that if we were to um use these uh uh
patterns you know in our work a lot of
us can get a prity boost quite quickly
and I think that um agentic reasoning
design patterns are going to be
important uh this is my small slide I
expect that the set of task AI could do
will expand dramatically this year uh
because of agentic workflows and one
thing that it's actually difficult
people to get used to is when we prompt
an LM we want to response right away um
in fact a decade ago when was you know
having discussions around at at at
Google on um called a big box search
type in Long prompt one of the reasons
you know I failed to push successfully
for that was because when you do a web
search you one have responds back in
half a second right that's just human
nature we like that instant gra instant
feedback but for a lot of the agent
workflows um I think we'll need to learn
to dedicate the toss and AI agent and
patiently wait minutes maybe even hours
uh to for response but just like us I've
seen a lot of novice managers delegate
something to someone and then check in
five minutes later right and that's not
productive um I think we need to it be
difficult we need to do that with some
of our AI agents as well all right so
this is actually a point which I want to
pose a different way of thinking about
it think about grock grock grq you get
500 700 850 tokens per second with grock
with their architecture and all of a
sudden the agents which you know you
usually expect them to take a few
minutes to do a semi complex task all
the way up to 10 15 20 minutes depending
on what the task is a lot of the time in
that task completion is the inference
running that is assuming you're getting
you know 10 15 20 tokens per second with
open AI but if you're able to get 800
tokens per second it's essentially
instant and a lot of people when they
first saw grock they thought well what's
the point of 800 tokens per second
because humans can't read that fast this
is the best use case for that agents
using hyper inference speed and reading
each other's responses is the best way
to leverage that really fast inference
speed humans don't actually need to read
it so this is a perfect example so if
all of a sudden that part of your agent
workflow is extremely fast and then
let's say we get an embeddings model to
be that fast all of a sudden the slowest
part of the entire agent workflow is
going to be searching the web or hitting
a third party API it's no longer going
to be the inference and the embeddings
and that is really exciting let's keep
watching the end and then one other
important Trend fast token generation is
important because with these agentic
workflows we're iterating over and over
so the elm is generating tokens for the
to read and I think that um generating
more tokens really quickly from even a
slightly lower quality LM might give
good results compared to slower tokens
from a betm maybe it's a little bit
controversial because it may let you go
around this Loop a lot more times kind
of like the results I showed with gpdc
and an agent architecture on the first
slide um and cand I'm really looking
forward to Cloud 5 and Cloud 4 and gb5
and Gemini 2.0 and all these other one4
models that many building and part of me
feels like if you're looking forward to
running your thing on gb5 zero shot you
know you may be to get closer to that
level of performance on some
applications than you might think with
agent reasoning um but on an early model
I think I I I I think this is an
important Trend uh uh and honestly the
path to AGI feels like a journey rather
than a destination but I think this typ
of agent workflows could help us take a
small step forward on this very long
journey thank you okay so he said a lot
of important things at the end there one
thing he said is if you're already
looking forward to GPT 5 clae 4 the
basically the next generation of The
Cutting Edge models you might be able to
achieve
and what's the cost of all these tokens
and all of that I think is going to get
sorted out as models become more and
more commoditized so I'm super excited
about agents I'm super excited about
inference speed improvements and I hope
you liked Andrew ning's talk if you
liked this video please consider giving
a like And subscribe and I'll see you in
the next one
Browse More Related Video
AUTOGEN STUDIO : The Complete GUIDE (Build AI AGENTS in minutes)
【人工智能】AI智能体工作流 | Agentic Reasoning | 吴恩达Andrew Ng | 红杉AI Ascent 2024分享 | Agent 4大设计模式
Will "Claude Investor" DOMINATE the Future of Investment Research?" - AI Agent Proliferation Begins
OpenAI'S "SECRET MODEL" Just LEAKED! (GPT-5 Release Date, Agents And More)
Claude DISABLES GUARDRAILS, Jailbreaks Gemini Agents, builds "ROGUE HIVEMIND"... can this be real?
Using agents to build an agent company: Joao Moura
5.0 / 5 (0 votes)