AI Leader Reveals The Future of AI AGENTS (LangChain CEO)
Summary
TLDR视频脚本讨论了智能代理(agents)的当前状态和未来发展。Lang chain的CEO和创始人Harrison Chase强调,代理不仅仅是复杂的提示(prompts),而是具有工具使用、记忆和规划等多种能力的复杂系统。他提到,通过为代理提供短期和长期记忆,以及允许它们执行计划和行动,可以显著提高代理的性能。Harrison还探讨了用户体验(UX)的重要性,强调了“人类在循环中”的必要性,以及如何通过代理框架减少幻觉(hallucinations)并提高可靠性。最后,他讨论了代理记忆的两个方面:程序记忆和个性化记忆,以及它们如何对下一代代理的发展至关重要。整个讨论突出了在构建生产就绪和现实世界中的代理时,开发者们正在努力解决的一些关键问题和挑战。
Takeaways
- 🤖 **智能代理的复杂性**:智能代理不仅仅是复杂的提示,它们能够使用工具、记忆和执行计划,远超过简单的语言模型提示。
- 🧠 **记忆的重要性**:代理拥有短期记忆和长期记忆,这对于提高代理性能至关重要,长期记忆如使用RAG技术,可以存储信息供以后使用。
- 🛠️ **工具使用**:代理可以使用各种工具,如日历、计算器、网络访问和代码解释器,这些工具极大地扩展了代理的能力。
- 📈 **性能提升**:通过添加短期和长期记忆功能,代理的性能得到了显著提升。
- 🔍 **规划与行动**:代理能够进行规划,包括自我批评、思考链分解和执行动作,这使得它们能够更有效地完成任务。
- 🔁 **循环迭代**:代理的工作方式可以被看作是在循环中运行语言模型,不断询问并执行下一步操作,直到任务完成。
- 🌐 **开发者关注点**:开发者正在将智能代理推向生产就绪和现实世界的应用,特别关注规划、用户体验和记忆。
- 📐 **流程工程**:通过设计良好的流程图或状态机,可以提高代理的效率,这需要人类工程师在开始时进行规划。
- 🔮 **未来展望**:未来的智能代理可能需要全新的架构来实现更深层次的逻辑和推理能力。
- 🤝 **协调一致性**:代理框架在协调不同模型和工具方面非常有价值,即使在未来模型能够更缓慢地思考时,这些框架仍然是必不可少的。
- 🔄 **可逆性与编辑**:用户界面设计中的可逆性和编辑能力,如Devon演示的那样,可以提高用户体验并使代理更加可靠。
- 🧵 **记忆的类型**:代理需要程序记忆(正确执行任务的记忆)和个性化记忆(关于用户的事实,用于个性化体验)。
Q & A
Harrison Chase 是谁?
-Harrison Chase 是 Lang chain 的首席执行官和创始人,Lang chain 是一个流行的编码框架,允许用户轻松地将不同的 AI 工具组合在一起。
什么是 Lang chain?
-Lang chain 是一个开发者框架,用于构建各种大型语言模型(LLM)应用程序,其中最常见的类型之一是代理(agents)。
代理(agents)是什么?
-代理是一种使用语言模型与外部世界互动的工具。它们不仅仅是复杂的提示(prompts),而是拥有访问日历、计算器、网络等工具的能力,还具备短期和长期记忆,能够进行规划和执行动作。
为什么代理不仅仅是大型语言模型的提示?
-因为代理拥有超出大型语言模型本身的能力,如使用工具、记忆和执行规划等,这些能力使得代理能够执行更复杂的任务,远不止生成文本响应。
代理中的规划是什么?
-规划是代理能够进行自我反思、提前计划、将复杂任务分解为子任务的能力,这是目前单独的大型语言模型尚不能有效执行的功能。
什么是“树状思维”(Tree of Thoughts)?
-树状思维是一种允许模型生成对提示的初始响应,然后将该响应反馈给模型并询问如何改进的方法,从而赋予模型自我反思和规划的能力。
代理框架中的“人类在循环中”(Human in the Loop)是什么?
-人类在循环中是指在代理执行任务的过程中,人类用户可以介入以提供指导或纠正,以提高代理的可靠性和输出质量。
为什么说代理框架对于协调不同的模型和工具非常有价值?
-代理框架可以帮助开发者构建工具和策略,协调不同的模型和代理,提供一致的工作流程,即使在未来模型能够更慢地思考和规划时,这些框架仍将非常有价值。
代理的用户界面(UX)设计为什么很重要?
-用户界面设计影响着用户与代理的交互方式,良好的UX设计可以提高代理的可靠性和用户的使用体验,例如提供“回放和编辑”功能,允许用户回退到代理的某个状态并进行编辑。
代理的短期记忆和长期记忆有什么区别?
-短期记忆指的是在对话或同一对话中代理之间的记忆,而长期记忆则涉及到存储以备后用的信息,如使用检索增强生成(RAG)技术。长期记忆对于个性化和企业环境中的知识保留至关重要。
为什么说记忆管理在代理中非常关键?
-记忆管理对于代理的个性化和适应性至关重要。它需要能够记住正确的操作方式和用户的个性化信息,同时也要能够随着业务需求的变化而演化和更新。
Outlines
🤖 代理机器人的当前状态与未来发展
Harrison Chase,Lang chain的CEO和创始人,在Sequoia活动上讨论了代理机器人的现状以及未来的发展。他提到代理不仅仅是复杂的提示,它们通过使用大型语言模型与外部世界互动,并且可以配备各种工具,如日历、计算器、网络访问等。代理还拥有短期记忆和长期记忆,能够进行规划和执行行动。Harrison强调了代理的性能在添加了短期和长期记忆功能后得到了显著提升。他还提到了规划的重要性,包括自我批评、思考链、子目标分解等,并讨论了开发者如何使代理更加生产就绪和实用。
🧐 代理的用户体验和记忆功能
讨论了代理的用户体验(UX)和记忆功能的重要性。提到了大型语言模型几乎肯定会产生幻觉,而代理框架通过缓存、提示库等方法帮助减少这些幻觉。人机交互仍然必要,因为代理的可靠性还不够高。讨论了如何在自动化和人工干预之间找到平衡点,以及如何通过用户界面设计提高用户体验。还提到了允许用户回溯和编辑代理操作的重要性,以及个性化记忆在提升用户体验中的作用。
🔄 代理的规划和行动能力
深入探讨了代理的规划能力和执行行动的重要性。Harrison提到,尽管当前的语言模型还不能完全可靠地进行规划,但通过外部提示策略和认知架构,可以提高模型的性能。他提出了关于未来代理是否会内置这些策略的问题,以及是否需要全新的架构来让模型能够更好地逻辑推理和提前规划。他还讨论了流工程的概念,即通过设计流程图或状态机来明确代理的工作流程,以及这如何成为代理框架的一个关键优势。
📚 代理的长期和短期记忆
讨论了代理的长期和短期记忆对于提供个性化体验和企业环境中的知识掌握的重要性。Harrison展示了如何通过代理框架构建记忆功能,以及这如何帮助代理学习和改进。他还提到了记忆的复杂性,包括存储多少信息、何时忘记某些事情、以及记忆如何随着企业需求的变化而发展。最后,他强调了尽管目前还没有确定的最佳实践,但给予代理记忆能力、使用流工程和工具等是目前可行的方法。
Mindmap
Keywords
💡代理(Agents)
💡Lang Chain
💡规划(Planning)
💡用户体验(User Experience)
💡记忆(Memory)
💡工具使用(Tool Usage)
💡执行动作(Taking Actions)
💡人类参与(Human in the Loop)
💡流程工程(Flow Engineering)
💡大型语言模型(Large Language Models, LLM)
💡CREW AI
Highlights
Harrison Chase,Lang chain的CEO和创始人,在Sequoia活动上就代理人进行了演讲,讨论了代理人的当前状态以及未来发展方向。
代理人不仅仅是复杂的提示,它们可以访问日历、计算器、网络等工具,并且拥有短期和长期记忆。
代理人可以进行规划,包括自我批评、思考链分解和执行子目标。
Crew AI发布了具有短期和长期记忆的代理人框架,显著提高了代理人性能。
代理人可以执行循环中的LLM(大型语言模型),通过循环询问和执行来完成任务。
开发者正在探索如何使代理人在生产环境中更加可靠和实用。
规划是代理人发展中令人兴奋的领域之一,它允许模型提前规划和分解复杂任务。
通过反思和树状思考等技术,可以提高模型的规划和思考能力。
代理人框架允许从大型语言模型提示中提取更多质量和性能。
未来可能需要一种新的架构来使模型能够更好地逻辑推理和提前规划。
开发者可能需要自己构建工具和策略,直到模型能够内生地进行缓慢思考和规划。
流程工程是代理人发展中的另一个重要方面,涉及设计工作流程和状态机。
代理人框架有助于流程工程,超越了单纯的提示工程。
用户体验(UX)是代理人应用中的一个关键领域,需要找到与代理人互动的正确方式。
人类在循环中是必要的,因为代理人的可靠性还不够高,但过多的人类干预会减少自动化的价值。
Devon和DEA展示了强大的用户界面设计,允许在单一视图中查看所有操作。
回放和编辑功能是用户体验中的一个重要方面,允许用户回到过去的状态进行编辑。
代理人的记忆包括程序记忆和个性化记忆,对于提供个性化体验和企业环境中的学习至关重要。
代理人框架正在构建或已经构建了长期和短期记忆功能,这对于企业的知识和个性化体验非常有帮助。
代理人的发展仍处于早期阶段,许多问题如记忆存储、规则制定和业务适应性仍在探索之中。
Transcripts
let's talk about agents Harrison chase
the CEO and founder of Lang chain did a
talk at this Sequoia event that I made
another video on a couple weeks ago
where Andrew ning did a talk also there
and Harrison's talk is also about agents
and the current state of Agents what to
expect from agents in the future where
they work really well where they don't
and so let's watch it together and I'll
comment on it as we go through it so
let's watch a quick note before I get to
the video if you want a chance to win a
rabbit R1 all you need to do is
subscribe to my newsletter get awesome
AI updates twice a week and stay up
toate on the world of AI I'll drop the
link to subscribe in the description
below so check it out subscribe to my
newsletter and maybe you can win this
rabbit R1 now back to the video for
those of you who are not familiar with
Harrison he is as I mentioned the
co-founder and CEO of Lang chain and if
you haven't heard of Lang chain let me
tell you quickly about what they do so
Lang chain is a super popular coding
framework that allows you to basically
just take a bunch of different AI tools
and plug them all together really easily
the chain part and really this was
agents before agents had a term and so
of course Harrison is incredibly
knowledgeable about agents so now let's
watch the video thanks for the intro and
and thanks for having me excited to be
here so today I want to talk about
agents uh so L chains the developer
framework for building all types of llm
applications but one of the most common
ones that we see being built are agents
um and we've heard a lot about agents uh
from a variety of speakers before so I'm
not going to I'm not going to go into
too much of of a deep kind of like
overview but at a high level it's using
a language model to interact with the
external world all right I actually want
to stop it right away so one thing that
I've heard quite a lot and less So
lately now that agents have really
become mainstream is agents are just
prompts they're just complex prompts but
that's not necessarily true and even if
it were there's so much going on around
that that that is what makes it so
special and this is a great graph for
actually understanding what's going on
with agents so you can think of the
large language model as this one little
piece right here the agent itself then
you can give that agent tools so they
can have access to your calendar to a
calculator to the web they can do code
interpreter which means they can
actually spin up environments and write
and run code and basically there's an
unlimited amount of tools that you can
give agents then we give agents memory
both short memory and long memory so
short memory means memory between a
conversation or within the conversation
between agents and longterm memory is
something like rag for example so
retrieval augmented generation saving
information to be used later and crew AI
my favorite agent framework just
released both short-term and long-term
memory and has shown that the agent
performance has significantly improved
since adding these features so agents
can also do planning which is reflection
self-critique ique chain of thoughts
subgoal decomposition and then they can
also perform actions so with all of
these additional superpowers agents or
just the large language model prompt
becomes so much more than just that and
we're going to touch on planning in a
moment because Harrison says something
really interesting about it so let's
keep watching tool usage memory planning
taking actions is is kind of the highle
gist and the simple form of this you can
maybe think of as just running an llm in
a for Loop so you ask the llm what to do
you then go execute that and then you
ask it what to do again and then you
keep on doing that until it decides it's
done so today I want to talk about some
of the areas that I'm really excited
about that we see developers spending a
lot of time in and really taking this
idea of of of an agent and making it
something that's production ready and
and and real world and and really you
know the future of Agents as the title
suggests so there's three main things
that I want to talk about and we've
actually touched on uh all of these in
some capacity already so I think it's a
great Roundup so planning uh the user
experience and memory so for planning
Andrew uh covered this really nicely in
his talk um but we see a few the basic
idea here is that if you think about
running the llm in a for Loop often
times there's multiple steps that it
needs to take so I'm going to pause
there for a second so I've already done
a video all about the tree of thoughts
paper which is incredible so be sure to
check it out I'll link it in the
description below and then I I haven't
actually done a review of the reflection
paper but the gist is you allow a model
to generate this initial response to a
prompt and then you simply feed it back
and say hey what would you do better and
that's the very simple explanation of
what it does but essentially what we're
doing is giving the models the ability
to reflect to plan ahead to break
complex task down into subtasks and
that's something that the models
themselves alone can't do
yet and I've made a few videos about qar
and qar has to do with giving the models
the ability to plan and look ahead but
that's not something we have to play
around with today and in fact just today
I released a video about the gpt2
chatbot large language model that was
mysteriously released on LM CIS and then
it was actually just recently taken down
and a lot of people think that that is a
model that has the ability to power
agents because it does have this
planning ability more so than anything
we've ever seen and I haven't
necessarily seen that but I also wasn't
explicitly testing for that but the
point is agents and agent Frameworks
allow you to extract so much more
quality so much more performance out of
just a large language model prompt and
so when you're running it in a for Loop
you're asking it implicitly to kind of
reason and plan about what the best next
step is see the observation and then
kind of like resume from there and think
about the what the what the next best
step is right after that
right now at the moment language models
aren't really good enough to kind of do
that reliably and so we see a lot of
external uh uh papers and external
prompting strategies kind of like
enforcing planning in in some method
whether this be uh planning steps
explicitly up front um or reflection
steps at the end to see if it's kind of
like done everything correctly as as it
should and I've actually made a video
about models that were explicitly
trained to quote unquote think slowly
and orca is a great example of that and
orca is a project out of Microsoft that
really teaches the model how to think
slowly and use a lot of these techniques
whether we're talking about reflection
or tree of thoughts or other kind of
slow thinking techniques automatically
without us having to prompt or kind of
code around the model to make it do that
I think the interesting thing here
thinking about the future is whether
these types of prompting strategies and
these types of like cognitive
architectures continue to be things that
developers are building or whether they
get built into the model apis as we
heard Sam talk a little bit about um
yeah so that's really a question still
and I'm not sure my guess is it's going
to take a new architecture something
completely new Beyond just the
Transformers architecture to allow these
models to really logic and reason
properly to plan ahead to think to think
slowly and that's just not what they do
today so maybe that's what GPT 5 is
going to be maybe that's what qar is but
I haven't seen any evidence that we
actually have a large language model
that can do that so for now developers
are going to have to build these tools
and these strategies themselves which is
fine because companies like crew AI make
it really easy to do that and even when
models will be able to think more slowly
and have these things inherently in them
agent Frameworks are still going to be
very valuable for coordinating different
models for giving tools for being able
to coordinate different models
coordinate different agents give them
different tools and coordinate a very
consistent workflow um and so for all
three of these to be clear like I don't
have answers uh and I just have
questions and so one of my questions
here is you know are these planning
prompting things short-term hacks or
long-term uh necessary components
actually let me know what you think in
the comments do you think that these
types of prompting techniques reflection
tree of thoughts are these short-term
hacks and then event ually the models
will just be able to do this without
prompting them to do so or using
external techniques or are these
techniques we're going to have to do
forever another another kind of like
aspect of this is just the importance of
basically flow engineering and so this
term I heard come out of this paper
Alpha codium it basically achieves
state-of-the-art kind of like coding
performance not necessarily through
better models or better prompting
strategies but through better flow
engineering so explicitly designing this
uh kind of like graph or or or state
machine type thing and I think one way
to think about this is you're actually
offloading the planning of what to do to
the human Engineers who are doing that
at the beginning and so you're relying
on that as a little bit of a crutch all
right so that's a really good point and
again that's why I'm so bullish on agent
Frameworks they help you with the flow
engineering piece and Beyond just prompt
engineering now we're talking about flow
engineering and that's a whole separate
Art and Science in itself and it's still
very early days we're still trying to
figure out what types of flows work well
how many agents work well together is
there a maximum is there a minimum how
should they plan what steps should they
execute so it's still early it's still
really fun to watch the next thing that
I want to talk about is the ux of a lot
of agent applications this is actually
one area I'm really excited about I
don't think we've kind of nailed the the
right way to interact with these agent
applications I think uh human in the
loop is kind of still necessary because
they're not super reliable and I want to
talk about human in the loop so I work
with large companies helping them with
their AI strategy and consistency and
reliability and quality is insanely
important to them and when you're
talking about large language models
hallucinations are almost guaranteed so
how do you avoid them well there's a few
ways again agent Frameworks help you
really reduce hallucinations through
things like caching through prompt
libraries through obviously reducing the
temperature of the large language model
but also human in the loop and that's
really important especially to large
Enterprise companies and I don't think
human in the loop is going to go away
anytime soon but as Harrison says if you
have too much human in the loop
basically you're removing all of the
Automation and there's this fine balance
of where you actually need human in the
loop and I think it's essentially
whenever you have a deliverable whenever
the agents produce something that is
substantial and is a piece of something
that will be delivered and relied upon
within the organization and so that's
something I'm still experimenting with
is what the optimal human in thee Loop
strategy is but if it's in the loop too
much then it's not actually doing that
much useful thing so there's kind of
like a weird balance there one ux thing
that I really like uh from from Devon uh
which came out you know a week two weeks
ago and speaking of Devon as much
virality as they had and as much dunking
as they had and people calling it out as
actually like doing less than what
they've shown in the Dem Mo the ux is
fantastic and shortly after Devon we had
DEA and open Devon so obviously they did
something right with showing all of the
screens the browser the chat window the
terminal the code all in one screen that
was obviously a really powerful UI
because it was copied and a lot of
people like it so I think this was
immediately one of the big contributions
of seeing the Devon demo was just
everybody realized oh this is a great
way to structure the user interface um
and and and Jordan kind of like uh put
this nicely on Twitter is is the
presence of like a rewind and edit
ability so you can basically go back to
a point in time where the edit or where
the agent was and then edit what it did
or edit the state that it's in so that
it can make a more informed decision and
I think this is a really really powerful
ux um that we're really excited about uh
at L chain and exploring this more and I
think this brings a little bit more
reliability um but at the same time kind
of like steering ability to the agents
so let's talk talk about being able to
rewind and change things uh I agree this
is a really incredible user experience
because there are times where you kind
of go off in a path in a direction and
you find that that was not the right
thing to do so let's go back and start
from this state and I've seen one
project do this incredibly well and
they've actually been a sponsor of this
channel but it comes to mind because
they really do do it so well which is
pythagora and that was the AI coding
assistant that I've shown you before and
pythagora has this ability to basically
rewind to any step along the entire
journey of a project and you can start
from there you can edit it and continue
on from there so really cool and that's
kind of what Devin does that's also what
Harrison is talking about I think that's
going to be a very strong piece of agent
coordination and I can't wait until all
of the agent Frameworks build that in
speaking of kind of like steering
ability the the last thing I want to
talk about is the memory of of Agents um
and so Mike uh zapier showed this off a
little a little bit earlier where he was
basically interacting with the bot and
kind of like teaching it what to do and
correcting it and so this is an example
where I'm teaching um in a chat setting
in AI to kind of like write a tweet in a
specific style and so you can see that
I'm just correcting it in natural
language to get to a style that I want I
then hit thumbs up the next time I go
back to this application it it remembers
the style that I want but I can keep on
editing it I can keep on making it a
little more differentiated and when I go
back a third time it remembers all of
that and so this I would kind of
classify as kind of like procedure
memory so it's remembering the correct
way to do something I think another
really important aspect is is basically
personalized memory so remembering facts
about a human that you might not
necessarily use to to do something more
correctly but you might use to make the
experience kind of like more
personalized um so this is an example
kind of like journaling app that that
we're building and playing around with
for exploring memory and you can see
that I mentioned that I went to a
cooking class and it remembers that I
like Italian food and so I think
bringing in these kind of like
personalized aspects um whether it be
procedural or or kind of like these
personalized facts will be really
important for the next generation of
Agents um that's all I have so that is
both long-term and short-term memory in
the short term you should be able to go
back and forth with an agent or allow
the agents to go back and forth with
each other and they can learn and
improve along the way and that might be
also where human in the loop comes in
you can kind of steer them but then we
have long-term memory which is also
really important not only for
personalization but also also within the
context of businesses and Enterprise the
ability for these agents to learn things
to have obviously the company's
knowledge at hand at any time but that's
just rag but basically learn that and
use that memory for the foreseeable
future is a really powerful feature that
is being built into or is already built
into many agent Frameworks and that's
something I'm really excited about now
there's a lot of complexity there how
much do you store how do you write the
rules for when to forget something or do
you ever forget something how do you
change a memory businesses change all
the time so the memory has to evolve
with the business's needs and again all
of this is very early it's so raw right
now so just having the ability to give
it long-term and short-term memory and
using flow engineering and tools and all
of these things it's possible but I
think there's not really a tried and
true path yet people are still figuring
out what is the best combination what is
the optimal combination of you know
whatever we're talking about long-term
memory short-term memory tools number of
Agents different large language models
should you use different ones in the
same workflow there's so many cool
questions yet to be answered and so
that's it that's his whole talk uh this
was a great talk let me know what you
think in the comments if you liked this
video please consider giving a like And
subscribe and I'll see you in the next
one
استعرض المزيد من الفيديوهات ذات الصلة
![](https://i.ytimg.com/vi/bJZTJ7MjYqg/hq720.jpg)
【生成式AI導論 2024】第9講:以大型語言模型打造的AI Agent (14:50 教你怎麼打造芙莉蓮一級魔法使考試中出現的泥人哥列姆)
![](https://i.ytimg.com/vi/oG6FyY2r9G0/hq720.jpg)
7. Layered Knowledge Representations
![](https://i.ytimg.com/vi/qJZ_1a-t_sA/hq720.jpg)
6. Layers of Mental Activities
![](https://i.ytimg.com/vi/S2zgNFimIAI/hq720.jpg)
The Critical Role of Supply Chains in Business and Society
![](https://i.ytimg.com/vi/yxQYbQ0FAf0/hq720.jpg)
Supply Chain Resilience During COVID 19 and Beyond
![](https://i.ytimg.com/vi/aPtDDPT1gZQ/hq720.jpg)
2 Ex-AI CEOs Debate the Future of AI w/ Emad Mostaque & Nat Friedman | EP #98
5.0 / 5 (0 votes)