AI Leader Reveals The Future of AI AGENTS (LangChain CEO)

Matthew Berman
2 May 202416:22

Summary

TLDR视频脚本讨论了智能代理(agents)的当前状态和未来发展。Lang chain的CEO和创始人Harrison Chase强调,代理不仅仅是复杂的提示(prompts),而是具有工具使用、记忆和规划等多种能力的复杂系统。他提到,通过为代理提供短期和长期记忆,以及允许它们执行计划和行动,可以显著提高代理的性能。Harrison还探讨了用户体验(UX)的重要性,强调了“人类在循环中”的必要性,以及如何通过代理框架减少幻觉(hallucinations)并提高可靠性。最后,他讨论了代理记忆的两个方面:程序记忆和个性化记忆,以及它们如何对下一代代理的发展至关重要。整个讨论突出了在构建生产就绪和现实世界中的代理时,开发者们正在努力解决的一些关键问题和挑战。

Takeaways

  • 🤖 **智能代理的复杂性**:智能代理不仅仅是复杂的提示,它们能够使用工具、记忆和执行计划,远超过简单的语言模型提示。
  • 🧠 **记忆的重要性**:代理拥有短期记忆和长期记忆,这对于提高代理性能至关重要,长期记忆如使用RAG技术,可以存储信息供以后使用。
  • 🛠️ **工具使用**:代理可以使用各种工具,如日历、计算器、网络访问和代码解释器,这些工具极大地扩展了代理的能力。
  • 📈 **性能提升**:通过添加短期和长期记忆功能,代理的性能得到了显著提升。
  • 🔍 **规划与行动**:代理能够进行规划,包括自我批评、思考链分解和执行动作,这使得它们能够更有效地完成任务。
  • 🔁 **循环迭代**:代理的工作方式可以被看作是在循环中运行语言模型,不断询问并执行下一步操作,直到任务完成。
  • 🌐 **开发者关注点**:开发者正在将智能代理推向生产就绪和现实世界的应用,特别关注规划、用户体验和记忆。
  • 📐 **流程工程**:通过设计良好的流程图或状态机,可以提高代理的效率,这需要人类工程师在开始时进行规划。
  • 🔮 **未来展望**:未来的智能代理可能需要全新的架构来实现更深层次的逻辑和推理能力。
  • 🤝 **协调一致性**:代理框架在协调不同模型和工具方面非常有价值,即使在未来模型能够更缓慢地思考时,这些框架仍然是必不可少的。
  • 🔄 **可逆性与编辑**:用户界面设计中的可逆性和编辑能力,如Devon演示的那样,可以提高用户体验并使代理更加可靠。
  • 🧵 **记忆的类型**:代理需要程序记忆(正确执行任务的记忆)和个性化记忆(关于用户的事实,用于个性化体验)。

Q & A

  • Harrison Chase 是谁?

    -Harrison Chase 是 Lang chain 的首席执行官和创始人,Lang chain 是一个流行的编码框架,允许用户轻松地将不同的 AI 工具组合在一起。

  • 什么是 Lang chain?

    -Lang chain 是一个开发者框架,用于构建各种大型语言模型(LLM)应用程序,其中最常见的类型之一是代理(agents)。

  • 代理(agents)是什么?

    -代理是一种使用语言模型与外部世界互动的工具。它们不仅仅是复杂的提示(prompts),而是拥有访问日历、计算器、网络等工具的能力,还具备短期和长期记忆,能够进行规划和执行动作。

  • 为什么代理不仅仅是大型语言模型的提示?

    -因为代理拥有超出大型语言模型本身的能力,如使用工具、记忆和执行规划等,这些能力使得代理能够执行更复杂的任务,远不止生成文本响应。

  • 代理中的规划是什么?

    -规划是代理能够进行自我反思、提前计划、将复杂任务分解为子任务的能力,这是目前单独的大型语言模型尚不能有效执行的功能。

  • 什么是“树状思维”(Tree of Thoughts)?

    -树状思维是一种允许模型生成对提示的初始响应,然后将该响应反馈给模型并询问如何改进的方法,从而赋予模型自我反思和规划的能力。

  • 代理框架中的“人类在循环中”(Human in the Loop)是什么?

    -人类在循环中是指在代理执行任务的过程中,人类用户可以介入以提供指导或纠正,以提高代理的可靠性和输出质量。

  • 为什么说代理框架对于协调不同的模型和工具非常有价值?

    -代理框架可以帮助开发者构建工具和策略,协调不同的模型和代理,提供一致的工作流程,即使在未来模型能够更慢地思考和规划时,这些框架仍将非常有价值。

  • 代理的用户界面(UX)设计为什么很重要?

    -用户界面设计影响着用户与代理的交互方式,良好的UX设计可以提高代理的可靠性和用户的使用体验,例如提供“回放和编辑”功能,允许用户回退到代理的某个状态并进行编辑。

  • 代理的短期记忆和长期记忆有什么区别?

    -短期记忆指的是在对话或同一对话中代理之间的记忆,而长期记忆则涉及到存储以备后用的信息,如使用检索增强生成(RAG)技术。长期记忆对于个性化和企业环境中的知识保留至关重要。

  • 为什么说记忆管理在代理中非常关键?

    -记忆管理对于代理的个性化和适应性至关重要。它需要能够记住正确的操作方式和用户的个性化信息,同时也要能够随着业务需求的变化而演化和更新。

Outlines

00:00

🤖 代理机器人的当前状态与未来发展

Harrison Chase,Lang chain的CEO和创始人,在Sequoia活动上讨论了代理机器人的现状以及未来的发展。他提到代理不仅仅是复杂的提示,它们通过使用大型语言模型与外部世界互动,并且可以配备各种工具,如日历、计算器、网络访问等。代理还拥有短期记忆和长期记忆,能够进行规划和执行行动。Harrison强调了代理的性能在添加了短期和长期记忆功能后得到了显著提升。他还提到了规划的重要性,包括自我批评、思考链、子目标分解等,并讨论了开发者如何使代理更加生产就绪和实用。

05:01

🧐 代理的用户体验和记忆功能

讨论了代理的用户体验(UX)和记忆功能的重要性。提到了大型语言模型几乎肯定会产生幻觉,而代理框架通过缓存、提示库等方法帮助减少这些幻觉。人机交互仍然必要,因为代理的可靠性还不够高。讨论了如何在自动化和人工干预之间找到平衡点,以及如何通过用户界面设计提高用户体验。还提到了允许用户回溯和编辑代理操作的重要性,以及个性化记忆在提升用户体验中的作用。

10:02

🔄 代理的规划和行动能力

深入探讨了代理的规划能力和执行行动的重要性。Harrison提到,尽管当前的语言模型还不能完全可靠地进行规划,但通过外部提示策略和认知架构,可以提高模型的性能。他提出了关于未来代理是否会内置这些策略的问题,以及是否需要全新的架构来让模型能够更好地逻辑推理和提前规划。他还讨论了流工程的概念,即通过设计流程图或状态机来明确代理的工作流程,以及这如何成为代理框架的一个关键优势。

15:03

📚 代理的长期和短期记忆

讨论了代理的长期和短期记忆对于提供个性化体验和企业环境中的知识掌握的重要性。Harrison展示了如何通过代理框架构建记忆功能,以及这如何帮助代理学习和改进。他还提到了记忆的复杂性,包括存储多少信息、何时忘记某些事情、以及记忆如何随着企业需求的变化而发展。最后,他强调了尽管目前还没有确定的最佳实践,但给予代理记忆能力、使用流工程和工具等是目前可行的方法。

Mindmap

Keywords

💡代理(Agents)

代理在视频中指的是利用语言模型与外部世界互动的智能体。它们不仅仅是复杂的提示(prompts),而是拥有多种工具、记忆和规划能力的实体。代理能够执行任务、进行规划、拥有短期和长期记忆,并且能够执行动作。代理的概念是视频的核心,围绕着它们如何工作、它们的当前状态以及未来发展进行讨论。

💡Lang Chain

Lang Chain 是一个流行的编码框架,允许用户将不同的AI工具组合在一起。它是构建LLM(大型语言模型)应用程序的开发者框架,尤其是代理。在视频中,Lang Chain 被用来展示如何通过框架构建代理并提高其性能。

💡规划(Planning)

规划是代理执行任务时的重要能力,它涉及到反思、自我批评、思考链分解和子目标分解。在视频中,规划被讨论为提高代理性能的关键因素之一,它允许代理将复杂任务分解为子任务,并提前规划行动步骤。

💡用户体验(User Experience)

用户体验在代理应用程序中至关重要,视频讨论了如何通过设计良好的用户界面和交互流程来提高用户体验。例如,提供“回放和编辑”功能,允许用户回退到代理操作的某个点并进行编辑,从而提高代理的可靠性和用户的控制能力。

💡记忆(Memory)

代理的记忆分为短期记忆和长期记忆。短期记忆涉及对话之间的记忆,而长期记忆则涉及到像RAG(Retrieval-Augmented Generation)这样的技术,用于保存信息以供将来使用。在视频中,记忆被强调为提高代理性能的关键因素,它允许代理记住用户的偏好和业务流程,从而提供个性化服务。

💡工具使用(Tool Usage)

工具使用是指代理能够利用各种工具来执行任务,如访问日历、计算器、网络或执行代码解释。这些工具的集成显著扩展了代理的能力,使它们能够执行更复杂的任务。在视频中,工具使用是代理不仅仅是大型语言模型提示的重要组成部分。

💡执行动作(Taking Actions)

执行动作是代理能力的一部分,允许它们在规划之后实际执行任务。在视频中,这与代理的规划能力紧密相关,说明了代理如何不仅仅是生成响应,而是能够实际采取行动来完成任务。

💡人类参与(Human in the Loop)

人类参与是指在代理执行任务的过程中,人类用户的介入和反馈。这对于提高代理的可靠性和质量至关重要,尤其是在需要确保交付物质量的大型企业中。视频讨论了找到正确的人类参与平衡点,以避免过度自动化或过多人工干预。

💡流程工程(Flow Engineering)

流程工程涉及到设计代理执行任务的流程,包括规划步骤和执行步骤。在视频中,流程工程被提出作为一种提高代理性能的方法,通过明确设计流程图或状态机来优化代理的工作流程。

💡大型语言模型(Large Language Models, LLM)

大型语言模型是代理的基础,它们提供了代理与外部世界互动的语言处理能力。视频讨论了如何通过代理框架增强LLM的性能,以及如何通过规划和工具使用来提高LLM的输出质量。

💡CREW AI

CREW AI 在视频中被提及为一个代理框架的例子,它提供了短期和长期记忆功能,并且展示了添加这些功能后代理性能的显著提升。CREW AI 代表了当前代理框架发展的方向,即通过增强记忆和规划能力来提升代理的整体性能。

Highlights

Harrison Chase,Lang chain的CEO和创始人,在Sequoia活动上就代理人进行了演讲,讨论了代理人的当前状态以及未来发展方向。

代理人不仅仅是复杂的提示,它们可以访问日历、计算器、网络等工具,并且拥有短期和长期记忆。

代理人可以进行规划,包括自我批评、思考链分解和执行子目标。

Crew AI发布了具有短期和长期记忆的代理人框架,显著提高了代理人性能。

代理人可以执行循环中的LLM(大型语言模型),通过循环询问和执行来完成任务。

开发者正在探索如何使代理人在生产环境中更加可靠和实用。

规划是代理人发展中令人兴奋的领域之一,它允许模型提前规划和分解复杂任务。

通过反思和树状思考等技术,可以提高模型的规划和思考能力。

代理人框架允许从大型语言模型提示中提取更多质量和性能。

未来可能需要一种新的架构来使模型能够更好地逻辑推理和提前规划。

开发者可能需要自己构建工具和策略,直到模型能够内生地进行缓慢思考和规划。

流程工程是代理人发展中的另一个重要方面,涉及设计工作流程和状态机。

代理人框架有助于流程工程,超越了单纯的提示工程。

用户体验(UX)是代理人应用中的一个关键领域,需要找到与代理人互动的正确方式。

人类在循环中是必要的,因为代理人的可靠性还不够高,但过多的人类干预会减少自动化的价值。

Devon和DEA展示了强大的用户界面设计,允许在单一视图中查看所有操作。

回放和编辑功能是用户体验中的一个重要方面,允许用户回到过去的状态进行编辑。

代理人的记忆包括程序记忆和个性化记忆,对于提供个性化体验和企业环境中的学习至关重要。

代理人框架正在构建或已经构建了长期和短期记忆功能,这对于企业的知识和个性化体验非常有帮助。

代理人的发展仍处于早期阶段,许多问题如记忆存储、规则制定和业务适应性仍在探索之中。

Transcripts

play00:00

let's talk about agents Harrison chase

play00:02

the CEO and founder of Lang chain did a

play00:05

talk at this Sequoia event that I made

play00:08

another video on a couple weeks ago

play00:10

where Andrew ning did a talk also there

play00:12

and Harrison's talk is also about agents

play00:15

and the current state of Agents what to

play00:17

expect from agents in the future where

play00:19

they work really well where they don't

play00:20

and so let's watch it together and I'll

play00:22

comment on it as we go through it so

play00:24

let's watch a quick note before I get to

play00:27

the video if you want a chance to win a

play00:29

rabbit R1 all you need to do is

play00:32

subscribe to my newsletter get awesome

play00:34

AI updates twice a week and stay up

play00:37

toate on the world of AI I'll drop the

play00:39

link to subscribe in the description

play00:41

below so check it out subscribe to my

play00:43

newsletter and maybe you can win this

play00:44

rabbit R1 now back to the video for

play00:47

those of you who are not familiar with

play00:49

Harrison he is as I mentioned the

play00:51

co-founder and CEO of Lang chain and if

play00:54

you haven't heard of Lang chain let me

play00:56

tell you quickly about what they do so

play00:58

Lang chain is a super popular coding

play01:00

framework that allows you to basically

play01:03

just take a bunch of different AI tools

play01:05

and plug them all together really easily

play01:07

the chain part and really this was

play01:09

agents before agents had a term and so

play01:12

of course Harrison is incredibly

play01:14

knowledgeable about agents so now let's

play01:16

watch the video thanks for the intro and

play01:19

and thanks for having me excited to be

play01:20

here so today I want to talk about

play01:22

agents uh so L chains the developer

play01:25

framework for building all types of llm

play01:26

applications but one of the most common

play01:28

ones that we see being built are agents

play01:31

um and we've heard a lot about agents uh

play01:34

from a variety of speakers before so I'm

play01:36

not going to I'm not going to go into

play01:38

too much of of a deep kind of like

play01:40

overview but at a high level it's using

play01:42

a language model to interact with the

play01:45

external world all right I actually want

play01:47

to stop it right away so one thing that

play01:49

I've heard quite a lot and less So

play01:51

lately now that agents have really

play01:52

become mainstream is agents are just

play01:55

prompts they're just complex prompts but

play01:57

that's not necessarily true and even if

play02:00

it were there's so much going on around

play02:02

that that that is what makes it so

play02:04

special and this is a great graph for

play02:05

actually understanding what's going on

play02:07

with agents so you can think of the

play02:08

large language model as this one little

play02:10

piece right here the agent itself then

play02:13

you can give that agent tools so they

play02:15

can have access to your calendar to a

play02:17

calculator to the web they can do code

play02:19

interpreter which means they can

play02:20

actually spin up environments and write

play02:22

and run code and basically there's an

play02:24

unlimited amount of tools that you can

play02:26

give agents then we give agents memory

play02:29

both short memory and long memory so

play02:32

short memory means memory between a

play02:34

conversation or within the conversation

play02:36

between agents and longterm memory is

play02:38

something like rag for example so

play02:40

retrieval augmented generation saving

play02:42

information to be used later and crew AI

play02:45

my favorite agent framework just

play02:47

released both short-term and long-term

play02:49

memory and has shown that the agent

play02:52

performance has significantly improved

play02:54

since adding these features so agents

play02:56

can also do planning which is reflection

play02:59

self-critique ique chain of thoughts

play03:01

subgoal decomposition and then they can

play03:03

also perform actions so with all of

play03:05

these additional superpowers agents or

play03:08

just the large language model prompt

play03:11

becomes so much more than just that and

play03:13

we're going to touch on planning in a

play03:15

moment because Harrison says something

play03:17

really interesting about it so let's

play03:19

keep watching tool usage memory planning

play03:21

taking actions is is kind of the highle

play03:24

gist and the simple form of this you can

play03:27

maybe think of as just running an llm in

play03:29

a for Loop so you ask the llm what to do

play03:32

you then go execute that and then you

play03:34

ask it what to do again and then you

play03:35

keep on doing that until it decides it's

play03:37

done so today I want to talk about some

play03:39

of the areas that I'm really excited

play03:42

about that we see developers spending a

play03:43

lot of time in and really taking this

play03:46

idea of of of an agent and making it

play03:49

something that's production ready and

play03:51

and and real world and and really you

play03:53

know the future of Agents as the title

play03:55

suggests so there's three main things

play03:57

that I want to talk about and we've

play03:58

actually touched on uh all of these in

play04:00

some capacity already so I think it's a

play04:02

great Roundup so planning uh the user

play04:04

experience and memory so for planning

play04:08

Andrew uh covered this really nicely in

play04:11

his talk um but we see a few the basic

play04:14

idea here is that if you think about

play04:15

running the llm in a for Loop often

play04:18

times there's multiple steps that it

play04:19

needs to take so I'm going to pause

play04:21

there for a second so I've already done

play04:22

a video all about the tree of thoughts

play04:25

paper which is incredible so be sure to

play04:27

check it out I'll link it in the

play04:28

description below and then I I haven't

play04:30

actually done a review of the reflection

play04:32

paper but the gist is you allow a model

play04:35

to generate this initial response to a

play04:37

prompt and then you simply feed it back

play04:39

and say hey what would you do better and

play04:41

that's the very simple explanation of

play04:43

what it does but essentially what we're

play04:46

doing is giving the models the ability

play04:48

to reflect to plan ahead to break

play04:52

complex task down into subtasks and

play04:56

that's something that the models

play04:57

themselves alone can't do

play05:00

yet and I've made a few videos about qar

play05:03

and qar has to do with giving the models

play05:06

the ability to plan and look ahead but

play05:08

that's not something we have to play

play05:10

around with today and in fact just today

play05:12

I released a video about the gpt2

play05:15

chatbot large language model that was

play05:18

mysteriously released on LM CIS and then

play05:21

it was actually just recently taken down

play05:23

and a lot of people think that that is a

play05:25

model that has the ability to power

play05:27

agents because it does have this

play05:28

planning ability more so than anything

play05:30

we've ever seen and I haven't

play05:32

necessarily seen that but I also wasn't

play05:33

explicitly testing for that but the

play05:35

point is agents and agent Frameworks

play05:38

allow you to extract so much more

play05:40

quality so much more performance out of

play05:43

just a large language model prompt and

play05:45

so when you're running it in a for Loop

play05:46

you're asking it implicitly to kind of

play05:49

reason and plan about what the best next

play05:51

step is see the observation and then

play05:53

kind of like resume from there and think

play05:55

about the what the what the next best

play05:57

step is right after that

play06:00

right now at the moment language models

play06:02

aren't really good enough to kind of do

play06:04

that reliably and so we see a lot of

play06:06

external uh uh papers and external

play06:09

prompting strategies kind of like

play06:11

enforcing planning in in some method

play06:13

whether this be uh planning steps

play06:15

explicitly up front um or reflection

play06:18

steps at the end to see if it's kind of

play06:21

like done everything correctly as as it

play06:23

should and I've actually made a video

play06:25

about models that were explicitly

play06:28

trained to quote unquote think slowly

play06:30

and orca is a great example of that and

play06:33

orca is a project out of Microsoft that

play06:36

really teaches the model how to think

play06:38

slowly and use a lot of these techniques

play06:41

whether we're talking about reflection

play06:42

or tree of thoughts or other kind of

play06:44

slow thinking techniques automatically

play06:47

without us having to prompt or kind of

play06:49

code around the model to make it do that

play06:52

I think the interesting thing here

play06:53

thinking about the future is whether

play06:55

these types of prompting strategies and

play06:57

these types of like cognitive

play06:58

architectures continue to be things that

play07:01

developers are building or whether they

play07:03

get built into the model apis as we

play07:05

heard Sam talk a little bit about um

play07:08

yeah so that's really a question still

play07:11

and I'm not sure my guess is it's going

play07:14

to take a new architecture something

play07:16

completely new Beyond just the

play07:18

Transformers architecture to allow these

play07:21

models to really logic and reason

play07:23

properly to plan ahead to think to think

play07:26

slowly and that's just not what they do

play07:28

today so maybe that's what GPT 5 is

play07:31

going to be maybe that's what qar is but

play07:33

I haven't seen any evidence that we

play07:36

actually have a large language model

play07:38

that can do that so for now developers

play07:40

are going to have to build these tools

play07:42

and these strategies themselves which is

play07:44

fine because companies like crew AI make

play07:46

it really easy to do that and even when

play07:48

models will be able to think more slowly

play07:50

and have these things inherently in them

play07:53

agent Frameworks are still going to be

play07:55

very valuable for coordinating different

play07:57

models for giving tools for being able

play07:59

to coordinate different models

play08:01

coordinate different agents give them

play08:02

different tools and coordinate a very

play08:05

consistent workflow um and so for all

play08:08

three of these to be clear like I don't

play08:09

have answers uh and I just have

play08:10

questions and so one of my questions

play08:12

here is you know are these planning

play08:14

prompting things short-term hacks or

play08:17

long-term uh necessary components

play08:20

actually let me know what you think in

play08:21

the comments do you think that these

play08:23

types of prompting techniques reflection

play08:26

tree of thoughts are these short-term

play08:28

hacks and then event ually the models

play08:30

will just be able to do this without

play08:32

prompting them to do so or using

play08:33

external techniques or are these

play08:35

techniques we're going to have to do

play08:36

forever another another kind of like

play08:38

aspect of this is just the importance of

play08:40

basically flow engineering and so this

play08:42

term I heard come out of this paper

play08:44

Alpha codium it basically achieves

play08:46

state-of-the-art kind of like coding

play08:47

performance not necessarily through

play08:49

better models or better prompting

play08:50

strategies but through better flow

play08:52

engineering so explicitly designing this

play08:55

uh kind of like graph or or or state

play08:57

machine type thing and I think one way

play08:59

to think about this is you're actually

play09:00

offloading the planning of what to do to

play09:03

the human Engineers who are doing that

play09:05

at the beginning and so you're relying

play09:06

on that as a little bit of a crutch all

play09:08

right so that's a really good point and

play09:10

again that's why I'm so bullish on agent

play09:12

Frameworks they help you with the flow

play09:14

engineering piece and Beyond just prompt

play09:18

engineering now we're talking about flow

play09:20

engineering and that's a whole separate

play09:22

Art and Science in itself and it's still

play09:25

very early days we're still trying to

play09:27

figure out what types of flows work well

play09:29

how many agents work well together is

play09:31

there a maximum is there a minimum how

play09:34

should they plan what steps should they

play09:36

execute so it's still early it's still

play09:39

really fun to watch the next thing that

play09:41

I want to talk about is the ux of a lot

play09:43

of agent applications this is actually

play09:45

one area I'm really excited about I

play09:46

don't think we've kind of nailed the the

play09:48

right way to interact with these agent

play09:50

applications I think uh human in the

play09:53

loop is kind of still necessary because

play09:56

they're not super reliable and I want to

play09:57

talk about human in the loop so I work

play09:59

with large companies helping them with

play10:01

their AI strategy and consistency and

play10:04

reliability and quality is insanely

play10:08

important to them and when you're

play10:09

talking about large language models

play10:10

hallucinations are almost guaranteed so

play10:13

how do you avoid them well there's a few

play10:16

ways again agent Frameworks help you

play10:19

really reduce hallucinations through

play10:21

things like caching through prompt

play10:23

libraries through obviously reducing the

play10:25

temperature of the large language model

play10:28

but also human in the loop and that's

play10:30

really important especially to large

play10:33

Enterprise companies and I don't think

play10:36

human in the loop is going to go away

play10:38

anytime soon but as Harrison says if you

play10:41

have too much human in the loop

play10:42

basically you're removing all of the

play10:44

Automation and there's this fine balance

play10:46

of where you actually need human in the

play10:49

loop and I think it's essentially

play10:51

whenever you have a deliverable whenever

play10:53

the agents produce something that is

play10:55

substantial and is a piece of something

play10:58

that will be delivered and relied upon

play11:00

within the organization and so that's

play11:02

something I'm still experimenting with

play11:04

is what the optimal human in thee Loop

play11:06

strategy is but if it's in the loop too

play11:08

much then it's not actually doing that

play11:10

much useful thing so there's kind of

play11:12

like a weird balance there one ux thing

play11:14

that I really like uh from from Devon uh

play11:17

which came out you know a week two weeks

play11:19

ago and speaking of Devon as much

play11:22

virality as they had and as much dunking

play11:24

as they had and people calling it out as

play11:26

actually like doing less than what

play11:28

they've shown in the Dem Mo the ux is

play11:30

fantastic and shortly after Devon we had

play11:33

DEA and open Devon so obviously they did

play11:36

something right with showing all of the

play11:38

screens the browser the chat window the

play11:41

terminal the code all in one screen that

play11:44

was obviously a really powerful UI

play11:46

because it was copied and a lot of

play11:48

people like it so I think this was

play11:49

immediately one of the big contributions

play11:51

of seeing the Devon demo was just

play11:53

everybody realized oh this is a great

play11:55

way to structure the user interface um

play11:58

and and and Jordan kind of like uh put

play12:00

this nicely on Twitter is is the

play12:02

presence of like a rewind and edit

play12:04

ability so you can basically go back to

play12:06

a point in time where the edit or where

play12:08

the agent was and then edit what it did

play12:10

or edit the state that it's in so that

play12:12

it can make a more informed decision and

play12:14

I think this is a really really powerful

play12:15

ux um that we're really excited about uh

play12:18

at L chain and exploring this more and I

play12:20

think this brings a little bit more

play12:23

reliability um but at the same time kind

play12:25

of like steering ability to the agents

play12:28

so let's talk talk about being able to

play12:30

rewind and change things uh I agree this

play12:32

is a really incredible user experience

play12:35

because there are times where you kind

play12:37

of go off in a path in a direction and

play12:39

you find that that was not the right

play12:40

thing to do so let's go back and start

play12:42

from this state and I've seen one

play12:44

project do this incredibly well and

play12:46

they've actually been a sponsor of this

play12:48

channel but it comes to mind because

play12:49

they really do do it so well which is

play12:51

pythagora and that was the AI coding

play12:54

assistant that I've shown you before and

play12:56

pythagora has this ability to basically

play12:58

rewind to any step along the entire

play13:01

journey of a project and you can start

play13:03

from there you can edit it and continue

play13:05

on from there so really cool and that's

play13:07

kind of what Devin does that's also what

play13:09

Harrison is talking about I think that's

play13:11

going to be a very strong piece of agent

play13:15

coordination and I can't wait until all

play13:18

of the agent Frameworks build that in

play13:20

speaking of kind of like steering

play13:21

ability the the last thing I want to

play13:23

talk about is the memory of of Agents um

play13:26

and so Mike uh zapier showed this off a

play13:28

little a little bit earlier where he was

play13:30

basically interacting with the bot and

play13:31

kind of like teaching it what to do and

play13:33

correcting it and so this is an example

play13:35

where I'm teaching um in a chat setting

play13:38

in AI to kind of like write a tweet in a

play13:40

specific style and so you can see that

play13:41

I'm just correcting it in natural

play13:43

language to get to a style that I want I

play13:45

then hit thumbs up the next time I go

play13:47

back to this application it it remembers

play13:49

the style that I want but I can keep on

play13:51

editing it I can keep on making it a

play13:52

little more differentiated and when I go

play13:54

back a third time it remembers all of

play13:56

that and so this I would kind of

play13:57

classify as kind of like procedure

play13:59

memory so it's remembering the correct

play14:01

way to do something I think another

play14:03

really important aspect is is basically

play14:05

personalized memory so remembering facts

play14:07

about a human that you might not

play14:09

necessarily use to to do something more

play14:11

correctly but you might use to make the

play14:13

experience kind of like more

play14:14

personalized um so this is an example

play14:16

kind of like journaling app that that

play14:18

we're building and playing around with

play14:20

for exploring memory and you can see

play14:22

that I mentioned that I went to a

play14:23

cooking class and it remembers that I

play14:24

like Italian food and so I think

play14:26

bringing in these kind of like

play14:27

personalized aspects um whether it be

play14:29

procedural or or kind of like these

play14:31

personalized facts will be really

play14:33

important for the next generation of

play14:35

Agents um that's all I have so that is

play14:39

both long-term and short-term memory in

play14:41

the short term you should be able to go

play14:43

back and forth with an agent or allow

play14:45

the agents to go back and forth with

play14:47

each other and they can learn and

play14:49

improve along the way and that might be

play14:50

also where human in the loop comes in

play14:52

you can kind of steer them but then we

play14:54

have long-term memory which is also

play14:56

really important not only for

play14:57

personalization but also also within the

play14:59

context of businesses and Enterprise the

play15:02

ability for these agents to learn things

play15:05

to have obviously the company's

play15:07

knowledge at hand at any time but that's

play15:09

just rag but basically learn that and

play15:12

use that memory for the foreseeable

play15:14

future is a really powerful feature that

play15:18

is being built into or is already built

play15:21

into many agent Frameworks and that's

play15:23

something I'm really excited about now

play15:25

there's a lot of complexity there how

play15:27

much do you store how do you write the

play15:30

rules for when to forget something or do

play15:32

you ever forget something how do you

play15:33

change a memory businesses change all

play15:35

the time so the memory has to evolve

play15:38

with the business's needs and again all

play15:40

of this is very early it's so raw right

play15:42

now so just having the ability to give

play15:45

it long-term and short-term memory and

play15:47

using flow engineering and tools and all

play15:49

of these things it's possible but I

play15:51

think there's not really a tried and

play15:53

true path yet people are still figuring

play15:56

out what is the best combination what is

play15:58

the optimal combination of you know

play16:00

whatever we're talking about long-term

play16:02

memory short-term memory tools number of

play16:04

Agents different large language models

play16:06

should you use different ones in the

play16:08

same workflow there's so many cool

play16:10

questions yet to be answered and so

play16:13

that's it that's his whole talk uh this

play16:15

was a great talk let me know what you

play16:16

think in the comments if you liked this

play16:18

video please consider giving a like And

play16:19

subscribe and I'll see you in the next

play16:21

one

Rate This

5.0 / 5 (0 votes)

Related Tags
智能代理语言模型工具协调记忆规划用户体验自动化大型企业AI策略可靠性人类在循环界面设计个性化记忆
Do you need a summary in English?