What's next for AI agentic workflows ft. Andrew Ng of AI Fund

Sequoia Capital
26 Mar 202413:40

Summary

TLDR本次演讲中,安德鲁教授分享了他在人工智能领域的最新见解,特别是关于AI代理的工作流程。他强调了迭代式工作流程相较于传统非代理式工作流程的优越性,并通过案例研究展示了如何通过使用GPT-3.5和代理工作流程来提高代码生成的准确性。此外,他还介绍了四种AI代理设计模式:反思、多工具使用、规划和多代理协作,并预测这些模式将显著提升AI的生产力和功能。安德鲁教授的演讲不仅为听众提供了丰富的技术知识,也为AI领域的未来发展提供了深刻的洞见。

Takeaways

  • 🌟 安德烈·吴在斯坦福大学作为著名的计算机科学教授,对神经网络和GPU的发展贡献巨大,同时也是Coursera和deeplearning.ai的创始人,以及Google Brain的早期领导者。
  • 📝 在使用AI模型时,非代理性工作流程类似于一次性完成任务,而代理性工作流程则更像是迭代过程,通过不断思考和修订来提高结果质量。
  • 🚀 通过使用代理性工作流程,即使是GPT-3.5这样的模型也能在某些任务上表现得比GPT-4更好,这说明代理性工作流程的重要性。
  • 🔍 在进行代码编写的案例研究中,使用代理性工作流程的GPT-3.5比单纯使用GPT-4的准确率更高。
  • 🛠️ 代理性工作流程可以显著提高生产力,并且有四种广泛的设计模式:反思、多工具使用、规划和多代理协作。
  • 🤔 反思是一种强大的技术,通过让AI模型检查和修正自己生成的代码,可以提高代码的质量和效率。
  • 🔗 多工具使用允许AI模型结合多种工具和资源来完成任务,扩展了大型语言模型的应用范围。
  • 📈 规划算法使得AI代理能够在遇到问题时自主规划解决方案,提高了问题解决的灵活性和创造性。
  • 🤖 多代理协作通过模拟多个专家角色的互动,可以产生复杂的解决方案和创新性的结果。
  • 💡 快速生成token对于代理性工作流程至关重要,因为它允许AI模型快速迭代,从而提高整体性能。
  • 🌐 未来AI的能力将因代理性工作流程而显著扩展,我们需要适应让AI代理独立工作并耐心等待结果的工作方式。
  • 🚀 代理性推理设计模式将是推动AI发展的重要趋势,可能帮助我们在实现通用人工智能(AGI)的漫长旅程中迈出一小步。

Q & A

  • 安德烈在人工智能领域有哪些贡献?

    -安德烈是斯坦福大学著名的计算机科学教授,他在神经网络与GPU的发展上做出了早期贡献。他是Coursera的创始人,以及deeplearning.ai等流行课程的创建者。此外,他还是Google Brain的创始人和早期领导者。

  • 什么是非代理性工作流程?

    -非代理性工作流程是指使用大型语言模型(LM)的方式,用户输入一个提示,模型生成一个答案,类似于要求一个人一次性写下整篇文章,而不允许使用退格键进行修改的过程。

  • 代理性工作流程与非代理性工作流程有何不同?

    -代理性工作流程是一种更迭代的过程,其中AI可能会进行一些思考,然后修订文章,并可能进行多次迭代。这种方法可以带来更好的结果,因为它允许AI在生成内容后进行自我反思和修正。

  • 使用代理性工作流程有什么好处?

    -使用代理性工作流程可以显著提高大型语言模型的性能。例如,在一项研究中,使用代理性工作流程的GPT-3.5在某些任务上的表现甚至超过了更先进的GPT-4。

  • 什么是代理性反思?

    -代理性反思是一种设计模式,其中一个AI系统被提示执行一个任务,然后再次被提示检查其执行的结果,确保其正确性、效率和良好的结构。这可以帮助AI发现并修正自己的错误。

  • 多代理协作是如何工作的?

    -多代理协作涉及多个AI代理共同完成任务。每个代理可以扮演不同的角色,例如一个负责编写代码,另一个负责审查代码。通过这种方式,AI代理可以相互合作,提高工作效率和质量。

  • 规划算法在AI中的作用是什么?

    -规划算法使AI能够进行更复杂的任务,如分析、收集信息、采取行动和提高个人生产力。它们可以帮助AI在遇到问题时重新规划和调整策略,以实现目标。

  • 为什么快速生成令牌(tokens)在代理性工作流程中很重要?

    -在代理性工作流程中,快速生成令牌对于迭代过程至关重要。因为AI需要生成大量令牌供自己阅读和处理,所以能够快速生成令牌可以显著提高工作效率。

  • 未来的人工智能发展趋势是什么?

    -未来的人工智能发展可能会集中在代理性工作流程和代理性推理设计模式上。这些模式可以帮助我们更有效地使用AI,提高生产力,并可能在实现通用人工智能(AGI)的漫长道路上迈出一小步。

  • 为什么我们需要适应等待AI生成响应?

    -由于代理性工作流程可能需要多次迭代和深思熟虑的过程,我们需要学会耐心等待AI生成响应。这可能需要几分钟甚至几小时,而不是像传统的即时反馈那样迅速。

  • 如何提高AI代理的工作效率?

    -提高AI代理的工作效率可以通过使用代理性工作流程、多代理协作、规划算法和快速生成令牌等设计模式。这些方法可以帮助AI更有效地完成任务,并提高其性能。

Outlines

00:00

🤖 人工智能代理的工作流程与设计模式

本段落主要介绍了人工智能代理的工作流程和设计模式。首先,通过与非代理性工作流程的对比,阐述了代理性工作流程的迭代性和优越性。接着,通过实际案例分析,说明了使用代理性工作流程的GPT-3.5在编码任务上的表现优于直接使用GPT-4。此外,还介绍了四种主要的设计模式:反思、多工具使用、规划和多代理协作,这些模式在提高AI性能和生产力方面具有重要意义。最后,强调了快速生成令牌的重要性,并对未来AI技术的发展表示期待。

05:01

🔄 反思与自我评估在AI编程中的应用

这一部分详细探讨了反思这一设计模式在AI编程中的应用。通过让AI系统对自己的代码进行自我评估和修正,可以显著提高代码的质量和效率。例如,通过让AI系统检查自己生成的代码,并提出改进意见,可以引导AI进行自我修正,从而得到更优的代码版本。这种方法不仅展示了AI的自我反思能力,也为提高AI编程效率提供了新的思路。

10:04

🔧 多工具使用与多代理协作的AI设计模式

本段落讨论了多工具使用和多代理协作两种AI设计模式。多工具使用模式通过结合不同的工具和平台,扩展了AI的功能和应用范围。例如,AI可以通过生成代码、搜索网络和执行任务来帮助用户完成复杂的工作。而多代理协作模式则通过模拟多个AI代理之间的合作,提高了问题解决的效率和创新性。例如,通过创建不同的代理角色,如CEO、设计师、产品经理和测试员,可以让它们进行协作对话,共同开发软件或解决问题。这些设计模式展示了AI在复杂任务处理中的潜力和灵活性。

Mindmap

Keywords

💡神经网络

神经网络是一种模仿人脑神经元连接方式的计算模型,用于识别模式和处理复杂的数据。在视频中,提到了使用GPUs开发神经网络的早期工作,这是人工智能领域的一个重要突破。

💡深度学习

深度学习是机器学习的一个分支,通过构建和训练多层神经网络来学习数据的高层特征。视频中提到了deeplearning.ai这个流行的在线课程,它专注于深度学习的教育和推广。

💡人工智能代理

人工智能代理指的是能够自主执行任务、做出决策并与之交互的AI系统。视频强调了AI代理在未来AI发展中的重要性,并讨论了如何通过代理工作流程来提高AI的效率和性能。

💡迭代工作流程

迭代工作流程是一种重复和逐步改进的过程,其中每一步都是基于前一步的结果进行优化和调整。在视频中,通过与AI的交互和反馈,迭代工作流程被用来提高AI完成任务的准确性和效率。

💡自我反思

自我反思是指个体对自己的行为、想法或作品进行评估和分析的过程。在AI中,自我反思可以指AI系统对自己的输出进行检查和评估,以提高性能和准确性。

💡多代理协作

多代理协作指的是多个AI系统或代理协同工作以完成一个共同任务的过程。这种协作可以提高问题解决的效率和创新性,因为每个代理都可以从不同的角度贡献其专长。

💡规划

规划是指制定一系列行动步骤以实现特定目标的过程。在AI中,规划算法使AI能够预测未来事件并制定策略以有效达成目标。

💡两用技术

两用技术指的是可以应用于多个领域或问题的通用技术。在视频中,提到了大型语言模型(LM)通过生成代码和执行网络搜索等不同任务,展示了其多用途的特性。

💡快速令牌生成

快速令牌生成是指AI系统能够迅速产生大量数据令牌(如文字、图像等)的能力。这对于提高AI代理工作效率和实现快速迭代至关重要。

💡代理推理设计模式

代理推理设计模式是指在构建和使用AI代理时采用的一系列策略和方法,以提高AI的推理和决策能力。这些模式可以帮助AI更有效地处理复杂任务和问题。

💡多代理辩论

多代理辩论是指让不同的AI代理就某一问题进行讨论和辩论的过程,以此来提高决策的质量和全面性。这种方法可以模拟不同观点之间的互动,从而得出更优的解决方案。

Highlights

安德鲁是斯坦福大学著名的计算机科学教授,早期在神经网络与GPUs的发展中做出了贡献。

安德鲁是Coursera的创始人,以及deeplearning.ai等流行课程的创建者。

安德鲁是Google Brain的创始人和早期领导者。

人工智能代理是当前AI领域值得关注的激动人心的趋势。

非代理性工作流程中,人们使用语言模型的方式类似于一次性生成答案,不进行迭代。

代理性工作流程通过迭代和反馈改进,可以显著提高结果的质量。

通过使用代理性工作流程,GPT-3.5在某些任务上的表现甚至超过了GPT-4。

反射是一种强大的技术,可以让语言模型自我检查和改进代码。

多代理协作和规划算法是新兴的技术,有时能带来令人惊讶的结果。

使用多代理系统,可以模拟不同角色的协作,如CEO、设计师、产品经理和测试员。

代理性推理设计模式将是未来AI发展中的重要趋势。

快速生成token对于代理性工作流程至关重要,因为它可以加快迭代速度。

对于AI代理的期待是,它们能够在更长的时间内自主工作,而不仅仅是即时响应。

多代理辩论可以提高AI系统的性能,这是一种强大的设计模式。

通过代理性工作流程,我们可以在实现通用人工智能(AGI)的漫长旅程中迈出一小步。

Transcripts

play00:03

all of you uh know Andreu in as a famous

play00:06

uh computer science professor at

play00:08

Stanford was really early on in the

play00:10

development of neural networks with gpus

play00:13

of course a creator of corsera and

play00:15

popular courses like

play00:17

deeplearning.ai also the founder and

play00:19

Creator uh and early lead of Google

play00:22

brain uh but one thing I've always

play00:24

wanted to ask you before I hand it over

play00:26

Andrew while you're on stage uh is a

play00:30

question I think would be relevant to

play00:31

the whole audience 10 years ago on

play00:35

problem set number two of cs229 you gave

play00:38

me a

play00:39

b and I was wondering I looked it over I

play00:42

was wondering what you saw that I did

play00:44

incorrectly so anyway Andrew thank you

play00:47

Hansen um looking forward to sharing

play00:49

with all of you what I'm seeing with AI

play00:51

agents which I think is the exciting

play00:53

Trend that I think everyone building in

play00:56

AI should pay attention to and then also

play00:57

excited about all all the other uh on

play01:00

Sak presentations so hey agents you know

play01:03

today the way most of us use Lish models

play01:05

is like this with a non- agentic

play01:07

workflow where you type a prompt and

play01:10

generates an answer and that's a bit

play01:12

like if you ask a person to write an

play01:14

essay on a topic and I say please sit

play01:16

down to the keyboard and just type the

play01:18

essay from start to finish without ever

play01:21

using backspace um and despite how hard

play01:24

thises is L's do it remarkably well in

play01:27

contrast with an agentic workflow this

play01:30

is what it may look like have an AI have

play01:32

an LM say write an essay outline do you

play01:35

need to do any web research if so let's

play01:37

do that then write the first draft and

play01:40

then read your own first draft and think

play01:42

about what parts need revision and then

play01:45

revise your draft and you go on and on

play01:47

and so this workflow is much more

play01:49

iterative where you may have the L do

play01:52

some thinking um and then revise this

play01:55

article and then do some more thinking

play01:57

and iterate this through a number of

play02:00

times and what not many people

play02:02

appreciate is this delivers remarkably

play02:05

better results um I've actually been

play02:07

really surprised myself working these

play02:08

agent workflows how well how well they

play02:11

work I's do one case study at my team

play02:14

analyzed some data uh using a coding

play02:16

Benchmark called the human eval

play02:18

Benchmark released by open a few years

play02:20

ago um but this says coding problems

play02:22

like given the nonent list of integers

play02:25

return the sum of all the all elements

play02:26

are an even positions and it turns out

play02:29

the answer is you code snipper like that

play02:31

so today lot of us will use zero shot

play02:33

prompting meaning we tell the AI write

play02:35

the code and have it run on the first

play02:37

spot like who codes like that no human

play02:39

codes like that just type out the code

play02:40

and run it maybe you do I can't do that

play02:43

um so it turns out that if you use GPT

play02:46

3.5 uh zero shot prompting it gets it

play02:50

48% right uh gp4 way better 607 7% right

play02:55

but if you take an agentic workflow and

play02:57

wrap it around GPT 3.5 I say it actually

play03:01

does better than even

play03:03

gbd4 um and if you were to wrap this

play03:06

type of workflow around gb4 you know it

play03:09

it it also um does very well and you

play03:12

notice that gbd 3.5 with an agentic

play03:15

workflow actually outperforms

play03:18

gp4 um and I think this has and this

play03:21

means that this has signant consequences

play03:24

fighting how we all approach building

play03:26

applications so agents is the ter of

play03:29

around a lot there's a lot of consultant

play03:31

reports talk about agents the future of

play03:33

AI blah blah blah I want to be a bit

play03:35

concrete and share of you um the broad

play03:38

design patterns I'm seeing in agents

play03:40

it's a very messy chaotic space tons of

play03:42

research tons of Open Source there's a

play03:44

lot going on but I try to categorize um

play03:46

bit more concretely what's going on

play03:48

agents reflection is a tool that I think

play03:51

many of us should just use it just works

play03:54

uh to use I think it's more widely

play03:56

appreciated but actually works pretty

play03:57

well I think of these as pretty robust

play03:59

technology when I use them I can you

play04:01

know almost always get them to work well

play04:04

um planning and multi-agent

play04:05

collaboration I think is more emerging

play04:08

when I use them sometimes my mind is

play04:10

blown for how well they work but at

play04:12

least at this moment in time I don't

play04:13

feel like I can always get them to work

play04:15

Rel Lively so let me walk through these

play04:18

four design patterns in the few slides

play04:20

and if some of you go back and yourself

play04:22

will ask your engineers to use these I

play04:24

think you get a productivity boost quite

play04:26

quickly so reflection here's an example

play04:29

let's say ask a system please write code

play04:31

for me for a given task then we have a

play04:34

coder agent just an LM that you prompt

play04:37

to write code to say you def du task

play04:40

write a function like that um an example

play04:42

of

play04:43

self-reflection would be if you then

play04:45

prompt the LM with something like this

play04:47

here's code intended for a toas and just

play04:50

give it back the exact same code that

play04:51

they just generated and then say check

play04:53

the code carefully for correctness sound

play04:55

efficiency good construction CRI just

play04:57

write prompt like that it turns out the

play04:59

same l that you prompted to write the

play05:01

code may be able to spot problems like

play05:03

this bug in line Five May fix it by blah

play05:05

blah blah and if you now take his own

play05:07

feedback and give it to it and reprompt

play05:09

it it may come up with a version two of

play05:12

the code that could well work better

play05:13

than the first version not guaranteed

play05:15

but it works you know often enough for

play05:17

this be wor trying for a lot of

play05:19

applications um to foreshadow to use if

play05:22

you let it run unit test if it fails a

play05:25

unit test then he why do you fail the

play05:27

unit test have that conversation and be

play05:29

able to figure out fail the unit test so

play05:31

you should try changing something and

play05:32

come up with V3 by the way for those of

play05:35

you that want to learn more about these

play05:37

Technologies I'm very excited about them

play05:38

for each of the four sections I have a

play05:40

little recommended reading section at

play05:42

the bottom that you know hopefully gives

play05:44

more references and again just the

play05:46

foreshadow multi-agent systems I've

play05:48

described as a single coder agent that

play05:51

you prompt to have it you know have this

play05:52

conversation with itself um one Natural

play05:55

Evolution of this idea is instead of a

play05:57

single code agent you can can have two

play06:00

agents where one is a coder agent and

play06:02

the second is a Critic agent and these

play06:05

could be the same base LM model but that

play06:08

you prompt in different ways where you

play06:10

say one your expert coder right code the

play06:12

other one say your expert code review to

play06:14

review this code and this Tye of

play06:16

workflow is actually pretty easy to

play06:18

implement I think it's such a very

play06:19

general purpose technology for a lot of

play06:21

workflows this would give you a

play06:23

significant boost in in the performance

play06:25

of LMS um the second design pattern is

play06:28

to use many of where already have seen

play06:31

you know LM based systems uh uh using

play06:33

tools on the left is a screenshot from

play06:36

um co-pilot on the right is something

play06:39

that I kind of extracted from uh gp4 but

play06:42

you know LM today if you ask it what's

play06:44

the best coffee maker web search for

play06:46

some problems um will generate code and

play06:48

run code um and it turns out that there

play06:51

are a lot of different tools that many

play06:53

different people are using for analysis

play06:56

for gathering information for taking

play06:58

action for personal productivity

play07:00

um it turns out a lot of the early work

play07:02

in two use turned out to be in the

play07:03

computer vision Community because before

play07:06

large language models lm's you know they

play07:09

couldn't do anything with images so the

play07:10

only option was that the LM generate a

play07:13

function called that could manipulate an

play07:15

image like generate an image or do

play07:17

object detection or whatever so if you

play07:18

actually look at literature it's been

play07:20

interesting how much of the work um in

play07:22

two years seems like it originated from

play07:25

Vision because LMS would blind to images

play07:27

before you know gp4 and and and lava and

play07:31

so on um so that's two use and it

play07:34

expands what an LM can do um and then

play07:38

planning you know for those of you that

play07:40

have not yet played a lot with planning

play07:42

algorithms I I feel like a lot of people

play07:44

talk about the chat GPT moment where

play07:46

you're wow never seen anything like this

play07:48

I think if not used planning alums many

play07:51

people will have a kind of a AI agent

play07:54

wow I couldn't imagine the AI agent

play07:56

doing this I've run live demos where

play07:59

something failed and the AI agent

play08:01

rerouted around the failures I've

play08:02

actually had quite a few of those moment

play08:04

wow you can't believe my AI system just

play08:07

did that autonomously but um one example

play08:10

that I adapted from a hugging GPT paper

play08:12

you know you say this general image

play08:14

where the girls read where a girl is

play08:16

reading a book and it posts the same as

play08:17

a boy in the image example. jpack and

play08:19

please subscribe the new image for your

play08:21

voice so give an example like this um

play08:23

today we have ai agents who can kind of

play08:25

decide first thing I need to do is

play08:27

determine the post of the boy

play08:29

um then you know find the right model

play08:32

maybe on hugging face to extract the

play08:34

post then next need to find a post image

play08:37

model to synthesize a picture of a of a

play08:40

girl of as following the instructions

play08:43

then use image to text to and then

play08:46

finally use text of speech and today we

play08:48

actually have agents that I don't want

play08:50

to say they work reliably you know

play08:52

they're kind of finicky they don't

play08:55

always work but when it works is

play08:57

actually pretty amazing but with agentic

play08:59

loops sometimes you can recover from

play09:00

earlier failures as well so I find

play09:03

myself already using research agents for

play09:05

some of my work where one of piece of

play09:07

research but I don't feel like you know

play09:09

Googling myself and spend a long time I

play09:11

should send to the research agent come

play09:13

back in a few minutes and see what it's

play09:14

come up with and and it sometimes works

play09:16

sometimes doesn't right but that's

play09:17

already a part of my personal

play09:20

workflow the final design pattern multi-

play09:22

Asian collaboration this is one of those

play09:24

funny things but uh um it works much

play09:28

better than you might think

play09:29

uh uh but on the left is a screenshot

play09:33

from a paper called um chat Dev uh which

play09:36

is completely open which actually open

play09:38

source many of you saw the you know

play09:41

flashy social media announcements of

play09:43

demo of a Devon uh uh Chad Dev is open

play09:46

source it runs on my laptop and what

play09:49

Chad Dev doeses is example of a

play09:51

multi-agent system where you prompt one

play09:54

LM to sometimes act like the CEO of a

play09:57

software engine company sometimes Act

play09:59

designer sometime a product manager

play10:01

sometimes I a tester and this flock of

play10:03

agents that you built by prompting an LM

play10:05

to tell them you're now Co you're now

play10:08

software engineer they collaborate have

play10:10

an extended conversation so that if you

play10:12

tell it please develop a game develop a

play10:15

GOI game they'll actually spend you know

play10:18

a few minutes writing code testing it uh

play10:21

iterating and then generate a like

play10:23

surprisingly complex programs doesn't

play10:25

always work I've used it sometimes it

play10:27

doesn't work sometimes it's amazing but

play10:29

this technology is really um getting

play10:32

better and and just one of design

play10:34

pattern it turns out that multi-agent

play10:36

debate where you have different agents

play10:38

you know for example could be have ch

play10:40

GPT and Gemini debate each other that

play10:42

actually results in better performance

play10:45

as well so having multiple simulated air

play10:48

agents work together has been a powerful

play10:50

design pattern as well um so just to

play10:53

summarize I think these are the these

play10:55

are the the the uh patterns of seen and

play10:58

I think that if we were to um use these

play11:01

uh uh patterns you know in our work a

play11:04

lot of us can get a prity boost quite

play11:06

quickly and I think that um agentic

play11:09

reasoning design patterns are going to

play11:12

be important uh this is my small slide I

play11:15

expect that the set of T AI could do

play11:17

will expand dramatically this year uh

play11:20

because of agentic workflows and one

play11:23

thing that it's actually difficult

play11:25

people to get used to is when we prompt

play11:27

an LM we want to response right away

play11:29

um in fact a decade ago when I was you

play11:31

know having discussions around at at at

play11:33

Google on um it called a big box search

play11:36

we type a long prompt one of the reasons

play11:39

you know I failed to push successfully

play11:42

for that was because when you do a web

play11:43

search you one of responds back in half

play11:45

a second right that's just human nature

play11:47

we like that instant grab instant

play11:49

feedback but for a lot of the agent

play11:50

workflows um I think we'll need to learn

play11:53

to dedicate the toss and AI agent and

play11:56

patiently wait minutes maybe even hours

play11:58

uh to for a response but just like I've

play12:01

seen a lot of novice managers delegate

play12:03

something to someone and then check in 5

play12:05

minutes later right and that's not

play12:07

productive um I think we need to it be

play12:10

difficult we need to do that with some

play12:11

of our AI agents as well I saw I heard

play12:14

some loss um and then one other

play12:17

important Trend fast token generation is

play12:18

important because with these agented

play12:21

workflows we're iterating over and over

play12:23

so the LM is generating tokens for the

play12:25

elm to read so be able to generate

play12:26

tokens way faster than any human to read

play12:29

is fantastic and I think that um

play12:31

generating more tokens really quickly

play12:33

from even a slightly lower quality LM

play12:36

might give good results compared to

play12:39

slower tokens from a better LM maybe

play12:41

it's a little bit controversial because

play12:43

it may let you go around this Loop a lot

play12:44

more times kind of like the results I

play12:46

showed with gbd3 and an agent

play12:48

architecture on the first slide um and

play12:51

cand I'm really looking forward to Cloud

play12:53

5 and uh CL 4 and gb5 and Gemini 2.0 and

play12:56

all these other wonderful models that

play12:58

may are building

play12:59

and part of me feels like if you're

play13:01

looking forward to running your thing on

play13:03

gp5 zero shot you know you mayble to get

play13:07

closer to that level performance on some

play13:09

applications than you might think with

play13:11

agenting reasoning um but on an early

play13:14

model I think I I I I think this is an

play13:17

important Trend uh uh and honestly the

play13:21

path to AGI feels like a journey rather

play13:24

than a destination but I think this typ

play13:26

of agent workflows could help us take a

play13:29

small step forward on this very long

play13:31

journey thank

play13:35

[Applause]

play13:38

you

Rate This

5.0 / 5 (0 votes)

相关标签
AI代理迭代合作设计模式人工智能效率提升技术进步代码生成多代理系统反思工具规划算法