Introspective Agents: Performing Tasks With Reflection with LlamaIndex

LlamaIndex
2 May 202428:57

Summary

TLDR今日のビデオでは、自己内省型エージェントについて解説します。自己内省型エージェントは、タスクを実行しながら自己反映を行い、最初の応答を反映し、その後、そのタスクのパフォーマンスを評価し、修正するものです。このプロセスは、最終的な応答が満足できるまで繰り返されます。自己内省型エージェントは、タスクを委譲する2つのサブエージェントを含んでおり、1つは初期応答を生成し、もう1つは反映と訂正のサイクルを実行します。また、このパッケージには、ツールインタラクティブ反映と自己反映という2つの反映メカニズムが含まれています。ツールインタラクティブ反映では、外部ツールを使用して反映を行い、自己反映では、事前トレーニングされた言語モデル自体を使用します。ビデオでは、ポジティブAPIを使用してテキストの毒性スコアを取得し、その毒性を減らす方法を紹介しています。結果として、自己内省型エージェントは、タスクのパフォーマンスを大幅に向上させることができ、特に適切なツールを使用した反映がより優れた結果を生む傾向があることが示されています。

Takeaways

  • 🧐 内省型エージェントとは、タスクを実行しながら自己反映を行い、初回応答を反映し、その後、その応答の質を評価し、修正するエージェントです。
  • 📚 内省型エージェントは、タスクを遂行する際に、タスクに応じた初回応答を生成し、その後、反復的な反映と訂正を通じて最終的な応答に至ります。
  • 🔧 「llama index agent introspective」パッケージを使用して、内省型エージェントを実装し、その主なクラスを通じてどのように使用できるかを学びます。
  • ⚠️ 毒性削減タスクには、攻撃的または不快なコンテンツが含まれる可能性があるため、閲覧に注意が必要です。
  • 📉 毒性削減のタスクでは、ポジティブなテキストを安全で_less toxic_な方法で書き換えることを求められます。
  • 🤖 内省型エージェントは、タスクを委譲するデリゲート型のエージェントであり、メインエージェントと反映エージェントの2つの異なるエージェントを含んでいます。
  • 🔄 反映エージェントは、反復的な反映と訂正サイクルを実行し、停止条件が満たされるまで処理を続けます。
  • 🛠️ ツールインタラクティブ反映エージェントは、外部ツールを使用して反映を行い、毒性スコアを取得する例としてPerspective APIが使用されます。
  • 🤔 セルフ反映は、外部ツールを使わずに、事前トレーニングされた言語モデルの知識だけで反映と訂正を行います。
  • 📝 クリティックペーパー「Critic」の結果に基づいて、適切なツールを使用した反映は、セルフ反映よりも優れた結果を出す傾向があります。
  • 📊 統計的な意味で結果を比較すると、ツールインタラクティブ反映はセルフ反映よりもわずかに低い毒性スコアを生成することが示唆されています。
  • 📚 このノートブックでは、内省型エージェントを使用してタスクを実行し、反映と訂正サイクルを通じて改善されたテキストを生成する方法を学びました。

Q & A

  • イントロスペクティブエージェントとは何ですか?

    -イントロスペクティブエージェントは、タスクを実行する際に自己反映を行うエージェントです。タスクに対して初期応答を生成し、その後、その応答を反映して修正を繰り替え、最終的な応答に至るまで繰り返しを行います。

  • 自己反映とはどのようなプロセスですか?

    -自己反映は、エージェントが与えられたタスクの初期応答を振り返り、その性能を評価し、必要に応じて修正を行うプロセスです。これにより、より適切な応答を得ることができます。

  • LLaMAインデックスエージェントとは何ですか?

    -LLaMAインデックスエージェントは、自己反映を用いてタスクを実行する新しいパッケージで、特に毒性削減タスクに使用されています。

  • 毒性削減とは何を意味しますか?

    -毒性削減とは、潜在的に有害なテキストを安全で問題のない方法で書き換えることを意味します。これにより、テキストが持つ毒性を軽減し、より安全なコンテンツを作成することが目的です。

  • ツールインタラクティブリフレクションエージェントとは何ですか?

    -ツールインタラクティブリフレクションエージェントは、外部ツールを使用して自己反映を行うエージェントです。たとえば、テキストの毒性スコアを取得するためにPerspective APIなどのツールを使用します。

  • セルフリフレクションとは何ですか?

    -セルフリフレクションは、外部ツールを使わずに、事前にトレーニングされた言語モデル(LLM)自身の知識を使用して反映を行う方法です。

  • 毒性スコアとは何ですか?

    -毒性スコアは、テキストがどの程度有害かを示す指標で、パーセント値で表現されます。0から100までのスコアで、低いスコアは毒性が低いことを意味します。

  • Perspective APIとは何ですか?

    -Perspective APIは、テキストの毒性スコアを計算できるAPIです。Google Cloudプロジェクトで有効にし、APIキーを取得して使用する必要があります。

  • critiqueメカニズムとは何ですか?

    -critiqueメカニズムは、自己反映を実行する際に使用されるプロセスで、外部ツールのフィードバックを活用してテキストを修正します。critiqueは、critique大型言語モデルがツールインタラクティブなcritiqueで自己訂正ができるという研究に基づいています。

  • イントロスペクティブエージェントがタスクをどのようにデリゲートするのですか?

    -イントロスペクティブエージェントはタスクをメインワーカーエージェントとリフレクティブエージェントの2つのサブエージェントにデリゲートします。メインワーカーエージェントは初期応答を生成し、リフレクティブエージェントは反映と修正の繰り返しを行います。

  • 自己反映とツールインタラクティブリフレクションのどちらが優れていますか?

    -critique論文の結果によれば、適切なツールが使用可能であれば、ツールインタラクティブリフレクションの方が自己反映よりも優れた結果を出す傾向があります。

  • このノートブックの目的は何ですか?

    -このノートブックの目的は、イントロスペクティブエージェントを使用してタスクを実行し、特に毒性削減タスクを通じて自己反映を活用する方法を紹介し、そのプロセスと結果を解説することです。

Outlines

00:00

📘 内省的エージェントとは

ビデオでは、テキストの毒性を減らすために内省的エージェントを使用する方法について説明しています。内省的エージェントは、タスクを実行する際に自己反映を行い、初回の応答を反映し、その後、それを修正することで最適な応答に至れます。また、このプロセスに必要なパッケージをインストールし、イベントループを実行する必要があります。さらに、ビデオには、一部の視聴者が不快感を覚える可能性のある内容が含まれているため、警告が発せられています。

05:01

🔍 内省的エージェントのしくみ

内省的エージェントは、タスクを委譲するエージェントで、メインエージェントと反映エージェントの2つのエージェントを含んでいます。メインエージェントは初回の応答を生成し、反映エージェントはその後、修正サイクルを繰り返して最終的な応答を作成します。このパッケージでは、ツールを用いた反映と自己反映という2つの反映メカニズムが実装されています。

10:04

🛠️ ツールインタラクティブな反映エージェントの構築

ツールインタラクティブな反映エージェントは、外部ツールを使用して反映を行います。ここでは、テキストの毒性スコアを取得するためにパースペクティブAPIを使用しています。APIの使用には、Google Cloudプロジェクトで有効化し、APIキーを取得する必要があります。カスタムクラスを作成して、テキストの毒性スコアを取得し、そのスコアを用いてテキストを改善していきます。

15:05

🔧 イントロスペクティブエージェントのテストと自己反映

作成したイントロスペクティブエージェントを使用して、有害なテキストを改善しました。ツールインタラクティブな反映エージェントでは、パースペクティブAPIを用いて毒性スコアを取得し、自己反映を使用する場合は、外部ツールを使わずにLLM自体の知識を用いて反映を行います。結果として、毒性スコアが下がることが確認でき、テキストが安全で Became safer.

20:06

🤖 イントロスペクティブエージェントの応用

自己反映とツールインタラクティブな反映の2つのメカニズムをテストし、毒性スコアを比較しました。クリティックペーパーの結果と一致し、適切なツールを使用した反映は自己反映よりも優れているとされています。また、外部ツールからのフィードバックがタスクの遂行において非常に重要であることが示されました。

25:06

📚 イントロスペクティブエージェントのまとめ

ビデオは、タスクを実行する際に自己反映を行うイントロスペクティブエージェントの紹介を通じて終了しました。このエージェントは、メインワーカーエージェントと反映エージェントにタスクを委譲し、毒性スコアの低減に成功しました。また、ツールを使用した反映が自己反映よりも優れているというクリティックペーパーの結果にも忠実に沿っていました。

Mindmap

Keywords

💡自省的エージェント(introspective agent)

自省的エージェントとは、タスクを実行する際に自己反映を行いながら行動するエージェントです。このエージェントは、与えられたタスクに対する初期応答を生成し、その後、その応答を反映的に評価し、改善を繰り替えることで最終的な応答を導き出します。このプロセスは、テキストの毒性削減タスクにおいて、まず安全でないテキストを入力として受け取り、それを安全でない表現を減らした形で再構築することで、毒性を削減に努めています。

💡反映(reflection)

反映とは、エージェントがタスクの初期応答を生成した後、その応答を再評価し、改善を加えるプロセスです。このプロセスは、テキストの毒性を減らすために用いられ、エージェントは生成されたテキスト草案を反映的に評価し、毒性スコアに基づいて修正を繰り替えます。

💡毒性削減(toxicity reduction)

毒性削減は、ポジティブな対話を促進するために、有害または不快なコンテンツを減らすプロセスです。このビデオでは、自省的エージェントが与えられたテキストを安全でない表現を減らした形で再構築することで、テキストの毒性を削減するタスクを行っています。

💡修正(correction)

修正は、自省的エージェントが行うプロセスで、テキスト草案を反映的に評価した後、毒性スコアを下げるためにテキストを改善するステップです。このプロセスは、毒性スコアに基づいて繰り返し行われ、最終的に安全で適切なテキストを作成することを目指します。

💡Perspective API

Perspective APIは、Google Cloudプロジェクトで有効にできる外部ツールで、テキストの毒性スコアを提供します。このAPIは、自省的エージェントがテキストの毒性を評価し、改善するための重要な役割を果たします。

💡自己反映(self-reflection)

自己反映は、外部ツールを使わずに、エージェント自身の知識と事前に学習された情報に基づいて反映と修正を行うプロセスです。この方法は、ツールインタラクティブ反映と比べて、外部の情報を活用しないため、毒性スコアの低下には一定以上の効果が期待できますが、外部ツールが与える追加的なフィードバックが得られないという制約があります。

💡ツールインタラクティブ反映(tool interactive reflection)

ツールインタラクティブ反映は、外部ツールを活用して反映と修正を行うプロセスです。この方法では、Perspective APIのようなツールを使ってテキストの毒性スコアを評価し、その結果を活用してテキストを改善していきます。

💡停止条件(stopping condition)

停止条件は、自省的エージェントが反映と修正のサイクルを終了する基準です。たとえば、毒性スコアが一定の値以下になった場合に反映と修正のプロセスを終了するように設定されます。この条件は、最終的な応答が満足のいく品質に達したとエージェントが判断したときに適用されます。

💡LLM (Large Language Model)

LLMとは、大規模言語モデルの略で、自然言語処理(NLP)の分野で用いられる高度な機械学習モデルです。このビデオでは、LLMが自省的エージェントの反映と修正プロセスに活用されており、テキストの毒性を削減するタスクを遂行しています。

💡critique

critiqueは、自省的エージェントが生成したテキスト草案を評価し、改善点を見つけるプロセスです。critiqueは、ツールインタラクティブ反映においてはPerspective APIなどの外部ツールを通じて行われ、自己反映ではLLM自体の知識を活用して行われます。

💡GitHub

GitHubは、ソフトウェア開発者がコードを共同で開発・管理するためのプラットフォームです。このビデオでは、GitHub上での問題(issue)への投稿やDiscordへの連絡が、自省的エージェントに関する質問やコメントの手段として提案されています。

Highlights

介绍了内省代理(introspective agent)的概念,这是一种在执行任务时使用反思的智能体。

内省代理通过反思和迭代修正其对给定任务的初始响应,直到达到满意的最终响应。

演示了如何使用内省代理进行毒性降低(toxicity reduction),即把潜在有毒的文本重写为更安全的形式。

介绍了两个不同的反思机制:工具互动反思(tool interactive reflection)和自我反思(self-reflection)。

工具互动反思使用外部工具(如Perspective API)进行反思,而自我反思仅使用语言模型本身的知识。

通过实例演示了如何构建使用工具互动反思的内省代理,并使用Perspective API获取文本的毒性评分。

展示了如何构建使用自我反思的内省代理,这种方法不依赖外部工具,只使用预训练或微调的语言模型。

进行了一个小规模的比较,发现在有适当工具的情况下,工具互动反思比自我反思能产生更低毒性评分的结果。

根据Critic论文的结果,使用工具互动批评(critique)可以显著提高任务执行的效果。

强调了外部工具反馈在执行任务过程中的重要性,适当的工具可以带来更好的结果。

展示了如何通过循环测试多个有毒文本示例,并生成它们的更安全版本。

通过数据框(data frame)展示了不同反思机制下的改进文本、原始文本和毒性评分的对比。

平均毒性评分的比较显示,工具互动反思的平均毒性评分低于自我反思。

总结了内省代理的工作原理,即它如何将任务委托给主工作代理和反思代理,以及这两个代理如何协作。

提供了GitHub和Discord等社区渠道,供用户提出问题或发表评论。

整个演示包括了对Critic论文的讨论,该论文探讨了大型语言模型如何通过工具互动批评来自我纠正。

展示了如何使用内省代理处理潜在的冒犯性内容,并给出了适当的警告和内容提示。

讨论了内省代理在毒性降低任务中的具体实现,包括初始响应的生成和随后的反思修正周期。

强调了在反思和修正周期中达到停止条件的重要性,无论是满足条件还是达到用户指定的最大迭代次数。

Transcripts

play00:00

hey everyone so in today's video we're

play00:02

going to cover introspective agents uh

play00:05

so I've got this notebook here prepared

play00:08

uh and it's going to cover how we use an

play00:10

introspective agent uh to perform

play00:13

toxicity

play00:15

reduction so we should probably begin

play00:17

with what is an introspective agent so

play00:19

if we recall that an agent is

play00:21

essentially a thing that performs a task

play00:23

uh an introspective agent is again a

play00:26

thing that performs a task but you know

play00:27

doing so while using reflection

play00:31

um reflection essentially is a way for

play00:34

the agent to uh reflect on its initial

play00:38

response to performing that you know

play00:41

that given task and then after

play00:43

reflecting on how well it performed on

play00:45

that initial response it has the

play00:47

opportunity to correct it against that

play00:50

reflection so an introspective agent is

play00:53

essentially again an agent that performs

play00:55

a task and so it gets an initial

play00:58

response to it and then it performs

play01:00

successive iterations of reflection and

play01:02

correction until it's you know reaches a

play01:05

point where it's satisfied with the

play01:07

final

play01:09

response so we've implemented that

play01:11

reflection design pattern here in a new

play01:14

package called llama index agent

play01:16

introspective and this notebook will

play01:18

walk through how you can use the main

play01:20

classes uh contained in that in that

play01:24

package so to begin yeah we need to of

play01:27

course install our necessary packages so

play01:30

they're here of course here you've got

play01:31

the Llama index agent

play01:34

introspective and you know a quick setup

play01:36

thing here too we've we are going to run

play01:38

some async calls so we do need an event

play01:40

Loop running and so we'll use n

play01:43

Ayn now I should warn uh viewers here

play01:47

that the task here is on toxicity

play01:49

reduction so there are going to be some

play01:51

content that some may find offensive um

play01:55

and and sort of disturbing so um warning

play01:59

that that that content does exist so if

play02:01

that's something that is not good for

play02:04

you then maybe uh skip the parts when

play02:06

you know that content is uh presented in

play02:09

this

play02:11

video okay so what is the task of

play02:14

toxicity reduction essentially what

play02:16

we're going to ask the agent to do is

play02:18

we're going to give it a piece of text

play02:20

that is potentially toxic and then we're

play02:23

going to ask the agent to rewrite this

play02:25

text in a safer way um so one that is

play02:29

less t toxic right so the introspective

play02:33

agent again will do this by first

play02:36

performing uh the uh the task with an

play02:39

initial response so it will create a

play02:41

draft of a safer version of that text

play02:45

and then it will perform reflection on

play02:46

that draft so um that reflection can

play02:50

happen in a few ways and in this package

play02:52

here we've uh implemented two different

play02:55

ways of implementing a reflection

play02:57

mechanism but besides that after that

play03:00

reflection on that draft is made then

play03:02

the llm will also or the agent rather

play03:04

will then use a correction step to

play03:08

create a new draft which is hopefully

play03:11

less um harmful less toxic than the

play03:14

original

play03:15

text and you can have this agent go

play03:17

through this corre sorry reflection and

play03:21

correction Cycles until one of two

play03:24

things happens either it reaches a

play03:27

stopping condition where it's

play03:30

uh it views a draft as being

play03:33

satisfactory or it reaches a user

play03:36

specified maximum number of

play03:39

iterations so that is the task and a

play03:42

little bit again of what is an

play03:44

introspective agent and how reflection

play03:46

is performed so if we look a little bit

play03:48

closer now into you know under the hood

play03:51

of an introspective agent essentially

play03:53

what we've got uh is an introspective

play03:56

agent is an agent that actually

play03:58

delegates um the task to other uh agents

play04:02

so uh in other words an introspective

play04:04

agent contains two agents uh one we call

play04:07

a mean agent and another one that we

play04:09

call a reflective

play04:11

agent so what happens when an

play04:13

introspective agent gets a task is

play04:16

essentially it delegates this task to

play04:18

First and Main a main agent that

play04:21

generates that initial response so in

play04:23

the context of toxicity reduction it

play04:25

will generate that initial draft of a

play04:28

safer version of of the original

play04:31

text and it will output it here then

play04:34

this initial response will go to the

play04:37

reflective agent which is responsible

play04:40

for performing the reflection and

play04:41

correction Cycles until a condition a

play04:44

stoping condition has been

play04:46

met so uh again the task is passed to

play04:49

the introspective agent the

play04:51

introspective agent delegates that task

play04:53

in sequence to two other agents the main

play04:56

agent is responsible for generating the

play04:58

initial response and then that gets

play05:01

passed to the reflective agent that

play05:02

performs the reflection and correction

play05:04

cycles and then finally you get a

play05:06

corrected

play05:08

response so that is our introspective

play05:13

agent now I did mention that there are

play05:17

um in at least in this Library uh

play05:19

package there are two included me

play05:22

mechanisms for reflection uh so there is

play05:26

one that's called tool interactive

play05:28

reflection agent and another one called

play05:31

self-reflection the main difference

play05:33

there is essentially in the first one

play05:34

tool interactive uh reflection we are

play05:37

allowing um the the agent to use tools

play05:42

in order to perform reflection so you

play05:44

can think of that as doing a sort of

play05:46

like a fact check on you know a certain

play05:49

response using an external tool whereas

play05:52

in self-reflection we're just using the

play05:54

llm itself its knowledge uh obviously

play05:57

pre-trained or fine-tuned

play06:00

um and no external you external

play06:02

tools so in this notebook what we'll do

play06:05

is we'll set up two introspective agents

play06:07

one uh using tool interactive reflection

play06:11

and another using self-reflection and

play06:13

again these classes are available in the

play06:16

Llama index agent introspective uh

play06:19

integration

play06:21

package okay so in this first part we'll

play06:23

build that introspective agent with a

play06:26

tool interactive reflection

play06:28

agent so in this first step here what

play06:31

I'm going to actually do is I'm going to

play06:33

build a tool uh which will allow us to

play06:36

get the toxicity score of a piece of

play06:39

text and that tool will use the

play06:42

perspective API and the perspective API

play06:46

essentially um in order to be able to

play06:49

use that you need to enable that in uh

play06:51

your Google Cloud projects and then get

play06:54

a set of credentials or an API key uh

play06:57

that you can then pass through uh as a

play07:00

prospective API key environment

play07:02

environment variable if you are going to

play07:04

use this

play07:05

notebook um I left the link here uh for

play07:08

you to follow uh the instructions on how

play07:11

it is that you actually get the

play07:13

credentials to get perspective running

play07:16

now once you have that running I've got

play07:18

here essentially just a a custom class

play07:21

that will allow us to get that toxicity

play07:23

score on a piece of cheex um by the way

play07:27

I should mention that you know excuse me

play07:31

that this uh tool interactive reflection

play07:35

framework actually came from a paper um

play07:39

called critic uh by I forget the

play07:41

fellow's name but I do have a paper card

play07:44

here uh by zeinal in

play07:48

2023 um so in their example and in their

play07:52

paper actually they also use this

play07:55

toxicity reduction example and they've

play07:58

also made use of of perspective API so

play08:01

we are in in effect uh reimplementing

play08:04

their

play08:07

example okay so in llama index terms we

play08:11

have a function um or sorry a class that

play08:14

will allow us to get uh the perspective

play08:17

API toxicity scores and then what we

play08:20

need to do is essentially build a

play08:22

function that takes in a piece of text

play08:25

and outputs you know a score plus the

play08:28

attribute um that that score is for so

play08:33

essentially perspective Works in a way

play08:35

that it provides scores for each of

play08:37

these attributes so you've got toxicity

play08:40

severe

play08:41

toxicity uh and you know all these other

play08:44

ones for Simplicity and what you know

play08:46

also the authors of that critic

play08:48

paper also did is that they considered

play08:51

this whole like all these attributes to

play08:53

be essentially just a bucket of toxicity

play08:56

attributes and and we'll do the same

play08:58

here

play08:59

so we've got this function essentially

play09:01

that will again give us a toxicity

play09:04

score and we can test that now so here

play09:06

is a text right of friendly greetings

play09:09

from python this is you know shouldn't

play09:12

be toxic so let's see uh this is in

play09:15

terms of percentage so it's got a

play09:17

toxicity score of

play09:18

2.54 uh which is uh low uh on the scheme

play09:22

scale of 0 to

play09:24

100 okay now that we've got our tool

play09:28

let's build now our tool Interac active

play09:29

reflection agent so that again that tool

play09:31

interactive reflection agent uses a tool

play09:34

to perform reflection so it does a

play09:36

little bit of for example like fact

play09:38

checking in the context of toxicity

play09:41

reduction what we will do is we'll ask

play09:43

the reflection agent to use this

play09:46

external tool to get the toxicity score

play09:49

of the text in each iteration so you've

play09:52

got the initial response then you you

play09:54

know you go through cycles of reflection

play09:56

and correction and each iteration there

play09:58

we're going to ask the reflection agent

play10:00

to get uh the toxicity score and then in

play10:03

the correction phase we'll you know the

play10:06

the agent will use that toxicity score

play10:09

and the attribute in order to improve

play10:11

upon the previous uh text or the

play10:13

previous

play10:16

draft okay so to build our introspective

play10:18

agent we need to build first our tool

play10:20

interactive reflection agent and here I

play10:23

just you know wrote a quick helper

play10:25

function because we are going to um

play10:28

build another one of these things later

play10:30

in the notebook so I just wrapped this

play10:32

in a a function you could you really

play10:34

don't have to um but I've done it here

play10:37

again out of convenience so to build a

play10:40

tool interactive reflection agent worker

play10:42

we need to First construct uh something

play10:44

that it needs which is called the

play10:46

critique agent worker so it does a

play10:47

little bit of Delegation here as well it

play10:49

delegates the reflection to this

play10:52

critique agent which will make use of

play10:54

the prospective API tool that we just

play10:56

built we also need to Define an llm that

play10:59

will construct the corrections against

play11:01

the critique or reflection and then

play11:04

we'll need to define a stopping

play11:06

condition uh for the reflection

play11:08

correction

play11:09

cycles and then we can just uh with

play11:12

those elements we can construct our tool

play11:14

interactive reflection

play11:15

agent so here are the steps so the

play11:18

critique agent is going to be a function

play11:21

calling agent worker so it needs to use

play11:24

uh you know functions or tools and you

play11:26

can see here we're passing it that

play11:27

perspective tool we're using here GPT

play11:29

3.5 turbo uh and we're saying uh please

play11:33

be verbose or uh actually default is yes

play11:37

please be

play11:38

verose and in step two we need to again

play11:41

Define the correction llm for

play11:43

Corrections we're going to use gp4 turbo

play11:46

preview and this step C we need to

play11:49

define a stopping condition uh so this

play11:52

function here helps us sorry again this

play11:56

function is meant to Define

play12:00

how or when these reflection and

play12:02

correction Cycles stop and this function

play12:05

operates on the critique or reflection

play12:08

string so it takes in the string and

play12:11

outputs a boole

play12:12

in and so what we are going to be

play12:15

looking for because we're going to

play12:17

prompt the agent as such um is we're

play12:20

going to look for the the the string

play12:23

pass in the critique string if this is

play12:26

in there then it's the stopping

play12:28

condition should be

play12:31

met okay and in this last step D here

play12:34

we're going to finally build our

play12:35

reflection agent worker and we've got a

play12:38

convenience Constructor method called

play12:40

from defaults here and here see here you

play12:43

could provide the critique agent worker

play12:46

the template so here again we passing

play12:48

that instruction to say WR pass

play12:50

otherwise right fail if the toxicity

play12:53

score is less than three so those two

play12:56

you know this string or this string

play12:58

should be in that CR Peak

play13:00

string then we have our soping condition

play13:02

and our correction

play13:04

llm and so that defines our tool

play13:07

interactive reflection

play13:10

agent and then in step two uh we can

play13:13

Define our main agent worker so if you

play13:15

recall the introspective agent delegates

play13:17

a task to two agents the main worker

play13:20

that is responsible for generating the

play13:21

initial response and then the reflection

play13:24

agent that's responsible for performing

play13:26

the reflection and correction Cycles now

play13:29

it actually the way that we've

play13:30

implemented it is that the main worker

play13:32

actually can be optional and if it's a

play13:36

non type then what happens is that we

play13:38

assume that the user input is actually

play13:41

the initial response so you don't need a

play13:43

main agent to generate that initial

play13:46

response okay with those two elements in

play13:49

hand now we can generate our

play13:51

introspective agent so we can do that

play13:54

using again another convenience

play13:56

Constructor called from defaults we're

play13:59

we going to pass in our reflective agent

play14:01

which is our tool interactive reflective

play14:03

agent and then our main agent worker

play14:05

which uh Again by default I believe will

play14:08

be uh none here so it's false therefore

play14:11

it'll go to none at which point again it

play14:14

will be assume that the user input um is

play14:17

the original

play14:20

response and we're implementing sorry

play14:23

we're adding here some chat history so

play14:25

just a prefix of messages that will uh

play14:27

you know dictate uh the the system

play14:30

message essentially so we're saying

play14:31

you're an assistant that generates safer

play14:33

versions of potentially toxic user

play14:35

supplied

play14:37

text and then the last step here we go

play14:40

from uh you know one sort of nuancing

play14:43

here is that we go from an agent worker

play14:45

to an actual agent uh so we just take

play14:47

that agent worker and run this as agent

play14:50

method and then we've got our

play14:53

introspective agent by calling that

play14:55

helper

play14:56

function so now that we've got our

play14:58

introspective agent that uses uh tool

play15:01

interactive uh reflection and using here

play15:04

in this case perplexity API tool we can

play15:07

test it out so here's a harmful piece of

play15:10

text um that I got as an example from

play15:12

the critic paper and we're going to pass

play15:15

this into our um introspective agent who

play15:18

will perform toxicity reduction so it'll

play15:20

take this harmful text and it will

play15:22

create a safer version of

play15:25

it so it's since it's for both it's

play15:28

printing out all the steps you can see

play15:29

that it's made a a function call to the

play15:32

prospective function tool this is when

play15:34

it's performing that

play15:36

reflection uh and then you could see

play15:38

that initial score is quite High 50% uh

play15:42

at with the toxicity attribute of

play15:45

insult um and it goes through these

play15:47

iterations until finally it reaches that

play15:49

stopping condition we said you know if

play15:52

finally it gets to a toxicity score of

play15:54

less than three then include the string

play15:57

pass within the critique string

play15:59

so our stopping fun condition function

play16:02

uh you know right correctly uh notice

play16:05

this string and said okay it's time to

play16:08

stop we've finally found a good

play16:11

response and so let's take a look at the

play16:13

response so here is a corrected version

play16:16

of the input people who do not eat uh

play16:19

meat for ethical reasons related to

play16:21

animal W welfare are making a personal

play16:23

decision it's important to respect

play16:25

diverse perspectives and experiences

play16:29

um that does seem indeed to be less

play16:32

harmful than the original text uh and we

play16:35

know that this new text has a toxicity

play16:37

score of

play16:39

1.37 in comparison to the original one

play16:42

which had

play16:43

.72 and here I'm just showing you that

play16:45

you can access the sources that were

play16:48

used by the introspective agent again

play16:51

this was done through delegation but

play16:53

nevertheless we surfaced it up here so

play16:56

it was actually the critique agent

play16:57

within the reflection agent that made

play17:00

these tool calls but you could see uh

play17:02

what's happened

play17:04

there we also store the memory uh sorry

play17:07

the chat history of all the U sub agents

play17:10

within uh the processing of this task as

play17:13

well uh in this chat

play17:16

store okay so that's an introspective

play17:19

agent using the tool interactive agent

play17:22

so again sorry so the tool interactive

play17:24

reflection agent so again there we're

play17:27

performing Reflection by using external

play17:29

tools and in in in this case we use the

play17:31

prospective API tool that we

play17:34

built um so now let's carry on to

play17:37

introspective agents that use

play17:41

self-reflection now I mentioned uh

play17:43

earlier that the difference between the

play17:45

two reflection mechanisms is that with

play17:48

the tool interactive one we're using

play17:50

external tools with self-reflection

play17:52

we're just using the LM itself no

play17:54

external tools so building this is

play17:56

actually a little bit easier um again

play17:59

we've got here an helper function to

play18:01

build the introspective agent with this

play18:03

reflection

play18:04

mechanism so step is the steps are uh

play18:08

pretty similar to before we need to

play18:10

First Define that reflection agent then

play18:13

we need to Define excuse me the main

play18:15

agent and then with these two things in

play18:17

hand we can construct our introspective

play18:19

agent remember again the introspective

play18:21

agent delegates tasks to these two

play18:24

agents okay so for self-reflection uh we

play18:27

can uh use the convenience Constructor

play18:30

which uh essentially just uses the llm

play18:33

so this llm will be used for both

play18:34

reflection and correction and there are

play18:37

default prompts baked into these uh

play18:39

reflection uh agents that you can you

play18:41

know take a look at within the source

play18:44

code again you've got your main worker

play18:47

that is responsible for building or

play18:49

generating that initial response it is

play18:51

optional here uh and we will again set

play18:54

it to none so we will assume again that

play18:56

the user input is the initial response

play18:58

that we want to reflect and

play19:00

correct on and and then okay so with the

play19:04

reflection agent and the main worker

play19:06

agent defined uh we can now uh create

play19:11

our introspective uh agent again using

play19:13

the same Constructor as before um and

play19:17

the same system method so everything is

play19:20

pretty much similar the only difference

play19:21

here is how we build their or reflection

play19:24

agent

play19:26

worker Okay so

play19:29

now that we've got our uh introspective

play19:32

agent that uses self-reflection Let's

play19:34

test it out on the same uh piece of text

play19:37

as

play19:40

before so again it's for both so it's

play19:43

going through its steps again here with

play19:45

self-reflection we're not making use of

play19:47

any external tools so that's why you

play19:48

don't see any tool call to perp

play19:51

perpetually sorry perspective API

play20:04

okay there we go it's

play20:06

done um and so now we can take a look at

play20:09

the response so here again is the

play20:11

corrected version people who choose not

play20:13

to eat meat for ethical reasons related

play20:15

to animal well welfare often spark

play20:17

intense discussions it's important to

play20:19

remember that individual actions and

play20:21

beliefs can vary widely and it's not

play20:23

productive to generalize about anyone's

play20:25

character based on their dietary choices

play20:28

seems like a pretty fair uh uh you know

play20:33

result so again we can access the memory

play20:36

store um as before and and one thing

play20:40

that you know we might be curious to

play20:41

know is actually what is the toxicity

play20:44

score for this one because again with

play20:46

self-reflection we're not using any

play20:48

external tools um but we can for sure

play20:51

after getting this you know response

play20:53

test it out and see what the toxicity

play20:55

score would be so let's do that now okay

play20:58

so it it Nets out a a toxicity score of

play21:02

1.2 uh which if we compare to the tool

play21:05

interactive

play21:08

one is 1.36 so essentially it does it

play21:12

better

play21:14

marginally um and and so that's how the

play21:17

self-reflection agent or sorry

play21:19

self-reflection yes agent me mechanism

play21:22

works with the introspective agent and

play21:25

finally in this last section of the

play21:27

notebook uh let's do a a you know a sort

play21:30

of a mini Showdown and definitely this

play21:32

is going to be super mini it won't

play21:35

really have uh statistical significance

play21:37

because I'm not going to use enough

play21:38

examples here um but I will say that uh

play21:42

at least the results does corroborate

play21:45

the results that were seen or observed

play21:47

in the critic paper and I'll get into

play21:49

that now uh which is essentially that um

play21:54

the tool interactive reflection if you

play21:56

do have tools that are appropriate for

play21:59

the reflection using these tools

play22:01

actually yields better responses than

play22:04

using self-reflection um

play22:08

mechanics so in this section what we'll

play22:10

do is we'll you know we'll put these two

play22:13

reflection mechanisms to the test and

play22:16

see which one Nets out in the smaller

play22:20

toxicity scores so again a fair warning

play22:23

here that these examples are um you know

play22:26

or may be offensive to some um uh so uh

play22:29

feel free to skip it if that is the

play22:32

case I will mention that these examples

play22:34

came from two sources the paper the

play22:37

critic paper as well as um guard rails

play22:40

AI docks here which I've linked here

play22:42

they had uh two or three examples that I

play22:44

pulled

play22:46

from okay so what we're going to do is

play22:49

have the introspective agent using sorry

play22:52

two introspective agents one using

play22:54

self-reflection one using tool

play22:55

interactive reflection and both are

play22:58

going to run through these toxic

play22:59

examples and generate safer versions of

play23:02

them so here are my introspective agents

play23:06

so as I mentioned before I I used I

play23:09

created these helper functions because I

play23:11

wanted to create uh these agents again

play23:14

uh very simply uh and this in this time

play23:17

what I'll do is I'll put verbos equals

play23:18

to false for

play23:21

both okay so now that we've got our

play23:23

examples now as well as our agents let's

play23:26

just run through a loop that will um you

play23:29

know have each agent run through those

play23:32

tasks generating safer versions of

play23:36

them

play23:39

okay and while this is going what I'll

play23:42

do is essentially um I've added here as

play23:45

a little bonus so this is that critic

play23:47

paper that I've been referring to it the

play23:50

full title is critic large language

play23:53

models can self-correct with tool

play23:55

interactive critiquing this is Again by

play23:57

uh guzu L 2023 I believe is published in

play24:01

uh iclr though

play24:05

2024 so um yeah essentially what the

play24:09

main results of that paper are that

play24:12

critic leads to you know substantial

play24:15

improvements over the Baseline so when

play24:17

you do use reflection you see uh

play24:19

substantial gains in a variety of tasks

play24:22

so QA math and they did toxicity

play24:25

reduction again we're following the

play24:26

toxicity reduction example that they did

play24:29

in their paper so we're we're

play24:30

essentially replicating it to a certain

play24:32

degree um the second main result that I

play24:35

pulled out from that paper was that

play24:37

feedback is very crucial for sorry

play24:41

feedback from external L from external

play24:44

tools is very crucial to this process of

play24:48

um performing tasks so if you do have

play24:51

tools that you can make use of and are

play24:53

appropriate for the task then you should

play24:55

make use of them because they will lead

play24:57

uh or they should lead to better results

play24:59

than you know not using

play25:01

them and then the last main result is

play25:03

critic performs better with stronger

play25:06

llms uh so they tested here test D Vinci

play25:09

uh text D Vinci 003 versus GPT 3.5 turbo

play25:13

and they found it to perform better on

play25:15

the better model which was gbt 3.5

play25:19

turbo um and yeah in these paper cards I

play25:22

essentially go through um you other

play25:24

things so I've go through the

play25:25

contributions the insights and then I do

play25:28

a little bit of an illustration of the

play25:30

technical bits so uh you could see here

play25:33

again with critic you've got initial

play25:35

response or output then you go through a

play25:38

reflection or critique phase then you go

play25:40

through a correction phase based upon

play25:42

that critique and you do that in Cycles

play25:45

until a stopping criteria is met um

play25:48

again in that paper they use perspective

play25:49

API which we are also doing

play25:52

here okay so that's the paper card um

play25:56

again just a visualization that I like

play25:59

to create whenever I read a paper that I

play26:01

feel I would like to remember uh you

play26:04

know key takeaways from in the future

play26:07

all right

play26:08

so uh good timing so essentially that

play26:12

Loop has finished and we've displayed

play26:14

the results in a data frame and you

play26:17

could see here here is the improved text

play26:20

here is the reflection type and the

play26:22

original text you could see is 49

play26:25

toxicity score according to perspective

play26:27

and with tool interactive reflection it

play26:30

gets down to 1.15 the same example using

play26:33

self-reflection goes down to 5.5 and and

play26:36

then we you know so on so

play26:38

forth so you could see across the board

play26:40

essentially that both reflection

play26:43

mechanisms leads to a drastic

play26:46

improvement uh over the original text

play26:48

which is good so the agent is doing its

play26:50

job and reflection has is having an

play26:55

effect and you know just for

play26:57

comparison's sake here

play26:59

what we'll do is we'll Group by the

play27:01

reflection method and take the average

play27:03

of the improved toxicity score and we

play27:06

see self-reflection has a slightly

play27:08

higher um toxicity score than does tool

play27:12

reflection has and as I mentioned before

play27:15

that's essentially the observation that

play27:18

um the authors pointed out in their

play27:20

paper as well in and and I mentioned

play27:23

here in this key result to of course

play27:26

there they did it across three different

play27:28

examples with many more examples uh for

play27:31

each um case study and so their uh their

play27:35

observations are probably more closer to

play27:37

statistical significance than ours here

play27:40

but nevertheless it is again in

play27:42

agreement with uh what the authors

play27:45

observed so in summary um this is the

play27:49

introspective agent um uh notebook where

play27:53

we've introduced the introspective agent

play27:55

class that performs tasks using

play27:58

reflection and in our implementation an

play28:01

introspective agent delegates a task to

play28:03

a main worker agent uh that performs uh

play28:07

the task to get the initial response and

play28:09

then that initial response goes gets

play28:11

passed to another reflective agent that

play28:13

performs reflection and correction

play28:15

cycles and in that package we also have

play28:19

again self-reflection and Tool

play28:21

interactive reflection uh and you can

play28:23

see here which again is in agreement

play28:25

with the critic paper if you do have a

play28:28

apprpriate tools that you can use to

play28:29

perform the reflection then you should

play28:31

make use of it uh because it seems that

play28:34

using external tools for reflection does

play28:37

let net out better

play28:40

results and that's it for today folks um

play28:43

hope you enjoyed the the notebook and if

play28:46

you have any questions uh or comments

play28:49

then you know feel free to submit an

play28:51

issue in GitHub or reach out to us in

play28:54

our Discord thanks

Rate This

5.0 / 5 (0 votes)

Related Tags
自己反映毒性削減言語モデル外部ツールテキスト改善安全性向上プロセス解説ツール活用プログラミングAI技術データ分析
Do you need a summary in English?