Self-correcting code assistants with Codestral

LangChain
29 May 202416:04

Summary

TLDRランスがコード生成モデルのコードSTWを紹介。このモデルはコード補完など、プログラミング言語に特化したタスクに優れており、ツール使用もサポート。コード生成モデルは多くの企業が欲するカスタムコードアシスタンスに適しており、コードの実行が可能かどうかの評価が容易。Codium AIが提案したコード生成のためのフローエンジニアリングのアイデアを取り入れた。コード生成後、インラインでコードをテストし、失敗すればループバックして再試行する。この方法はより良い結果を生み出し、Lang chainのツール使用と組み合わせてデモンストレーションされた。

Takeaways

  • 😀 LanceがLang chainから来ており、mrawがコード生成モデルであるCode STWをリリースしたと発表しています。
  • 🔥 Code STWはコード生成タスクにおいて優れており、中間コードの埋め込みやコード補完に適しています。
  • 📚 Code STWはプログラミング言語に訓練されており、ツール使用をサポートするinstructバージョンもあります。
  • 🛠️ Lanceはコード生成モデルが非常に有用であると語り、多くの企業がカスタムコードアシスタンスを求めていると指摘しています。
  • 🔍 Lanceはコード生成の評価とテストが容易であるという強みを強調しており、Codium AIとCarboiが提唱したコード生成のためのフローエンジニアリングのアイデアに触れています。
  • 🔧 Code STWを使用して、質問に応じたコード解決策を生成し、そのコードをテストし、失敗時にはループバックして再試行するというシンプルなフローを紹介しています。
  • 💻 Lanceは実際にCode STWとLang chainを使用して、ユーザーからの質問に応じてコードを生成し、テストするデモンストレーションを行いました。
  • 📝 Lang chainのツール使用機能を使用して、モデルからの出力に特定の構造を持たせることができます。
  • 🔗 Lang graphライブラリを使用して、コード生成とそのテストを繰り返すフローを作成しました。
  • 🔄 フローのデモンストレーションでは、エラーが発生した場合にLLMにエラーを反映させ、自己訂正を試みるプロセスを示しました。
  • 📈 最後に、LanceはCode STWモデルとLang graphを使用して、コード生成と自己訂正を効果的に行うことができると結論付けています。

Q & A

  • コードSTWとは何ですか?

    -コードSTWは、コード生成タスクに優れた新しいコード生成モデルです。コード補完やミドルフィルなどに特化しています。

  • コードSTWの主な特徴は何ですか?

    -コードSTWは、プログラミング言語に特化して訓練されており、ツールの使用をサポートする指示バージョンもあります。また、生成されたコードを簡単にテストできる点が特徴です。

  • LangChainのChat LangChainとは何ですか?

    -Chat LangChainは、LangChainのドキュメントに対するQAを行うツールで、ユーザーの質問に基づいて動作するコードブロックを生成できます。

  • コード生成モデルが一般的に役立つ理由は何ですか?

    -コード生成モデルは、コードの実行が簡単にテストできるため、多くの企業が独自のコードアシスタントをカスタマイズする際に役立ちます。

  • コード生成のフローエンジニアリングとは何ですか?

    -フローエンジニアリングは、コード生成の解決策を生成し、生成されたコードをその場でチェックして、問題があれば再試行するという強力なアイデアです。

  • コードチェックの具体例は何ですか?

    -コードチェックの具体例としては、インポートが機能するか、コードが実行されるか、ユニットテストに合格するかを確認することがあります。

  • LangGraphの役割は何ですか?

    -LangGraphは、フローを構築するためのライブラリで、特にサイクルやフィードバックを含むワークフローの構築に適しています。

  • コード生成フローの基本的な構造はどのようなものですか?

    -コード生成フローは、ユーザーの質問を受け取り、コードを生成し、そのコードをテストし、問題があれば再試行するという流れで構成されます。

  • LangChainの構造化出力機能とは何ですか?

    -LangChainの構造化出力機能は、特定のスキーマに従った出力を生成し、それをJson形式で返し、後で解析するためのものです。

  • コード生成フローの自己修正機能の利点は何ですか?

    -自己修正機能により、コード生成モデルが生成したコードのエラーを認識し、そのエラーを反映して再試行することで、精度と使いやすさが大幅に向上します。

Outlines

00:00

😀 コード生成モデルの紹介

ランスは、Lang chain社がリリースした新しいコード生成モデルであるCode STWについて話しています。このモデルはコード生成タスクにおいて優れており、中間コードの埋め込みやコード補完などを行うことができます。プログラミング言語に訓練されており、ツール使用をサポートするinstructバージョンもあります。ランスは、コード生成モデルが非常に汎用的であると語り、多くの企業がカスタムコード支援を求めていると説明しています。例えば、Lang chainではchat Lang chainというQAシステムを持ち、ユーザーの質問に基づいて機能するコードブロックを生成することができます。コード生成は非常に評価しやすく、テストもしやすいという利点もあります。また、コード解決策を生成することで、その場で簡単に検証が可能であるというアイデアも紹介されています。

05:00

🛠️ コード生成フローの紹介

ランスは、コード生成におけるフローエンジニアリングの強力さを強調しています。Codium AIとCarboiが提唱したアイデアに基づいて、コード生成フローがより良い結果を生み出すと説明しています。このフローは、コード解決策を生成し、その場でテストし、失敗したらループバックして再挑戦するという単純な考え方です。実際に、ランスはこの手法を使って簡単なテストケースを紹介し、コード生成モデルの精度とユーザビリティを向上させる方法を示しています。また、ツール使用とCodolを組み合わせて、出力オブジェクトに前提文、インポート、コード自体を含め、シンプルなコードチェックを組み込む方法も紹介されています。

10:04

🔄 Lang graphを使用したコードチェックの自動化

ランスは、Lang graphライブラリを使用してコードチェックを自動化する方法を紹介しています。Lang graphは、フィードバックループを持つタスクに適しており、コード生成と組み合わせて使用することができます。ランスは、グラフのノードとエッジを定義し、コード生成からコードチェックまでを自動化するフローを作成しています。エラーがあった場合、エラーメッセージをLLMに渡して再生成を試みる、というシンプルなワークフローを構築しています。このプロセスは、ランスが実際に実行した例を通じて説明されており、エラーが発生した場合にLLMが自己修正できる様子も示されています。

15:08

📈 コード生成と自己修正のテストケース

ランスは、コード生成と自己修正のプロセスをさらに詳しく説明しています。具体的には、Pythonで'Hello World'を表示するプログラムを作成する簡単な問題から、より複雑な関数のベクター化の問題まで、テストケースを通じてモデルの能力を示しています。各テスト ケースでは、コード生成モデルが問題を解決するコードを生成し、そのコードが実行可能かどうかをチェックします。エラーが発生した場合は、エラーメッセージをLLMに渡し、自己修正を促します。このプロセスは、lsmithを使用して追跡されており、ランスはこのプロセスが非常に効果的であると結論付けています。

Mindmap

Keywords

💡コード生成モデル

コード生成モデルとは、プログラミング言語を学習し、コードの生成や補完などのタスクをこなす人工知能モデルのことです。このビデオでは、mrawがリリースしたコード生成モデルであるcode STWについて紹介されており、その性能と応用が議論されています。コード生成モデルは、プログラミング言語に特化したタスクにおいて非常に優れており、多くの企業でカスタムコードアシスタントとして求められています。

💡ツール使用

ツール使用とは、モデルが特定のツールやライブラリを用いてタスクを遂行する能力を指します。ビデオでは、Lang chainにおけるchat Lang chainというツールが紹介されており、これはドキュメントをもとに機能するコードブロックを生成するQAシステムです。ツール使用は、コード生成モデルの応用性と柔軟性を高める重要な要素です。

💡コードの評価

コードの評価とは、生成されたコードが実行可能かどうか、インポートが機能するかどうかなどの基準に基づいてコードをチェックすることです。ビデオでは、コード生成モデルが生成したコードを評価し、もしバグがあった場合は再生成に挑戦することができるというフローエンジニアリングの考え方が紹介されています。

💡フローエンジニアリング

フローエンジニアリングは、コード生成プロセスにおけるフィードバックループを活用した手法です。ビデオでは、コード生成後にコードを評価し、失敗した場合はモデルにフィードバックを与えて再生成を試みるというプロセスが説明されています。この手法は、コード生成の精度と信頼性を高めるために非常に有効です。

💡Lang chain

Lang chainは、言語処理に特化したフレームワークであり、ビデオでは特にコード生成とその評価プロセスに関連して言及されています。Lang chainを用いることで、ドキュメントを通じてQAを実行し、ユーザーの質問に基づいて機能するコードブロックを生成することができます。

💡Codium AI

Codium AIは、ビデオ内で言及されている企業の一つで、コード生成に関する研究を行っているとされています。彼らはコード生成におけるフローエンジニアリングの重要性を提唱しており、ビデオではそのアイデアが紹介されています。

💡Lang Smith

Lang Smithは、ビデオ内で紹介されているツールで、コードの実行や評価プロセスを追跡し、記録することができます。ビデオではLang Smithを用いて、コード生成モデルの生成プロセスとその評価結果をトラッキングする例が説明されています。

💡構造化出力

構造化出力とは、モデルからの応答が特定の形式に従うように設計された出力を意味します。ビデオでは、モデルが要求された構造に従ってコードの前提文、インポート、コードブロックを生成するように指示されています。これは、モデルの応答をより予測可能かつ管理しやすくするのに役立ちます。

💡Lang graph

Lang graphは、ビデオ内で紹介されているライブラリで、フィードバックループを持つワークフローの構築に適しています。ビデオでは、Lang graphを使ってコード生成とその評価を繰り返すプロセスを構築し、コード生成モデルの自己訂正能力を向上させる方法が説明されています。

💡自己訂正

自己訂正は、モデルが生成したコードを自己評価し、問題が発覚した場合は修正を試みるプロセスです。ビデオでは、自己訂正能力を持つコード生成モデルの利点を強調しており、Lang graphを用いてこのプロセスを構築する方法が紹介されています。

Highlights

Lance from Lang chain introduces mraw's release of code STW, a code generation model.

Code STW excels in tasks like code completion and is trained on a programming language.

The instruct version of code STW supports tool use.

Lance discusses the general usefulness of code generation models for companies.

Introduces chat Lang chain, a QA system that produces functioning code blocks from questions.

Code is easy to evaluate, either by execution or through unit tests.

Flow engineering for code generation is highlighted as a powerful idea from Codium AI.

The concept allows for inline checking of code solutions during the inference flow.

Lance demonstrates a simple test case using code STW with a code generation flow.

Shows how to use function calling or tool use with code STW to produce an output object.

Details the structure of the output object including preamble, imports, and code.

Lance explains the incorporation of simple code checks into the workflow.

Demonstrates how to loop back and retry if code checks fail.

Lang graph is introduced as a library for building flows, especially with cycles or feedback.

L graph is used to create a workflow for code generation with self-correction.

Lance builds and runs a graph for a simple 'Hello World' program.

Shows how the flow appends error messages to guide the model for self-correction.

Demonstrates a more sophisticated example of vectorizing a function with self-correction.

Lance concludes by showcasing the effectiveness of using code generation with self-correction in practical scenarios.

Transcripts

play00:00

hey this is Lance from Lang chain so

play00:03

mraw released code STW today which is a

play00:05

code generation model um which I'm

play00:07

actually really excited about so it's

play00:10

really good at code generation tasks

play00:11

like fill INE middle or code completion

play00:13

it's trained on a programming language

play00:15

it has an instruct version that supports

play00:16

tool use but one of the reasons why I

play00:19

really like code generation models and

play00:20

I've actually done quite a bit of work

play00:22

with them is that they're just very

play00:23

generally useful so lots of companies

play00:25

for example want C custom code

play00:27

assistance that might combine like some

play00:28

documentation Plus Code gener ation um

play00:31

at Lang chain for example we have

play00:33

something called chat Lang chain it's

play00:34

basically QA over our docs it can

play00:36

produce functioning code blocks for

play00:38

users based on questions um and one of

play00:42

the other things is cool about code is

play00:43

really easy to evaluate it's really easy

play00:45

to test does this code actually execute

play00:46

or not um and so a really powerful idea

play00:50

related to code generation was put out a

play00:52

few months ago um from the folks at

play00:54

codium Ai and carboi summarized it

play00:57

really nicely here in that this idea of

play01:00

flow engineering for code for code

play01:03

generation is really powerful and the

play01:05

idea shown in the paper the alpha codium

play01:07

work and kind of highlighted here in

play01:09

this tweet from karpathy is simply that

play01:12

if you produce a code solution you can

play01:15

really easily check it in line kind of

play01:18

as mentioned here it's pretty easy to

play01:19

evaluate code at the minimum does it

play01:21

execute do the Imports work um in the

play01:24

maximum case do you have like a an

play01:26

actual solution do unit test but in any

play01:28

case the point is code is very easy to

play01:30

test and you can actually test it in

play01:32

your inference flow so you produce a

play01:35

generation you then test the code if it

play01:38

fails you can loop back and try again

play01:40

and this idea of kind of a a code

play01:42

generation flow was shown in the paper

play01:44

to produce much better results and it's

play01:46

something that I want to show today uh

play01:48

using Cod strw uh in a really simple

play01:50

test case so this is something I've done

play01:53

a little bit in the past and I found to

play01:54

be extremely effective it's a very

play01:55

simple idea but here's the basic flow

play01:58

that we will kind of highlight

play02:00

so I want to be able to take a question

play02:03

related to code generation pass it to

play02:05

the model so pass it to coal and have

play02:07

cool produce a

play02:09

solution um now what I'm going to do is

play02:11

I'm going to use a function calling or

play02:13

tool use with codol to produce an output

play02:15

object that has three things a preamble

play02:18

stating like here is the problem I'm

play02:19

trying to solve the Imports and the code

play02:23

itself and what I'm going to do is I'm

play02:24

going to show how it's really easy to

play02:26

incorporate some simple code checks like

play02:28

do the Imports work the code execute if

play02:30

either fail like there's a bug in the

play02:32

code then I'm going to show how to loot

play02:35

back and retry and this simple kind of

play02:37

like check retry Loop is a way to

play02:40

significantly improve the the accuracy

play02:42

and and kind of usability of code

play02:45

generation models I'm going to show how

play02:46

to do that right now so to kick this off

play02:50

I have a notebook here I've done a few

play02:52

pip installs I've just set my mistol API

play02:55

key that's really it and I'm also going

play02:56

to use Lang Smith or tracing of course

play02:58

this is optional I'm going to set envir

play02:59

IR ment variable for my line chain

play03:01

project which will basically will

play03:03

indicate where all my traces will go and

play03:06

this is just like the the kind of flow

play03:07

we want to lay out so we basically want

play03:09

to use cod strol to take a user question

play03:12

produce a solution and we want to test

play03:13

that solution if it passes our test

play03:16

return to the user if it doesn't try

play03:17

again that's all we want to do so what

play03:20

I'm going to do here is first let me

play03:22

just show some very basic components so

play03:25

first let's just talk about how to

play03:26

actually use code here's my general

play03:28

prompt I basically just I'm going to

play03:29

tell the model your code assistant

play03:31

ensure that all the code can be executed

play03:33

with all the Imports and variables

play03:34

defined structure your answer in three

play03:36

ways give me a preamble or a prefix

play03:38

describing the code solution give me the

play03:40

Imports give me a function and code

play03:42

block so I'm going to ask for those

play03:44

three things now here's where tool use

play03:46

comes in I can actually Define the

play03:47

schema of the output that I actually

play03:49

want and what I can do is I can bind

play03:51

that using a lang chain very convenient

play03:54

with structured output I can basically

play03:56

bind that to the llm and then this chain

play03:59

will invoke the LM using the structured

play04:01

output now here's how that actually

play04:03

works under the

play04:04

hood basically this object that passes a

play04:07

pantic object is converted into function

play04:10

schema form a STW and it's then passed

play04:13

or bound to the llm so the llm has

play04:16

access to this function and it knows the

play04:18

schema that it should return when that

play04:20

function call or tool is invoked so

play04:23

basically what happens is I can take a

play04:25

user question the function is invoked

play04:28

and then the llm knows to produce an

play04:30

output that adheres to my schema and

play04:33

this will basically be a Json string

play04:35

again remember llm is just string to

play04:36

string so it's going to be a Json string

play04:39

and then under the hood with this with

play04:41

structured output thing that I'm using

play04:42

from Lang chain we apply an output

play04:44

parser that basically pantic parti

play04:46

should take a Json string convert it

play04:48

back to pantic object so that's it

play04:49

that's all that's going on um but I'll

play04:51

show you how this is really cool so I'm

play04:54

defining this object this is what I want

play04:55

to get out here's my chain now let's

play04:57

test this out write a function for

play04:58

Fibonacci

play05:00

um I passed it in as a user question um

play05:03

so that's

play05:05

it now this is

play05:08

running great and we see a result now

play05:10

here's what's cool if you look at this

play05:12

result object it actually is a code

play05:15

object so it basically it's pantic

play05:17

object following the scheme we specify

play05:19

here it has a prefix boom it has some

play05:22

imports actually in this case none and

play05:24

then it has the code block that's it so

play05:27

we'll see why this is really useful in a

play05:29

little bit but want to introduce that

play05:30

idea of basically we can use cod strol

play05:33

with tool use to produce structured

play05:35

outputs which is generally very useful

play05:37

and in the particular for this notion of

play05:40

kind of like inline self-correction is

play05:42

extremely useful cool so that's that

play05:44

first piece now what I'm going to

play05:47

introduce here is Lang graph so Lang

play05:50

graph is a library from the Lang chain

play05:51

team and we've used this a number of

play05:54

other videos um and I've used this kind

play05:56

of extensively in general uh to build

play05:59

flow flows and this is an example of a

play06:01

flow the main characteristic of the flow

play06:03

that I highlight here that Lang graph is

play06:05

really well suited for is anything with

play06:06

a cycle so anything with feedback

play06:09

basically what it's saying is I want

play06:12

every time I run my app I want to do

play06:15

this code generation produce a

play06:17

structured output do some kind of code

play06:19

checks make a decision based on the

play06:21

outcome of those code checks feedback if

play06:23

they fail finish if they pass that you

play06:26

can think of as like a very simple kind

play06:27

of like workflow um and L graph is a

play06:30

great way to build these kinds of

play06:31

workflows and we'll see why so the first

play06:33

thing I need to specify with L graph is

play06:35

just simply the graph State now this is

play06:37

just a thing that lives throughout the

play06:38

lifetime of My Graph it basically

play06:40

represents all information that's shared

play06:42

across what you might call these nodes

play06:44

so in this case I have two particular

play06:45

nodes and you might call this an edge so

play06:48

this is kind of where I'm making a

play06:49

decision um so State lives across these

play06:52

nodes and edges so that's really it so

play06:54

I'm going to Define my state it's going

play06:55

to attain some information that's

play06:57

relevant to the flow I just talked about

play06:59

so it's contain an error message it's

play07:01

going to contain my final generation

play07:03

it's going to contain the messages that

play07:04

are being passed to my

play07:05

llm and this will all become a little

play07:07

bit more clear as we go forward so here

play07:11

I'm going to lay out this is basically

play07:13

the nodes and the edges of My

play07:15

Graph now what you'll see is for every

play07:19

node here's my generate node and that's

play07:21

what we laid out here

play07:23

generate um it's going to take in the

play07:26

state and the nodes just modify the

play07:28

state in some way so that's how to think

play07:30

about the nodes so in this case I take

play07:31

in the state I unpack the state into

play07:34

like some messages uh some of iterations

play07:37

an error message these are things we're

play07:39

going to use throughout our graph um so

play07:42

then what I'm going to Simply do is

play07:45

compute a code solution so I'm going to

play07:46

look at my messages in and I'm going to

play07:48

generate a solution now remember that's

play07:51

exactly what we did up here so this is

play07:52

actually nothing new remember look at

play07:54

this this is just we Define a set of

play07:56

messages invoke our code gen chain get

play07:58

an output

play07:59

same thing we're going to be doing in

play08:00

our graph so this is nothing exotic

play08:02

we've actually already tested this and

play08:05

once that runs I'm just going to pend

play08:08

that output of code solution to my

play08:10

messages okay so you know again here's

play08:14

my attempt to solve the problem I'm just

play08:15

going to take the codes the prefix the

play08:17

Imports and the code I'm just going to

play08:19

add that as a new message I'm going to

play08:20

increment my iterations we'll use as

play08:22

iterations to determine when to stop um

play08:25

and I'm just going to return then my

play08:27

state with a few things here first is

play08:30

going to be my code solution My

play08:33

Generation that's it then it's going to

play08:35

be my my stack of messages which is

play08:37

basically just pended to and then the

play08:39

number of iterations that's really it

play08:41

that's it so nice and easy there now the

play08:43

code check is the second kind of big

play08:45

node that we're going to be working with

play08:46

so our first node is Generation the

play08:48

second node is our code checks we just

play08:50

saw Generation generation can return the

play08:52

the generation with the three pieces the

play08:55

Preamble The Imports and the code block

play08:57

now it's going to be passed to code

play08:58

check

play09:00

so code check is really anything we want

play09:03

to be we can do any kind of checks on

play09:05

this code now maybe in the best case

play09:07

with some kind of unit test we could run

play09:09

I'm going to show you the simplest

play09:10

possible code check that we might want

play09:12

to do so in this particular particular

play09:14

case what I'm going to do is I'm I'm

play09:16

going to get the code solution from our

play09:18

state remember we wrote that out to

play09:20

state so the generation contains our

play09:22

code solution and in this node I just

play09:24

pick it back up from State you know

play09:25

State's pass every node I get the code

play09:28

solution I EX exract the three pieces

play09:30

and we just showed that above so I get

play09:31

the prefix the Imports and the code and

play09:33

now all I'm going to do is simply just

play09:35

test

play09:36

execution do does Imports execute if not

play09:40

I'm going to throw a flag or I'm going

play09:41

to kind of flow throw a message here

play09:43

code import failed I'm going to take an

play09:45

error message I'm going to pen that

play09:46

error message to our messages object and

play09:48

I'm going to return that that's it and

play09:52

alternatively if that passes I'm going

play09:55

to go ahead and try the whole thing so

play09:56

I'm going to combine the Imports and the

play09:58

code um I'm going to go ahead and

play09:59

execute the the code and again if that

play10:03

fails I'm going to basically return

play10:04

another error message um and now if

play10:07

there's no errors then that's great I

play10:09

confirm that you know there's no test

play10:11

failures I've set this error flag to no

play10:14

and everything else the same as before I

play10:15

return the messages I return iterations

play10:17

I return code Solution that's it now

play10:19

this is the final bit all we're going to

play10:22

do here is decide whether or not to

play10:24

finish this is basically our little

play10:25

conditional Edge which we talked about

play10:27

here and all this needs to to do because

play10:30

we wrote that error flag to state so

play10:33

again remember we wrote error no if none

play10:35

of these tests passed we wrote error yes

play10:37

if either one does right there and if

play10:39

that's the case all we need to do then

play10:42

is get our get our error from the state

play10:45

um if it's no or we've exceed the Max

play10:47

iterations uh then we just go ahead and

play10:49

finish um and if yes then we go back to

play10:54

generate that's

play10:56

it and that's really it so we're just

play10:59

going to find all of those

play11:02

pieces and we're basically almost done

play11:04

here let's just build this graph now

play11:06

this is how in L graph you can actually

play11:08

assemble your workflow all I need to do

play11:10

is take that function I defined generate

play11:12

add it as a node take the function we

play11:14

defined code check add it as a node

play11:16

again this is my like this is the state

play11:18

graph um and I just build the graph here

play11:21

set my entry point as generate um add an

play11:23

edge and then uh so basically I go from

play11:27

generate to check code and I go from

play11:29

check code to basically I decide to

play11:32

finish based

play11:34

upon this logic right here and basically

play11:39

if if it returns end then I end if it

play11:43

returns generate I go back to generate

play11:45

okay um so that's really the Crux of all

play11:48

you need to do and I can go ahead and

play11:50

run that and actually this will draw my

play11:51

graph for me using this little display

play11:53

feature right here so we can see we

play11:55

start we go to generate we go to code

play11:57

check optionally depending on what

play12:00

happens from the code check based onal

play12:02

Edge will'll go back to try to

play12:04

regenerate um or if there's no errors we

play12:08

go to end so that's really it it's

play12:10

pretty

play12:11

nice um and the one thing I'll just make

play12:15

a note of is as we're going through this

play12:17

flow we're actually appending to our

play12:18

messages and so basically if the if

play12:21

there's an error we're appending that

play12:24

failure to our messages and we're

play12:26

basically telling llm here is the

play12:29

failure reflect on this error um State

play12:32

what you think went wrong and try again

play12:35

so that's really

play12:37

it nice and easy and there's our

play12:40

flow cool now let's try this out here's

play12:44

like the simplest possible um you know

play12:47

kind of problem write a Python program

play12:48

that prints hello world right so let's

play12:50

try this out I'm going to run this and

play12:53

what's kind of nice is I have this kind

play12:55

of this this nice kind of formatting

play12:57

stuff you can kind of see the input um

play13:00

yeah okay so R program that prints hello

play13:02

world generating code solution and then

play13:04

here's here's kind of my attempt to

play13:06

solve it um here's actually the the

play13:09

Imports none the code here's the code it

play13:12

goes through the checks no test failures

play13:13

and it ends cool now what it's nice is I

play13:17

can go over to lsmith and I have this

play13:21

project right here now this project

play13:24

actually lays out exactly what we just

play13:26

did but we can actually dig into each

play13:27

piece so here we went we start our graph

play13:30

we went to generate I go to again this

play13:32

is using Cod strol model so here's here

play13:35

is my U you know human message or my

play13:37

question in um this is showing that it

play13:40

does indeed invoke my function so that's

play13:42

great um here's the prefix here's the

play13:44

Imports here's the code block um it uses

play13:47

a pantic tools parser to basically write

play13:50

that out as a pantic object we talked

play13:51

about that

play13:52

previously um and then here's the code

play13:55

check so basically it goes through the

play13:58

the various code checks and you can kind

play14:00

of check all these here um and then it

play14:03

goes through the decision to finish um

play14:06

and in this particular case because none

play14:07

of the code checks failed and finished

play14:09

so that's great so this is a good

play14:11

example of like of kind of how the flow

play14:13

Works in a very simple test case now

play14:15

let's try something this a little more

play14:18

sophisticated cool so in this case I'm

play14:21

basically asked to vectorize a function

play14:23

I give it a function um I show me ask it

play14:25

to show me a test case with this with

play14:27

this actually working okay so we can see

play14:29

that it kicks off the flow here I want

play14:30

to vectorize a function um here's the

play14:33

inmt to solve the problem uh so here's

play14:36

kind of the initial solution now what we

play14:38

see here is your solution failed the

play14:39

code execution test it did not Define

play14:41

image reflect on the error attempt to

play14:43

solve it here's my attempt to solve a

play14:45

problem the error C because variable the

play14:47

the variable image is not defined to

play14:48

solve this problem so you make it kind

play14:50

of reflect on its error and try again so

play14:53

it goes and tries again we see it fails

play14:55

again for a different reason isolution

play14:57

failed the C secution test it could not

play14:58

broadcast cast um uh 505050 into a shape

play15:03

50-50 53 okay um so here's my here's the

play15:08

attempt to solve the problem it kind of

play15:10

explains itself it goes back through and

play15:13

then all the code tests now pass um so

play15:16

that's basically it and then

play15:18

finish so this showcases how you can use

play15:21

code generation using new code STW model

play15:23

stol with self correction using Lang

play15:27

graph and what we showed in in general

play15:30

is the ability to to perform code

play15:32

generation and perform arbitrary checks

play15:35

on the output of the generation itself

play15:37

if any checks fail Loop them back use a

play15:39

message cue to accumulate over time uh

play15:42

or over iterations in this flow the

play15:45

various errors and then pass them back

play15:47

to the LM to attempt to self-correct and

play15:50

we seen we've seen this work pretty

play15:51

effectively in a very simple test case

play15:53

but I've actually seen this work really

play15:55

well in the case of uh code generation

play15:57

with Rag and and uh we have some other

play16:00

uh resources on that which I'll share

play16:01

later thanks

Rate This

5.0 / 5 (0 votes)

Related Tags
コード生成自己修正Lang chainCodium AIツール使用コードテストインライン評価プログラミングAIモデル開発支援
Do you need a summary in English?