Will "Claude Investor" DOMINATE the Future of Investment Research?" - AI Agent Proliferation Begins

AI Unleashed - The Coming Artificial Intelligence Revolution and Race to AGI
23 Mar 202424:01

Summary

TLDRThe transcript discusses the rapid advancements in AI, particularly in agentic workflows, which are expected to drive significant progress in the field. It highlights the potential release of GPT-5 and the evolution of AI models like CLA, which has transitioned from creating self-portraits to investment analysis. The importance of iterative processes and multi-agent collaboration is emphasized, with examples such as the development of an open-source alternative to Devon and the iterative improvement in AI's ability to write code and play games like Super Mario 64. The transcript also touches on the educational resources available for learning about AI and the potential impact of AI agents on various industries.

Takeaways

  • ๐Ÿš€ Andrew A emphasizes the significance of AI agentic workflows, predicting they will drive substantial AI progress, potentially outpacing the next generation of foundation models like the anticipated GPT-5.
  • ๐Ÿ”ฎ Sam Altman expects a major model launch later in the year, with CLA 3 demonstrating a shift from creating self-portraits to more complex tasks like investment analysis.
  • ๐ŸŒ The creator of Claud investor, an AI investment analyst, has open-sourced the tool, showcasing the potential for AI in financial analysis and decision-making.
  • ๐Ÿค– AI development is moving towards more collaborative and iterative processes, with multiple agents working together and refining tasks through planning, tool use, and reflection.
  • ๐Ÿ“ˆ AI's capability in coding is improving, with GPT 3.5 and 4 demonstrating higher correctness rates in coding tasks compared to their predecessors.
  • ๐Ÿ› ๏ธ Tool use is a critical component in AI workflows, allowing AI to gather information, take action, or process data more effectively.
  • ๐Ÿ“ Iterative processes and multi-agent collaboration significantly enhance the quality of AI output, yielding results that surpass those of single-pass writing or isolated AI models.
  • ๐Ÿ” Andrew A shares a framework for categorizing design patterns for building agents, highlighting reflection, tool use, planning, and multi-agent collaboration as key elements.
  • ๐Ÿ“Š Claud Investor illustrates the application of AI in investment analysis, providing an example of how AI can synthesize financial data and news to make investment recommendations.
  • ๐ŸŽฎ Experiments with GPT-4's ability to play games like Super Mario 64 show the potential for AI in interactive environments, despite latency and decision-making challenges.

Q & A

  • What does Andrew A. urge everyone to pay attention to?

    -Andrew A. urges everyone to pay attention to AI agentic workflows, as he believes they will drive massive AI progress.

  • What is expected to be released later this year in terms of AI foundation models?

    -GPT-5, the next generation of OpenAI's foundation models, is expected to be released later this year.

  • What is CLA, and what is its recent development?

    -CLA is an AI model that has evolved from drawing somewhat disturbing self-portraits to trying its hand at beating Warren Buffett as an investor.

  • What does the creator behind Claud investor say about potentially open-sourcing it?

    -The creator behind Claud investor has mentioned that they may open-source it the next day, and indeed, they do open-source it.

  • What is Devon and how was it received in the AI community?

    -Devon is an open-source alternative to impressive seeming AI agents that was recently released and is currently available to a small group of testers.

  • What is the significance of Andrew A.'s statement about agentic workflows being more important than the next generation of foundation models?

    -Andrew A.'s statement highlights the potential impact of agentic workflows on AI progress, suggesting that their importance may surpass that of foundational models like GPT-5.

  • How does an agentic workflow with an LM (Language Model) differ from a zero-shot approach?

    -An agentic workflow involves the LM iterating over a document multiple times, planning, researching, writing drafts, and revising, whereas a zero-shot approach involves the LM answering a question or completing a task in one go without prior examples or iterations.

  • What is the 'Reflection' tool use case mentioned in the script?

    -In the context of AI, 'Reflection' refers to the AI examining its own work to come up with ways to improve it, which is a critical part of an agentic workflow.

  • How does multi-agent collaboration improve the results of AI tasks?

    -Multi-agent collaboration involves more than one AI agent working together, splitting up tasks, and discussing or debating ideas to come up with better solutions than a single agent would be able to achieve.

  • What is the improvement rate when incorporating an iterative agent workflow with GPT 3.5?

    -Incorporating an iterative agent workflow with GPT 3.5 can achieve up to a 95.1% improvement rate, which is significantly better than the 48% rate of GPT 3.5 alone and even surpasses the performance of GPT 4.

  • What is the potential application of AI agents in the stock market according to the script?

    -AI agents can be used to analyze financial data, news, sentiment, and industry trends for stocks within a given industry, rank them by investment potential, and provide price targets, although it's emphasized that these are for educational or informational use only.

Outlines

00:00

๐Ÿš€ Andrew A's Prediction on AI Agentic Workflows

Andrew A emphasizes the significance of AI agentic workflows, predicting they will drive substantial AI progress this year, potentially outpacing the next generation of foundation models like the anticipated GPT 5. He highlights the rapid advancement of these workflows, noting their improvement from mere ideas six months ago to highly effective tools now. Andrew A, a respected AI researcher, teaches a broader audience about AI through his courses on deeplearning.ai, many of which are free, making him a valuable resource for those interested in diving deep into the field.

05:02

๐Ÿ“ˆ Iterative AI Workflows and Their Impact

The paragraph discusses the iterative nature of AI workflows, comparing them to human writing processes. It suggests that iterative AI workflows yield significantly better results than single-pass writing. The concept of a 'Society of Minds,' where multiple agents focus on different tasks, is introduced as a way to further enhance outcomes. The example of GPT 3.5's iterative process achieving a 95.1% success rate in a coding benchmark is provided, showcasing the dramatic improvement over non-iterative models. The paragraph also touches on the importance of providing examples to AI models, such as GPT 3.5 and GPT 4, to improve their performance significantly.

10:04

๐Ÿ› ๏ธ Devon's Workspace and Multi-Agent Collaboration

The paragraph describes Devon's workspace, which includes various tools like a shell, browser, and editor, as well as a planner for breaking down tasks into subtasks. It highlights the problem-solving process Devon goes through when initializing a chart component, showcasing the iterative planning and execution mechanism. The paragraph also discusses the effectiveness of multi-agent collaboration, where multiple AI agents work together to discuss, debate, and refine ideas, leading to better solutions than a single agent could achieve. The potential applications of this collaborative approach in various fields are emphasized, along with the importance of open-source tools and frameworks for AI agent development.

15:05

๐ŸŽฎ GPT-4's Gaming Capabilities with Multimodal Gamer

Josh Biet, an engineer on the Hyperight AI project, explores GPT-4's ability to play Super Mario 64 using a multimodal gamer framework. Despite GPT-4's latency, the model demonstrates decision-making and navigation skills within the game. Biet iteratively improves the model's performance by adjusting the prompt and providing more context about the game's controls. The paragraph details the step-by-step process of how GPT-4 interacts with the game, including its successes and failures, and ends with a call for others to modify and expand the repository for playing other games, especially those less sensitive to latency issues.

20:08

๐ŸŒŸ Final Iteration and Code Overview of Multimodal Gamer

The final iteration of the multimodal gamer project is presented, showing GPT-4's improved performance in navigating and playing Super Mario 64. A fast-forwarded version of the gameplay demonstrates the model's ability to learn and adjust its strategy over time. The paragraph then provides an overview of the code repository, explaining the simple structure and the use of a prompt to guide GPT-4's actions. The potential for the repository to be adapted for other games is discussed, with a focus on games that can accommodate the model's latency. Biet encourages others to modify the repository and experiment with different games, offering to share further developments through YouTube videos.

Mindmap

Keywords

๐Ÿ’กAI Agentic Workflows

AI Agentic Workflows refer to the process where artificial intelligence systems are designed to perform tasks in a manner similar to how a human would, with the ability to plan, iterate, and execute multiple steps to achieve a goal. In the context of the video, this concept is highlighted as a significant driver of AI progress, potentially surpassing the impact of the next generation of foundation models like GPT-5.

๐Ÿ’กFoundation Models

Foundation models are large-scale machine learning models that serve as a base for building various AI applications. They are typically pre-trained on massive datasets and can be fine-tuned for specific tasks. The video mentions the anticipation of GPT-5, the next iteration of OpenAI's foundation models, as a significant event in the AI field.

๐Ÿ’ก่ฟญไปฃ่ฟ‡็จ‹ (Iterative Process)

The iterative process in AI refers to the approach where an AI system is allowed to refine its output through multiple rounds of revision and improvement. This is akin to how human writers often go through drafts, making changes to enhance the quality of their work. In the video, the iterative process is emphasized as a critical method that yields much better results than a single-pass approach.

๐Ÿ’กOpen Source

Open source refers to a philosophy and practice of allowing others to view, use, modify, and distribute a work under certain licenses. In the context of the video, open source AI tools are mentioned as a way to democratize AI development and innovation, enabling a broader community to contribute and build upon existing technologies.

๐Ÿ’กMulti-Agent Collaboration

Multi-agent collaboration in AI involves multiple AI systems working together to achieve a common goal. Each agent may have a specific role or focus, and they can share information, discuss, and coordinate their actions to produce better outcomes than individual agents could alone. The video highlights this as a powerful approach that can significantly enhance the capabilities of AI systems.

๐Ÿ’กTool Use

Tool use in AI refers to the ability of an AI system to utilize various tools or functionalities to aid in its tasks. This could include web search, code execution, or other data processing functions. The concept is important as it allows AI to gather information, take actions, and process data more effectively, leading to improved performance in complex tasks.

๐Ÿ’กReflection

In the context of AI, reflection refers to the ability of an AI system to examine its own work and identify ways to improve it. This self-assessment capability is crucial for AI systems to learn from their outputs and enhance their performance over time, much like how humans reflect on their actions to better themselves.

๐Ÿ’กPlanning

Planning in AI involves the AI system's capability to devise and execute a multi-step plan to achieve a specific goal. This includes setting objectives, outlining the steps needed to reach those objectives, and carrying out the necessary actions in a systematic manner. The video emphasizes planning as a key aspect of AI agentic workflows that can lead to more effective and sophisticated outcomes.

๐Ÿ’กFew-Shot Learning

Few-shot learning is a machine learning paradigm where a model is trained on a small number of examples and is expected to generalize its learning to new, unseen data. This is particularly relevant in AI as it allows models to adapt quickly to new tasks with limited data, which is a significant advantage over models that require large datasets for training.

๐Ÿ’กLatency

Latency in the context of AI and computing refers to the delay in the system's response to a request. In AI applications, such as playing games or making decisions, latency can significantly impact performance. Lower latency allows for faster and more efficient decision-making, which is crucial for real-time applications.

๐Ÿ’กMultimodal Gamer

Multimodal Gamer is a framework or system that enables multimodal models, which can process and understand multiple types of data (like text, images, and sound), to play video games. This concept is showcased in the video through an experiment where GPT-4 is used to play Super Mario 64, demonstrating the potential of AI in interactive and complex environments.

Highlights

Andrew Ng emphasizes the importance of AI agentic workflows for driving significant AI progress, potentially more than the next generation of foundation models.

Expectations for the release of GPT-5 or the next iteration of OpenAI's foundation models are high.

CLA 3 showcases its versatility from creating self-portraits to attempting to outperform Warren Buffett as an investor.

The creator behind Claud investor has open-sourced the model, which could revolutionize investment analysis.

Devon, an impressive AI agent, is currently in a limited testing phase and demonstrates the rapid advancement of AI capabilities.

Andrew Ng advocates for AI education and offers free courses on deeplearning.ai for those interested in learning about AI.

AI agentic workflows are improving at an astonishing rate, with significant advancements observed over just a 6-month period.

The transition from using LLMs in zero-shot mode to iterative agent workflows marks a significant shift in AI development.

An iterative workflow with AI yields much better results than a single-pass approach, similar to human writing processes.

The concept of 'Society of Minds' where multiple agents with different focuses collaborate further improves AI outcomes.

GPT-3.5, when used iteratively in an agent loop, can achieve up to 95.1% correctness, surpassing the capabilities of GPT-4.

Open source agent tools and academic literature on agents are becoming more prevalent, indicating an exciting yet confusing time in AI development.

A framework for categorizing design patterns for building agents is shared by Andrew Ng, highlighting the practical applications of AI in various fields.

Reflection, tool use, planning, and multi-agent collaboration are identified as key components of effective AI agentic workflows.

Devon's workspace showcases the ability of AI to plan, execute, and troubleshoot tasks, demonstrating a human-like approach to problem-solving.

Multi-agent collaboration is extremely effective, as seen in various examples, including the iterative improvement of GPT 3.5's capabilities.

The potential of using AI agents like Claud investor for financial analysis and investment tracking is discussed, highlighting the educational and informational use of such tools.

Open sourcing of AI models and frameworks, such as the multimodal gamer, encourages experimentation and innovation in AI gaming applications.

The rapid iteration and improvement of AI models, as seen in the development of multimodal gamer, demonstrate the potential for AI in complex tasks like playing video games.

Transcripts

play00:00

so Andrew a urges everyone to pay

play00:02

attention to AI agentic workflows saying

play00:06

that they will drive massive AI progress

play00:08

this year potentially more than the next

play00:10

generation of foundation models keep in

play00:13

mind we're expecting GPT 5 to come out

play00:15

later this year or whatever the next big

play00:19

iteration of open ai's foundation models

play00:21

will be Sam Alman said on Lex Freeman

play00:23

podcast that he expects that the next

play00:26

big model launch is later this year CLA

play00:28

3 goes from drawing somewhat disturbing

play00:31

self-portraits to trying its hand on

play00:34

beating Warren Buffett as an investor

play00:36

the Creator behind Claud investor even

play00:39

says that he may open source it tomorrow

play00:42

but will he spoiler alert he does

play00:45

meanwhile if you got the technical chops

play00:47

there's a team that's building in public

play00:49

creating the open source alternative to

play00:51

Devon Devon is of course one of the more

play00:54

impressing seeming AI agents right now

play00:56

came out just last week I believe and

play00:58

currently is available to a small group

play01:00

of testers I am not one of them why am I

play01:02

not one of them am I not cool enough am

play01:05

I not worthy of Devon fine see if I care

play01:08

but I I do care I care a lot but let's

play01:09

get back to Andrew a now Andrew a is one

play01:12

of the most well-known well respected AI

play01:15

researchers that's doing a lot to teach

play01:17

a greater audience about AI about how to

play01:20

use AI he's got a lot of courses at

play01:23

deeplearning.ai a lot of them free so if

play01:25

you're ready to dive deep he's got tons

play01:28

of stuff on here with VAR specialists in

play01:29

the the field a great resource a lot of

play01:32

it is free I think he has a few paid

play01:33

courses but a lot of this is free and so

play01:35

he just posted this a few days ago

play01:37

saying I think AI gentic workflows will

play01:39

drive massive AI progress perhaps even

play01:41

more than the next generation of

play01:43

foundation models again that's that's

play01:45

saying a lot knowing what Foundation

play01:48

models we expect next this is kind of a

play01:50

big deal he's saying this is an

play01:51

important Trend and I urge everyone who

play01:53

works in AI to pay attention to it and

play01:55

if you've been following we've covered a

play01:57

lot of these agents these agentic

play01:59

workflows on his channel and they're

play02:02

getting scary good scary Fast 6 months

play02:05

ago it was just an idea there are some

play02:08

examples of it but nothing really too

play02:10

exciting that can use and slowly as they

play02:12

start coming out each time shockingly

play02:14

better the rate of progress is insanely

play02:17

fast so he's saying today we use mostly

play02:19

llms in zero shot mode so like you ask

play02:21

it a question and it answers you which

play02:23

is similar to asking somebody to just

play02:25

write an essay start to finish typing

play02:27

straight through without using

play02:28

Backspaces you know not brainstorming

play02:31

beforehand right just start typing it

play02:33

out beginning to end with an agentic

play02:35

workflow however we can ask the LM to

play02:37

iterate over a document many many times

play02:39

it might plan an outline decide what

play02:41

kind of research it needs to do for

play02:43

example do some Google searches to

play02:45

gather more information it can write a

play02:46

first draft read it over revise iterate

play02:50

Etc when I did my review of Chad Dev

play02:53

which was the sort of agentic workflow

play02:55

where you had multiple agents each

play02:57

responsible for their own area of

play02:59

producing say a code or app or a little

play03:02

game asked them to create something I

play03:03

think it might have been Flappy Bird

play03:05

right some simple game and each of these

play03:08

agents each little person that you see

play03:10

here they represent an agent that

play03:12

actually existed in that sort of

play03:13

environment so each one of these little

play03:15

faces that you see here that was a

play03:17

separate instance of GPT 3.5 that's what

play03:21

they were using at the time you could I

play03:22

believe KCK get up to GPT 4 but this

play03:24

thing would you know use up a lot of

play03:26

tokens so if you use gbt 4 you could

play03:28

potentially you know run up quite a bill

play03:30

but even with GPT 3.5 they were able to

play03:34

produce very impressive results because

play03:36

they would work together this was the

play03:37

CTO the chief technical officer he would

play03:39

go from design have them kind of write

play03:42

the outline and everything that was

play03:43

needed so this was kind of the planning

play03:45

designing Etc then it would go into

play03:46

coding they would actually create all

play03:48

the code the CTO would kind of go along

play03:51

with them and kind of pingpong the

play03:53

process back and forth until it was

play03:54

refined then they would kick it over to

play03:56

testing and testing would like run it

play03:58

see if they can spot any bugs make sure

play04:00

everything works so they see this guy

play04:02

kind of a no bug symbol appears here and

play04:04

when I tested this last year I have a

play04:06

video about it testing would keep

play04:08

kicking the code back to coding coding

play04:10

would refine it and this happened I mean

play04:12

three four five times as I was sitting

play04:15

and I can assumed this was a glitch this

play04:16

was a doom Loop right I was like okay

play04:18

it's just going to get stuck it's just

play04:19

going to keep sitting there forever

play04:20

burning through my API credits but no

play04:22

eventually it was like okay yeah we got

play04:24

rid of all the bugs and they kicked it

play04:26

over to the documentation step that

play04:29

built a manual for the game when I saw

play04:32

this happening live on my computer I I

play04:34

was kind of floored I was like all right

play04:37

there's something big here because this

play04:39

I mean they use the waterfall model of

play04:41

development they discuss the code they

play04:43

break it down into steps is this is very

play04:45

very similar to what a development

play04:48

agency would do it's eerily similar to

play04:51

how humans would handle this sort of

play04:53

process and the fact that they can

play04:54

iterate and test bugs to make sure

play04:56

everything's working I mean I think now

play04:58

more people are aware that this can

play04:59

happen but back then when I saw it for

play05:01

the first time live on my computer like

play05:04

happening you know locally with open AI

play05:06

API but they were running on my computer

play05:09

spitting out useful code that I could

play05:12

run those games or apps or whatever and

play05:14

troubleshooting them that that was weird

play05:17

and also the fact that gbt 3.5 by

play05:19

running multiple instances that all work

play05:21

together kind of went back and forth It

play05:23

produced GPT 4 like results what happens

play05:26

when you string a bunch of GPT 4S

play05:29

together right right when the next

play05:30

Generation model comes out what happens

play05:32

if you let's say GPT 5 right or Claude 4

play05:35

or whatever what happens when you string

play05:37

all of those together where each of them

play05:40

has its own job its own focus and they

play05:43

work together to refine that idea we're

play05:45

not that far from that so Andrew Ang

play05:47

continues the iterative process is

play05:49

critical for most human writers to write

play05:52

good text with AI such an iterative

play05:54

workflow yields much better results than

play05:56

writing in a single pass this is true we

play05:59

know this to be true we also again from

play06:01

a lot of the studies that I've seen as

play06:03

well as real life results having

play06:04

iterative results plus kind of this

play06:07

Society of Minds kind of multiple agents

play06:10

with their own Focus this even further

play06:12

improves the results deon's splashy demo

play06:15

recently received a lot of social media

play06:17

Buzz my team has been closely following

play06:18

the evolution of AI that writes code by

play06:21

the way he he had this course for quite

play06:23

you know like 6 months maybe pair

play06:25

programming with a large language model

play06:27

on deep learning that AI by the way I'm

play06:29

obviously not getting paid to say any of

play06:31

this this is a free course I just do

play06:33

think this is a pretty cool resource if

play06:35

if you're looking for the Deep technical

play06:36

dive and so he's saying that they've

play06:38

been doing quite a bit of research into

play06:40

this he's saying GPT 3.5 zero shot was

play06:43

48.1% correct they're using the human

play06:46

eval coding Benchmark GPT 4 zero shot so

play06:49

again zero shot meaning we're not giving

play06:51

you examples of how to create that

play06:54

particular code how to solve the

play06:55

particular problem we're just saying

play06:56

here's the problem and it spits out the

play06:58

answer so GP 4 does better at 67% by the

play07:02

way giving these models examples like

play07:05

few shot learning can be massive so this

play07:08

is Matt Schumer we'll talk about him

play07:09

later so he is at hyper right AI so he

play07:12

was he created one of the AI agents that

play07:14

we reviewed here I believe this is the

play07:16

team behind self-operating computer and

play07:19

Hyper right AI they've had a lot of

play07:21

updates on this so we'll definitely

play07:22

check them out in a different video but

play07:24

he posted this a few days ago he's

play07:26

saying highest Alpha Secret in AI right

play07:28

now if you provide around 10 examples to

play07:31

claw 3 Hau so Hau is the tiny CLA 3

play07:35

Model the smallest one I believe it goes

play07:36

Hau Sonet and then Opus Opus is the one

play07:39

that everyone's kind of focusing on as

play07:41

the really good one right but he's

play07:42

saying you give 10 examples you know 10

play07:45

shot learning to Cloud 3 ha coup it will

play07:48

often outperform Cloud 3 Opus and far

play07:51

outperform GPT 4 at a fraction of the

play07:54

cost with blazing speeds meaning that if

play07:56

you if you tell clo theopus please do

play07:59

XYZ right whatever you're asking to do

play08:02

right and then whatever output it gives

play08:03

you you'd say oh that's an a good job

play08:05

CLA 3 Opus but you take CLA 3 ha coup

play08:08

the small model it's super cheap very

play08:10

fast and you tell it please do XYZ but

play08:13

you give it 10 examples right you give

play08:15

it here's an example 1 2 3 you give it

play08:16

10 examples well that output that it

play08:19

gives you that might be an A+ it might

play08:21

be better than Cloud 3 Opus the bigger

play08:24

model so that's an important point to

play08:25

understand that a lot of the stuff it

play08:26

Stacks few shot examples right it

play08:29

improves creating multiple agents each

play08:31

responsible for its own thing it

play08:32

improves the results but next Andrew a

play08:36

continues however the improvement from

play08:38

gbt 3.5 to gbt 4 so from 48% to 67 why

play08:42

it's it's dwarfed by incorporating an

play08:44

iterative agent workflow indeed wrapped

play08:47

in an agent Loop GPT 3.5 achieves up to

play08:51

95.1% so it's massively massively better

play08:54

than GPT 4 open source agent tools and

play08:57

the academic literature on agents are

play08:59

are proliferating making this an

play09:00

exciting time but also a confusing one

play09:02

and so to simplify he's sharing a

play09:04

framework for categorizing design

play09:06

patterns for building agents he's saying

play09:08

his team AI fund is successfully using

play09:11

these patterns in many applications and

play09:13

I hope you find them useful isn't it

play09:15

interesting how a lot of the stuff that

play09:16

you know the top AI Minds sharing this

play09:18

on Twitter so the rest of us can learn

play09:20

it and use it and hopefully when we

play09:22

learn something new also share it with

play09:23

the world is kind of similar to a lot of

play09:25

the stuff that we're talking about in

play09:27

regards to how these AI agents work

play09:29

together kind of interesting I think so

play09:32

here are the things that they've been

play09:33

finding extremely useful one is

play09:35

reflection where the AI examines its own

play09:37

work to come up with ways to improve it

play09:39

tool use the LM is given tools such as

play09:42

web search code execution or any other

play09:44

function to help it gather information

play09:46

take action or process data planning the

play09:49

LM comes up with and executes a

play09:51

multi-step plan to achieve a goal for

play09:53

example writing an outline for an essay

play09:55

then doing online research then writing

play09:56

a draft and so on this is something that

play09:58

I think d does extremely well here's

play10:01

from Ethan mik's tweet that we went over

play10:03

yesterday so here's Devon's workspace

play10:06

right so he's got multiple things like

play10:07

shell browser editor but he's also got

play10:09

this planner where whatever task you

play10:11

give him Devon breaks it up into

play10:13

multiple steps kind of subtasks to

play10:15

complete that and then one by one goes

play10:17

through does it checks it off does it

play10:19

checks it off Etc right including

play10:22

troubleshooting so here looks like he

play10:24

ran into I say he I mean so they call

play10:27

this one Devon the this person building

play10:29

the open source version is calling it

play10:30

Dev cut so but whatever the case is so

play10:33

Devon runs into some problem in

play10:35

initializing a charts component right he

play10:38

tries to figure out how to do it and

play10:39

resolves it by you know importing

play10:42

something that he needs the point is

play10:43

there's some bug or some error that it

play10:45

solves right and then checks it off

play10:47

going yep now that we resolve that issue

play10:49

we're going to redeploy the web app

play10:51

check and so it just keeps going down

play10:53

this list so a lot of the things that

play10:55

people were complaining about GPT 4

play10:57

being stupid and not being able to

play10:58

complete certain tasks I mean how much

play11:01

of that just goes up in smoke when you

play11:03

add a really strong ability for it to

play11:05

plan out its steps think through you

play11:07

know step by step but also have some

play11:09

sort of this some sort of a iterative

play11:11

planning and executing mechanism and

play11:13

then of course the final one multi-agent

play11:15

collaboration more than one AI agent

play11:17

work together splitting up tasks and

play11:18

discussing and debating ideas to come up

play11:21

with a better to come up with better

play11:22

Solutions than a single agent would now

play11:24

a lot of people might dismiss this like

play11:26

isn't this the same thing as planning

play11:28

and flection and iterating no now we've

play11:32

covered multiple examples here on this

play11:34

channel not just Chad but many other

play11:36

where multi-agent collaboration is

play11:39

extremely effective so you can see here

play11:41

depending on what kind of architecture

play11:42

we use right zero shot reflection tool

play11:45

use planning multi-agent all right so

play11:47

out of the box GPT 3.5 is here 47 46%

play11:51

whatever that was GPT 4 much better

play11:53

right very impressive 68 whatever 66 but

play11:56

this is the Improvement when we add

play11:59

those other types of architecture look

play12:01

at this massive shot from where GPT 3.5

play12:04

started to where it could be right the

play12:06

same model but massive massive massive

play12:09

Improvement by the way if you're

play12:11

interested in this stuff Matt Schumer so

play12:12

the guy behind hyper R AI is a great

play12:15

follow that whole idea of using Claude

play12:17

ha cou to get the quality of Opus at the

play12:20

fraction of cost and latency so he made

play12:22

a collab notebook right put all the code

play12:25

in there and is open sourcing it so if

play12:27

you wanted to try it out you now here it

play12:29

is on GitHub and here's his latest so

play12:32

this is the Claude investor the first

play12:35

Claud 3 investment analyst agent just

play12:37

provide an industry and it will one find

play12:40

Financial data/ newws for key companies

play12:43

two analyze sentiments SL trends for

play12:45

each three rank Stocks by investment

play12:47

potential plus price targets and it's

play12:49

open sourced now you might look at the

play12:52

specific things that goes into this and

play12:54

say well these aren't the best things

play12:56

that I would use for investment tracking

play12:58

or whatever but you might have your own

play12:59

sort of process that's fine but you can

play13:01

this you can use this model to plug in

play13:03

whatever process you use for finding

play13:05

good Investments and and run it have

play13:07

this have this AI workflow do all the

play13:10

research for you and come back with

play13:12

short summaries price targets Etc here's

play13:14

the explanation of how Claud investor

play13:17

works so user provides an industry agent

play13:19

finds a few stocks to explore retrieves

play13:21

key financial data and news for these

play13:23

stocks analyzes sentiment industry

play13:25

Trends and peer comparisons for each

play13:27

generates investment recommendations

play13:28

ranked my potential and obviously this

play13:30

is just for educational or informational

play13:33

use only don't use this for real stock

play13:35

picking of course now I could see

play13:38

someone that takes this framework and

play13:40

builds their own using whatever data

play13:43

that maybe people aren't really looking

play13:45

at for example using Twitter sentiment

play13:48

for example finding some sort of viral

play13:50

trends that might translate to companies

play13:52

doing better or worse there's a lot of

play13:54

information online about the global

play13:56

movement of ships and airplanes and

play13:58

stuff like that they could give you

play13:59

advanced warning if a company's in

play14:01

trouble or about to do really well on

play14:03

next on their next earnings call now I'm

play14:05

not saying people should dive into this

play14:07

but this or or something like this will

play14:10

be used to make stock trades and he's

play14:12

open sourcing it here so you can

play14:13

actually check it out his framework for

play14:15

doing this by the way and this is not

play14:16

Financial advice it is just my opinion

play14:18

on what I will do but the better these

play14:20

AI tools get the more AI agents are out

play14:23

there snooping for information the more

play14:25

and more I will stay away from Trading

play14:28

because I feel like that would be the

play14:30

way to get slaughtered in the markets to

play14:32

me the older I get the smarter buff it

play14:35

seems just buy good businesses when

play14:38

people are freaking out and then don't

play14:40

sell just chill eat your burgers and

play14:42

Coke somewhere far far away from wall

play14:45

streets so you don't get all riled up

play14:46

about whatever is happening so just

play14:48

chill until you see an opportunity and

play14:50

then furiously attack it like he does

play14:52

this hamburger then chill I'm curious

play14:54

let me know in the comments do you agree

play14:55

that the more AI agents are out there

play14:57

the the less most of us should try to

play15:01

you know outperform their market and

play15:02

outtrade the competition you know if

play15:05

you're even into that but before you go

play15:07

here's Josh biet so he's another

play15:09

engineer on the hyperight AI project he

play15:12

asks a simple question can gp4 with

play15:14

vision play Super Mario 64 to answer

play15:18

that he created the multimodal gamer

play15:20

I'll link his profile below but uh check

play15:23

this up I wrote some code to let the

play15:24

model behind chat DBT play Super Mario

play15:27

64 often said that these models

play15:31

are uh predictors Not actors but I

play15:34

thought I would give it a try and see if

play15:36

the results speak for

play15:38

themselves these models such as gp4 have

play15:41

a bit of latency and I found that as the

play15:43

primary issue um in most cases about how

play15:46

it navigated and made

play15:48

decisions it would be interesting if

play15:50

latency was non-existent how this model

play15:52

would could do if it could get more

play15:54

frames per second I created a repository

play15:57

called M multimodal gamer

play15:59

and um it's basically a framework to

play16:01

enable m multimodal models to play games

play16:03

on a computer okay I have an initial

play16:06

implementation so I'm going to try it

play16:11

now okay so it took a screenshot

play16:15

so let's see if it okay so it moved

play16:18

forward moved

play16:20

up and it said Mario is facing the path

play16:23

forward let's start moving up The Path

play16:26

moving continue moving up The Path

play16:30

yeah so just an initial Pro of concept

play16:34

let's make it

play16:37

better okay I have the next

play16:40

iteration and let's see if Mario can

play16:43

cross the

play16:44

bridge okay I'm going to start up

play16:47

Mario okay my hands are off so now it

play16:51

can GPT 4 Vision can decide on the

play16:54

amount of time to hold it oh okay he

play16:56

made it across the bridge so Mario needs

play16:59

to go towards the bridge continue his

play17:01

Advent Adventure hold up for three

play17:03

seconds is what it did now it's

play17:05

jumping Mario is facing a possible under

play17:09

should jump over it well that's

play17:12

wrong okay moving

play17:16

up okay all right the duration is

play17:19

helpful but 3 seconds is probably too

play17:21

long since how infrequent the

play17:23

screenshots are okay I'm going to try to

play17:25

iterate

play17:27

it so I made some

play17:30

adjustments in the prompt so that gp4

play17:35

can make multiple actions at a time and

play17:38

it's a little more logical on the

play17:40

duration of time it takes an action so

play17:42

let's see how Mario does passing this

play17:48

guy okay I'm going to start it

play17:52

up okay it probably took the

play17:57

screenshot

play17:59

okay Mario's over there kind of in that

play18:04

corner okay he's running round him oh

play18:07

that he's not doing very

play18:10

well uh okay he got

play18:14

hurt hopefully he

play18:19

goes facing turn around towards the star

play18:24

okay he's running

play18:27

away

play18:33

he's

play18:35

stuck Retreat

play18:38

further possibly circle around it okay

play18:40

so now he's

play18:42

running head towards the star behind the

play18:45

gate which involves freeing train Chop

play18:48

or finding another way to

play18:53

access how close to the gate now I

play18:55

should approach the wooden post and

play18:57

attempt to free train Chop okay

play18:59

I don't know I don't know how what that

play19:04

is she grabbed the

play19:08

re okay it seems like it gp4 was on to

play19:12

something

play19:14

there need to repossession Mario to

play19:18

grab oh ran into train

play19:27

Chom

play19:30

need to coin quickly to

play19:40

recover yeah Mario's

play19:52

struggling okay let's keep adjusting and

play19:54

iterating

play19:55

[Music]

play19:57

it so I adjusted the prompt for gbd4 to

play20:00

give it more context about the

play20:01

controllers it has and the ones it

play20:03

doesn't have like it can't toggle the

play20:04

view um it might have helped so let's

play20:07

see

play20:09

how uh how Mario does

play20:16

[Music]

play20:20

now you can get past these

play20:23

things okay so he's running

play20:27

stopped

play20:29

[Music]

play20:32

the top of the

play20:34

hill okay he's running he's running oh

play20:37

he's in the middle wow okay that was

play20:39

pretty good

play20:44

actually oh he's running to the left oh

play20:47

he got hit okay he passed him oh it

play20:50

finished him okay he was

play20:53

close so now that you've seen each

play20:56

iteration uh of the project as I've

play20:58

adjusted it I thought I would just share

play21:00

a longer fast forwarded version of the

play21:03

final interation I worked on I still

play21:05

think there's a lot more that could be

play21:06

improved but it's kind of fun to see how

play21:08

it would do um not just in Snippets but

play21:11

um over a longer period of time so

play21:13

here's the fast forward version of it

play21:27

navigating

play21:43

let's look at the code so it's just a

play21:46

few files in this repository it's uh

play21:49

relatively small and let's start with a

play21:52

prompt so the prompt that we're sending

play21:54

to GPD for vision preview it's pretty

play21:57

simple it's we're just saying you know

play21:59

you're playing the game and um I set up

play22:02

this prompt so it could be used with

play22:04

other games but basically the game is

play22:05

Super Mario 64 and then it has a goal to

play22:08

collect Power Stars scattered across the

play22:11

various levels in the game um and uh

play22:14

which access through the paintings in

play22:17

Prince uh Princess Peach's castle um and

play22:20

then it has a controller uh the N64

play22:22

controller just to give it some contacts

play22:24

and I pass those into this long promp

play22:27

string and we just say here are your

play22:29

actions up down left right attack jump

play22:32

and some context about what it's seeing

play22:34

it's seeing a snapshot of the screen at

play22:36

every iteration so yeah I mean that's

play22:39

really the system prompt pretty basic um

play22:42

and where we send that P system prompt

play22:44

is in the API file so we take a

play22:47

screenshot and uh we just save that to

play22:49

the uh locally to the computer and then

play22:52

we pass up that screenshot as a base 64

play22:54

to the um open AI API um and we you send

play22:58

it with a user prompt which says uh see

play23:01

the screenshot uh of the game to and you

play23:04

know do your next action basically here

play23:06

is what we're saying I told the GPD 4

play23:09

Vision that it can go up down left right

play23:11

attack but then we have to convert that

play23:12

to the keys of the keyboard so we do

play23:15

that here and a function called press

play23:17

which just uses this Library Pi Auto uh

play23:20

guey and this Library let's do just uh

play23:24

Fire keyboard events or Mouse events the

play23:26

same as we do when we use a computer and

play23:29

that's really the code um I hope this

play23:32

repo can be adjusted by others and and

play23:34

you can make it play at games I think

play23:37

the greatest potential for this repo is

play23:39

games where latency is not a problem

play23:42

where um there may be step you take the

play23:45

the game is step by step and it's okay

play23:47

if you take a while at each step so um I

play23:51

hope that others try to modify this repo

play23:53

and try to build other games I might do

play23:55

so myself um and if I do I will uh share

play23:59

more uh YouTubes of it

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
AI ProgressAgentic WorkflowsGPT-5 AnticipationAI ResearchOpen SourceAI CollaborationInvestment AnalysisCoding AutomationSuper Mario 64 AI