How Far Can We Scale AI? Gen 3, Claude 3.5 Sonnet and AI Hype

AI Explained
30 Jun 202418:25

Summary

TLDRThe script discusses the rapid advancements in AI video generation, exemplified by models like Runway Gen 3 and the anticipated Sora from Open AI. It raises questions about the reliability of AI leaders and the scalability of language models, highlighting Claude 3.5 Sonic's capabilities and the incremental improvements in AI. The potential for AI in fields like drug discovery is mentioned, with a cautionary note on separating AI hype from reality and the unpredictable impact of scaling and new research.

Takeaways

  • ๐ŸŒ AI video generation is rapidly advancing and becoming more accessible, with models like Runway Gen 3 and Sora promising highly realistic outputs despite training on a small fraction of available video data.
  • ๐ŸŽฅ The Luma dream machine offers an engaging way to experiment with AI-generated images, allowing users to interpolate between two images or generate new ones.
  • ๐Ÿ“ˆ There is skepticism about the continuous scaling of language models, with concerns about whether increased scale will necessarily lead to more accurate or reliable AI.
  • ๐Ÿค– The release of advanced voice models, such as OpenAI's real-time advanced voice model, has been delayed to improve content detection and refusal capabilities.
  • ๐Ÿ“Š Benchmark results for AI models like Claude 3.5 Sonic show improvements with increased compute, but the gains are incremental and not proportional to the scale.
  • ๐Ÿง  The potential of AI in fields like biology and drug discovery is being discussed, with some suggesting AI could accelerate the rate of discoveries, though this is speculative.
  • ๐Ÿ” There is a call for caution in interpreting benchmark results and a recognition that AI models still struggle with basic tasks, indicating that scale alone may not solve all issues.
  • ๐Ÿ“š The script highlights the importance of metacognition in AI development, suggesting that understanding how to think about problems is as crucial as scaling computational power.
  • ๐Ÿค Open source AI models are seen as a way to encourage innovation and allow for diverse applications, contrasting with the idea of a single 'true AI'.
  • ๐Ÿšง There is acknowledgment from AI leaders that the field is moving fast, and there is a need to ensure that understanding keeps pace with the capabilities of AI models.
  • ๐Ÿ”ฎ Predictions about the future of AI are made with caution, acknowledging the many unknowns and the difficulty in forecasting exact outcomes or timelines for AI advancements.

Q & A

  • What is the current state of AI video generation technology?

    -AI video generation technology is rapidly advancing, with models such as Runway Gen 3 becoming more accessible and generating increasingly realistic outputs. However, these models are likely trained on less than 1% of available video data, indicating that future generations could become even more realistic in a relatively short time.

  • What is the significance of the Luma Dream Machine in AI image generation?

    -The Luma Dream Machine is a tool for AI image generation that allows users to create images or interpolate between two real ones. It represents a fun and engaging way for users to experiment with AI-generated visuals while waiting for the release of more advanced models.

  • What is the status of the video generation model called Sora from Open AI?

    -Sora is a highly anticipated video generation model from Open AI, which is considered to be the most promising in its field. However, it is still under development, and comparisons with other models like Runway Gen 3 show that it may have benefits from larger scale training and compute resources.

  • Why is the release of the real-time advanced voice mode from Open AI delayed?

    -The release of the real-time advanced voice mode from Open AI has been delayed to improve the model's ability to detect and refuse certain content. It also addresses concerns about the model occasionally producing inaccurate or unreliable outputs.

  • What are some of the limitations of scaling AI models?

    -While scaling AI models can lead to improvements, it does not necessarily solve all problems. For example, even with more data and compute, models may still struggle with basic tasks or produce hallucinations in language generation. The hope is that scaling will eventually lead to more accurate and reliable AI, but there is skepticism about whether this will fully address current limitations.

  • What is the current performance of Claude 3.5 Sonic from Anthropic compared to other language models?

    -Claude 3.5 Sonic is a language model from Anthropic that is free, fast, and in certain domains, more capable than comparable language models. It shows improvements in basic mathematical ability and general knowledge compared to models like GPT-40 and Gemini 1.5 Pro from Google, but the differences are not as significant as the scale of compute might suggest.

  • What is the concept of 'metacognition' in the context of AI development?

    -Metacognition in AI refers to the ability of models to understand how to think about a problem in a broad sense, to assess the importance of an answer, and to use external tools to check their answers. It represents a significant frontier in AI development, moving beyond simple scaling to more human-like thinking processes.

  • What are some of the challenges in accurately assessing the capabilities of AI models?

    -Assessing the capabilities of AI models is challenging due to the limitations and flaws in benchmark tests, which may not accurately reflect real-world performance. Additionally, models may struggle with basic tasks despite high scores on benchmarks, indicating a need for more nuanced evaluation methods.

  • How do AI lab leaders view the potential of AI in fields like biology and drug discovery?

    -Some AI lab leaders, such as those from Anthropic, see the potential for AI to significantly impact fields like biology and drug discovery, possibly leading to new discoveries and even cures for diseases. However, these views are sometimes seen as overly optimistic or 'hype' by others in the industry.

  • What is the current sentiment regarding the hype around AI capabilities and scaling?

    -There is a growing sentiment that the hype around AI capabilities and scaling may have gone too far, with some industry insiders expressing skepticism about the pace of progress and the reliability of benchmark results. The challenge is to separate hype from reality and to manage expectations about what AI can and cannot do.

  • What are the potential future developments in AI model scaling and algorithmic improvements?

    -Future developments in AI are expected to include scaling to models with billions of parameters and continued improvements in algorithms and chip technology. If these advancements continue at their current pace, there is a possibility that by 2025-2027, AI models could surpass human capabilities in many areas.

Outlines

00:00

๐ŸŒ AI Video Generation and Model Scalability

The script discusses the rapid advancements in AI video generation, highlighting the capabilities of models like Runway Gen 3 and Sora from OpenAI. It emphasizes the transformative potential of these technologies on content consumption. The author also raises questions about the reliability of AI leaders and the limitations of scaling language models. Comparisons between different models, such as Runway Gen 3 and Sora, illustrate the improvements in video generation quality. The script also touches on the potential and limitations of AI in understanding and generating accurate world models.

05:00

๐Ÿ“ˆ Incremental Improvements and the Economics of AI Scaling

This paragraph delves into the incremental improvements brought by scaling AI models, questioning the economic viability of investing in such advancements. It uses Claude 3.5 Sonic from Anthropic as a case study, comparing its performance to other models and highlighting the diminishing returns on investment as models approach human-level intelligence. The author also explores the concept of 'metacognition' in AI, suggesting that understanding how to think about a problem is as important as scaling computational power.

10:03

๐Ÿค– AI's Evolving Capabilities and the Hype Surrounding Them

The script addresses the hype around AI's capabilities, contrasting it with the reality of current technology. It cites comments from industry leaders who express skepticism about the pace and impact of AI advancements. The author discusses the potential of AI in fields like drug discovery and cancer treatment, while also acknowledging the uncertainty and the need for caution in interpreting AI's capabilities. The paragraph concludes with a call for a balanced view between optimism and realism in the AI community.

15:03

๐Ÿš€ The Future of AI: Scaling, Research, and Real-World Applications

In the final paragraph, the script contemplates the future trajectory of AI, considering both scaling and algorithmic improvements. It presents predictions from industry experts about the potential of AI to surpass human capabilities in various domains. The author also reflects on the unpredictability of AI's impact and the importance of separating hype from reality. The paragraph ends with an invitation for the audience to engage in further discussions on AI's future through the author's Patreon platform.

Mindmap

Keywords

๐Ÿ’กAI Video Generation

AI Video Generation refers to the creation of video content by artificial intelligence. In the video script, it is highlighted as a transformative technology that is making artificial worlds more tangible and accessible. The script discusses the capabilities of models like Runway Gen 3 and Sora, emphasizing the potential for highly realistic video generation despite being trained on a small fraction of available video data.

๐Ÿ’กScaling

Scaling in the context of AI refers to the increase in the size and complexity of models, often through the addition of more data and computational power. The script raises questions about the merits and limitations of scaling, suggesting that while it can lead to improvements, it may not necessarily solve all challenges, such as the accuracy of world models or reasoning abilities.

๐Ÿ’กLanguage Models

Language models are AI systems designed to understand and generate human language. The script mentions Claude 3.5 Sonic and compares it with other models like GPT-40, discussing their capabilities in various domains such as mathematical ability and general knowledge. It also touches on the incremental improvements and the cost of scaling these models.

๐Ÿ’กMultimodal Training

Multimodal training involves training AI models on multiple types of data, such as text, images, and video. The script discusses the hope that multimodal training could improve reasoning abilities in AI, but also notes that this has not been entirely successful, as evidenced by the performance of models like Claude 3.5 Sonic.

๐Ÿ’กHype vs. Reality

The script explores the balance between the hype surrounding AI advancements and the actual capabilities and limitations of current technology. It critiques the tendency to overstate the potential of AI, urging for a more measured and realistic assessment of its progress and implications.

๐Ÿ’กBenchmarks

Benchmarks are tests or metrics used to evaluate the performance of AI models. The script expresses skepticism about the reliability of benchmark results, suggesting that they may not fully capture the capabilities or limitations of AI models, and that minor differences in scores may not be significant.

๐Ÿ’กMetacognition

Metacognition refers to the ability of an AI to understand and reflect on its own thought processes. Bill Gates mentions metacognition as the 'big frontier' in AI development, suggesting that progress in this area could be more significant than mere scaling of models.

๐Ÿ’กEmergent Behaviors

Emergent behaviors are unexpected or novel capabilities that arise in complex systems as a result of interactions between simpler components. The script briefly mentions the idea that certain AI capabilities might emerge as models scale up, but also questions whether this will necessarily lead to the desired outcomes.

๐Ÿ’กOpen AI

Open AI is an organization focused on the development of AI technologies. The script discusses Open AI's work on models like Sora and GPT-40, as well as their efforts to improve the reliability and reasoning abilities of their AI systems.

๐Ÿ’กAnthropic

Anthropic is another AI research organization mentioned in the script, known for developing models like Claude 3.5 Sonic. The script discusses the company's approach to AI development and their public statements about the potential impact of their technology.

๐Ÿ’กAI Ethics and Responsibility

The script touches on the ethical considerations and responsibilities of AI development, such as the potential for AI to be used in sensitive areas like cancer research. It also discusses the need for caution in making claims about AI capabilities and the importance of ensuring that AI development is aligned with societal values.

Highlights

AI video generation is becoming increasingly tangible and accessible, set to transform content consumption.

Runway gen 3 and its audio, generated by AI, represents the current state of AI video generation.

AI models are trained on a fraction of available video data, indicating potential for more realistic video generation.

Luma dream machine allows for image generation and interpolation, offering a fun user experience.

Chinese model 'cing' and the anticipated release of 'Sora' from OpenAI are compared for video generation capabilities.

Training models on more data doesn't guarantee accurate world models, as seen in Sora's generation capabilities.

The question of whether scale in AI will solve all issues is raised, with indications it may not.

Real-time advanced voice mode from OpenAI, showcased in GPT-40's demo, has been delayed.

Claude 3.5 Sonic from Anthropic offers free and fast capabilities in certain domains.

Benchmarks for AI models have significant flaws and should be interpreted with caution.

The incremental benefits of scaling in AI models may not justify the increased cost and compute.

Artifacts feature in Claude 3.5 Sonic allows for interactive project work alongside the language model.

AI models still struggle with basic tasks despite increased scale, contradicting naive scaling hypotheses.

Bill Gates discusses the potential of scaling in AI and the need for understanding beyond just data access.

Metacognition in AI is identified as a significant frontier for development.

Mustafa Suleyman from Microsoft AI suggests that consistent action from AI models may not be achievable until GPT-6.

Sam Altman from OpenAI discusses the use of AI in cancer screening and potential future contributions to discovering cures.

Mark Zuckerberg expresses skepticism about the grandiose claims made by AI lab leaders.

Dario Amodei from Anthropic discusses the potential of AI in biology and drug discovery, with caveats about the uncertainty of predictions.

The pace of AI development is a challenge, with the need to ensure understanding keeps pace with capabilities.

AI models are compared to undergraduates, with the potential to reach professional levels in various fields.

Transcripts

play00:00

artificial worlds generated by AI video

play00:03

models have never been more tangible and

play00:07

accessible and look set to transform how

play00:10

millions and then billions of people

play00:12

consume content and artificial

play00:14

intelligence in the form of the new free

play00:16

Claude 3.5 Sonic is more capable than it

play00:20

has ever been but I will draw on

play00:22

interviews in the last few days to show

play00:24

that there are more questions than ever

play00:26

not just about the merits of continued

play00:28

scaling of language model

play00:30

but about whether we can rely on the

play00:33

words of those who lead these giant AI

play00:35

orgs but first AI video generation which

play00:39

is truly on fire at the moment these

play00:42

outputs are from Runway gen 3 available

play00:45

to many now and to everyone apparently

play00:48

in the coming days the audio by the way

play00:51

is also AI generated this time from udio

play00:56

[Music]

play01:20

and as you watch these videos remember

play01:22

that the AI models that are generating

play01:24

them are likely trained on far less than

play01:27

1% of the video data that's a available

play01:30

unlike highquality text Data video data

play01:33

isn't even close to being used up expect

play01:36

generations to get far more realistic

play01:38

and not in too long either and by the

play01:40

way if you're bored while waiting on the

play01:43

Gen 3 wait list do play about with the

play01:46

Luma dream machine I've got to admit it

play01:48

is pretty fun to generate two images or

play01:51

submit two real ones and have the model

play01:54

interpolate between them now those of

play01:55

you in China have actually already been

play01:58

able to play with model of similar

play02:00

capabilities called cing but we are all

play02:04

waiting on the release of Sora the most

play02:07

promising video generation model of them

play02:09

all from open AI here are a couple of

play02:12

comparisons between Runway gen 3 and

play02:15

Sora the prompts used in both cases are

play02:18

identical and there's one example that

play02:21

particularly caught my eye as many of us

play02:23

may have realized by now simply training

play02:25

models on more data doesn't necessarily

play02:27

mean they pick up accurate world models

play02:30

now I strongly suspect that Sora was

play02:33

trained on way more data with way more

play02:35

compute with its generation at the

play02:38

bottom you can see that the dust emerges

play02:40

from behind the car this neatly

play02:42

demonstrates the benefits of scale but

play02:45

still leaves open the question about

play02:47

whether scale will solve all now yes it

play02:50

would be simple to extrapolate a

play02:52

straight line upwards and say that with

play02:54

enough scale we get a perfect world

play02:57

simulation but I just don't think it

play02:59

will be like that and there are already

play03:00

more than tentative hints that scale

play03:03

won't solve everything more on that in

play03:05

just a moment but there is one more

play03:07

modality I am sure we were all looking

play03:09

forward to which is going to be delayed

play03:12

that's the realtime advanced voice mode

play03:15

from open AI it was the star of the demo

play03:18

of GPT 40 and was promised in the coming

play03:21

weeks alas though it has now been

play03:23

delayed to the fall or the Autumn and

play03:26

they say that's in part because they

play03:28

want to improve the model's ability to

play03:30

detect and refuse certain content I also

play03:33

suspect though like dodgy physics with

play03:35

video generation and hallucinations with

play03:38

the language generation they also

play03:40

realized it occasionally goes off the

play03:43

rails now I personally find this funny

play03:45

but you let me know whether this would

play03:46

be acceptable to release refreshing

play03:49

coolness in the air that just makes you

play03:51

want to smile and take a deep breath of

play03:54

that crisp invigorating Breeze the Sun's

play03:57

shining but it's that this lovely gentle

play04:00

warmth that's just perfect for light

play04:03

Jack so either way we're definitely

play04:05

going to have epic entertainment but the

play04:07

question is what's next particularly

play04:09

when it comes to the underlying

play04:11

intelligence of models is it a case of

play04:13

shooting past human level or diminishing

play04:16

returns well here's some anecdotal

play04:18

evidence with the recent release of

play04:21

Claude 3.5 Sonic from anthropic it's

play04:24

free and fast and in certain domains

play04:26

more capable than comparable language

play04:29

models this table I would say shows you

play04:31

a comparison on things like basic

play04:33

mathematical ability and general

play04:35

knowledge compared to models like GPT 40

play04:37

and Gemini 1.5 Pro from Google I would

play04:40

caution that many of these benchmarks

play04:42

have significant flaws so decimal point

play04:45

differences I wouldn't pay too much

play04:46

attention to the most interesting

play04:48

comparison I would argue is between

play04:50

Claude 3.5 Sonic and Claude 3 Sonet

play04:53

there is some evidence that Claude 3.5

play04:55

Sonic was trained on about four times as

play04:57

much compute as Claude 3 on it and you

play05:00

can see the difference that makes

play05:02

definitely a boost across the board but

play05:04

it would be hard to argue that it's four

play05:06

times better and in the visual domain it

play05:09

is noticeably better than its

play05:11

predecessor and than many other models

play05:14

and I got Early Access so I tested it a

play05:16

fair bit these kind of benchmarks test

play05:18

reading charts and diagrams and

play05:20

answering basic questions about them but

play05:22

the real question is how much extra

play05:24

compute and therefore money can these

play05:26

companies continue to scale up and

play05:29

invest if the returns are still

play05:31

incremental in other words how much more

play05:33

will you and more importantly businesses

play05:36

continue to pay for these incremental

play05:38

benefits after all in no domains are

play05:41

these models reaching 100% and let me

play05:44

try to illustrate that with an example

play05:46

and as we follow this example ask

play05:48

yourself whether you would pay four

play05:49

times as much for a 5% hallucination

play05:52

rate versus an 8% hallucination rate if

play05:55

in both cases you have to check the

play05:57

answer anyway let me demonstrate with

play05:58

the brilliant new new feature you can

play06:00

use with Claude 3.5 Sonic from anthropic

play06:03

it's called artifacts think of it like

play06:05

an interactive project that you can work

play06:07

on alongside the language model I dumped

play06:10

a multi hundred page document on the

play06:12

model and asked the following question

play06:14

find three questions on functions from

play06:16

this document and turn them into

play06:18

clickable flash cards in an artifact

play06:20

with full answers and explanations

play06:22

revealed interactively it did it and

play06:24

that is amazing but there's one slight

play06:27

problem question one is perfect it's a

play06:30

real question from the document

play06:32

displayed perfectly and interactive with

play06:34

the correct answer and explanation same

play06:36

thing for question two but then we get

play06:39

to question three where it copied the

play06:41

question incorrectly worse than that it

play06:43

rejigged and changed the answer options

play06:46

also is there a real difference between

play06:48

q^2 and netive Q ^2 when it claimed that

play06:52

netive Q ^2 is the answer now you might

play06:54

find this example trivial but I think

play06:56

it's revealing don't get me wrong this

play06:58

feature is mentally useful and it

play07:00

wouldn't take me long to Simply tweak

play07:02

that third question and by the way

play07:04

finding those three examples strewn

play07:06

across a multi hundred page document is

play07:09

impressive even though it would save me

play07:11

some time I would still have to

play07:13

diligently check every character of

play07:15

claude's answer and at the moment as I

play07:17

discussed in more detail in my previous

play07:19

video there is no indication that scale

play07:22

will solve this issue now if you think

play07:24

I'm just quibbling and benchmarks show

play07:26

the real progress well here is the the

play07:29

reasoning lead at Google deepmind

play07:32

working on their Gemini series of models

play07:34

someone pointed out a classic reasoning

play07:36

error made by Claude 3.5 Sonic and Denny

play07:39

XE said this love seeing tweets like

play07:41

this rather than those on llms with phds

play07:46

superhuman intelligence or fancy results

play07:48

on leaked benchmarks I'm definitely not

play07:50

the only one skeptical of Benchmark

play07:53

results and an even more revealing

play07:55

response to Claude 3.5s basic errors

play07:57

came from open AI know Brown I think

play08:00

it's more revealing because it shows

play08:02

that those AI Labs anthropic and open AI

play08:05

had their hopes slightly dashed based on

play08:07

the results they expected in reasoning

play08:10

from multimodal training non Brown said

play08:12

Frontier models like GPT 40 and now

play08:14

clawed 3.5 Sonic maybe at the level of a

play08:18

quote smart high schooler mimicking the

play08:20

words of Mira murati CTO of open aai in

play08:23

some respects but they still struggle on

play08:25

basic tasks like Tic-tac-toe and here's

play08:28

the key quote there was hope that native

play08:31

multimodal training would help with this

play08:34

kind of reasoning but that hasn't been

play08:36

the case that last sentence is somewhat

play08:39

devastating to the naive scaling

play08:41

hypothesis there was hope that native

play08:44

multimodal training on things like video

play08:46

from YouTube would teach models a world

play08:49

model it would help but that hasn't been

play08:51

the case now of course these companies

play08:53

are working on Far More Than Just naive

play08:55

scaling as we'll hear in a moment from

play08:56

Bill Gates but it's not like you can

play08:58

look at the benchmark Mark results on a

play09:00

chart and just extrapolate forwards

play09:02

here's Bill Gates promising two more

play09:04

turns of scaling I think he means two

play09:06

more orders of magnitude but notice how

play09:08

he looks skeptical about how that will

play09:11

be enough the big Frontier is not so

play09:13

much scaling we have probably two more

play09:17

turns of the crank on

play09:19

scaling where by accessing video data

play09:23

and getting very good at synthetic

play09:26

data that we can scale up probably

play09:30

you know two more times that's not the

play09:32

most interesting Dimension the most

play09:34

interesting Dimension is is what I call

play09:37

metacognition where understanding how to

play09:40

think about a problem in a broad sense

play09:43

and step back and say okay how important

play09:46

is this answer how could I check my

play09:48

answer you know what external tools

play09:50

would help me with this so we're get

play09:52

we're going to get the scaling benefits

play09:55

but at the same time the various

play09:59

actions to change the the underlying

play10:02

reing algorithm from the trivial that we

play10:07

have today to more humanlike

play10:10

metacognition that's the big

play10:13

Frontier that uh it's a little hard to

play10:16

prict how quickly that'll happen you

play10:18

know I've seen that we will make

play10:21

progress on that next year but we won't

play10:23

completely solve it uh for some time

play10:26

after that and there were others who

play10:27

used to be incredibly bullish on scaling

play10:30

that now sound a little different here's

play10:32

Microsoft ai's CEO Mustafa sullan

play10:35

perhaps drawing on lessons from the

play10:37

mostly defunct inflection AI that he

play10:39

used to run saying it won't be until GPT

play10:42

6 the AI models will be able to follow

play10:44

instructions and take consistent action

play10:46

there's a lot of cherry-picked examples

play10:48

that are impressive you know on Twitter

play10:50

and stuff like that but to really get it

play10:52

to consistently do it in novel

play10:54

environments is is pretty hard and I

play10:57

think that it's going to be not one two

play10:59

orders of magnitude more computation of

play11:01

training the models um so not gbt 5 but

play11:05

more like gbt 6 scale models so I think

play11:08

we're talking about two years before we

play11:10

have systems that can really take action

play11:13

now based on the evidence that I put

play11:14

forward in my previous video let me know

play11:16

if you agree with me that I still think

play11:19

that's kind of naive reasoning

play11:20

breakthroughs will rely on new research

play11:23

breakthroughs not just more scale and

play11:25

even samman said as much about a year

play11:27

ago saying the ear ER of ever more

play11:30

scaling of parameter count is over now

play11:33

as we'll hear he has since contradicted

play11:35

that saying current models are small

play11:37

relative to where they'll be but at this

play11:39

point you might be wondering about

play11:41

emergent behaviors don't certain

play11:43

capabilities just spring out when you

play11:45

reach a certain scale well I simply

play11:46

can't resist a quick plug for my new

play11:49

corsera series that is out this week the

play11:52

second module covers a mergent behaviors

play11:54

and if you already have a corsera

play11:55

account do please check it out it' be

play11:57

free for you and if you were thinking of

play11:59

getting one there will be a link in the

play12:01

description anyway here's that quote

play12:03

from samman somewhat contradicting the

play12:05

comments he made a year ago models he

play12:07

says get predictably better with scale

play12:10

we're still just like so early in

play12:12

developing such a complex system um

play12:15

there's data issues There's algorithmic

play12:18

issues uh the models are still quite

play12:21

small relative to what they will be

play12:22

someday and we know they get predictably

play12:23

better but this was the point I was

play12:26

trying to make at the start of the video

play12:28

as I argu in my previous video I think

play12:30

we're now at a time in AI where we

play12:33

really have to work hard to separate the

play12:35

hype from the reality simply trusting

play12:38

the words of the leaders of these AI

play12:41

Labs is less advisable than ever and of

play12:43

course it's not just samman here's the

play12:46

commitment from anthropic led by Dario

play12:48

amade back last year they described why

play12:51

they don't publish their research and

play12:52

they said it's because we do not wish to

play12:54

advance the rate of AI capabilities

play12:57

progress but their CEO 3 days ago said

play13:00

AI is progressing fast due in part to

play13:03

their own efforts to try and keep Pace

play13:06

with the rate at which the complexity of

play13:08

the models is increasing I think this is

play13:10

one of the biggest challenges in the

play13:11

field the field is moving so fast

play13:14

including by our own efforts that we

play13:16

want to make sure that our understanding

play13:18

keeps Pace with our our abilities our

play13:20

capabilities to produce powerful models

play13:22

he then went on to say that today's

play13:24

models are like undergraduates which if

play13:27

you've interacted with these models seem

play13:29

seems pretty harsh on undergraduates if

play13:32

we go back to the analogy of like

play13:33

today's models are like

play13:35

undergraduates uh you know let's say

play13:37

those models get to the point where you

play13:38

know they're kind of you know graduate

play13:41

level or strong professional level think

play13:44

of biology and Drug Discovery think of

play13:47

um a model that is as strong as you know

play13:51

a Nobel prize winning scientist or you

play13:53

know the head of the you know the head

play13:54

of head of drug Discovery at a major

play13:56

pharmaceutical company now I don't know

play13:58

if he's placing that on a naive trust in

play14:01

benchmarks or whether he is deliberately

play14:04

hyping and then later in the

play14:05

conversation with the guy who's in

play14:07

charge of the world's largest Sovereign

play14:09

wealth fund he described how the kind of

play14:11

AI that anthropic works on could be

play14:14

instrumental in curing cancer I look at

play14:16

all the things that have been invented

play14:18

you know if I look back at biology you

play14:20

know crisper the ability to like edit

play14:22

genes if I look at um you know C

play14:26

therapies which have cured certain kinds

play14:28

of cancer

play14:29

there's probably dozens of discoveries

play14:32

like that lying around and if we had a

play14:35

million copies of an AI system that are

play14:38

as knowledgeable and as creative about

play14:41

the field as all those scientists that

play14:43

invented those things then I think the

play14:45

rate of of those discoveries could

play14:47

really proliferate and you know some of

play14:49

our really really

play14:50

longstanding diseases uh you know could

play14:53

be could be addressed or even cured now

play14:55

he added some caveats of course but that

play14:58

was a claim echoed on the same day

play15:00

actually I think by open AI Sam mman one

play15:03

of our partners color health is now

play15:05

using uh gb4 for cancer screening and

play15:08

treatment plans and that's great and

play15:10

then maybe a future version will help uh

play15:12

discover cures for cancer other AI lab

play15:15

leaders like Mark Zuckerberg think those

play15:17

claims are getting out of hand but you

play15:20

know part of that is the open source

play15:21

thing too so that way other companies

play15:22

out there can create different things

play15:23

and people can just hack on it

play15:25

themselves and mess around with it so I

play15:27

guess that's a pretty deep worldview

play15:28

that I have and I don't know I find it a

play15:31

pretty big turnoff when people in the um

play15:34

in the tech industry kind of talk about

play15:37

building this one true AI it's like it's

play15:39

almost as if they they kind of think

play15:41

they're creating God or something and

play15:42

it's it's like it's just that's not

play15:45

that's not what we're doing that's I

play15:46

don't think that's how this plays out

play15:48

implicitly he's saying that companies

play15:50

like open aai and anthropic are getting

play15:53

carried away and later though in that

play15:55

interview the CEO of anthropic admitted

play15:57

that he was somewhat pulling things out

play15:59

of his hat when it came to biology and

play16:02

actually with scaling you know let's say

play16:05

you know you extend people's productive

play16:07

ability to work by 10 years right that

play16:09

could be you know one six of the whole

play16:11

economy do you think that's a realistic

play16:13

Target I mean again like I know some

play16:16

biology I know something about how the

play16:19

AA models are going to happen I wouldn't

play16:20

be able to tell you exactly what would

play16:22

happen but like I can tell a story where

play16:25

it's possible so so 15% and when will be

play16:28

so when could we have added the

play16:31

equivalent of 10 years to our life I

play16:32

mean how long what what's the time frame

play16:34

again like you know this involves so

play16:36

many unknowns right if I if I try and

play16:38

give an exact number it's just going to

play16:40

sound like hype but like a thing I could

play16:42

a thing I could imagine is like I don't

play16:45

know like two to three years from now we

play16:47

have ai systems that are like capable of

play16:50

making that kind of Discovery 5 years

play16:52

from now those those discoveries are

play16:54

actually being made and 5 years after

play16:56

that it's all gone through the

play16:57

regulatory apparatus and and really so

play17:00

you know we're talking about more we're

play17:02

talking about you know a little over a

play17:03

decade but really I'm just pulling

play17:05

things out of my hat here like I don't

play17:07

know that much about drug Discovery I

play17:08

don't know that much about biology and

play17:11

frankly although although I invented AI

play17:13

scaling I don't know that much about

play17:14

that either I can't predict it the truth

play17:17

of course is that we simply don't know

play17:19

what the ramifications will be of the

play17:22

scaling and of course of new research

play17:24

regardless these companies are pressing

play17:27

ahead uh right now 100 Mill ion there

play17:29

are models in training today that are

play17:31

more like a billion I think if we go to

play17:34

10 or 100 billion and I think that will

play17:36

happen in 2025 2026 maybe 2027 and the

play17:41

algorithmic improvements continue a pace

play17:43

and the chip improvements continue a

play17:45

pace then I think there there is in my

play17:48

mind a good chance that by that time

play17:50

we'll be able to get models that are

play17:52

better than most humans at most things

play17:55

but I want to know what you think are we

play17:57

at the dawn of a new era in

play18:00

entertainment and intelligence or has

play18:02

the hype gone too far if you want to

play18:04

hear more of my Reflections do check out

play18:06

my podcasts on patreon on AI insiders

play18:09

you could also check out the dozens of

play18:11

bonus videos I've got on there and the

play18:14

live meetups arranged via Discord but

play18:16

regardless I just want to thank you for

play18:18

getting all the way to the end and

play18:20

joining me in these wild times have a

play18:23

wonderful day

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
AI VideoContent CreationLanguage ModelsAI EthicsScalabilityBenchmarkingMultimodal AIClaude 3.5Sora ModelAI HypeEmergent Behaviors