Amazon CEO's LEAKED Conversation Reveals Stunning Truth About The Future Of Software Engineering

TheAIGRID
24 Aug 202428:54

Summary

TLDRThe script discusses the impact of AI on software development, referencing a leaked recording from Amazon Web Services' CEO, Matt Garman. It suggests that AI advancements could reduce the need for traditional coding by developers. The video aims to provide grounded insights into the potential shifts in the industry, highlighting the rapid progress in AI capabilities and the necessity for developers to adapt by upskilling and focusing on innovation and user experience. It also addresses the fear of job loss due to AI, emphasizing the optimistic view of new creative opportunities and the role changes rather than extinction.

Takeaways

  • ๐Ÿง  The script discusses the rapid evolution of AI in software development and its potential impact on the job market, emphasizing that AI is not about to replace all software engineers but will change the nature of their work.
  • ๐Ÿ“ข A leaked recording from Amazon Web Services CEO Matt Garman suggests that AI could take over many coding tasks, implying that developers might need to develop other skills as AI advances.
  • ๐Ÿ”ฎ The prediction is made that within 24 months, most developers might not be coding as AI takes over, highlighting a significant shift in the industry within the next two years.
  • ๐Ÿ›  AI's ability to perform coding tasks in natural language with high efficiency is a game-changer, potentially reducing the need for traditional coding by software engineers.
  • ๐Ÿค– The role of software developers is expected to evolve, with a focus on innovation and building user-centric solutions rather than just writing code.
  • ๐Ÿ“ˆ The script references the rapid improvement in AI's coding capabilities, with benchmarks showing significant progress in a short time frame.
  • ๐Ÿ“š The importance of understanding the fundamentals of coding and the underlying technology is underscored, as AI tools will still require knowledgeable users to operate effectively.
  • ๐Ÿ’ก The video script suggests that the fear of AI replacing jobs is somewhat misplaced, as the technology is more likely to augment the role of software engineers rather than eliminate it.
  • ๐ŸŒ OpenAI's release of a human-validated subset of the SWE Bench indicates a commitment to improving the evaluation of AI models' ability to solve real-world software issues.
  • ๐Ÿš€ The script points to the potential for AI to democratize programming, allowing domain experts to utilize technology without needing to be skilled programmers.
  • ๐Ÿ”‘ The takeaway for aspiring software developers is to focus on developing a broad set of skills, including understanding AI and its applications in software engineering.

Q & A

  • What was the main point discussed in the leaked Amazon Cloud Chief's conversation?

    -The main point was that AI could potentially take over many coding tasks, which might lead to a shift in the role of software developers, rather than completely replacing them.

  • What is the Amazon Web Services CEO Matt Garman's prediction regarding AI and coding?

    -Matt Garman predicts that within 24 months, AI could take over many coding tasks, and software engineers might need to develop other skills as their traditional coding role changes.

  • What is the current capability of AI in coding tasks according to the transcript?

    -AI is already capable of performing many coding tasks with remarkable efficiency in natural language, and this capability is expected to increase in the near future.

  • What does the term 'SWE Bench' refer to in the context of the script?

    -SWE Bench is a benchmark for evaluating large language models' abilities to solve real-world software issues sourced from GitHub.

  • How has the performance of AI models on the SWE Bench evolved recently?

    -The performance of AI models on the SWE Bench has significantly improved, with some models like Cosign Genie scoring as high as 43.8%, indicating rapid progress in AI's software engineering capabilities.

  • What does the transcript suggest about the future role of software engineers?

    -The transcript suggests that the role of software engineers will change, with a focus on innovation, understanding customer needs, and managing AI systems rather than solely writing code.

  • What is the significance of the statement 'the hottest new programming language is English'?

    -This statement highlights the idea that as AI systems become more adept at understanding natural language prompts, English could become the primary means of interacting with these systems for programming tasks.

  • How does the transcript address the fear of AI replacing jobs in the software industry?

    -The transcript acknowledges the fear but emphasizes that AI is more likely to change the role of software engineers by automating routine coding tasks, allowing them to focus on more creative and user-centric work.

  • What is the potential impact of AI advancements on the hiring process in tech companies?

    -The potential impact includes a shift in the skills required for software engineering roles, with a possible focus on managing AI systems and understanding customer needs rather than traditional coding skills.

  • What does the transcript suggest about the rate of improvement in AI's coding capabilities?

    -The transcript suggests that the rate of improvement is accelerating, with significant monthly gains in performance on benchmarks like the SWE Bench, indicating a rapid advancement in AI's ability to perform coding tasks.

  • How does the transcript discuss the balance between AI hype and realistic expectations?

    -The transcript attempts to provide a balanced view by acknowledging the rapid advancements in AI while also referencing skeptics who argue that some claims about AI capabilities may be overhyped or premature.

Outlines

00:00

๐Ÿค– AI's Impact on Software Engineering Jobs

The script discusses the evolving landscape of software development due to AI advancements. It references a leaked recording from Amazon Web Services' CEO, Matt Garman, who suggests that AI could soon take over many coding tasks, prompting a shift in the role of software engineers. The video aims to debunk hype and provide grounded analysis based on industry developments. Garman's comments are interpreted as an advisory nudge rather than a warning of job extinction, emphasizing the need for developers to adapt by developing other skills as AI continues to advance.

05:02

๐Ÿ—ฃ๏ธ The Emergence of English as the New Coding Language

This paragraph highlights the idea that English could become the primary means of interacting with AI systems, as AI's ability to understand and execute natural language prompts is improving. The role of software developers is expected to change, with a focus on understanding customer needs and innovating rather than writing code. The video script points out that AI tools could automate code generation, potentially reducing the demand for traditional coding skills but also creating opportunities for developers to upskill and work more closely with AI to enhance productivity.

10:03

๐Ÿ”ฎ Predictions and Speculations on AI in Coding

The script delves into predictions about the future of programming with AI, citing opinions from industry leaders like the CEO of Stability AI and Nvidia's CEO, who foresee a significant reduction in the need for human programmers. It emphasizes the importance of domain expertise and the potential for AI to democratize programming by enabling non-experts to solve complex problems. The video also addresses the rapid pace of AI development and the need for continuous learning and adaptation in the field of software engineering.

15:04

๐Ÿ“Š Analysis of AI's Progress in Software Engineering Tasks

The paragraph presents an analysis of AI's capabilities in software engineering, particularly focusing on the performance of AI models on the Software Engineering Benchmark (SWE Bench). It discusses the rapid improvement in AI's ability to solve real-world software issues, as evidenced by the increasing scores on the SWE Bench. The video script also mentions OpenAI's release of a human-validated subset of the SWE Bench to more accurately evaluate AI models, suggesting that previous benchmarks may have underestimated AI's capabilities.

20:05

๐Ÿ“ˆ Projecting AI's Future in Software Engineering

This section of the script extrapolates the current rate of improvement in AI to predict future capabilities in software engineering. It suggests that AI could be capable of solving a significant majority of software engineering tasks within the next 10 to 12 months, based on the current acceleration phase of technological progress. The video acknowledges the potential for this growth to slow as AI approaches theoretical limits but maintains that the next 24 to 48 months will see a substantial shift in the role of software engineers due to AI.

25:06

๐Ÿ† Alpha Code 2's Achievements in Competitive Programming

The final paragraph discusses the achievements of Alpha Code 2, an AI system that has demonstrated high performance in competitive programming challenges. The system's ability to rank between expert and candidate master levels on Codeforces, a well-known competitive programming platform, indicates its advanced problem-solving capabilities. The video script uses this as an example to illustrate the potential future impact of AI on the field of software engineering, suggesting that AI could soon perform tasks typically expected of mid-level to senior software engineers.

Mindmap

Keywords

๐Ÿ’กAI

AI, or Artificial Intelligence, refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. In the context of the video, AI is central to the discussion about the future of software development. The video discusses how AI advancements might lead to changes in the tasks performed by software engineers, potentially reducing the need for manual coding.

๐Ÿ’กSoftware Development

Software Development is the process of creating and maintaining applications and systems. It involves coding, testing, and debugging. The video script suggests that AI could revolutionize this field by taking over many coding tasks, which would shift the focus of software engineers towards other aspects of development, such as innovation and user experience.

๐Ÿ’กCoding

Coding is the act of writing source code in a particular programming language. The video discusses the possibility that AI could replace or reduce the amount of coding done by humans, as AI systems become more proficient at generating code from natural language prompts.

๐Ÿ’กAmazon Web Services (AWS)

Amazon Web Services is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments. In the video, AWS CEO Matt Garman's comments on AI's impact on software development are highlighted, suggesting that AWS is actively considering and preparing for AI's role in the future of coding.

๐Ÿ’กNatural Language Processing (NLP)

Natural Language Processing is a subfield of AI that focuses on the interaction between computers and humans through natural language. The video mentions that AI's ability to understand and generate code from natural language is a significant factor in its potential to change software development.

๐Ÿ’กDevelopers

Developers, or software developers, are professionals who create software applications and systems. The video script discusses the potential shift in the role of developers due to AI, suggesting that they may need to develop other skills as AI takes over many coding tasks.

๐Ÿ’กInnovation

Innovation refers to the process of translating an idea or invention into a good or service that creates value or for which customers will pay. The video suggests that as AI takes over routine coding tasks, developers will focus more on innovation, creating new and interesting applications for end users.

๐Ÿ’กAI Models

AI models are algorithms, often machine learning models, that are trained to perform tasks such as image recognition, language translation, or in this case, coding. The video discusses the rapid improvement in AI models' abilities to perform software engineering tasks, indicating a potential future where these models can handle more complex coding challenges.

๐Ÿ’กUpskilling

Upskilling is the process of learning new skills or improving existing ones to meet new job requirements. The video mentions that AWS is helping employees upskill to adapt to the changes AI will bring to the software development industry.

๐Ÿ’กCompetitive Programming

Competitive Programming is a activity where programmers try to solve problems in a fixed amount of time, often in a contest format. The video references Alpha Code 2's performance in competitive programming as an indicator of AI's growing capabilities in complex coding tasks, suggesting that AI is becoming capable of performing at a level comparable to advanced human programmers.

Highlights

Amazon Web Services CEO Matt Garman suggests most developers may stop coding as AI takes over coding tasks.

AI's ability to perform coding tasks in natural language is speculated to replace many traditional coding jobs.

Garman's comments were advisory rather than a warning, indicating a shift in software engineering dynamics.

The role of software developers will change, with a focus on innovation and user experience rather than coding.

AI advancements are expected to automate code generation, reducing the need for manual coding by developers.

The prediction that most developers won't be coding in 24 months reflects the rapid pace of AI in software development.

Coding as a language to communicate with computers may become less central as AI takes on more coding tasks.

The potential for AI to generate entire programs from a single prompt signifies a major shift in software creation.

The hottest new programming language being English reflects the use of natural language prompts for AI coding.

Garman's optimistic tone suggests more creative opportunities for developers in the AI-enhanced landscape.

AWS is helping employees upskill to increase productivity with AI, showing a proactive approach to industry change.

The demand for software engineers may increase in the short term due to the need to understand and manage AI systems.

The role of a developer in 2025 may be significantly different, with a focus on managing AI rather than traditional coding.

The rapid improvement in AI coding capabilities, as seen in benchmarks, suggests an approaching inflection point in the industry.

OpenAI's release of a human-validated subset of the SWE Bench aims to more accurately evaluate AI's software engineering abilities.

The underestimation of AI capabilities by current benchmarks indicates that models may be more advanced than realized.

The improvements in AI coding agents' performance on the SWE Bench show an acceleration in the capabilities of AI in software engineering.

Estimations based on current improvement rates suggest AI could reach 90% of SWE tasks within the next 12 months.

The potential for AI to perform at a level comparable to expert competitive programmers indicates advanced problem-solving capabilities.

The discussion on the future of software engineering roles and the impact of AI on hiring processes and company structures.

Transcripts

play00:00

so software development Ai and the job

play00:02

market is rapidly evolving and in a

play00:05

recent leaked recording an Amazon Cloud

play00:07

Chief tells employees that most

play00:10

developers could stop coding as soon as

play00:12

AI takes over now I know what most of

play00:14

you guys are initially thinking you're

play00:16

probably thinking that this is an AI

play00:18

hype video where I'm saying that oh no

play00:20

all AI software Engineers are going to

play00:22

replace all traditional software

play00:24

engineer that's not what I'm saying I'm

play00:25

going to break down this article and

play00:27

show you the actual real grounded truth

play00:29

that's based on ual industrywide

play00:30

developments in the AI community that

play00:32

most people aren't paying attention to

play00:34

so this is an article from Business

play00:37

Insider I find it quite insightful

play00:39

because this is something that actually

play00:41

happened in a leaked conversation with

play00:43

the rate of current AI developments I'm

play00:46

not sure that this person is entirely

play00:48

wrong although I think there are a few

play00:50

nuances that maybe this article doesn't

play00:53

pay attention to so a summary of this

play00:55

article basically says that the Amazon

play00:57

web services CEO Matt garm shared

play01:00

thoughts on AI during an internal

play01:02

fireside chat in June and Business

play01:04

Insider obtained a recording of the

play01:06

meeting garmin's comments were a kind of

play01:08

advisory nudge rather than a dire

play01:10

warning to software Engineers hence the

play01:13

part where I'm stating that this isn't a

play01:15

kind of oh no all software Engineers are

play01:17

gone but like I said before AI is most

play01:19

certainly going to be changing the

play01:21

dynamic so let's take a look at this and

play01:23

see exactly what was said and what this

play01:25

truly does mean and any other further

play01:27

things for the industry so he says here

play01:30

the software Engineers may need to

play01:32

develop other skills as soon as

play01:34

artificial intelligence takes over many

play01:36

coding tasks if you aren't familiar with

play01:38

the current concept of how good AI is

play01:41

many people have been speculating that

play01:43

AI is going to replace many coding tasks

play01:46

due to its ability to perform many

play01:48

coding tasks in natural language with a

play01:50

remarkable level of efficiency now I do

play01:52

think that sometime in the future this

play01:54

is going to happen but there is a bit

play01:56

more detail that you do need to pay

play01:58

attention and it says that's according

play02:00

to Amazon web services CEO Matt Garman

play02:03

who shared his thoughts on the topic

play02:04

during an internal fireside chat held in

play02:07

June according to a recording of the

play02:09

meeting obtained by Business Insider now

play02:11

here's where he gives his prediction for

play02:12

the dates in terms of where he thinks

play02:15

this event will happen so he says that

play02:17

if you go 24 months from now which is

play02:19

literally just 2 years he says or some

play02:22

amount of time I can't exactly predict

play02:24

where it is but it is possible that most

play02:27

developers are not coding okay and I

play02:30

think what he says here is rather

play02:31

accurate considering how people

play02:33

interpret that comment okay so he says

play02:35

here is that coding is just kind of like

play02:37

that language where we talk to computers

play02:40

it's not necessarily the skill in and of

play02:42

itself the executive said he said that

play02:44

skill in and of itself is like how do I

play02:46

innovate how do I go build something

play02:48

that's interesting for my end users to

play02:51

use and I think this right here actually

play02:54

captures what most people miss about

play02:56

this when people say that okay software

play02:58

engineering is is going to potentially

play03:00

change considering the rate of tools and

play03:03

advancements in the AI space and how

play03:06

good these systems are getting at coding

play03:08

related tasks and when we look into the

play03:10

future we can kind of see that okay this

play03:12

is clearly going to change the industry

play03:14

now of course he does say here that 24

play03:16

months from now things are going to look

play03:17

different and I think 24 months from now

play03:20

is not a bad estimate because 24 months

play03:23

from now is 2 years currently it's 2024

play03:26

that would be 2026 and in 24 number from

play03:30

now in 2 years arguably there would have

play03:32

at least been potentially two more

play03:35

scale-ups of AI models now maybe there

play03:37

might not be two more scale-ups of

play03:38

course there are all of these things

play03:40

that we cannot truly predict but I do

play03:42

think 24 months from now the space might

play03:44

be in a completely different position

play03:46

now with that being said if we are in a

play03:48

position where 24 months from now these

play03:50

systems are absolutely amazing where you

play03:52

can simply build products through

play03:54

natural language prompts then it is

play03:56

possible that most developers aren't

play03:59

going to be coded during a time where

play04:01

llms are doing the majority of the heavy

play04:03

lifting so I think that that is an

play04:05

accurate statement for 24 months from

play04:07

now what he doesn't say and what people

play04:09

might take away from this is that

play04:10

developers are going to be useless and

play04:12

all of their jobs are going to be gone

play04:13

that's not what he's saying what he's

play04:15

saying is that most of them aren't going

play04:17

to be coding remember he said this is

play04:19

more of like an advisory nudge rather

play04:21

than a DI up warning now the thing here

play04:23

as well is that coding is essentially

play04:24

just a language that we actually talk to

play04:26

computers and how we get them to do

play04:28

exactly what we want and basically if we

play04:30

think about like far into the future of

play04:32

course it might not be 24 months from

play04:33

now it might be 4 years from now but the

play04:35

end goal is always how do we innovate

play04:38

and of course how do we actually build

play04:40

something that's interesting for my end

play04:41

users to use I think this is mainly the

play04:44

end goal for anyone who's using software

play04:46

the end goal is always how can I make

play04:48

this you know product better for my

play04:50

users and how can I innovate within this

play04:53

to make products that are actually good

play04:55

for my users so I think that that is a

play04:57

really important prediction now one of

play04:59

the things that was recently said on

play05:01

Twitter you know earlier last year was

play05:04

the fact that the hottest new

play05:06

programming language is English now this

play05:09

you know saying that the hottest new

play05:11

programming language is English is

play05:12

basically referring to the fact that

play05:14

English is what people use to talk to

play05:16

llms and if you've been talking to llms

play05:19

you'll know that when you talk to llms

play05:21

they can manage to get a lot of your

play05:23

understanding through natural language

play05:25

you know sometimes you do have to do a

play05:27

bit of extra prompting but as long as

play05:29

you understand English you're going to

play05:30

be working in a very easy environment

play05:33

with these program now the article

play05:35

continues to State some more things like

play05:38

this role will change and he says the

play05:40

role of the software developer will

play05:42

change garment said it just means that

play05:45

each of us has to get more in tune with

play05:48

what our customers need and what the

play05:49

actual end thing is that we're going to

play05:52

try and go build okay because that's

play05:54

going to be more and more of what work

play05:55

is opposed to as of sitting down and

play05:57

actually writing code and this is

play05:59

something that I do agree with once the

play06:01

AI is able to completely automate code

play06:03

let's say you know 10 years from now ai

play06:06

is just able to generate code for like

play06:08

you know an entire program with one

play06:09

single prompt which I think we're

play06:11

starting to see you know early Sparks of

play06:13

that with Claude 3.5 I think of course

play06:16

the main thing you know the kind of

play06:18

place that you want to be in is one

play06:20

where you can actually think about what

play06:22

the end user experience is going to be

play06:24

like and what customers are actually

play06:26

going to want of course it's not going

play06:28

to be in you know doing the heavy

play06:29

through code if llms are going to be

play06:31

doing that the work is going to shift to

play06:33

be you know actually understanding what

play06:36

customers actually need and what the end

play06:38

thing is so I think what he's stating

play06:40

here is that you know this role is

play06:42

probably not going to go away but you

play06:44

know when you actually think about it

play06:45

the role is going to change and it's

play06:47

going to be really interesting to see

play06:48

how the role manages to change when a

play06:50

large portion of your work does get

play06:52

automated I think it's going to be

play06:53

interesting to see how individuals

play06:55

manage to adapt to that changing work

play06:58

environment and use any other skills in

play07:00

order to adapt to the workplace now I do

play07:02

think that this is going to be something

play07:04

that is quite true of course he says

play07:06

here that this is no dire warning of

play07:08

course the talk of AI changing and even

play07:11

eliminating jobs has intensified lately

play07:13

as companies layoff employees or stop

play07:15

hiring to shift resources towards AI

play07:18

development new AI tools should

play07:19

automatically generate code can help

play07:21

companies do more with the same number

play07:23

of Engineers or fewer of these pricey

play07:25

employees if you aren't familiar with

play07:27

the price of an you know software

play07:29

engineer these guys are paid big big

play07:31

bucks especially at Fan companies and

play07:34

demand uh you know higher salary than

play07:36

most traditional roles you can see here

play07:38

it says in garment's case he was sharing

play07:40

advice rather than issuing a dire

play07:41

warning that software developers will go

play07:43

extinct because of AI his tone is

play07:45

optimistic suggesting more creative

play07:47

opportunities for developers and he says

play07:49

that AWS Amazon web services was helping

play07:53

employees continue to upskill and learn

play07:55

about new technologies to increase their

play07:57

productivity with the help of a I now I

play08:00

think this is a stellar statement

play08:02

because there is a lot of fear right now

play08:05

the fact that AI can do a lot of things

play08:07

and it's advancing so rapidly the fact

play08:09

that this could eliminate jobs is

play08:11

certainly a fear amongst many which is

play08:13

why of course I do have my community but

play08:15

this is something that like I said

play08:17

before is a role that I think it's going

play08:19

to be enhanced by AI because the thing

play08:21

is that right now what we're seeing is

play08:23

we're seeing an influx of people that

play08:25

are getting into code because of these

play08:27

you know llms and these systems what we

play08:29

have now is a place where you know you

play08:31

can ask an llm to code something for you

play08:33

completely basic but if you don't

play08:35

understand how that code Works how you

play08:37

can change that code what to kind of

play08:39

prompt the llm you're still going to be

play08:40

pretty stuck in a rudimentary manner

play08:42

when you're trying to build something

play08:44

and I think that is also going to

play08:46

actually shortterm increase a demand for

play08:49

software Engineers because there are

play08:50

many people I know right now including

play08:52

myself who are building certain things

play08:55

you know experimenting with code that

play08:57

truly haven't really done that before so

play08:59

it's going to be kind of interesting to

play09:00

see how that Dynamic manages to shift

play09:02

and how companies manage to integrate

play09:04

software Engineers as rather more than

play09:06

you know coders now I guess you could

play09:08

say orchestrators of you know pieces of

play09:10

software as their main focus so you can

play09:11

see right here he says being a developer

play09:13

in 2025 may be different than what it

play09:16

was as a developer in 2020 and I think

play09:18

this is going to be you know rather true

play09:20

as your role's main focus is probably

play09:22

going to shift so that's a huge hint

play09:24

towards any aspiring software developer

play09:26

or someone who is a software developer

play09:28

the kinds of things that you going to be

play09:29

focusing now essentially he says here

play09:31

that this is no more undifferentiated

play09:34

heavy lifting see an Amazon web services

play09:36

spokesperson Aisha Johnson told Business

play09:39

Insider that garman's comments conveyed

play09:42

opportunities for developers to

play09:44

accomplish more than they do today with

play09:46

new AI tools he added that there was no

play09:48

indication he expected a decline in the

play09:51

role of develop like I said before you

play09:53

know these tools ideally we do want them

play09:56

to do the heavy lifting which is going

play09:58

to free up more time for tasks that do

play10:00

matter such as actually thinking about

play10:03

what the end user wants which means that

play10:05

overall these experiences are going to

play10:07

get better now one of the things I do

play10:10

also want to talk about was the fact

play10:12

that whilst the statement does come from

play10:14

the Amazon web services Cloud Chief you

play10:17

know him stating in a private chat it

play10:19

does seem quite bad like oh this company

play10:22

had this private chat and they were

play10:24

behind closed doors saying that you know

play10:26

AI could take over with coding he's not

play10:28

the only person that has said this okay

play10:30

you know stability AI that company the

play10:33

CEO Imad mustac also predicted in 20203

play10:37

that there will be no human programmers

play10:40

in 5 years he based this off a

play10:42

prediction you know on a few factors

play10:44

including the GitHub data so 41% of all

play10:47

code on GitHub is currently AI generated

play10:51

and the fact that mustac believes that

play10:52

chat GPT will be available on all mobile

play10:55

phones without an now this is an AI

play10:57

overview so I haven't fact checked all

play10:59

this stuff but the point here is that

play11:00

he's not the only person that has made

play11:02

this prediction other people including

play11:04

nvidia's CEO has also iterated and

play11:07

spoken about this as well I want to say

play11:10

something and it's it's going to sound

play11:11

completely opposite of what people feel

play11:13

over the course of the last 10 years 15

play11:15

years um almost everybody who sits on a

play11:17

stage like this would tell you it is

play11:19

vital that your children learn computer

play11:22

science um everybody should learn how to

play11:24

program and in fact it's almost exactly

play11:26

the opposite it is our job to create

play11:28

computer technology such that nobody has

play11:32

to program and that the programming

play11:33

language is human everybody in the world

play11:37

is now a programmer this is the miracle

play11:39

of artificial intelligence the countries

play11:41

the people that understand how to solve

play11:44

a domain problem in digital biology or

play11:47

in education of young people or in

play11:50

manufacturing or in farming those people

play11:52

who understand domain expertise now can

play11:55

utilize technology that is readily

play11:57

available to you you now have a computer

play11:59

that will do what you tell it to do it

play12:01

is vital that we upskill everyone and

play12:03

the upskilling process I I believe will

play12:06

be delightful surprising so yeah that's

play12:09

Nvidia CEO talking about the future and

play12:13

I think what he's stating here a lot of

play12:15

people are thinking okay do I now switch

play12:17

my career I think what he's more so

play12:19

talking about is like you know people

play12:20

who are just just starting and those who

play12:22

are extremely young getting into the

play12:25

space that by the time their career

play12:27

matures the area is going to be a lot

play12:29

different now I still think you're going

play12:31

to need to you know know the

play12:32

fundamentals behind the scenes and that

play12:33

kind of stuff but some people disagree

play12:36

now in this video what I wanted to do is

play12:38

I wanted to keep this video as balanced

play12:41

as possible because I know that there is

play12:43

a large amount of people out there who

play12:45

do believe that this is something that

play12:48

is completely overhyped and I did see a

play12:50

recent video that actually spoke about

play12:53

this in a lot of depth there were two

play12:55

videos that I did watch and I'm going to

play12:56

mention those now because I always try

play12:59

to you know understand where my biases

play13:01

May lie if I'm someone who has a channel

play13:04

in the AI space talking about the

play13:06

technology there are incentives for me

play13:08

to exaggerate AI claims however I'm not

play13:11

that kind of person I understand that

play13:13

there are nuances to things and that

play13:15

sometimes the technology might not be as

play13:18

great as it is Promised for example we

play13:21

did have the Devon Saga where you know

play13:24

the internet of bugs four months ago you

play13:27

know a few weeks after Devon was

play13:28

released he released this video a

play13:30

25-minute investigation into the first

play13:33

AI software engineer now essentially he

play13:35

stated in this video that it was you

play13:37

know an upwork lie and that you know

play13:39

essentially Devon wasn't as good as they

play13:41

claimed now that's completely

play13:43

understandable a lot of the times with

play13:45

technology demos things aren't as good

play13:47

as the creators claim because they're

play13:49

trying to drum up hype for their product

play13:51

and it worked and the thing is is that

play13:54

while yes potentially Devin was a system

play13:57

that may have been a bit overhyped I

play13:59

think the underlying message is still

play14:01

quite true the tech in the space is

play14:03

actually not overhyped at all and the

play14:06

most surprising thing about this is that

play14:08

the four months since the Devon system

play14:10

was released you can see that this video

play14:12

was literally four months ago there have

play14:14

actually been numerous developments in

play14:16

the AI software engineering space that

play14:18

most people haven't paid attention Devon

play14:21

actually managed to grab the you know

play14:23

Collective consciousness of the software

play14:25

engineering space on social media but

play14:28

the other more incremental updates the

play14:30

ones that actually matter and are slowly

play14:32

moving upwards people haven't been

play14:34

paying attention that's why I put here

play14:36

that things move pretty quickly

play14:38

especially in the AI space and I'm going

play14:40

to show you guys what I'm talking about

play14:42

so open AI decided to release the

play14:45

software engineering bench they

play14:46

introduced this on the 13th of August

play14:48

2024 which is just around 10 days ago

play14:51

now depending on when I release this

play14:53

video so they said we're releasing a

play14:55

human validated subset of the swe bench

play14:58

that more reliably evaluates AI model's

play15:01

ability to solve real world software

play15:03

issues so they basically said that look

play15:06

there are issues on this bench okay and

play15:08

you guys can take a look at this here it

play15:10

says that look all right one of the most

play15:12

popular evaluation Suites for software

play15:14

engineering is the swe bench a benchmark

play15:17

for evaluating large language models

play15:19

abilities to solve real world software

play15:21

issues sought from GitHub The Benchmark

play15:23

involves giving agents a code repository

play15:25

and issue description and challenging

play15:27

them now listen to this to generate a

play15:29

patch that resolved the problem they

play15:30

described by the issue now coding agents

play15:33

have made impressive progress on the swe

play15:35

bench with top scoring agents scoring

play15:37

20% on the swe bench and 43% on the swe

play15:42

bench light according to that

play15:43

leaderboard as of August the 5th but

play15:46

here's the kicker okay they said that

play15:48

their testing identified some swe bench

play15:51

tasks which may be hard or impossible

play15:55

okay to solve so basically what they're

play15:57

saying here is that look we made a

play15:59

benchmark and when we looked at the one

play16:01

that everyone's currently benchmarking

play16:02

their systems on that Benchmark had some

play16:04

issues that were way too hard or

play16:06

completely impossible to solve leading

play16:09

the swe bench to systematically

play16:12

underestimate the model's autonomous

play16:15

software engineering capabilities I'm

play16:16

going to say that one more time open I

play16:18

came out and said that that our testing

play16:20

indicates that some swe bench talks are

play16:23

impossible to solve leading to swe bench

play16:26

systematically underestimating the the

play16:29

model's autonomous software engineering

play16:31

capability which basically means that

play16:32

look we made benchmarks and we realized

play16:34

that your ones were pretty impossible to

play16:36

solve and you guys aren't truly

play16:38

realizing how capable these models truly

play16:41

are which is an issue because if you've

play16:43

been paying attention to the space what

play16:45

you'll know is that there have been

play16:47

major major updates for example with

play16:49

fine-tuning you can see that the actual

play16:52

Improvement since Devon since 4 months

play16:54

ago this has completely doubled guys and

play16:57

the software engineering you know bench

play16:59

you can see right here that since the

play17:01

Devon area which was around 133% you can

play17:03

see that that was around here during

play17:05

that time in 4 months performance has

play17:08

doubled okay we've had numerous

play17:10

competitors come out of the works we've

play17:12

had Amazon's Q developer agent get

play17:15

38.8% we've got the factory code Droid

play17:17

that are you know aiming to build you

play17:19

know like an army of autonomous software

play17:21

agents and then we of course recently

play17:23

had cosign Genie which is now the

play17:25

state-of-the-art model at 43. 8% on that

play17:29

same Benchmark that Devon was on that

play17:32

17% Benchmark so for those you know who

play17:34

are saying that you know this one

play17:35

doesn't work and this you know is awful

play17:37

yada y y I would love to see what people

play17:40

have to say about the actual

play17:42

improvements because many people are now

play17:44

dismissing these claims and many people

play17:46

are now stating that look this thing

play17:48

isn't good at all this isn't actually

play17:50

any kind of reliable Improvement but if

play17:52

we've just been paying attention to the

play17:54

rate of improvement here we can see that

play17:55

this is absolutely incredible now I've

play17:57

made this table in Google and you can

play17:59

see right here that the improvements

play18:02

here are rather Stark we can see that

play18:04

literally at the start of 2024 we were

play18:06

at 7% you can literally just see right

play18:08

here 7% and now we are 8 months through

play18:10

2024 and we at

play18:13

43.8% and remember Devon was only four

play18:16

months ago you can see and in the four

play18:17

months you know we've had you know 38 38

play18:20

37 36 26 you know things have been

play18:22

moving rather rather quickly so it's

play18:26

important to know that now I did some

play18:28

you know not testing but you know I

play18:29

spoke to Claude I wanted to ask them how

play18:31

quickly are we going to be moving

play18:33

towards 90% because the reason I said

play18:36

90% is because to go from like 40% to

play18:39

90% is rather easy and that is going to

play18:41

be an acceleratory period but to get

play18:43

from 90% to like 99% is a lot harder

play18:47

like making gains when you're already

play18:49

you know 90% of the way there it becomes

play18:52

you know exponentially harder to make

play18:54

those additional gains so it says here

play18:56

that there is a significant jump from

play18:58

the older models below 10% to the newer

play19:00

ones about 20% the Improvement seems to

play19:03

be accelerating with more recent models

play19:06

showing larger gains this is for many

play19:08

factors including the fact that you know

play19:10

all these com companies are now coming

play19:11

out of stealth showing you know everyone

play19:13

what they've been building and it says

play19:15

here that given these observations we

play19:17

can make a rough estimate the

play19:18

improvement from the best 2023 model

play19:21

which is retrievable augmented

play19:23

generation and Claw 3 Opus to the best

play19:25

2024 model is about a

play19:27

36.8% % increase in roughly 8 months

play19:31

this suggests an average Improvement

play19:32

rate of about 4.6 percentage points per

play19:34

month to get from 43.8% to 90% we need

play19:38

an additional 46.2 percentage points at

play19:41

the current rate of improvement it would

play19:42

have take approximately just 10 months

play19:45

to reach 90% however technological

play19:48

progress often follows an S curve where

play19:50

improvements accelerate for a while then

play19:51

slow down as we approach theoretical

play19:53

limits and we're likely in the

play19:55

acceleration phase now considering this

play19:57

and the rapid re progress I would

play19:59

estimate that reaching 90% could be

play20:01

possible with 6 to 12 months from the

play20:03

latest data in the date placing the

play20:05

prediction sometime between February and

play20:07

August

play20:08

2025 so with Claude 3.5 doing the

play20:12

analysis here you can see that if we

play20:14

actually look at the data and we you

play20:16

know extrapolate out further and say

play20:18

okay you know we know what's going to

play20:19

happen Claude is basically predicting

play20:21

that look within 10 months you know it's

play20:23

going to be 90% of all swe tasks that

play20:26

are quite possible and then you know

play20:28

Amazon Cloud Chief is basically saying

play20:30

that look you know in the 24 months to

play20:32

come things are going to be rather

play20:33

different it doesn't seem that crazy

play20:35

when you actually break it down number

play20:37

by number and this is some important

play20:39

things that you do have to pay attention

play20:40

to especially if you're in this space

play20:42

now one of the things that most people

play20:45

do actually forget and I do think that

play20:47

this prediction isn't overestimated this

play20:49

isn't like a hype video where I'm like

play20:50

oh my God you know AGI and you know 10

play20:53

months or two months this is more of

play20:55

like you know trying to keep it factual

play20:56

and grounded literally based on the you

play20:58

know benchmarks that we've recently seen

play21:00

but we've seen like 4.6 percentage

play21:02

Points each month of course some months

play21:03

are going to be larger some months are

play21:05

going to be not as good but I think if

play21:07

you know if we go at the same rate it's

play21:08

going to be 10 months we can you know

play21:10

say that okay even if it's not 10 months

play21:12

even if we extrapolate and add another

play21:13

year onto that there's going to be huge

play21:16

amounts of significant developments now

play21:18

like I said before there's always one

play21:19

thing that most people forget and this

play21:21

is where I think there's going to be you

play21:22

know huge jumps made in AIS coding

play21:25

capabilities you know in the areas of

play21:27

these Frontier you know models doing

play21:29

stuff that just most people didn't even

play21:31

take into account okay and that's why I

play21:33

said the recent videos I'm talking about

play21:34

you know such as this one right here you

play21:37

know it's saying that debunking the AI

play21:39

software engineer yada yada yada this is

play21:40

terrible always take into account the

play21:42

actual benchmarks of other AI systems

play21:45

because that is also important and I did

play21:46

watch this video from n code. where he's

play21:49

basically explaining that look hype is a

play21:51

marketing tool and basically stating

play21:53

that hype is completely out of control

play21:56

and that this is you know something that

play21:58

could potentially not get there just yet

play22:00

but I do think that what this video

play22:02

doesn't pay attention to is of course

play22:04

some of the stuff and of course some of

play22:06

the more recent developments and some of

play22:08

the other papers where they actually

play22:09

talk about other coding stuff which I'm

play22:10

going to get into in a minute most

play22:12

powerful marketing tool that's ever

play22:14

existed let's talk Devon AI for a second

play22:17

I made a video talking about it right

play22:19

after the announcement I thought the

play22:20

founders were super smart but also that

play22:23

I'm not worried about Devon because

play22:25

these people are clearly fighting an

play22:27

uphill battle

play22:29

I don't know a lot about it the

play22:30

benchmarks say that this is more

play22:33

effective at software engineering tasks

play22:35

than the other llms I wonder if that's

play22:38

because it just puts a few things

play22:40

together like it has its own

play22:41

capabilities to like do research like go

play22:43

on Google brows stack Overflow and run

play22:46

code and execute code and they just put

play22:48

those pieces together more more

play22:50

cohesively than GPT obviously in a short

play22:53

amount of time they didn't create their

play22:54

own complex llm I'm pretty sure they're

play22:57

using one of the existing llms I didn't

play22:59

closely examine the evidence or anything

play23:02

like that it seemed kind of obvious to

play23:04

me a few months later people are

play23:05

realizing they were fooled by the hype

play23:08

Devon is extremely overrated at least

play23:11

for now if you don't believe me maybe

play23:13

you'll believe this guy he basically

play23:15

proved that at this point Devon AI is

play23:17

more useless than a freshman CS student

play23:20

who's one week into their first

play23:22

programming course so what I wanted to

play23:24

talk about you know after you know

play23:25

viewing ni's video it was definitely an

play23:27

insightful video and I do think that it

play23:29

grounds you know a lot more you know

play23:31

reality in terms of what's actually

play23:32

happening because I think what we do

play23:34

need to pay attention to is the reality

play23:35

of things you know a lot of things have

play23:37

blown out in that video he talks about

play23:38

Tesla's been selling for self driving

play23:40

for years you know companies make claims

play23:42

all the time you've got FTX you've got

play23:43

theranos this is something that usually

play23:45

does happen with tech companies but most

play23:48

people aren't paying attention to what

play23:49

these Frontier companies are saying okay

play23:51

for example um this is a paper that

play23:52

didn't get a lot of you know information

play23:54

slash like data you know people just

play23:56

basically didn't speak about this paper

play23:58

but essentially this is the alpha code 2

play24:00

technical report okay and it talks about

play24:02

how alpha code was the first AI system

play24:04

to perform at the level of the median

play24:06

competitor in competitive programming a

play24:08

difficult reasoning tasks involving

play24:10

Advanced maths logic and computer

play24:12

science and this paper introduces Alpha

play24:14

code 2 a new and enhanced system with

play24:16

massively improved performance powered

play24:18

by Gemini Alpha code 2 relies on the

play24:20

combination of powerful LMS and a

play24:23

bespoke searching and reranking

play24:25

reckoners when evaluated on the same

play24:27

platform as the original Alpha code we

play24:29

found that Alpha code 2 solved 1.7 times

play24:33

more problems and performed better than

play24:35

85% of competition participants now I'm

play24:38

just going to you know ground this in

play24:40

you know what you can understand here

play24:41

for including myself we can see here

play24:43

that uh basically this was done on the

play24:45

code forces okay and this was able to

play24:47

get the 85 percentile on the code forces

play24:50

so code forces is a wellknown platform

play24:53

for competitive programming with a

play24:54

ranking system that categorizes

play24:56

participants based on their performance

play24:58

in the contest the ranks include newbie

play25:00

pupil specialist expert candidate master

play25:03

master International Master grandmas

play25:06

International Grandmaster legendary

play25:07

Grandmaster and reaching 85th percentile

play25:10

on code forces means performing better

play25:12

than 85% of all competitors and

play25:14

according to the data Alpha code 2 ranks

play25:17

between the expert and candidate Master

play25:18

levels which were quite Advanced now if

play25:21

you want to ground that in terms of how

play25:22

good it could potentially show us how

play25:24

you know these systems are going to be

play25:25

in terms of you know coding abilities

play25:28

and what we can look at for the future

play25:30

you can see here that for code forces

play25:32

the expert level which is you know the

play25:33

top 70% to 85% participants at this

play25:36

level have a strong understanding of

play25:38

algorithms and data structures they're

play25:40

proficient in solving standard

play25:41

competitive programming problems and are

play25:43

beginning to solve more challenging

play25:45

problems that require deeper insights or

play25:47

more Advanced Techniques and the

play25:49

comparison to software engineering roles

play25:51

here basically says that typically

play25:53

focusing on learning and applying

play25:54

basically programming Concepts writing

play25:56

clean code contributing to projects

play25:57

under supervision the main problem

play25:59

solving required for competitive

play26:01

programming at the expert or candidate

play26:02

level is generally beyond what is

play26:04

expected of a junior software engineer

play26:06

and this is usually what you get at a

play26:08

midlevel to senior level software

play26:09

engineer and engineers at this level are

play26:12

expected to have strong problemsolving

play26:13

skills understanding algorithms and data

play26:16

structures and be capable of Designing

play26:18

efficient and scalable solutions they

play26:20

might not participate in competitive

play26:21

programming but they should be

play26:22

comfortable with similar challenges when

play26:24

needed the point here guys is that what

play26:26

we have is a scenario where you know we

play26:27

should discount the technology as for

play26:30

what we've seen recently as that every

play26:32

single month this is going to you know

play26:34

continually increase one of the things

play26:36

that I think most people aren't paying

play26:38

attention to is the fact that you know

play26:40

even open AI have openly said that this

play26:42

is going to be something themselves that

play26:44

they are going to be tackling for

play26:45

example take a look at this clip of

play26:46

samman actually talking about what one

play26:48

of the biggest areas of improvement is

play26:50

likely going to be for him moving

play26:52

forward more to gain there graph a

play26:54

couple of decades in the future be like

play26:56

H something changed yeah are there

play26:57

application

play26:58

or areas you think are most promising in

play27:00

the next 12 months I'm sure I'm biased

play27:03

just because of where what we do here

play27:05

but Cod I think is a it's a really big

play27:07

one and the only reason I include that

play27:08

clip is because opening eye have

play27:10

previously always surprised us with

play27:11

their capabilities and you know they

play27:14

tend to surprise us in areas that we we

play27:15

aren't even pretty much focusing on so

play27:17

the takeaways are simple guys what we

play27:19

have here is a situation where you know

play27:22

over the next 24 to 48 months we could

play27:25

see a complete shift in terms of the

play27:27

main role of software Engineers I still

play27:30

think that we're going to struggle to

play27:31

get from 90% all the way to you know 99%

play27:34

so I think that there will be like this

play27:35

scurve growth where you do have a

play27:37

situation where right now you know if

play27:39

you look at the benchmarks we are

play27:41

currently during that acceleration

play27:42

period where you know you've got these

play27:43

systems you know at the start it's

play27:45

really hard to make progress then all of

play27:46

a sudden the jumps happen the jumps

play27:48

happen then as you get to like 90 95%

play27:51

that's where the taper off is so I do

play27:52

think that if you're like a senior level

play27:54

software engineer to someone that's

play27:55

really cracked I do think that you know

play27:57

your job is probably not going to change

play27:58

for the most part because you're still

play28:00

going to have to understand how all of

play28:01

those systems work together and some of

play28:03

the most difficult tasks but I do think

play28:05

that you know as time goes on you know

play28:08

AI is probably going to eat like the

play28:10

bottom area where you're going to be

play28:11

writing code and you know as time goes

play28:12

on it's going to continue continue to

play28:14

continue to continue to you know shift

play28:17

the role of you know many software

play28:19

Engineers I think one of the most

play28:20

interesting things from all of this is

play28:22

how companies are going to change in

play28:23

terms of their hiring process is someone

play28:25

just going to be you know managing AI

play28:27

agents is it going to be a new you know

play28:28

AI agent structure that allows you to

play28:30

you know manage the entire code base

play28:32

you've got AI systems that are going to

play28:34

be working with millions and millions of

play28:36

context lengths either way I think this

play28:37

is something that you know is going to

play28:39

really really change in the next two to

play28:40

four years arguably one of the biggest

play28:42

changes that's going to happen but

play28:44

either way I don't think it's all doom

play28:45

and gloom I just think that the role

play28:46

will change and I think it's going to be

play28:48

fascinating to see how that happens if

play28:49

you did enjoy the video don't forget to

play28:50

leave a like do forget to subscribe and

play28:51

I'll see you guys in the next one