OpenAI'S "SECRET MODEL" Just LEAKED! (GPT-5 Release Date, Agents And More)

TheAIGRID
23 May 202430:22

Summary

TLDRThe video script discusses the rapid evolution of AI models, with OpenAI planning significant advancements in reasoning and intelligence. It hints at a 'GPT next' model set for release in November 2024, which may not follow the traditional GPT 5 naming convention. The script also covers investment areas such as textual intelligence, cost-effective models, and multimodal agents, showcasing the potential for transformative AI capabilities in various industries. The presentation at Viva Technology event in Paris highlighted the exponential growth in AI's computational power and its impact on future models, suggesting a future where AI agents could redefine human-computer interaction.

Takeaways

  • ๐Ÿš€ OpenAI anticipates releasing a new model, possibly called 'GPT next', which represents a significant leap in AI capabilities beyond the current GPT-4.
  • ๐Ÿ“… The release date for the next major AI model is hinted to be in November 2024, aligning with the company's cautious approach due to the upcoming United States elections.
  • ๐Ÿค– The advancements in AI are expected to be so significant that models within the next 1-2 years will be unrecognizable from their current state, indicating a rapid evolution in AI reasoning and intelligence.
  • ๐Ÿ“‰ OpenAI is committed to making AI models not only more intelligent but also more affordable and efficient, with pricing for GPT models decreasing by up to 80% in just a year.
  • ๐Ÿ’ก The company is focusing on four key investment areas: increasing textual intelligence, making models cheaper and faster, custom models, and developing multimodal agents.
  • ๐Ÿ” OpenAI's next frontier model is expected to provide a 'step function' in reasoning improvements, suggesting a substantial and sudden increase in AI's ability to understand, process, and generate complex reasoning.
  • ๐Ÿง  The potential for AI in fields like medical research and scientific reasoning is highlighted, with the expectation that AI will advance to a level capable of high-order thinking and problem-solving.
  • ๐Ÿ’ป OpenAI has been building increasingly powerful AI supercomputers, scaling from 'shark-size' to 'orca-size' and now 'whale-size', which will train the next set of models with unprecedented capabilities.
  • ๐Ÿ”‘ The Assistant API offers a complete toolkit for developers to integrate assistive experiences into their products, including conversation history management, function calling, knowledge retrieval, and code interpreter.
  • ๐ŸŒ The script also covers the impact of AI on various sectors, such as the drive-through industry with AI-powered voice agents, and the potential for AI to democratize access to complex tasks like software development.

Q & A

  • What is the expected timeframe for significant advancements in AI models according to the transcript?

    -The transcript suggests that within a year or two from now, AI models are expected to become unrecognizable from their current state, indicating significant advancements in a relatively short period.

  • What is the significance of the term 'GPT next' mentioned in the transcript?

    -The term 'GPT next' could imply a new model that may not follow the traditional naming convention of 'GPT 5'. It suggests that OpenAI might be planning more than what people expect, possibly indicating a significant leap in technology rather than an incremental upgrade.

  • Why was the Viva Technology event significant in the context of the transcript?

    -The Viva Technology event, also known as Vivatech, is significant because it's where the image indicating the progression from GPT 3 to GPT 4 and hinting at 'GPT next' was first revealed. It's an annual technology conference in Paris, France, dedicated to innovation and startups.

  • What is the potential release date for the 'GPT next' model as hinted in the transcript?

    -The transcript suggests that the 'GPT next' model is expected to be released in November 2024.

  • How does the transcript connect the release of the new AI model with the 2024 United States elections?

    -The transcript mentions that OpenAI's CTO, Mira Murati, confirmed that the elections were a major factor in the release timing of the new model. OpenAI is cautious about not releasing anything that could potentially affect global elections.

  • What are the four investment areas mentioned in the transcript related to AI development?

    -The four investment areas mentioned are: increasing textual intelligence, making models cheaper and faster, custom models, and developing multimodal agents.

  • What is the purpose of the Assistant API mentioned in the transcript?

    -The Assistant API is a toolkit for developers to bring assistive experiences into their products. It manages conversation history, allows function calling to integrate app features, and can interpret and execute code to answer precise questions.

  • How does the Assistant API facilitate the integration of knowledge retrieval in the transcript?

    -The Assistant API allows for the easy integration of factual data into conversations with models like GPT-4. It automates the process of embedding files, such as a book on Italy, into the conversation, making it easier for developers to provide accurate and context-aware responses.

  • What is the role of 'Code Interpreter' in the Assistant API as described in the transcript?

    -The 'Code Interpreter' in the Assistant API enables the writing and execution of Python code in the background to answer precise questions, particularly those involving calculations, currency conversion, and other numerical data.

  • What is the significance of the 'step function' in reasoning improvements mentioned in the transcript?

    -The 'step function' in reasoning improvements suggests a significant, abrupt, and substantial increase in AI's reasoning capabilities at a particular point, rather than a gradual improvement. This implies that the next models will have enhanced problem-solving abilities and a more sophisticated decision-making process.

Outlines

00:00

๐Ÿš€ Upcoming AI Model Advancements

The video discusses the rapid evolution of AI models, expecting significant changes within a year or two. The speaker hints at an under-the-radar announcement related to the 'GPT next' model, suggesting a substantial leap in reasoning capabilities. The Viva technology event, known as Vivatech, is referenced as the source of an image indicating a timeline for AI development, pointing towards a major release in November 2024. The importance of this date is tied to its proximity to the 2024 United States elections, suggesting a cautious approach by OpenAI to avoid influencing the electoral process. The video also speculates on the naming convention of future models, hinting that 'GPT next' might not align with the expected 'GPT 5'.

05:02

๐Ÿ—“๏ธ Key Dates and AI's Impact on Elections

This paragraph delves into the strategic timing of AI model releases, particularly in relation to global events like the 2024 U.S. elections. OpenAI's CTO, Mira Murati, confirms that the elections are a significant factor in the release schedule of models like GPT 5, aiming to avoid any potential negative impacts. The speaker also discusses the potential backlash OpenAI could face if they were to release a model with advanced capabilities that might be perceived as threatening to democracy. Additionally, investment areas for AI development are outlined, emphasizing the importance of enhancing textual intelligence to unlock transformative value in AI.

10:03

๐Ÿง  Anticipated Cognitive Leap in AI Models

The speaker reflects on the exponential growth of AI capabilities, suggesting that current models are at a primitive stage compared to what is to come. They predict that within a couple of years, AI models will advance to a level that will be unrecognizable from their current state, potentially excelling in complex fields like medical research or scientific reasoning. The discussion also touches on the affordability and accessibility of these models, with a focus on making them cheaper and faster to accommodate various use cases.

15:03

๐Ÿ’ก Insights into Future AI Model Naming and Capabilities

In this segment, the speaker explores the possibility that future AI models may not follow the traditional naming convention, as suggested by Sam Altman. They discuss the potential for models to be named something other than 'GPT 5' and the uncertainty surrounding the release dates and capabilities of these models. Additionally, the video references Microsoft's involvement in training next-generation AI models and the significant computational power being used, which is expected to result in a substantial leap in AI capabilities.

20:06

๐Ÿค– The Emergence of Multimodal Agents in AI

The focus shifts to the development of multimodal agents, which are expected to revolutionize human-computer interaction. The speaker highlights the potential of these agents to leverage text, context, and tools to create a more natural and efficient user experience. Examples of agentic workflows are provided, such as an AI software engineer capable of complex tasks and a voice agent for drive-through services. The potential impact of these agents on various industries, including software development and consumer services, is emphasized.

25:08

๐Ÿ› ๏ธ Demonstrating Assistive Experiences and Agents

This paragraph showcases live demos of assistive experiences and agents, illustrating their practical applications. The speaker introduces the Assistant API, a toolkit for developers to integrate assistive features into their products. Examples include an AI-powered travel app, a system for managing conversation history, function calling to interact with app features, knowledge retrieval to access factual data, and a code interpreter for precise computations. The video emphasizes the ease of use and the immediate applicability of these tools for developers.

30:10

๐Ÿ”ฎ Speculations on Future Model Releases and Capabilities

The final paragraph wraps up with speculations on the future of AI model releases, particularly around the anticipated model in November 2024. The speaker acknowledges the difficulty in predicting the exact nature of these models but asserts that it will represent a significant leap in capabilities and usability. They also touch on OpenAI's trademarked names like GPT 6, leaving the audience with a sense of anticipation for the monumental advancements on the horizon in the AI field.

Mindmap

Keywords

๐Ÿ’กAI Models

AI Models refer to artificial intelligence systems designed to perform specific tasks, such as language processing or image recognition. In the video's context, AI models are the central theme, with discussions about their evolution and future advancements. The script mentions models like GPT-3 and GPT-4, highlighting their increasing intelligence and capabilities.

๐Ÿ’กGPT-3

GPT-3, which stands for 'Generative Pre-trained Transformer 3,' is a language model developed by OpenAI. It is known for its ability to generate human-like text based on the input it receives. The script positions GPT-3 as the starting point of the discussed era, indicating its significance in the progression of AI models.

๐Ÿ’กGPT-4

GPT-4 is the successor to GPT-3 and represents the next generation of language models. The script suggests that GPT-4 has introduced a new era with its multimodal capabilities, meaning it can process and understand various types of data beyond just text, such as images and sounds.

๐Ÿ’กGPT Next

The term 'GPT Next' is used in the script to refer to a future model that is expected to bring significant advancements over its predecessors. It is suggested that 'GPT Next' might not be named GPT-5 as traditionally expected, indicating a potential leap in technology rather than a simple incremental upgrade.

๐Ÿ’กModel Intelligence

Model intelligence refers to the cognitive capabilities of AI models, including their ability to understand, learn, and make decisions. The script discusses the expected increase in model intelligence, suggesting that future models will be vastly more intelligent and capable than current ones, potentially reaching a level comparable to master students or experts in various fields.

๐Ÿ’กReasoning Improvements

Reasoning improvements denote the advancements in an AI model's ability to process information logically and come to conclusions. The script highlights that the next frontier model is expected to provide a 'step function' in reasoning improvements, implying a significant leap in the models' logical and analytical capabilities.

๐Ÿ’กViva Technology Event

The Viva Technology Event, also known as Vivatech, is an annual technology conference focused on innovation and startups. It is mentioned in the script as the source of an image that reveals key dates and information about the future of AI models, emphasizing the importance of such events in the tech industry for unveiling new developments.

๐Ÿ’กMultimodal AI

Multimodal AI refers to systems capable of processing and understanding multiple types of input data, such as text, images, audio, and video. The script mentions GPT-4 as a multimodal AI that has the ability to understand and generate responses across different modalities, showcasing a significant advancement in AI capabilities.

๐Ÿ’กAgents

In the context of the video, agents refer to AI-driven tools or interfaces that can perform tasks, make decisions, and interact with users in a more autonomous and interactive manner. The script discusses the potential of agents to revolutionize how we interact with software and perform complex tasks, such as coding or managing travel plans.

๐Ÿ’กAssistance API

The Assistance API is a toolkit mentioned in the script that allows developers to integrate AI-powered assistance into their products. It manages conversation history, function calling, knowledge retrieval, and code interpretation, providing a comprehensive set of tools to build interactive and intelligent applications.

๐Ÿ’กCompute

Compute, in the context of AI, refers to the computational resources and processing power used to train and run AI models. The script discusses the exponential growth in compute capabilities, comparing the scale of different AI supercomputers to marine life, to illustrate the significant increase in power and capabilities of future AI models.

Highlights

Expectations of AI models becoming unrecognizable within a year or two.

Plans to push the boundaries of AI reasoning with the next Frontier Model.

Secret announcements and updates for future AI models discussed.

Viva Technology event (Vivatech) mentioned as a source of AI innovation.

Introduction of 'GPT next' as a potential successor to GPT 4.

Release date for GPT next model hinted at November 2024.

Potential impact of AI models on the 2024 United States elections.

Open AI's concern over the influence of AI on global elections.

The transformative potential of increasing textual intelligence in AI.

Significant price decrease of GPT models over the past year.

Expectations for models to advance beyond their current 'first or second grader' intelligence.

AI's future ability to excel in complex tasks like medical research.

The concept of 'step function' improvements in AI reasoning capabilities.

Discussion on the importance of multimodal AI and its future advancements.

Live demos showcasing assistive experiences and agentic workflows.

Introduction of the Assistant API for building assistive experiences.

Knowledge retrieval and code interpreter tools for enhancing AI capabilities.

Trademarked names like GPT 6 hint at future model developments.

The monumental leap expected in AI capabilities and usability in the coming years.

Transcripts

play00:00

and uh we think that within maybe a year

play00:02

or two from now like the the models will

play00:04

be unrecognizable from what they are

play00:05

today and so this year we also plan to

play00:08

push that boundary even more and we

play00:10

expect our next Frontier Model to come

play00:12

um and and and provide like a step

play00:14

function in reasoning uh improvements as

play00:16

well so there was actually a very very

play00:19

interesting announcement that was made

play00:21

but including myself a lot of people

play00:23

didn't actually realize what the

play00:24

announcement was and it was something

play00:26

that was quite under the radar but in

play00:28

this video I'm going to be showing you

play00:29

guys all of the secret announcements and

play00:32

some of the secret updates that are

play00:33

coming to opening eyes models in the

play00:36

future well in the very very close

play00:37

future including some of the key dates

play00:40

that were actually unveiled at a secret

play00:42

presentation so one of the things that

play00:44

you can see here is this is an image

play00:46

that has been floating around on the

play00:48

internet for the past 24 hours and I've

play00:50

confirmed that this image is from the

play00:52

Viva technology event which is commonly

play00:55

known as vivatech and it's an annual

play00:58

technology conference dedicated to

play01:00

Innovation and startups it's held in

play01:02

Paris in France and was founded in 2016

play01:06

by the Publishers group and the event

play01:08

takes place at the Paris Expo and it's

play01:10

pretty much the or I should say one of

play01:13

Europe's largest tech and startup events

play01:16

and it's mainly focused on Tech

play01:17

Innovation along with some other key

play01:20

business insights and from this image it

play01:23

seems pretty simple but there was

play01:25

something that a lot of people did Miss

play01:27

so we can see here that on the image

play01:29

that you you're currently looking at we

play01:31

have three main points so what we do

play01:33

have is we have a situation where we

play01:35

have the 2021 which is the gpt3 era The

play01:39

Da Vinci model and this was of course in

play01:41

2021 you can see now and this is

play01:44

actually super super interesting and

play01:46

this is why I think this video is

play01:48

remarkably important you can see that we

play01:50

moved in 2023 to the GPT 4 era which is

play01:54

here so you can see that this is around

play01:55

the time of GPT 4 being released and

play01:58

being deployed and what's very very

play02:01

interesting is that it now shows a new

play02:04

piece of information that I would say is

play02:06

a little bit interesting you can see

play02:09

that they describe this as GPT next okay

play02:13

so I think that maybe GPT 5 might not be

play02:16

coming and when I say that I don't mean

play02:18

GPT 5 isn't actually coming what I mean

play02:21

is that GPT 5 as you think of it I think

play02:24

open AI are most likely planning a lot

play02:28

more than people think and of course

play02:30

that is something that we even recently

play02:31

saw with the recently demoed GP so it's

play02:35

crazy because it shows us that may 2024

play02:38

which is as you can see here today and

play02:40

you can see it says may 2024 but what's

play02:43

crazy is that they've actually given us

play02:46

the release date for this GPT next model

play02:48

so you can see that if you look at this

play02:52

dot right here we can see that this is

play02:53

actually November 2024 and this date is

play02:58

very important for a few reasons which

play03:00

I'm going to explain in a moment but I

play03:03

think one of the first things that you

play03:05

know really really surprised me was of

play03:07

course the fact that this is called GPT

play03:09

next and not GPT 5 now what's really

play03:13

really crazy about this is that the

play03:15

craziest thing is that we can see that

play03:18

there is clearly some kind of increasing

play03:20

capability so it is very very hard to

play03:23

see this but on the left hand side here

play03:25

it says model intelligence so you can

play03:28

see that gpt3 is intelligent was around

play03:30

this level GPT fors intelligence it

play03:32

doesn't really Benchmark it but I'm

play03:34

guessing that this is GPT 40 so you can

play03:36

see that there is a slight Improvement

play03:38

there although the Improvement is slight

play03:41

the thing that you need to take into

play03:42

account is the fact that even if

play03:45

improvements are slight it does mean

play03:47

that a lot of use cases are going to be

play03:49

pretty pretty insane because if the

play03:52

model can get smarter and it can become

play03:54

more reliable then it means the

play03:56

industries that it can impact are going

play03:58

to be a lot more

play04:00

overall so I think one of the most

play04:02

important things that we could see here

play04:03

is that the model intelligence from GPT

play04:06

40 or gp4 completely does a huge huge

play04:10

jump we can see that literally from this

play04:12

level to this level we can see that it's

play04:14

that amount of jump but from here it is

play04:17

a quite big jump and in fact I probably

play04:19

should be using some actual arrows

play04:21

apologies for my terrible drawings but

play04:24

the point I'm trying to make here is

play04:25

that it seems that the kind of jump that

play04:28

we're about to get here with this GPT

play04:30

next model it looks really really

play04:33

surprising and it's something that

play04:35

they've constantly reiterated that these

play04:37

future models are going to be very very

play04:40

intelligent in terms of how smart these

play04:42

systems are and whilst yes there might

play04:44

be other features this is something that

play04:46

we do know now one thing that I did want

play04:49

to talk about about this gptx model

play04:51

because of course they could have simply

play04:53

put that this is going to be GPT 5

play04:56

although of course they don't want to

play04:57

officially announce it it could just be

play04:59

a place holder for GPT 5 I think that it

play05:02

might not be GPT 5 but I'm going to dive

play05:05

into that one second but one thing I

play05:07

want you guys to know is that this

play05:08

release date right here of November 2024

play05:11

is a key date because this makes sense

play05:14

for the release date of the next model

play05:16

whether it's GPT 5 or whether it's GPT

play05:18

next whatever other models there are I

play05:21

think it's important to note that this

play05:23

date has been said by openai a few times

play05:26

so one of the key things coming up this

play05:28

year and I know some people don't live

play05:30

in America so you might not pay

play05:31

attention but there are the 2024 United

play05:35

States elections this is going to be

play05:37

taking place on Tuesday the 5th of

play05:39

November and you might be thinking okay

play05:41

but those are the elections what does

play05:43

that have to do with actual AI systems

play05:46

like I mean that's politics this is

play05:48

technology in fact those things are very

play05:50

very closely intertwined because open AI

play05:54

themselves did actually make a statement

play05:55

regarding this and the elections are

play05:58

actually a reason for the delay of GPT 5

play06:01

as many people did think that GPT 5 was

play06:04

scheduled to be released in the summer

play06:06

however you can see right here open AI

play06:09

CTO Mira morati recently confirmed that

play06:12

the elections were a major factor in the

play06:14

release of GPT 5 we will not be

play06:18

releasing anything that we don't feel

play06:20

confident on when it comes to how it

play06:22

might affect the global elections or

play06:25

other issues she said last month so

play06:28

whilst yes we did did just get a pretty

play06:30

pretty crazy demo of GPT 40 a multimodal

play06:34

AI that just completely completely

play06:37

shocked the industry it is pretty pretty

play06:40

incredible that you can see here that

play06:43

open AI are really really concerned with

play06:45

as to what the future models are going

play06:48

to be able to do in regards to the

play06:50

election now I think it's going to be

play06:52

either one of two things one of the

play06:54

things is that because there is an

play06:56

election coming up there are always

play06:58

different discussions on what could

play07:00

happen and the kinds of you know

play07:02

conversations going on around privacy

play07:04

issues and just a million different

play07:06

conversations that are going to be had

play07:08

and the problem is is that if open AI

play07:11

does release a model before the

play07:13

elections then you could face a negative

play07:16

PR situation like the public could be

play07:19

negatively thinking about open Ai and of

play07:21

course yes this week open AI have had a

play07:24

huge huge huge amount of bad news in

play07:27

their favor due to some of the things

play07:29

that have been going on at the company

play07:30

from people leaving to Sam Alman doing

play07:33

you know some questionable things

play07:35

depending on where you stand I think

play07:37

it's important to not release models

play07:40

during that time because it's at a time

play07:43

where if the technology is as truly

play07:45

Advanced as the graph shows us then it's

play07:48

definitely going to be more widely

play07:51

received as a model that is you know

play07:53

something that is threatening the

play07:56

individual democracy of the United

play07:57

States because if it has the ability to

play08:00

influence people then some individuals

play08:02

might say that this was all timed and

play08:05

you know with politics things do get

play08:07

really really difficult really really

play08:09

quickly so I think this does make sense

play08:12

now I could be wrong but the fact that

play08:14

we do have the open ey CTO stating that

play08:17

they're not going to be releasing

play08:19

anything and the fact that of course GPT

play08:22

next is in quotation marks here and that

play08:24

there's going to be that at November

play08:27

2024 just after the elections I think

play08:30

November 2024 and considering that

play08:33

previous rumors around that time also

play08:35

did say that I think that this also does

play08:38

make sense now here's where he actually

play08:40

talks about the GPT next models this is

play08:43

a very very fascinating clip and there

play08:45

is also this graph right here it is very

play08:48

very very hard to see like very very

play08:51

very hard to see but if you zoom in you

play08:53

can see that there is also this graph

play08:55

you can see that gpt3 era GPT 4 GPT next

play08:59

and you can see that this one doesn't

play09:01

actually have the dates on it so I'm

play09:02

guessing that before when they had the

play09:04

dates that may have just been a mistake

play09:06

based on proprietary information but of

play09:09

course now the information is out there

play09:11

although we don't know what date it's

play09:12

going to be we know that after November

play09:14

the 5th up until the end of November

play09:17

there's probably likely going to be some

play09:19

kind of model but anyways we are really

play09:22

excited for this but I'm going to show

play09:23

you guys what he talks about here cuz

play09:25

it's there are four investment areas I'd

play09:28

like to cover the first

play09:30

key priority that we have is textual

play09:31

intelligence and our core belief is that

play09:33

if we increase textual intelligence that

play09:35

will unlock transformational value uh in

play09:38

Ai and you can see on the screen here

play09:40

these are the two major models that we

play09:42

offer today GPT 4 the best model with

play09:44

Native multimodality that we just showed

play09:46

and GPT 3.5 turbo 10x cheaper which is

play09:49

convenient for simple task where what

play09:51

you need is really U things like

play09:53

classification or very simple entity

play09:55

extraction and we really expect that the

play09:57

the potential uh to increase

play10:00

the llm intelligence remains huge and

play10:02

today we think models are pretty great

play10:04

you know they're they're you know kind

play10:05

of like first or second graders they

play10:07

respond appropriately but they still

play10:09

make some mistakes every now and then

play10:11

but the cool thing that we should remind

play10:13

ourselves is that those models are the

play10:15

dumbest they'll ever be um you know they

play10:17

may become Master students In The Bleak

play10:19

of an eye um they will excel at medical

play10:22

research or scientific reasoning and uh

play10:24

we think that within maybe a year or two

play10:26

from now like the the models will be

play10:28

unrecognizable from what they are today

play10:30

and so this year we also plan to push

play10:32

that boundary even more and we expect

play10:34

our next Frontier Model to come um and

play10:37

and provide like a step function in

play10:39

reasoning uh improvements as well the

play10:41

second uh of investment area for us is

play10:43

to make sure the models are cheaper and

play10:45

faster all the time and we know that not

play10:48

every use case requires the highest

play10:49

level of intelligence and so that's why

play10:52

uh we want to make sure that we invest

play10:54

and you can see here on the screen the

play10:56

GPT for pricing and how much it's

play10:58

decreased by like 80% in just a year uh

play11:01

it's quite unique by the way for a new

play11:04

technology to uh like decrease in price

play11:07

so quickly but we think it's like really

play11:09

critical in order for all of you to

play11:11

build and reach scale with what we're

play11:13

trying to accomplish and innovate uh

play11:15

with your AI native products so I think

play11:17

that that short snippet from this Tech

play11:21

conference was rather insightful because

play11:24

he actually said a numerous amount of

play11:26

different things in that short snippet

play11:28

but I think some of them were more

play11:30

important than others of course he talks

play11:32

about the price decreasing but one of

play11:34

the things that he did mention that was

play11:36

rather rather fascinating and this is

play11:38

someone that is from open AI he actually

play11:41

speaks about the fact that literally

play11:43

within 1 to two years the models are

play11:46

going to be unrecognizable so that is

play11:48

something that even as someone who pays

play11:51

attention to the AI space and as someone

play11:53

who looks at all of the AI updates in

play11:55

many different things that I literally

play11:57

don't even post on this channel this is

play11:58

still something that is rather

play12:00

surprising and I think it's because

play12:02

humans do have a hard time at grasping

play12:05

the nature of exponential increases in

play12:08

terms of technology and intelligence so

play12:10

I think this is going to be a truly

play12:12

truly transformative period in terms of

play12:15

what is going to come out of this

play12:17

company within the next 5 to 10 years

play12:20

because if he's stating that literally

play12:22

the models are going to look

play12:24

unrecognizable within 1 to 2 years I

play12:26

mean in 2026 this isn't far away you

play12:30

know 2 years is a very short time period

play12:32

especially for these kinds of

play12:34

technological developments now something

play12:37

that he also said that I thought was

play12:38

also rather insightful was that he

play12:41

mentioned a step function in reaching

play12:43

and for the next models this likely

play12:46

means as we've already discussed that

play12:48

this is a significant discret

play12:51

Improvement in the ai's reasoning

play12:52

capabilities rather than a gradual

play12:55

incremental Improvement and essentially

play12:57

this just means that you know in

play12:59

contrast to the gradual Improvement that

play13:01

a step function implies a sudden

play13:04

substantial Improvement at a particular

play13:06

point which is followed by a new level

play13:08

of capability and this change is most

play13:11

more abrupt and significant compared to

play13:14

the gradual improvements and with the

play13:17

reasoning abilities current models like

play13:19

the gpt3 and GPT 4 have of course made

play13:22

significant strides in their ability to

play13:24

reason and understand and generate text

play13:27

but their abilities to reason can still

play13:29

be limited in certain contexts so the

play13:32

GPT NEX models means that a step

play13:34

function in reasoning could mean that

play13:37

they make a substantial leap in their

play13:39

ability to understand process understand

play13:42

process and generate more complex

play13:44

abstract and logical forms of reasoning

play13:46

and because of that increased level of

play13:48

reasoning it means that they've got

play13:49

improved problem solving and this means

play13:52

that these models are going to be better

play13:53

at tackling complex problems that

play13:55

require multi-step and logical reasoning

play13:58

of course this means enhanced

play14:00

understanding this means that the AI

play14:02

could understand context and nuances in

play14:04

a more humanlike way leading to more

play14:06

accurate and relevant responses decision

play14:09

making these models would likely be able

play14:11

to make more sophisticated decisions

play14:13

based on the information provided

play14:15

similar to higher order thinking and

play14:18

like I said before this is just once

play14:20

again going to open up a lot more

play14:22

applications they even spoke about how

play14:25

you know it's going to be able to do

play14:26

medical research we've seen that Google

play14:28

has has been pushing widely on that

play14:30

Frontier with the medical Gemini and

play14:33

they've achieved remarkable benchmarks

play14:35

and I wouldn't be surprised if openi are

play14:37

doing something in that realm now one of

play14:40

the things that I also want to talk

play14:42

about was of course the fact that this

play14:44

release date is rather fascinating and

play14:47

the name of the model did actually make

play14:49

me think about something that was spoke

play14:51

about previously you can see here that

play14:54

we have a model that is called GPT next

play14:58

but one of the things that I spoke about

play15:00

when covering the Sam Alman Lex Freedman

play15:02

interview was the fact that he said

play15:04

something rather insightful he said that

play15:07

the future models that he releases might

play15:09

not actually be called GPT 5 and of

play15:11

course there might be GPT 5 because they

play15:13

did trademark it but he did state that

play15:16

you know whatever the next model may be

play15:18

we're not sure when it's going to be

play15:19

released or what it's going to be called

play15:21

oh that's the honest answer is it blink

play15:24

twice if it's this year before we talk

play15:26

about like a gp5 like model called that

play15:29

or called or not called that or a little

play15:30

bit worse or a little bit better than

play15:32

what what you'd expect from a gbt 5 I

play15:34

think we have a lot of other important

play15:35

things to release first I don't know

play15:37

what to expect from gbt

play15:39

5 you're making me nervous and excited

play15:43

uh what are some of the so right there

play15:45

you can see that Sam mman is actively

play15:47

talking about how they are going to

play15:49

release a few things before gbt 5 of

play15:52

course you know we've seen things like

play15:53

voice engine we've seen Sora we've seen

play15:56

a bunch of other things but you know the

play15:58

way how he talks about how these future

play16:00

models might not even be called what we

play16:02

are expecting them to be called is of

play16:04

course rather fascinating too now one of

play16:07

the things that you may have seen

play16:08

recently was this from Microsoft and

play16:11

this is basically where they talk about

play16:13

the levels of compute that they using to

play16:16

train the next Frontier models and

play16:18

currently we can see that the diagram

play16:21

they use sharks and marine life to I

play16:24

guess you could say help us understand

play16:27

the scale of compute that they are

play16:28

currently using we can see that we have

play16:30

a shark here then of course we have an

play16:32

orca and then of course we have a whale

play16:35

so I mean the stock increase in terms of

play16:38

the capabilities from this graph going

play16:40

back all the way to the first graph are

play16:43

very very similar the gpt3 the GPT 4

play16:46

technology is only a little bit and then

play16:48

of course the next levels I think maybe

play16:51

open ey clearly have discovered

play16:52

something incredible and they're

play16:54

probably going to shock the world

play16:56

because if you're using that much

play16:57

compute to train something and you've

play17:00

also improved your architecture then I

play17:02

think the amount of capabilities that

play17:05

you can get is truly truly surprising

play17:08

and I think this is so surprising

play17:10

because not only do we have maybe not

play17:12

improved architectures in terms of the

play17:14

Transformer but I'm talking about

play17:15

certain techniques that open AI are

play17:17

pioneering and that they're using to

play17:20

advance the frontier in terms of the

play17:21

reasoning and the capabilities of their

play17:24

models and I'm going to show you guys a

play17:26

short snippet from this clip where it's

play17:28

actually spoken about in great context

play17:30

about why this is so pivotal and the

play17:33

only reason I'm showing you this is

play17:34

because now with the added context from

play17:37

this slide here where we can see that oh

play17:39

wait and the only reason I'm showing you

play17:41

guys this clip is because now with the

play17:43

added context of this previous graph

play17:46

where we can literally see in terms of

play17:47

the capabilities jump I think it's

play17:49

important to understand the compute side

play17:52

behind it that Frontier forward and like

play17:54

we showed this slide at the beginning

play17:56

like there's this like really beautiful

play17:59

relationship right now between sort of

play18:01

exponential progression a compute that

play18:02

we're applying to building the platform

play18:05

to the capability and power the platform

play18:07

that we get and I just wanted to you

play18:09

know sort of without without mentioning

play18:11

numbers uh which is sort of hard to do

play18:13

to give you all an idea of the scaling

play18:16

of these systems so in 2020 we built our

play18:20

first AI supercomputer for open AI uh

play18:23

it's the supercomputing environment that

play18:25

trained gbd3 and so like we're going to

play18:28

just choose Marine Wildlife is our scale

play18:31

marker so you can think of that system

play18:34

uh about as big as a shark so the next

play18:36

system that we uh built um scale-wise is

play18:40

about as big as uh an orca uh and like

play18:43

that is the system in uh that we

play18:45

delivered in 2022 that trained GPT 4 the

play18:48

system that we have just deployed is uh

play18:52

like scale-wise uh about as big as a

play18:54

whale relative to like you know this

play18:56

shark siiz supercomputer and this Mar

play18:59

siiz supercomputer and it turns out like

play19:01

you can build a whole hell of a lot of

play19:02

AI with a whale siiz supercomputer um

play19:05

and and so you know one of the things

play19:08

that I just want everybody to really

play19:10

really be thinking clearly about and

play19:11

like um this is going to be our segue to

play19:14

talking with Sam is the next sample is

play19:17

coming so like this whale siize

play19:19

supercomputer is hard at work right now

play19:21

building the next set of capabilities

play19:23

that we're going to put into your hands

play19:26

and yeah if you saw it there he said you

play19:28

can build a whole hell of a lot of AI

play19:32

with a large amount of comput so I'm

play19:34

really intrigued with as to what a whole

play19:37

hell of a lot of uh compute is going to

play19:39

be giving us but one thing I do note is

play19:42

that there is going to be a huge amount

play19:44

of capabilities now something that they

play19:46

also spoke about was of course

play19:48

multimodal agents this is going to be

play19:50

something that is here within the next

play19:52

level of Frontier state-of-the-art

play19:54

models I think that maybe this year we

play19:56

get something but there are also some

play19:58

things I do want to talk about with as

play20:00

to why we might not get that but they

play20:02

also demoed the multimodal agents and of

play20:06

course you can see that their investment

play20:08

areas are the textual intelligence

play20:09

cheaper and faster models the custom

play20:12

models and of course multimodal agents

play20:14

so I want to show you guys this short

play20:16

clip because opening ey haven't really

play20:18

shown us that much in terms of the

play20:20

agentic workflows but I think it's

play20:22

important to take a sneak peek because

play20:24

agents are truly going to change the way

play20:27

that we interact with computers we

play20:29

really believe that like in the future

play20:31

agents may be the biggest change that

play20:33

will happen to software and how we

play20:34

interact with computers and depending on

play20:37

the task they'll be able to leverage

play20:38

text they'll be able to leverage access

play20:40

to some context and tools um so and

play20:44

again all of these modalities that we

play20:46

mentioned will bring also like a fully

play20:49

natural and Noble way to interact uh

play20:52

with um to interact with with the

play20:54

software one example of this that I

play20:56

personally Love Is dein by the team AOG

play20:59

they built like essentially an AI

play21:01

software engineer and uh it's pretty

play21:03

fascinating because it's able to kind of

play21:05

like take a complex task and it's able

play21:07

to not just write code but it's able to

play21:10

also understand the task create like

play21:12

tickets browse the internet for

play21:14

documentation when it needs to F to

play21:16

fetch uh uh you know to to to fetch uh

play21:19

new information it's able to deploy

play21:21

solutions to create pool request and so

play21:23

on so it's kind of like one of those

play21:25

agentic use cases that I really love um

play21:28

and in fact like this tweet from Paul

play21:30

Graham earlier this year kind of caught

play21:31

my eye because he mentioned or realized

play21:34

that like the 22y old programmers these

play21:36

days are often as good as the

play21:38

28-year-old programmers and I think when

play21:40

you reason about like how the 20 year

play21:43

olds are already adopting Ai and tools

play21:45

like divin it's no surprise that they're

play21:47

getting more and more productive thanks

play21:49

to AI another agent experience that I

play21:52

that I think this time is more towards

play21:53

consumer is Presto and Presto um is

play21:57

letting customers place all with their

play21:59

voice uh so using a voice agent and of

play22:02

course there's not many drive-throughs

play22:04

here in Europe but what I found

play22:06

compelling about this example is that

play22:08

it's really helping a market where um

play22:11

there's been a labor shortage and so in

play22:13

turn that helps um offer not only a

play22:15

great experience but also let uh the the

play22:18

the staff actually focus on food and

play22:20

suring the customers but with that I'd

play22:22

like to dive into a couple more live

play22:24

demos to illustrate a little bit uh how

play22:27

you can build assistive experience IES

play22:28

and agents practically uh today so our

play22:32

first um incarnation of so yeah with

play22:35

that you can see that literally one of

play22:37

the things that this AI power

play22:40

drive-through system is is it's actually

play22:42

been impacting people because one of the

play22:44

things that you might not understand

play22:46

about drive-throughs is that they're

play22:47

kind of limited to human intelligence

play22:50

and one of the things I was thinking

play22:52

about when I saw a demo in a weekly AI

play22:55

video I covered when someone was

play22:57

actually going through a drive-thru with

play22:59

an AI system and they basically spoke

play23:01

about how it was so crazy because you

play23:04

know an AI system is able to completely

play23:06

understand exactly what you want it's

play23:08

able to understand exactly what you want

play23:10

in other languages too and it's also

play23:12

able to converse with you in other

play23:14

languages too much more fluently than

play23:17

just someone who only speaks one

play23:18

language and isn't bilingual or able to

play23:22

understand other languages and it's

play23:23

patient and it's fast and I think it's

play23:26

something that's going to allow a lot

play23:28

more unique experiences so that's why

play23:31

agents are something that is very very

play23:34

impactful because I think this is where

play23:36

you're really going to see that real

play23:38

life impact other than just in a

play23:41

day-to-day llm

play23:45

interface welcome to Wendy's what would

play23:47

you like can I have a chocolate

play23:52

frosty which size for the chocolate

play23:54

frosty medium

play23:59

can I get you anything else today no

play24:01

thank

play24:07

you great please pull up to the next

play24:11

window so now let's take a look at some

play24:14

of these demos of these agentic

play24:16

workflows that you can actually use and

play24:18

do uh and what they've shown Us in this

play24:22

presentation um incarnation of uh agents

play24:25

for developers is what we call the

play24:27

assistance CPI and the assistance API is

play24:30

a complete toolkit that all of you can

play24:32

use in order to bring assistance into

play24:34

your products so in this case here I'm

play24:36

building this like travel app called one

play24:38

the lust as you can see there's like a

play24:40

map on the right side but there's also

play24:41

an assistive experience on the left side

play24:44

and so this is completely powered by the

play24:45

assistant API so let's take a quick look

play24:48

if I say top five venues for the

play24:51

Olympics in Paris first of all first

play24:55

thing to note I don't have to manage any

play24:57

of those uh let's refresh the app a

play24:59

little bit sounds like we maybe lost

play25:00

Network top five venues for the Paris

play25:03

Olympics the first thing to not is like

play25:05

I don't have to manage that conversation

play25:07

history that conversation history is

play25:09

automatically managed by the assistant

play25:11

API from oppi uh and so I don't have to

play25:13

kind of manage my prompt and so on not

play25:15

sure what's happening here let's take a

play25:17

quick look might have lost some Wi-Fi or

play25:19

connection nope let's try let's try one

play25:22

last time let's go to Rome ah there we

play25:24

go sounds like the Olympics was bad luck

play25:26

but uh sounds like we're back so yeah I

play25:27

don't have to actually manage any of

play25:29

those messages the conversation history

play25:31

is automatically uh managed by by openi

play25:34

the second thing that's really cool to

play25:36

go out here is that as you could see

play25:38

when I started to interact with these

play25:39

messages the map zoomed automatically

play25:42

and that's one of my favorite features

play25:44

when I build agents it's called function

play25:45

calling and function calling is the

play25:47

ability for all of you to bring

play25:49

knowledge about um your unique features

play25:52

in your app and your unique functions

play25:53

over to the model in this case GPT for

play25:56

so if I say top five um things to see in

play25:59

Rome let's see what happens here in in

play26:02

theory uh what should pop up here is

play26:04

once again an interaction between the

play26:06

text and the map here we go so now as

play26:09

you can see as we talking to the model

play26:11

it's able to actually pinpoint the map

play26:13

because it h it knows that this feature

play26:15

exists so it's really really cool and

play26:18

that's like already available as part of

play26:20

the toolkit of the assistant CPI now

play26:22

another tool I wanted to call out here

play26:24

is uh knowledge retrieval and we know so

play26:26

many of you want to bring like factual

play26:28

data into the conversations with models

play26:31

like GPT foro and usually you have to

play26:33

build like a retrieval stack to do so

play26:35

and we've learned from so many

play26:37

developers how complex that can be and

play26:39

so we've made a ton of improvements in

play26:41

our retrieval stack and so I'm going to

play26:43

try to see if I can actually demo this

play26:45

in real time so I actually bought this

play26:48

like book to prepare a trip to Italy

play26:50

from Lonely Planet it's a pretty

play26:51

comprehensive book it has like 250 Pages

play26:54

it's like 95 megabytes so I hope the

play26:56

upload is going to work taking a bit of

play26:57

a risk here um but what's happening in

play27:00

real time is like as soon as the file

play27:01

will be uploaded it will be

play27:03

automatically embedded by the assistant

play27:05

API so that I don't have to um think

play27:08

about any of these things to do I will

play27:10

be able to just start interacting in the

play27:12

conversation and say based on this book

play27:15

what's the best photo spot in laio so

play27:20

before I press enter I'll show you quick

play27:22

look at page uh

play27:25

126 I believe let's go to page 126

play27:28

so the page 126 talks about laio right

play27:32

and so I'm going to like ask the

play27:33

question here what's the best photo spot

play27:35

in laio and as I'm browsing the book

play27:38

we're noticing here that like the photo

play27:40

opportunity was mentioned on page 128

play27:43

and it's supposed to be Pano and boom in

play27:46

real time we were able to found in find

play27:49

in this book that this is exactly the

play27:51

place um uh for for a photo spot and

play27:54

again I had to do no engineering work I

play27:56

just had to upload the file in the

play27:58

conversation that's were all taken care

play27:59

of for me last but not least there's

play28:01

also another tool that I want like to

play28:03

highlight called code interpreter and

play28:04

code interpreter is disability to write

play28:07

python code in the background to answer

play28:09

some very precise questions usually

play28:11

around like numbers and math and

play28:13

financial data so here for instance if I

play28:15

were to say in this conversation um we

play28:18

are sharing an Airbnb um 44 it's

play28:23

โ‚ฌ1,200 what's my share plus my flight

play28:26

cost of let's say 260 Now by asking this

play28:30

question this is not a typical thing

play28:32

that llms do great at by default right

play28:34

but what's happening behind the scenes

play28:36

is that we're actually Computing all of

play28:38

this including con currency conversion

play28:40

and so on by writing code in the sandbox

play28:43

and once again as a developer I have

play28:44

nothing to do but because a poni is

play28:47

managing this does not mean it's a

play28:48

blackbox in fact if I go here and if we

play28:51

refresh the threads um we should see

play28:53

here that this is the exact threads that

play28:55

we've been you know feeding and you can

play28:58

see we we're going to Rome like all of

play28:59

the messages we see the function calls

play29:01

that I highlighted to annotate the map

play29:03

and here this is the python code that

play29:05

was written behind the scenes to

play29:07

actually answer the question you know

play29:09

compute the currency conversion divide

play29:11

by number of people and so on so really

play29:13

like the assistance API complete toolkit

play29:16

with conversation history with access to

play29:19

retrieval and files you can upload now

play29:20

up to 10,000 files in retrieval and even

play29:23

code interpreter and function calling

play29:25

all of this what you can build on from

play29:26

day one so let me know what you think

play29:29

about future models I mean one of the

play29:31

things that is a little bit confusing is

play29:33

the fact that they do have GPT 6 and

play29:35

other names trademarked so I'm wondering

play29:38

if they're just going to continue with

play29:39

the traditional methods but it is quite

play29:42

hard to predict considering the fact

play29:44

that open AI is a company that comes

play29:46

with a lot of drama and of course a lot

play29:48

of surprise and with the rate that AI is

play29:51

exponentially increasing in terms of the

play29:53

capabilities and everything new being

play29:55

discovered what feels like every week I

play29:57

mean you know the capabilities trying to

play29:59

predict a year two years three years

play30:01

from now are are quite hard but I think

play30:04

from this we do know that you know

play30:05

November there's probably going to be a

play30:07

new model released whether it is GPT

play30:09

next whether it is GPT 5 I can say one

play30:12

thing it's certain is that it's going to

play30:14

be a Monumental leap in terms of the

play30:16

capabilities and usabil of what we're

play30:18

about to see okay that was

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
AI FutureOpenAIMultimodalAgentsReasoningInnovationTech TrendsAI ModelsGPT NextTransformative AI