OpenAI'S "SECRET MODEL" Just LEAKED! (GPT-5 Release Date, Agents And More)
Summary
TLDRThe video script discusses the rapid evolution of AI models, with OpenAI planning significant advancements in reasoning and intelligence. It hints at a 'GPT next' model set for release in November 2024, which may not follow the traditional GPT 5 naming convention. The script also covers investment areas such as textual intelligence, cost-effective models, and multimodal agents, showcasing the potential for transformative AI capabilities in various industries. The presentation at Viva Technology event in Paris highlighted the exponential growth in AI's computational power and its impact on future models, suggesting a future where AI agents could redefine human-computer interaction.
Takeaways
- ๐ OpenAI anticipates releasing a new model, possibly called 'GPT next', which represents a significant leap in AI capabilities beyond the current GPT-4.
- ๐ The release date for the next major AI model is hinted to be in November 2024, aligning with the company's cautious approach due to the upcoming United States elections.
- ๐ค The advancements in AI are expected to be so significant that models within the next 1-2 years will be unrecognizable from their current state, indicating a rapid evolution in AI reasoning and intelligence.
- ๐ OpenAI is committed to making AI models not only more intelligent but also more affordable and efficient, with pricing for GPT models decreasing by up to 80% in just a year.
- ๐ก The company is focusing on four key investment areas: increasing textual intelligence, making models cheaper and faster, custom models, and developing multimodal agents.
- ๐ OpenAI's next frontier model is expected to provide a 'step function' in reasoning improvements, suggesting a substantial and sudden increase in AI's ability to understand, process, and generate complex reasoning.
- ๐ง The potential for AI in fields like medical research and scientific reasoning is highlighted, with the expectation that AI will advance to a level capable of high-order thinking and problem-solving.
- ๐ป OpenAI has been building increasingly powerful AI supercomputers, scaling from 'shark-size' to 'orca-size' and now 'whale-size', which will train the next set of models with unprecedented capabilities.
- ๐ The Assistant API offers a complete toolkit for developers to integrate assistive experiences into their products, including conversation history management, function calling, knowledge retrieval, and code interpreter.
- ๐ The script also covers the impact of AI on various sectors, such as the drive-through industry with AI-powered voice agents, and the potential for AI to democratize access to complex tasks like software development.
Q & A
What is the expected timeframe for significant advancements in AI models according to the transcript?
-The transcript suggests that within a year or two from now, AI models are expected to become unrecognizable from their current state, indicating significant advancements in a relatively short period.
What is the significance of the term 'GPT next' mentioned in the transcript?
-The term 'GPT next' could imply a new model that may not follow the traditional naming convention of 'GPT 5'. It suggests that OpenAI might be planning more than what people expect, possibly indicating a significant leap in technology rather than an incremental upgrade.
Why was the Viva Technology event significant in the context of the transcript?
-The Viva Technology event, also known as Vivatech, is significant because it's where the image indicating the progression from GPT 3 to GPT 4 and hinting at 'GPT next' was first revealed. It's an annual technology conference in Paris, France, dedicated to innovation and startups.
What is the potential release date for the 'GPT next' model as hinted in the transcript?
-The transcript suggests that the 'GPT next' model is expected to be released in November 2024.
How does the transcript connect the release of the new AI model with the 2024 United States elections?
-The transcript mentions that OpenAI's CTO, Mira Murati, confirmed that the elections were a major factor in the release timing of the new model. OpenAI is cautious about not releasing anything that could potentially affect global elections.
What are the four investment areas mentioned in the transcript related to AI development?
-The four investment areas mentioned are: increasing textual intelligence, making models cheaper and faster, custom models, and developing multimodal agents.
What is the purpose of the Assistant API mentioned in the transcript?
-The Assistant API is a toolkit for developers to bring assistive experiences into their products. It manages conversation history, allows function calling to integrate app features, and can interpret and execute code to answer precise questions.
How does the Assistant API facilitate the integration of knowledge retrieval in the transcript?
-The Assistant API allows for the easy integration of factual data into conversations with models like GPT-4. It automates the process of embedding files, such as a book on Italy, into the conversation, making it easier for developers to provide accurate and context-aware responses.
What is the role of 'Code Interpreter' in the Assistant API as described in the transcript?
-The 'Code Interpreter' in the Assistant API enables the writing and execution of Python code in the background to answer precise questions, particularly those involving calculations, currency conversion, and other numerical data.
What is the significance of the 'step function' in reasoning improvements mentioned in the transcript?
-The 'step function' in reasoning improvements suggests a significant, abrupt, and substantial increase in AI's reasoning capabilities at a particular point, rather than a gradual improvement. This implies that the next models will have enhanced problem-solving abilities and a more sophisticated decision-making process.
Outlines
๐ Upcoming AI Model Advancements
The video discusses the rapid evolution of AI models, expecting significant changes within a year or two. The speaker hints at an under-the-radar announcement related to the 'GPT next' model, suggesting a substantial leap in reasoning capabilities. The Viva technology event, known as Vivatech, is referenced as the source of an image indicating a timeline for AI development, pointing towards a major release in November 2024. The importance of this date is tied to its proximity to the 2024 United States elections, suggesting a cautious approach by OpenAI to avoid influencing the electoral process. The video also speculates on the naming convention of future models, hinting that 'GPT next' might not align with the expected 'GPT 5'.
๐๏ธ Key Dates and AI's Impact on Elections
This paragraph delves into the strategic timing of AI model releases, particularly in relation to global events like the 2024 U.S. elections. OpenAI's CTO, Mira Murati, confirms that the elections are a significant factor in the release schedule of models like GPT 5, aiming to avoid any potential negative impacts. The speaker also discusses the potential backlash OpenAI could face if they were to release a model with advanced capabilities that might be perceived as threatening to democracy. Additionally, investment areas for AI development are outlined, emphasizing the importance of enhancing textual intelligence to unlock transformative value in AI.
๐ง Anticipated Cognitive Leap in AI Models
The speaker reflects on the exponential growth of AI capabilities, suggesting that current models are at a primitive stage compared to what is to come. They predict that within a couple of years, AI models will advance to a level that will be unrecognizable from their current state, potentially excelling in complex fields like medical research or scientific reasoning. The discussion also touches on the affordability and accessibility of these models, with a focus on making them cheaper and faster to accommodate various use cases.
๐ก Insights into Future AI Model Naming and Capabilities
In this segment, the speaker explores the possibility that future AI models may not follow the traditional naming convention, as suggested by Sam Altman. They discuss the potential for models to be named something other than 'GPT 5' and the uncertainty surrounding the release dates and capabilities of these models. Additionally, the video references Microsoft's involvement in training next-generation AI models and the significant computational power being used, which is expected to result in a substantial leap in AI capabilities.
๐ค The Emergence of Multimodal Agents in AI
The focus shifts to the development of multimodal agents, which are expected to revolutionize human-computer interaction. The speaker highlights the potential of these agents to leverage text, context, and tools to create a more natural and efficient user experience. Examples of agentic workflows are provided, such as an AI software engineer capable of complex tasks and a voice agent for drive-through services. The potential impact of these agents on various industries, including software development and consumer services, is emphasized.
๐ ๏ธ Demonstrating Assistive Experiences and Agents
This paragraph showcases live demos of assistive experiences and agents, illustrating their practical applications. The speaker introduces the Assistant API, a toolkit for developers to integrate assistive features into their products. Examples include an AI-powered travel app, a system for managing conversation history, function calling to interact with app features, knowledge retrieval to access factual data, and a code interpreter for precise computations. The video emphasizes the ease of use and the immediate applicability of these tools for developers.
๐ฎ Speculations on Future Model Releases and Capabilities
The final paragraph wraps up with speculations on the future of AI model releases, particularly around the anticipated model in November 2024. The speaker acknowledges the difficulty in predicting the exact nature of these models but asserts that it will represent a significant leap in capabilities and usability. They also touch on OpenAI's trademarked names like GPT 6, leaving the audience with a sense of anticipation for the monumental advancements on the horizon in the AI field.
Mindmap
Keywords
๐กAI Models
๐กGPT-3
๐กGPT-4
๐กGPT Next
๐กModel Intelligence
๐กReasoning Improvements
๐กViva Technology Event
๐กMultimodal AI
๐กAgents
๐กAssistance API
๐กCompute
Highlights
Expectations of AI models becoming unrecognizable within a year or two.
Plans to push the boundaries of AI reasoning with the next Frontier Model.
Secret announcements and updates for future AI models discussed.
Viva Technology event (Vivatech) mentioned as a source of AI innovation.
Introduction of 'GPT next' as a potential successor to GPT 4.
Release date for GPT next model hinted at November 2024.
Potential impact of AI models on the 2024 United States elections.
Open AI's concern over the influence of AI on global elections.
The transformative potential of increasing textual intelligence in AI.
Significant price decrease of GPT models over the past year.
Expectations for models to advance beyond their current 'first or second grader' intelligence.
AI's future ability to excel in complex tasks like medical research.
The concept of 'step function' improvements in AI reasoning capabilities.
Discussion on the importance of multimodal AI and its future advancements.
Live demos showcasing assistive experiences and agentic workflows.
Introduction of the Assistant API for building assistive experiences.
Knowledge retrieval and code interpreter tools for enhancing AI capabilities.
Trademarked names like GPT 6 hint at future model developments.
The monumental leap expected in AI capabilities and usability in the coming years.
Transcripts
and uh we think that within maybe a year
or two from now like the the models will
be unrecognizable from what they are
today and so this year we also plan to
push that boundary even more and we
expect our next Frontier Model to come
um and and and provide like a step
function in reasoning uh improvements as
well so there was actually a very very
interesting announcement that was made
but including myself a lot of people
didn't actually realize what the
announcement was and it was something
that was quite under the radar but in
this video I'm going to be showing you
guys all of the secret announcements and
some of the secret updates that are
coming to opening eyes models in the
future well in the very very close
future including some of the key dates
that were actually unveiled at a secret
presentation so one of the things that
you can see here is this is an image
that has been floating around on the
internet for the past 24 hours and I've
confirmed that this image is from the
Viva technology event which is commonly
known as vivatech and it's an annual
technology conference dedicated to
Innovation and startups it's held in
Paris in France and was founded in 2016
by the Publishers group and the event
takes place at the Paris Expo and it's
pretty much the or I should say one of
Europe's largest tech and startup events
and it's mainly focused on Tech
Innovation along with some other key
business insights and from this image it
seems pretty simple but there was
something that a lot of people did Miss
so we can see here that on the image
that you you're currently looking at we
have three main points so what we do
have is we have a situation where we
have the 2021 which is the gpt3 era The
Da Vinci model and this was of course in
2021 you can see now and this is
actually super super interesting and
this is why I think this video is
remarkably important you can see that we
moved in 2023 to the GPT 4 era which is
here so you can see that this is around
the time of GPT 4 being released and
being deployed and what's very very
interesting is that it now shows a new
piece of information that I would say is
a little bit interesting you can see
that they describe this as GPT next okay
so I think that maybe GPT 5 might not be
coming and when I say that I don't mean
GPT 5 isn't actually coming what I mean
is that GPT 5 as you think of it I think
open AI are most likely planning a lot
more than people think and of course
that is something that we even recently
saw with the recently demoed GP so it's
crazy because it shows us that may 2024
which is as you can see here today and
you can see it says may 2024 but what's
crazy is that they've actually given us
the release date for this GPT next model
so you can see that if you look at this
dot right here we can see that this is
actually November 2024 and this date is
very important for a few reasons which
I'm going to explain in a moment but I
think one of the first things that you
know really really surprised me was of
course the fact that this is called GPT
next and not GPT 5 now what's really
really crazy about this is that the
craziest thing is that we can see that
there is clearly some kind of increasing
capability so it is very very hard to
see this but on the left hand side here
it says model intelligence so you can
see that gpt3 is intelligent was around
this level GPT fors intelligence it
doesn't really Benchmark it but I'm
guessing that this is GPT 40 so you can
see that there is a slight Improvement
there although the Improvement is slight
the thing that you need to take into
account is the fact that even if
improvements are slight it does mean
that a lot of use cases are going to be
pretty pretty insane because if the
model can get smarter and it can become
more reliable then it means the
industries that it can impact are going
to be a lot more
overall so I think one of the most
important things that we could see here
is that the model intelligence from GPT
40 or gp4 completely does a huge huge
jump we can see that literally from this
level to this level we can see that it's
that amount of jump but from here it is
a quite big jump and in fact I probably
should be using some actual arrows
apologies for my terrible drawings but
the point I'm trying to make here is
that it seems that the kind of jump that
we're about to get here with this GPT
next model it looks really really
surprising and it's something that
they've constantly reiterated that these
future models are going to be very very
intelligent in terms of how smart these
systems are and whilst yes there might
be other features this is something that
we do know now one thing that I did want
to talk about about this gptx model
because of course they could have simply
put that this is going to be GPT 5
although of course they don't want to
officially announce it it could just be
a place holder for GPT 5 I think that it
might not be GPT 5 but I'm going to dive
into that one second but one thing I
want you guys to know is that this
release date right here of November 2024
is a key date because this makes sense
for the release date of the next model
whether it's GPT 5 or whether it's GPT
next whatever other models there are I
think it's important to note that this
date has been said by openai a few times
so one of the key things coming up this
year and I know some people don't live
in America so you might not pay
attention but there are the 2024 United
States elections this is going to be
taking place on Tuesday the 5th of
November and you might be thinking okay
but those are the elections what does
that have to do with actual AI systems
like I mean that's politics this is
technology in fact those things are very
very closely intertwined because open AI
themselves did actually make a statement
regarding this and the elections are
actually a reason for the delay of GPT 5
as many people did think that GPT 5 was
scheduled to be released in the summer
however you can see right here open AI
CTO Mira morati recently confirmed that
the elections were a major factor in the
release of GPT 5 we will not be
releasing anything that we don't feel
confident on when it comes to how it
might affect the global elections or
other issues she said last month so
whilst yes we did did just get a pretty
pretty crazy demo of GPT 40 a multimodal
AI that just completely completely
shocked the industry it is pretty pretty
incredible that you can see here that
open AI are really really concerned with
as to what the future models are going
to be able to do in regards to the
election now I think it's going to be
either one of two things one of the
things is that because there is an
election coming up there are always
different discussions on what could
happen and the kinds of you know
conversations going on around privacy
issues and just a million different
conversations that are going to be had
and the problem is is that if open AI
does release a model before the
elections then you could face a negative
PR situation like the public could be
negatively thinking about open Ai and of
course yes this week open AI have had a
huge huge huge amount of bad news in
their favor due to some of the things
that have been going on at the company
from people leaving to Sam Alman doing
you know some questionable things
depending on where you stand I think
it's important to not release models
during that time because it's at a time
where if the technology is as truly
Advanced as the graph shows us then it's
definitely going to be more widely
received as a model that is you know
something that is threatening the
individual democracy of the United
States because if it has the ability to
influence people then some individuals
might say that this was all timed and
you know with politics things do get
really really difficult really really
quickly so I think this does make sense
now I could be wrong but the fact that
we do have the open ey CTO stating that
they're not going to be releasing
anything and the fact that of course GPT
next is in quotation marks here and that
there's going to be that at November
2024 just after the elections I think
November 2024 and considering that
previous rumors around that time also
did say that I think that this also does
make sense now here's where he actually
talks about the GPT next models this is
a very very fascinating clip and there
is also this graph right here it is very
very very hard to see like very very
very hard to see but if you zoom in you
can see that there is also this graph
you can see that gpt3 era GPT 4 GPT next
and you can see that this one doesn't
actually have the dates on it so I'm
guessing that before when they had the
dates that may have just been a mistake
based on proprietary information but of
course now the information is out there
although we don't know what date it's
going to be we know that after November
the 5th up until the end of November
there's probably likely going to be some
kind of model but anyways we are really
excited for this but I'm going to show
you guys what he talks about here cuz
it's there are four investment areas I'd
like to cover the first
key priority that we have is textual
intelligence and our core belief is that
if we increase textual intelligence that
will unlock transformational value uh in
Ai and you can see on the screen here
these are the two major models that we
offer today GPT 4 the best model with
Native multimodality that we just showed
and GPT 3.5 turbo 10x cheaper which is
convenient for simple task where what
you need is really U things like
classification or very simple entity
extraction and we really expect that the
the potential uh to increase
the llm intelligence remains huge and
today we think models are pretty great
you know they're they're you know kind
of like first or second graders they
respond appropriately but they still
make some mistakes every now and then
but the cool thing that we should remind
ourselves is that those models are the
dumbest they'll ever be um you know they
may become Master students In The Bleak
of an eye um they will excel at medical
research or scientific reasoning and uh
we think that within maybe a year or two
from now like the the models will be
unrecognizable from what they are today
and so this year we also plan to push
that boundary even more and we expect
our next Frontier Model to come um and
and provide like a step function in
reasoning uh improvements as well the
second uh of investment area for us is
to make sure the models are cheaper and
faster all the time and we know that not
every use case requires the highest
level of intelligence and so that's why
uh we want to make sure that we invest
and you can see here on the screen the
GPT for pricing and how much it's
decreased by like 80% in just a year uh
it's quite unique by the way for a new
technology to uh like decrease in price
so quickly but we think it's like really
critical in order for all of you to
build and reach scale with what we're
trying to accomplish and innovate uh
with your AI native products so I think
that that short snippet from this Tech
conference was rather insightful because
he actually said a numerous amount of
different things in that short snippet
but I think some of them were more
important than others of course he talks
about the price decreasing but one of
the things that he did mention that was
rather rather fascinating and this is
someone that is from open AI he actually
speaks about the fact that literally
within 1 to two years the models are
going to be unrecognizable so that is
something that even as someone who pays
attention to the AI space and as someone
who looks at all of the AI updates in
many different things that I literally
don't even post on this channel this is
still something that is rather
surprising and I think it's because
humans do have a hard time at grasping
the nature of exponential increases in
terms of technology and intelligence so
I think this is going to be a truly
truly transformative period in terms of
what is going to come out of this
company within the next 5 to 10 years
because if he's stating that literally
the models are going to look
unrecognizable within 1 to 2 years I
mean in 2026 this isn't far away you
know 2 years is a very short time period
especially for these kinds of
technological developments now something
that he also said that I thought was
also rather insightful was that he
mentioned a step function in reaching
and for the next models this likely
means as we've already discussed that
this is a significant discret
Improvement in the ai's reasoning
capabilities rather than a gradual
incremental Improvement and essentially
this just means that you know in
contrast to the gradual Improvement that
a step function implies a sudden
substantial Improvement at a particular
point which is followed by a new level
of capability and this change is most
more abrupt and significant compared to
the gradual improvements and with the
reasoning abilities current models like
the gpt3 and GPT 4 have of course made
significant strides in their ability to
reason and understand and generate text
but their abilities to reason can still
be limited in certain contexts so the
GPT NEX models means that a step
function in reasoning could mean that
they make a substantial leap in their
ability to understand process understand
process and generate more complex
abstract and logical forms of reasoning
and because of that increased level of
reasoning it means that they've got
improved problem solving and this means
that these models are going to be better
at tackling complex problems that
require multi-step and logical reasoning
of course this means enhanced
understanding this means that the AI
could understand context and nuances in
a more humanlike way leading to more
accurate and relevant responses decision
making these models would likely be able
to make more sophisticated decisions
based on the information provided
similar to higher order thinking and
like I said before this is just once
again going to open up a lot more
applications they even spoke about how
you know it's going to be able to do
medical research we've seen that Google
has has been pushing widely on that
Frontier with the medical Gemini and
they've achieved remarkable benchmarks
and I wouldn't be surprised if openi are
doing something in that realm now one of
the things that I also want to talk
about was of course the fact that this
release date is rather fascinating and
the name of the model did actually make
me think about something that was spoke
about previously you can see here that
we have a model that is called GPT next
but one of the things that I spoke about
when covering the Sam Alman Lex Freedman
interview was the fact that he said
something rather insightful he said that
the future models that he releases might
not actually be called GPT 5 and of
course there might be GPT 5 because they
did trademark it but he did state that
you know whatever the next model may be
we're not sure when it's going to be
released or what it's going to be called
oh that's the honest answer is it blink
twice if it's this year before we talk
about like a gp5 like model called that
or called or not called that or a little
bit worse or a little bit better than
what what you'd expect from a gbt 5 I
think we have a lot of other important
things to release first I don't know
what to expect from gbt
5 you're making me nervous and excited
uh what are some of the so right there
you can see that Sam mman is actively
talking about how they are going to
release a few things before gbt 5 of
course you know we've seen things like
voice engine we've seen Sora we've seen
a bunch of other things but you know the
way how he talks about how these future
models might not even be called what we
are expecting them to be called is of
course rather fascinating too now one of
the things that you may have seen
recently was this from Microsoft and
this is basically where they talk about
the levels of compute that they using to
train the next Frontier models and
currently we can see that the diagram
they use sharks and marine life to I
guess you could say help us understand
the scale of compute that they are
currently using we can see that we have
a shark here then of course we have an
orca and then of course we have a whale
so I mean the stock increase in terms of
the capabilities from this graph going
back all the way to the first graph are
very very similar the gpt3 the GPT 4
technology is only a little bit and then
of course the next levels I think maybe
open ey clearly have discovered
something incredible and they're
probably going to shock the world
because if you're using that much
compute to train something and you've
also improved your architecture then I
think the amount of capabilities that
you can get is truly truly surprising
and I think this is so surprising
because not only do we have maybe not
improved architectures in terms of the
Transformer but I'm talking about
certain techniques that open AI are
pioneering and that they're using to
advance the frontier in terms of the
reasoning and the capabilities of their
models and I'm going to show you guys a
short snippet from this clip where it's
actually spoken about in great context
about why this is so pivotal and the
only reason I'm showing you this is
because now with the added context from
this slide here where we can see that oh
wait and the only reason I'm showing you
guys this clip is because now with the
added context of this previous graph
where we can literally see in terms of
the capabilities jump I think it's
important to understand the compute side
behind it that Frontier forward and like
we showed this slide at the beginning
like there's this like really beautiful
relationship right now between sort of
exponential progression a compute that
we're applying to building the platform
to the capability and power the platform
that we get and I just wanted to you
know sort of without without mentioning
numbers uh which is sort of hard to do
to give you all an idea of the scaling
of these systems so in 2020 we built our
first AI supercomputer for open AI uh
it's the supercomputing environment that
trained gbd3 and so like we're going to
just choose Marine Wildlife is our scale
marker so you can think of that system
uh about as big as a shark so the next
system that we uh built um scale-wise is
about as big as uh an orca uh and like
that is the system in uh that we
delivered in 2022 that trained GPT 4 the
system that we have just deployed is uh
like scale-wise uh about as big as a
whale relative to like you know this
shark siiz supercomputer and this Mar
siiz supercomputer and it turns out like
you can build a whole hell of a lot of
AI with a whale siiz supercomputer um
and and so you know one of the things
that I just want everybody to really
really be thinking clearly about and
like um this is going to be our segue to
talking with Sam is the next sample is
coming so like this whale siize
supercomputer is hard at work right now
building the next set of capabilities
that we're going to put into your hands
and yeah if you saw it there he said you
can build a whole hell of a lot of AI
with a large amount of comput so I'm
really intrigued with as to what a whole
hell of a lot of uh compute is going to
be giving us but one thing I do note is
that there is going to be a huge amount
of capabilities now something that they
also spoke about was of course
multimodal agents this is going to be
something that is here within the next
level of Frontier state-of-the-art
models I think that maybe this year we
get something but there are also some
things I do want to talk about with as
to why we might not get that but they
also demoed the multimodal agents and of
course you can see that their investment
areas are the textual intelligence
cheaper and faster models the custom
models and of course multimodal agents
so I want to show you guys this short
clip because opening ey haven't really
shown us that much in terms of the
agentic workflows but I think it's
important to take a sneak peek because
agents are truly going to change the way
that we interact with computers we
really believe that like in the future
agents may be the biggest change that
will happen to software and how we
interact with computers and depending on
the task they'll be able to leverage
text they'll be able to leverage access
to some context and tools um so and
again all of these modalities that we
mentioned will bring also like a fully
natural and Noble way to interact uh
with um to interact with with the
software one example of this that I
personally Love Is dein by the team AOG
they built like essentially an AI
software engineer and uh it's pretty
fascinating because it's able to kind of
like take a complex task and it's able
to not just write code but it's able to
also understand the task create like
tickets browse the internet for
documentation when it needs to F to
fetch uh uh you know to to to fetch uh
new information it's able to deploy
solutions to create pool request and so
on so it's kind of like one of those
agentic use cases that I really love um
and in fact like this tweet from Paul
Graham earlier this year kind of caught
my eye because he mentioned or realized
that like the 22y old programmers these
days are often as good as the
28-year-old programmers and I think when
you reason about like how the 20 year
olds are already adopting Ai and tools
like divin it's no surprise that they're
getting more and more productive thanks
to AI another agent experience that I
that I think this time is more towards
consumer is Presto and Presto um is
letting customers place all with their
voice uh so using a voice agent and of
course there's not many drive-throughs
here in Europe but what I found
compelling about this example is that
it's really helping a market where um
there's been a labor shortage and so in
turn that helps um offer not only a
great experience but also let uh the the
the staff actually focus on food and
suring the customers but with that I'd
like to dive into a couple more live
demos to illustrate a little bit uh how
you can build assistive experience IES
and agents practically uh today so our
first um incarnation of so yeah with
that you can see that literally one of
the things that this AI power
drive-through system is is it's actually
been impacting people because one of the
things that you might not understand
about drive-throughs is that they're
kind of limited to human intelligence
and one of the things I was thinking
about when I saw a demo in a weekly AI
video I covered when someone was
actually going through a drive-thru with
an AI system and they basically spoke
about how it was so crazy because you
know an AI system is able to completely
understand exactly what you want it's
able to understand exactly what you want
in other languages too and it's also
able to converse with you in other
languages too much more fluently than
just someone who only speaks one
language and isn't bilingual or able to
understand other languages and it's
patient and it's fast and I think it's
something that's going to allow a lot
more unique experiences so that's why
agents are something that is very very
impactful because I think this is where
you're really going to see that real
life impact other than just in a
day-to-day llm
interface welcome to Wendy's what would
you like can I have a chocolate
frosty which size for the chocolate
frosty medium
can I get you anything else today no
thank
you great please pull up to the next
window so now let's take a look at some
of these demos of these agentic
workflows that you can actually use and
do uh and what they've shown Us in this
presentation um incarnation of uh agents
for developers is what we call the
assistance CPI and the assistance API is
a complete toolkit that all of you can
use in order to bring assistance into
your products so in this case here I'm
building this like travel app called one
the lust as you can see there's like a
map on the right side but there's also
an assistive experience on the left side
and so this is completely powered by the
assistant API so let's take a quick look
if I say top five venues for the
Olympics in Paris first of all first
thing to note I don't have to manage any
of those uh let's refresh the app a
little bit sounds like we maybe lost
Network top five venues for the Paris
Olympics the first thing to not is like
I don't have to manage that conversation
history that conversation history is
automatically managed by the assistant
API from oppi uh and so I don't have to
kind of manage my prompt and so on not
sure what's happening here let's take a
quick look might have lost some Wi-Fi or
connection nope let's try let's try one
last time let's go to Rome ah there we
go sounds like the Olympics was bad luck
but uh sounds like we're back so yeah I
don't have to actually manage any of
those messages the conversation history
is automatically uh managed by by openi
the second thing that's really cool to
go out here is that as you could see
when I started to interact with these
messages the map zoomed automatically
and that's one of my favorite features
when I build agents it's called function
calling and function calling is the
ability for all of you to bring
knowledge about um your unique features
in your app and your unique functions
over to the model in this case GPT for
so if I say top five um things to see in
Rome let's see what happens here in in
theory uh what should pop up here is
once again an interaction between the
text and the map here we go so now as
you can see as we talking to the model
it's able to actually pinpoint the map
because it h it knows that this feature
exists so it's really really cool and
that's like already available as part of
the toolkit of the assistant CPI now
another tool I wanted to call out here
is uh knowledge retrieval and we know so
many of you want to bring like factual
data into the conversations with models
like GPT foro and usually you have to
build like a retrieval stack to do so
and we've learned from so many
developers how complex that can be and
so we've made a ton of improvements in
our retrieval stack and so I'm going to
try to see if I can actually demo this
in real time so I actually bought this
like book to prepare a trip to Italy
from Lonely Planet it's a pretty
comprehensive book it has like 250 Pages
it's like 95 megabytes so I hope the
upload is going to work taking a bit of
a risk here um but what's happening in
real time is like as soon as the file
will be uploaded it will be
automatically embedded by the assistant
API so that I don't have to um think
about any of these things to do I will
be able to just start interacting in the
conversation and say based on this book
what's the best photo spot in laio so
before I press enter I'll show you quick
look at page uh
126 I believe let's go to page 126
so the page 126 talks about laio right
and so I'm going to like ask the
question here what's the best photo spot
in laio and as I'm browsing the book
we're noticing here that like the photo
opportunity was mentioned on page 128
and it's supposed to be Pano and boom in
real time we were able to found in find
in this book that this is exactly the
place um uh for for a photo spot and
again I had to do no engineering work I
just had to upload the file in the
conversation that's were all taken care
of for me last but not least there's
also another tool that I want like to
highlight called code interpreter and
code interpreter is disability to write
python code in the background to answer
some very precise questions usually
around like numbers and math and
financial data so here for instance if I
were to say in this conversation um we
are sharing an Airbnb um 44 it's
โฌ1,200 what's my share plus my flight
cost of let's say 260 Now by asking this
question this is not a typical thing
that llms do great at by default right
but what's happening behind the scenes
is that we're actually Computing all of
this including con currency conversion
and so on by writing code in the sandbox
and once again as a developer I have
nothing to do but because a poni is
managing this does not mean it's a
blackbox in fact if I go here and if we
refresh the threads um we should see
here that this is the exact threads that
we've been you know feeding and you can
see we we're going to Rome like all of
the messages we see the function calls
that I highlighted to annotate the map
and here this is the python code that
was written behind the scenes to
actually answer the question you know
compute the currency conversion divide
by number of people and so on so really
like the assistance API complete toolkit
with conversation history with access to
retrieval and files you can upload now
up to 10,000 files in retrieval and even
code interpreter and function calling
all of this what you can build on from
day one so let me know what you think
about future models I mean one of the
things that is a little bit confusing is
the fact that they do have GPT 6 and
other names trademarked so I'm wondering
if they're just going to continue with
the traditional methods but it is quite
hard to predict considering the fact
that open AI is a company that comes
with a lot of drama and of course a lot
of surprise and with the rate that AI is
exponentially increasing in terms of the
capabilities and everything new being
discovered what feels like every week I
mean you know the capabilities trying to
predict a year two years three years
from now are are quite hard but I think
from this we do know that you know
November there's probably going to be a
new model released whether it is GPT
next whether it is GPT 5 I can say one
thing it's certain is that it's going to
be a Monumental leap in terms of the
capabilities and usabil of what we're
about to see okay that was
Browse More Related Video
STUNNING Step for Autonomous AI Agents PLUS OpenAI Defense Against JAILBROKEN Agents
GPT-4o Deep Dive & Hidden Abilities you should know about
Will "Claude Investor" DOMINATE the Future of Investment Research?" - AI Agent Proliferation Begins
BIG AI NEWS: 10,000X Bigger Than GPT-4, AGI 2025, New Boston Dynamics Demo And More
Sam Altman's Surprising WARNING For GPT-5 - (9 KEY Details)
Foundation Models | Satya Nadella at Microsoft Build 2024
5.0 / 5 (0 votes)