SHOCKING New AI Models! | All new GPT-4, Gemini, Imagen 2, Mistral and Command R+
Summary
TLDRGoogle DeepMind introduces Gemini 1.5 Pro, an AI model available for public preview on Google Cloud and Vertex AI platforms. The model boasts a 1 million token context window and is trained up to December 2023. GPT-4 Turbo with Vision has been released, offering significant improvements and enabling developers to build innovative applications. Meanwhile, Devon AI, an AI software engineering assistant, has garnered attention but also skepticism. Healthifme leverages GPT-4 Turbo Vision for nutrition insights through food photo recognition. The video also touches on the potential implications of AI agents on various industries and the economy.
Takeaways
- 🚀 Google DeepMind has released Gemini 1.5 Pro in public preview on Google's cloud and Vertex AI platforms.
- 🖼️ The new and improved Imagin 2 can create 4K live images from a single prompt, showcasing significant advancements in AI image generation.
- 📱 GPT-4 Turbo with vision is now generally available in the API, having moved out of preview mode and featuring important improvements.
- 💡 GPT-4 Turbo Vision introduces a 128,000 token context window and training data up to December 2023, enhancing its capabilities.
- 🤖 Devon AI, an AI software engineering assistant, is making waves as an application of GPT-4 Turbo's vision capabilities.
- 🕵️♂️ The YouTube channel 'Internet of Bugs' critically examines AI software development demos, questioning the authenticity of some recent presentations.
- 🛠️ The potential impact of AI agents like Devon on the job market, economy, and remote work is vast and raises many questions about the future.
- 🎨 TLDraw leverages GPT-4 Turbo Vision to transform user-doodled ideas into functional software, representing a potential shift in UI design.
- 📈 Google Cloud's updates to Gemini, Gemma, and mlops on Vertex AI include enhanced image generation and multimodal content analysis.
- 📅 The release of Gemini 1.5 Pro includes a 1 million context window, which could significantly improve its performance on various tasks.
- 🏆 The leaderboard for AI models shows tight competition between OpenAI's models, with GPT-4 and CLA 3 Opus neck and neck at the top.
Q & A
What is the new AI model released by Google DeepMind in the public preview?
-The new AI model released by Google DeepMind is Gemini 1.5 Pro, which is available in public preview on Google's cloud and Vertex AI platforms.
What improvements have been made to the GPT model recently?
-The recent improvements to the GPT model include the release of GPT 4 Turbo with Vision, which has a 128,000 token context window and training data up to December 2023. It also now supports JSON mode and function calling, and vision requests can be made.
What is Devon AI, and what role does it play in software engineering?
-Devon AI is an AI software engineering assistant powered by GPT 4 Turbo that uses vision for a variety of tasks. It has been making significant noise in the industry, showcasing its capabilities in tasks such as upwork side hustles and website building requests.
What are some concerns regarding the authenticity of AI software engineering demos like Devon AI?
-There are concerns that the demos shown for AI software engineering tools like Devon AI may not be entirely genuine. Critics believe there could be some misrepresentation or 'shenanigans' going on, as evidenced by the thorough debunking done by the internet of bugs YouTube channel.
How does the Healthifme app utilize GPT for Turbo Vision?
-Healthifme has built an app using GPT for Turbo Vision that provides users with nutrition insights by recognizing food photos from around the world.
What is the significance of the 1 million context window in Gemini 1.5 Pro?
-The 1 million context window in Gemini 1.5 Pro is significant because it allows the model to handle large documents and find specific information within them efficiently. This capability is particularly useful for tasks like searching and analyzing multimodal content.
What is the potential impact of AI agents like Devon AI on the job market and economy?
-The potential impact of AI agents includes the automation of various jobs, which could lead to changes in the economy and remote work. There are concerns about knowing who is real and who is not online, as well as how to protect against cyber attacks and maintain the quality of software development.
How does the GPT 4 Turbo Vision model facilitate user interface design?
-GPT 4 Turbo Vision facilitates user interface design by allowing users to draw and annotate their ideas, which the model then turns into actual software. This rapid prototyping process can significantly speed up the development and iteration of user interfaces.
What are the capabilities of Google Cloud's updated Gemini imaging model?
-The updated Gemini imaging model on Google Cloud can now create 4-second live images from a single prompt and supports processing audio inputs, including music, speech, and the audio portion of video. It can provide high-quality transcriptions or be used to search and analyze multimodal content.
How does the GPT 4 Turbo model perform in the Gladiator arena for LLM chatbots?
-The GPT 4 Turbo model, once added to the Gladiator arena, performed well in comparison to other models. It was found to be significantly better than the CLA 3 High coup model, and its performance is closely monitored to see where it will rank among the top AI models.
What are the current rankings of the top AI models in the Gladiator arena?
-As of the latest update, the top AI models in the Gladiator arena are CLA 3 Opus as the reigning king, followed by GPT 4, and then Bard from Gemini Pro. The new GPT 4 Turbo model is expected to join the rankings soon.
Outlines
🚀 New AI Releases and Improvements
This paragraph discusses the latest developments in AI technology. Google DeepMind has launched Gemini 1.5 Pro in public preview on Google's cloud and vertex AI platforms. The new and improved Imagin 2 has been introduced, capable of creating 4K live images from a single prompt. Additionally, GPT for Turbo with vision has been released from its preview mode and includes significant improvements such as JSON mode, function calling, a 128,000 token context window, and training data up to December 2023. The paragraph also highlights the skepticism around Devon AI's authenticity and the emergence of the internet of bugs, a YouTube channel debunking AI software development demos.
🖼️ Advancements in Image Generation and AI Models
The second paragraph focuses on advancements in image generation and AI models. Google Cloud has announced updates to Gemini imaging and Gemma and mlops on vertex AI. Gemini 1.5 Pro now supports audio inputs and can provide high-quality transcriptions. The Google Cloud's vertx AI Studio Vision allows for image generation, and the paragraph discusses the potential of these technologies. It also touches on the capabilities of GPT 4 Turbo Vision in user interface design and the potential future applications of AI in various fields.
🏆 AI Model Rankings and Upcoming Developments
This paragraph covers the current rankings of AI models and upcoming developments. The paragraph mentions the close competition between GPT 4 and CLA 3 Opus, with the latter recently surpassing GPT 4. It also introduces a new competitor, Command R plus, which has been making waves in the AI community. The paragraph concludes with a teaser about upcoming big news in the AI field, hinting at the potential release of a new model, gp4 turbo, and its expected impact on the current rankings.
Mindmap
Keywords
💡Gemini 1.5 Pro
💡GPT for Turbo with Vision
💡AI software engineering assistant
💡Internet of Bugs
💡Debunking
💡Ethan Mik
💡Automation
💡Civil attacks
💡Healthifme
💡User Interface Design
💡Gladiator Arena
💡ELO Rating
Highlights
Google Deep Mind releases Gemini 1.5 Pro in public preview on Google's cloud and vertex AI platforms.
New and improved imagin 2 can create 4C live images from a single prompt.
GPT for Turbo with vision is now generally available in the API, out of preview mode with important improvements.
GPT 4 Turbo F Vision has a 128,000 token context window and training data up to December 2023.
Devon AI, an AI software engineering assistant powered by GPT 4 Turbo with vision, is gaining attention.
Internet of bugs YouTube channel debunks Devon's demo, raising questions about the authenticity of AI demos.
Ethan mik's work with Devon AI agent on Reddit shows potential for AI in website building and problem-solving.
AI agents may open cans of worms in areas like remote work, software development, and cybersecurity.
Healthifim uses GPT for Turbo Vision to provide users with nutrition insights through photo recognition of foods.
TL draw demonstrates the potential of GPT 4 Vision in user interface design through rapid prototyping.
Google Deep Mind's image in to can create 4-second live images from a single prompt.
Gemini 1.5 Pro on vertex AI supports processing audio inputs, including music, speech, and video audio.
Gemini 1.5 Pro has a 1 million context window, excellent for finding specific information in large documents.
GPT 4 Turbo is added to the Gladiator arena for llm chat Bots to compete for the best model.
Model A (Open Chat 3.5) and Model B (CLA 3 High Coup) showcase differences in AI's ability to capture nuances in conversation.
CLA 3 Opus surpasses GPT 4 in the arena rankings, with interesting implications for the future of AI models.
Command R plus is a new competitor in the AI space, making waves and challenging existing models.
Transcripts
Google Deep Mind wakes up this morning
and releases Gemini 1.5 Pro that's now
available in public preview on Google's
cloud and vertex AI platforms which is
actually really cool we'll look at this
in just a second and they announced the
new and improved imagin 2 that's able to
create 4C live images from a single
prompt crashing waves mountain range of
course opening eyes like this can I be
we got to drop something too so GPT for
Turbo with vision is now generally
available in the API so it's out of the
preview mode and has been rolled out
with some important improvements so
there isn't a huge amount of specifics
about what was improved here's what's
new so this is the new model GPT 4 Turbo
F Vision the latest gp4 turbo model with
vision capabilities Vision requests can
now use Json mode and function calling
128,000 token context window and
training data up to December 2023 they
also give some examples of what
developers are building with vision and
they're saying drop whatever you're
building in the reply as well they
highlight Devon Devon AI has been making
tons of noise tons of waves it's an AI
software engineering assistant powered
by GPT 4 Turbo that uses vision for a
variety of tasks by the way not
everyone's convinced that Devon and the
demos that have been shown are the real
deal they think that there may be some
Shenanigans that's going on in internet
of bugs is a new YouTube channel
relatively new one month old that's
gaining some traction pointing out some
of the issues with these AI software
development demos latest video debunking
Devon first AI software engineer upw Cai
exposed he specifically takes a look at
this demo that cognition the company
behind Devon posted showing Devon's
upwork side Hustle the internet of bugs
Channel goes through and does a thorough
debunking of what the video claimed
including a 30-minute un edited footage
of him going through and doing
everything that Devon did supposedly to
complete that task now we've covered
Ethan mik's work where he gets the Devon
AI agent to go on Reddit and start a
thread where it's going to take actual
website building requests and it does
that solving numerous problems along the
way even at some point uh attempting to
charge people for the work as Ethan MOG
says agents are going to open a whole
bunch of cans of worms these cans of
warms are things like knowing who is
real and who is not online how does the
jobs and economy change when a lot of
these jobs can be automated by these
agents how does remote work change how
does software development change how do
we protect against things like civil
attacks DDOS attacks like there's a
million questions that if you imagine
these agents will continue developing
and getting better right there's a whole
bunch of cans of worms that are going to
open and these agents are the can
openers that's not not even a joke it's
not me being funny it's just what's
coming now if agent Trader here is
correct then maybe these changes aren't
quite as close as we think there are
maybe the reliability of agents still
hasn't been solved quite that well yet
and software Engineers have nothing yet
to fear because that specific skill set
is still Irreplaceable things like Devon
AI will be an assistant a very important
assistant that's going to allow them to
build more do a lot of the boring tasks
just kind of improve their product ity
instead of actually replacing that work
now me calling him agent shider is a
joke cuz he looks like Hank shider from
Breaking Bad I don't actually know his
real name also another company is Health
ifim me who built snap using GPT for
Turbo Vision to give users nutrition
insights through photo recognition of
foods from around the world TL draw we
covered this in another video If you
haven't played with this thing I highly
highly suggest you do imagine something
like Microsoft Paint where you just
paint whatever you want on the screen
right you paint buttons and you make
little annotations you write out what
you want then you click make it real and
this thing makes it real it uses GPT 4
Vision to take whatever you've just
doodled and turn it into an actual
version an actual software of of what
you did so things I've tried for example
drawing a game and then have it actually
make that game Real Time take seconds to
code that game up I mean there's simple
games one of them was like running
around chasing chickens in a little
enclosure but you can do web pages you
can do forms you can do tons and tons of
stuff it is surprisingly good and I
think something like this will be the
future of user interface design just
because of how quickly you can just get
stuff out there kind of tested iterated
Etc Google deep Minds image in to can
now make little 4-second live images
from a single prompt if you wanted to
try the image in two just the regular
image in two the image effects is
probably the easiest way of doing it
it's pretty good I was surprised I still
prefer mid Journey but Google is getting
very good at image generation in other
news Google Cloud announces updates to
Gemini imag in Gemma and mlops on vertex
AI so Gemma has been improved the small
open source model from Google kind of
like the open source version of Gemini
that might be an accurate way to
describe it Gemini 1.5 Pro on vertex AI
also supports processing audio inputs
including music speech and even the
audio portion of video it can give high
quality transcriptions or be used to
search and analyze multimodal content so
in the Google Cloud you can find the
vertx AI Studio Vision looks like you're
able to generate images and you can
request access here I don't have it yet
but it looks like I have the Gemini
experimental which is the default
setting for me and then you do have this
Gemini 1.5 Pro preview 0409 so I'm
guessing that's April 9th today if a 1
million context window so we might do a
deeper dive into this but it is looks
like it is available and looks like yeah
it does have the 1 million context
window which I have to say is kind of
exciting as we've covered before the
paper show that it's really good at
doing for example finding the needle in
the H stack so if you have a large
document and need to find a specific
thing in that document it will do so
very well 1 million context window is of
course massive and Gemini 1.5 Pro was
very good at a number of tasks the jump
from 1.0 to 1.5 was pretty massive I
believe if I recall correctly the big
change was they went to mixture of
experts so kind of copying the GPT 4
design and making kind of a big leap
forward so we might test this out in a
different video but if you wanted to
play around here's how you can do it the
Google cloud has a free trial it doesn't
autocharge you at the end so if you
wanted to jump in and uh mess around
with it it's you can do so for free they
have text to speech and speech to text
hello there happy to be here the voice
sound okay pretty good here's uh we're
going to try to generate a caption for
this lovely lovely thumbnail I made more
agents is all you need with these uh
wonderful little friendly little kitties
caption reads four hairless cats are
sitting at a table with the words more
agents is all you need above them I mean
that's pretty perfect they nailed it
they're sitting at a table yeah I got to
say this is excellent now we don't quite
yet know what exactly has changed we're
probably going to see how much better
the model is once we start testing it
once we start playing around with it it
has been added to the Gladiator arena
for llm chat Bots to see which one is
the best so the added GPT 4 Turbo the
409 version this is what that looks like
you basically get assign to random
models model a Model B you're not told
what it is but you're able to put in
your prompt and then see how well they
respond I'm going to say right quick
screenplay about Hank Shader and Walter
White talking about using AI to do their
jobs the scene is set at Walter's house
in the backyard as they make barbecue
and off they go Model A and model B
Walter starts smirking you know Hank
I've been thinking Hank raising eyebrow
oh great another one of your brilliant
ideas what's this one all right so
that's a good start I got to say Walter
leaning in what if we use AI to do our
jobs Hank are you serious you want to
replace me with a robot Walter no no not
replace augment improve efficiency you
know like using a computer to analyze
data instead of sifting through mounds
of paperwork okay I got to say so so far
Model A whatever it is is excellent they
really capture the Precision of the
words that Walter uses the Simplicity of
how Hank talks so Model A did an
excellent job and model B I mean it's
okay but it's very very basic right so
it just kind of repeats some talking
points about AI you know oh AI could
write up reports for me and Walter is
like but it might put you out of a job
kind of simplistic not bad it's okay
writing but a is significantly better
and now they revealed that model A was
open chat 3.5 and model B was CLA 3 High
coup which is surprising let's take a
look at the leaderboards so CLA 3 High
coup the ELO rating the arena rating
that they have for it is 1182 whereas
open chat is much lower at well much
lower on the rankings but not that much
lower on the um actual the rating right
so it's 1097 and currently we have CLA 3
Opus as our reing King it has recently
surpassed GPT 4 interestingly Bard from
Gemini Pro is right behind basically the
third model that's ranked right so we
have Claude GPT 4 then we have Bard so
it's number four but it's the third best
model right if you look at GPT these two
versions of GPT 4 as you know the same
model then we have a new competitor the
command R plus that we have to look into
because this has been making a lot of
waves a lot of people are questioning if
it does indeed belong on here we'll do a
full Deep dive into this later but the
point is that very soon we're going to
see the new model up here I guess it'll
be called gp4 turbo
20244 d09 and we'll be able to see
exactly where it falls will open ey take
back their crown and become the number
one once again I mean I got to say here
GPT 4 and Claw 3 Opus are neck to neck
they're two points apart which you can
say is not even I mean they're they're
pretty much the same the difference
might not be statistically significant
with that said my name is Wes rth make
sure you're subscribed think there's
going to be some big news coming soon
and thank you for watching
استعرض المزيد من الفيديوهات ذات الصلة
Всё о новой нейросети GPT-4o за 7 минут!
Google I/O 2024 keynote in 17 minutes
AI News: The AI Arms Race is Getting Insane!
The First AI That Can Analyze Video (For FREE)
OpenAI's New Model Releases LEAKED | Sam Altman talks about AGI, UBI, GPT-5 and what Agents will be
Google I/O 2024: Everything Revealed in 12 Minutes
5.0 / 5 (0 votes)