These AI Use Cases Will Affect Everyone You Know
Summary
TLDRThis week in AI brought a flurry of updates, with OpenAI's GPT-4 leading the charge, offering significant improvements over its predecessor. The model promises multimodal capabilities, faster processing, and a voice assistant with emotion detection. While many features are yet to come, some are already available for free users, including the new image generation capabilities. Google also made strides with its AI offerings, including the release of Project Astra and Gemini Advanced updates. Other companies like Stability AI and Hugging Face introduced new tools for image and video generation, while 11 Labs teased their upcoming music model. The summary highlights the rapid advancements and accessibility of AI technologies that are shaping the future of content creation and beyond.
Takeaways
- 📈 **GPT-4 Release**: OpenAI's new model, GPT-40, surpasses GPT-4 in many aspects, including speed, cost, and capabilities. It's currently available to paid users and is being rolled out to free users.
- 🆓 **Free Access**: GPT-40 is being made freely accessible to all users, which is a significant move by OpenAI, allowing everyone to utilize advanced AI capabilities.
- 🖼️ **Image Generation Updates**: Improvements to image generation capabilities include text generation, one-shot fine-tuning, and character consistency for creating comics or storyboards.
- 📈 **Performance Benchmarks**: GPT-40's vision model is leading in benchmarks, outperforming other models like Opus and Gemini Ultra.
- 🔄 **Web Interface Enhancements**: Web browsing and code interpreter have been improved for faster iterations and multiple generations creation.
- 🚀 **GPT-40's Multimodal Features**: Users can now upload images to engage with the new multimodal GPT-40, leveraging its advanced capabilities.
- 🔗 **New GPT Builder Features**: OpenAI has integrated a building block approach into the GPT Builder, allowing for easier creation of specialized versions of GPT called gpts.
- 📱 **Voice Input and Output**: The phone app still uses the old Whisper model for voice input and text-to-speech, with no immediate update to the new models.
- 📚 **Google's AI Announcements**: Google has released several AI tools, with Project Astra being a notable mention, though most are not yet available for use.
- 🌐 **Global Access**: Anthropic's model, Claude, is now accessible worldwide, increasing competition in the AI market.
- 🎨 **Stable Artisan by Stability AI**: A new Discord interface that combines multiple models, including image, video, and music generation, into one user-friendly platform.
- 🌟 **Icy Light Tool**: An AI tool for relighting images, showcasing the potential for AI in image editing and generation, which may soon replace traditional tools like Photoshop for many tasks.
Q & A
What is the main focus of the video script?
-The main focus of the video script is to discuss the latest AI developments and releases from companies like OpenAI and Google, highlighting tools and features that are currently available for use.
What does the term 'AI news you can use' refer to in the context of the script?
-The term 'AI news you can use' refers to the practical applications and immediate usability of the AI advancements discussed in the script, as opposed to announcements of future developments.
What is GPT 40 and why is it significant?
-GPT 40 is OpenAI's new model that outperforms GPT-4 in various aspects, such as speed and cost. It is significant because it offers new capabilities like a human-like voice assistant, multimodality, and emotion detection, which are groundbreaking in the field of AI.
How can users access GPT 40 currently?
-As of the script's recording date, GPT 40 is accessible to paying users on chat.open.com. It is also being rolled out to free users, with some already reporting access.
What improvements have been made to the image generation capabilities in the new AI models?
-The new AI models have improved image generation capabilities, including text-to-image generation, one-shot fine-tuning, character consistency for creating comic strips or storyboards, and an upload feature for engaging with the new multimodal GPT 40.
What is the current status of the GPT 40's specialized versions called gpts?
-As of the script's recording, the specialized versions of GPT 40, known as gpts, still run on GPT 4. However, there are screenshots indicating a new module for building gpts with added blocks and states.
What is the significance of the Mac app mentioned in the script?
-The Mac app mentioned in the script is significant because it represents a new interface for accessing AI tools. However, the script notes that access to certain features, like the new GPT 40, may still be restricted until further updates.
What is the AI Advantage Community and how does it relate to the script?
-The AI Advantage Community is a subscription-based service that offers challenges and resources related to AI. In the script, the community is mentioned as offering a yearly subscription as a prize for a challenge to submit favorite GPT 40 use cases.
How does the script address the topic of Google's AI announcements and releases?
-The script addresses Google's AI announcements by focusing on the releases that are currently available for use, such as Project Astra and Gemini Advanced updates. It also provides a free resource to help users navigate Google's extensive lineup of AI tools.
What is the significance of the new Gemini 1.5 flash model released by Google?
-The new Gemini 1.5 flash model is significant because it is faster than the 1.5 pro model and ranks highly in terms of speed for AI models. It is accessible through a site that hosts various new chatbots and models, indicating advancements in Google's AI capabilities.
Outlines
🤖 AI News and GPT 4.0 Updates
The content creator discusses the overwhelming pace of new AI releases, focusing on practical tools from Google Open AI and other companies. The highlight is GPT 4.0, a significant upgrade from its predecessor, boasting enhanced capabilities, speed, and cost-effectiveness. The model includes a human-like voice assistant and multimodal features. The creator provides updates on GPT 4.0's availability, noting that paid users have access, and a rollout to free users has begun. They also mention improvements in image generation and character consistency, with a new upload feature in the chat interface offering the latest model's capabilities. However, certain features like GPT 4.0's integration with gpts are not yet available.
🔍 GPT Builder and Google's AI Announcements
The script shifts to discussing the GPT Builder, which now incorporates a building block approach that the content creator had previously highlighted. The creator expresses excitement about the new features and plans to create tutorials once they are released. The discussion then moves to Google's AI announcements, with a focus on Project Astra, which is likened to the future promise of GPT 4.0's voice assistant. The creator also mentions a free resource provided by the AI Advantage team, which offers an overview of Google's extensive AI tools and offerings, including new features and research projects.
🎨 Google's AI Creative Tools and Off Script Sponsor
The content creator introduces Off Script, a sponsor that turns digital creations into physical products. They describe the app's functionality, which allows users to both judge others' creations for potential physical production and generate new products for community voting. If a product is successful, the creators earn a revenue share. The script also touches on Google's new video model, 'vo', which, while not at the level of some competitors, shows promise and is available for users to sign up and try through a waitlist.
🌐 Global Accessibility of AI Models
The script addresses the global release of AI models, noting that competition between Open AI and Google has led to wider accessibility, including in Europe. The creator expresses confusion over previous limitations, suggesting that legal barriers may not be as restrictive as thought. They also mention the release of Anthropic's model, which is now available worldwide, and discuss the relative merits of different AI models, including Google's Gemini and Twitter's grock, in comparison to Open AI's GPT 4.0.
🎭 Other Companies' AI Updates and Stable Artisan
Despite being overshadowed by Google and Open AI's announcements, other companies have released interesting AI updates. Stability AI, known for stable diffusion, has launched a Discord interface that integrates multiple models, including image, video, and music generation. The tool, called Stable Artisan, offers a convenient workflow for creators, although it lacks the promised audio feature at the time of the script. The creator also mentions 11 Labs' work on a music model and introduces Icy Light, an AI tool for relighting images, suggesting a future where Photoshop may not be necessary for many image editing tasks.
Mindmap
Keywords
💡Content Creator
💡AI News
💡GPT-4
💡Multimodal
💡Image Generation
💡Benchmarks
💡API
💡Waitlist
💡AI Advantage Community
💡Discord Interface
Highlights
GPT 40, OpenAI's new model, outperforms GPT-4 in almost all aspects, including speed and cost.
GPT 40 introduces a human-like voice assistant capable of detecting and expressing emotions.
GPT 40 is multimodal, offering a wide range of new capabilities.
A summary video of GPT 40's most important points is available, along with a compilation of groundbreaking use cases.
As of May 16th, 2024, GPT 40 is accessible to paying users, with a rollout to free users underway.
OpenAI's image generation capabilities have been massively improved, allowing text generation and one-shot fine-tuning.
Character consistency feature allows for the creation of comic strips or series with uploaded images.
Vision understanding benchmarks show GPT 40 as the new best-in-class vision model.
Web browsing and code interpreter improvements make iterations faster and more efficient.
GPT 404, specialized versions of GPT, are not yet available for use.
A new module for building GPTs with a building block approach has been added to the GPT Builder.
Off Script's iOS app allows users to turn AI-generated images into physical products.
Google's Project Astra is a voice assistant similar to the promised capabilities of GPT 40's voice assistant.
Google has released a comprehensive overview of their AI offerings, including 44 different AI tools.
Google's new video model, called 'vo', is a competitor to other video models like Runway Gen 2 and Sorá.
Anthropic's model, Claude, is now available for use worldwide, including in the European Union.
Stable Artisan by Stability AI offers a Discord interface combining multiple models for image, video, and music generation.
11 Labs is developing a music model that recreates voices and rap abilities with high fidelity.
ICY Light by Hugging Face allows users to relight images with AI, offering new image editing capabilities.
Transcripts
okay this is one of those weeks in the
ipace where as a content creator you
don't get much sleep because there's
just new stuff coming out every single
day there are so many new releases but
as this is AI news you can use we will
only focus on the ones that you can
actually put to work today which as a
matter of fact is not that many most of
the open Ai and Google IO announcements
are things that are coming in the future
nevertheless me and my team compiled all
the different releases from this week
from both Google open Ai and other
companies and it turn out that there are
some weit list that you should know
about and sign up to so you can use some
of these tools as early as possible and
with that being said let's just dive
into all of this AI Madness that came
out this week ranging from GPT 40 to new
video generators by Google and some
Nifty new hugging face spaces that were
overshadowed by the big announcements
but nevertheless you should know about
them all right so first things first
this was by far the biggest one this
week GPT 40 open eyes brand new model
that beats gp4 on pretty much everything
it's way faster it's way cheap bread has
new capabilities there's a voice
assistant coming that is humanlike a ly
about Majestic potatoes now that's what
I call a mashup andt can detect and
express emotions this model is
multimodal there is so much to talk
about here and I absolutely did if you
didn't see them there's two separate
videos that I created this week about
this announcement the first one focusing
on a summary of all the most important
points I'll link it on top right now and
then a second one which compiling that
one was absolute Madness me and my team
collected all all the use cases that we
considered really groundbreaking and put
all of them into video that one you can
also check out on the channel but beyond
those two videos there have actually
been developments with the model because
the most confusing part of the release
was what is available today what is
coming up and as this is news you can
use I see it as my responsibility to
keep you up to date on which parts of
this massive release you could be
putting to work today and which parts
you will be using in the future and in
that spirit I actually created a tweet
here summarizing where we're at with the
roll out of this release as of today May
16th 2024 by the way I will keep posting
on Twitter when things change I will
keep updating this so if you want the
fresh updates just follow the Twitter at
the advantage but as of now if you go to
chat. open.com if you're a paying user
you will find that you have access to
GPT 40 this is the first point virtually
all paid users that I know have access
to this model as of today but they also
announced that it's coming to all free
users and this roll out seemingly
started a few hours ago I caught the
first few comments on my videos and the
first few people on Twitter are
reporting that they can actually use GPT
40 as a completely free user this still
blows my mind I haven't wrapped my head
around the fact that every single person
on this planet is going to be able to
access GPT 4 heck better than gp4 a
multimodal gp4 that has access to gpts
and so on for free might take a while
till everybody gets it but this is
underway next up if you're using it the
image generation in it is still D free
now in the use Keys video we look at all
the image generation capabilities and
they are massively improved it's
incredible what they added in there it
can now do text meaning you can generate
a full font for yourself or write text
on various images like so and they also
show of capabilities where it does one
shot fine-tuning meaning you give it one
picture of yourself and you can recreate
that in any style this is absolutely
mindblowing when you pair it with the
fact that it can also do character
consistency meaning you can upload one
image of yourself recreate that in a
different style and then create a whole
comic strip you can create a whole
series you can tell a story create a
storyboard and to our before this
recording Greg Brockman actually started
tweeting about this feature meaning this
is probably on its way but as of now
it's still the old model D free so if
you're going to be testing this in the
chat GP interface you're going to find
it's still the old model that is kind of
meh meh okay but what does work as of
today is this upload feature so if you
upload images to GPT 40 you will be
engaging with the new multimodal GPT 40
and you will get all the improved
capabilities if you check out the
benchmarks on Vision understanding this
is the new best-in-class vision model it
beats Opus it beats Gemini Ultra it
beats gp4 on pretty much all of these
benchmarks plus as a power user I can
confirm it is the best Vision model that
we have today and it is available in the
web interface today and for the last
ones I'll just speedrun this we already
have improved web browsing and code
interpreter available today they made
under the hood improvements to these but
the main thing is that they're super
fast now so it's easy to iterate and
create multiple Generations whereas
before it took forever one thing that is
not here yet is GPT 404 gpts and this is
surprising to me to be honest seems like
an easy thing to implement but if you
have G gpts that you use for specific
tasks these still run on GPT 4 as of
today okay wait a minute 12 hours passed
since I recorded that segment and
actually some screenshots surfaced on
Twitter or X from Jeremy here that
shares that there's a new module when
you build gpts so this wasn't announced
we didn't know of this and I don't have
it neither does anybody else but it is
an interesting preview look basically
when you create these specialized
versions of chat GPT called gpts you
have this interface where you create
them and there's a new button at the
bottom where you can add blocks and
States and it's so interesting that they
added this to me because when I teach
building gpts matter of fact when the
GPT store came out I created a video
outlining how to build a GPT with just
one prompt and the entire prompt was
based on building blocks these are
different blocks with modalities that
the GPT can do for you they integrated
this building block approach into the
GPT Builder itself seriously I don't
want to brag here but it's so cool to
see that the channel is months ahead of
these feature roll outs and I teach you
techniques that they later on Implement
I mean this has been the case with the
prompt templates that I released in
December 2022 for cat GPT it took over a
year but now they have these buttons and
Fric has their prompt library and
they're all set up based on use case
with variables that you can change then
the emphasis on custom instructions now
the GPT Builder anyway just wanted to
inform you on this and I'll definitely
be creating tutorials on this once it
ships and just a reminder everybody will
have access to GPT soon as this model
including these gpts and the new
features will be available to everyone
okay let's move along if you're using
the phone app the voice input still uses
the old whisper so all of these voice
assist assistent features both the voice
input and the voice generation are the
old models whisper or tts1 respectively
I suspect that this is the one that will
take the longest amount of time because
this comes packaged with the new iPhone
or Android app and the new Mac app and
yes there is actually no windows app for
now there will only be a Mac app this
Mac app you can actually download
already I have it on my laptop but I do
not have access yet so I downloaded it
but when I log in it just tells me hey
you don't have access to this yet you'll
have to wait a little more and that's
what you need to know I'll keep you
updated on my Twitter and by the way one
more thing we're actually running a
challenge this week this is the first
time we're doing this where I'm
essentially challenging everybody in the
public everybody watching this video to
submit their favorite GPT 40 use case
and then the winner gets a yearly
subscription to the AI Advantage
Community where we do challenges like
this every single week so if you ever
wondered what people like you watching
these types of videos are doing with
something like cat gbt 40 we essentially
created a crowdsource database of all
the different use cases that you could
be applying to your everyday life too oh
and one more thing for all of you
Developers is building with GPT 40 open
I released this brand new and updated
cookbook this is how to implement the
API and use some of the new modalities
so if you're building with gp40 you
definitely want to check this out
there's some new things to be aware of
as the image processing Etc it's all
this page they put together I'll link it
below all right enough on this topic
let's move on to the next one here so we
clearly cover a lot of super interesting
and Cutting Edge Tech but a lot of the
tools that we show off can create
something incredible but it never leaves
the digital realm and that's why I'm
super excited to show you today's
sponsor of script they made it their
mission to actually take some of these
incredible Creations namely the visual
ones and they turn them into products
you heard that right you create
something with a tool like M journey and
then they make it their mission to turn
that into a physical product and they do
it by empowering creators and their
ideas so how does it work well they have
a IOS app that I'm going to show you now
briefly and basically there's two main
functionalities one is you can judge
other people's creations and decide if
this is worth turning into an actual
physical product like is it just me or
have you ever looked at these AI
generated images and you thought to
yourself wow it would be so cool to have
this in person and that's what you're
doing here you're basically swiping left
or right on these different mockups and
when it gets enough swipes to the right
they make it happen and you can purchase
these products so this is one aspect of
the app W this jacket is amazing look at
that I could actually buy it right
now this is too much fun wa what about
this Medusa lamp this would look
fantastic in the background so you get
the point if a product gets enough volts
they partner up with the creator of that
idea and they take care of all the
design manufacturing and ship Shing now
here's the second part to the app and
that's the creation because you can
participate if I go to this middle part
you can actually generate brand new
products inside of this app and then
submit it and then other people can vote
on it and if it goes through and they
sell it you get a revenue share of the
final product and the whole thing here
is quality so a lot of these are not the
cheapest version of that product that
you can find but they sure are extremely
unique so let's just make a super quick
idea here happen I'll go over here I'll
pick something from the catalog let's
say we want a rock that could look good
in the background of the video and you
already know it we're going to do cats
with have hats generated like so and I'm
just speedrunning this obviously you
would want to create more detailed
prompts for your Generations all right
this should do for a quick carpet and
after filling out these fields I can
submit this and now it's available in
the app and people can vote on my
concept and that's basically the whole
idea and all you need to do is download
the free IOS app log in with your Google
account for example and in seconds you
can be up and running and looking at
some of the I generated product so I
personally think this is absolutely
amazing because Off Script is really
taking care of all the hard parts of
this process like designing it
manufacturing it shipping it selling it
marketing it all you need to do is you
need to come up with an interesting idea
and then get enough people to swipe
right on the idea and they'll take it
from there so if you ever had any
product idea why not take it the next
step and you can do that by downloading
the offs script app today and they might
just bring your next idea to life all
right let's get back to the next piece
of AI news you can actually use okay now
that we talked about the releases from
openi let's switch gears and talk about
Google's releases and look this is not
going to be a video summarizing all the
things they announc there's a lot of
interesting things in there if you're
interested in Ai and if you want to
explore what direction Google is taking
you can check out the full keynote but
this is news you can use these are the
releases that you can put to work today
so if you want to check out one thing it
would be project Astra from Google deep
mind it's basically their version of
what GPT 40 promises to be when the
voice assistant ships so I would
strongly recommend you check that out
but beyond that I have an exciting
freebie here for you because the number
one question I received with all of
these Google AI products is how am I
supposed to make sense of their entire
lineup there's like four four different
versions of Gemini there's smaller
models there's Enterprise models they
have offerings across Google workspace
for private consumers for Enterprise
consumers there's developer interfaces
Google Search now uses AI it's included
in all their little apps and so on there
is just so much matter of fact I counted
it there's a total of 44 AI tools and
offerings that Google has right now so
what the AI Advantage team did here is
we actually went ahead and created a
full overview of all of their offerings
and we decided to give it out for free
so if you care to gain an overview of
everything you can check out this free
resource I will also link it below but
look at this basically here's an
overview of all the different Gemini
models what they do and how to use them
consumer products business facing
products business and developer facing
products all of their AI related
research projects new features that they
announced but that are not available yet
and we even compiled all of this into
infographics I might create a separate
video where I take you for the full
thing but for now here's the resource
you can check it out you can use it you
can share it with your friends and
family because they have a lot of
goodness when it comes to AI tools it's
just not very clear how it relates where
to find it and which ones are the tools
and offerings you might want to consider
for yourself but now let's talk about
what actually shipped from Google this
week because there are some things that
are available already and a wait list I
want to point you towards the one big
thing that shipped is a Gemini Advanced
updates and the main change here is that
they made their Gemini 1.5 pro model
accessible through Gemini Advanced that
is their GPT 40 competitor that is
accessible through a simple web
interface and yes that does cost quite
$20 a month but it includes a million
tokens of context which is 1,500 Pages
versus GPT 40 that right now has 32,000
tokens of context good enough for most
use cases but here you get 50 times more
context now GPT 40 is better in most
other categories so I would usually
recommend that but if you want to upload
a th Pages this is what you would want
to use they also expanded the
accessibility to many new countries by
the way this is a common thing amongst
many tools I'll show you some others
that did the same thing throughout this
week open I really pushed them to do
that but again this is shipped to Gemini
Advanced and the big thing here is that
it supports document uploads meaning
that if you have some business use cases
where you want to give it a lot of data
and then talk to it or rework it into
other formats the Gemini 1.5 pro model
inside of Gemini Advanced is the
simplest user interface I know of today
if you want to do it with a th Pages now
I do have to point out that usually this
doesn't work as well as people expect
because the data needs to be labeled you
can't just dump all your info in there
and expect the AI to make sense of it it
needs some context by the way if you're
familiar with fact this is the same
problem there you can't just give it
everything it won't make sense of it a
little tip that I learned from building
chatbots is that the best thing you can
do as a beginner is actually restructure
the data into question and answer pairs
but that might take a lot of work if you
have 1 thousand Pages oh and just to
round this out one more very important
fact is that they actually offer a 2mon
free trial now with the 1 million token
size in this web interface and you can
upload Google Docs and PDFs to the model
now plus one of the announcements was
that there's going to be a 2 million
token size window though meaning you're
going to be able to add 3,000 Pages I'm
not exactly sure who was asking for that
at this point but there you go that will
be coming down the line so Google
definitely making some moves but most of
the things they announced were simply
announcements they weren't shipped
products but one of these was really
exciting it was Google's new video model
a direct competitor to open AI Sora
Runway Gen 2 pabs or all the other video
models now they call it vo and look the
quality was not on Sora level that's the
simplest way to express it it's very
good it seems to be better than all the
other generators but even the examples
that they showed off which will
obviously be Cherry Picked those will be
the best of the best they weren't on the
level of Sora examples that we also
don't have access to so a right now is
just a space that has a lot of promise
but we don't have access to the very
best tools the ones we have are kind of
me me but why am I bringing this up
because they opened up a wait list for
this very tools so you can head on over
to this link as per usual it's linked in
the description below and you can
actually sign in here with Google pick
your country and you will be added to
the weit list of this brand new video
tool and let me tell you from experience
once Google does a weit list they're
usually pretty fast to roll these out so
I would expect this to be days or weeks
and not months but again that is just my
estimation based on all the other Google
AI tool weight list that I've been on
before and if this releases over the
next weeks they will have the best video
model in the entire space until Sora
comes out so consider yourself informed
sign up to the wait list and just one
last quick note about this website it's
actually an incredible website we
covered this on this exact show a few
months back when it released this is
what they call their AI Test Kitchen and
it Harbors a bunch of amazing creative
tools some of them are super unique like
text effects that allows to create
alliterations and explode words and
acronyms it's really good for lyricists
or anybody who wants to juggle around
words in a creative way but as you can
see on screen right now I don't have my
VPN activated meaning this won't work as
I'm sitting in Europe but if you're into
creative and fun things with AI I highly
recommend you revisit this although this
came out a few months ago it's a really
fun way to explore AI capabilities and
completely free okay and there is one
more thing that Google actually released
this week and it's this brand new Gemini
1.5 flash model you can access it
through a site that Harbors a lot of new
chatbots and models like Po and if you
watched last week's episode you will
know that there is this new website that
actually benchmarks the speed of these
different models so if what you care
about is speed this is usually relevant
for developers then this site ranks them
and this new flash model ranks above the
1.5 pro model that is in advanced see
how confusing this naming gets that's
why we create the resource check that
out there it should make more sense but
yeah this flash model is speedier than
the pro model by quite a bit but look
these two models are down here but if we
look at the new open a GPT 40 model that
you can access freely that ranks up here
it's twice as fast and their flash
model so yeah there you go Google
announced a lot of interesting things a
lot of Inspira things that get me
excited about the future but when it
comes to what has been released this
week opening ey does take the crown and
I did a little survey on the YouTube
channel you might have seen it at this
point over 700 people voted and I asked
which one of these announcements did you
find more interesting or exciting and
opening I just won by a landslide
because of some of the points that I
just showed you when it comes to what we
can use today open eyes the clear winner
but I do have to say I'm impressed by
what Google is doing it seems like
they're pulling all of the different
strings together and it's just clear
that they have all the ingredients to
compete for the number one spot in this
race but only time will show and I'll be
here covering it just like I'll be
covering the next update here which is
the fact that anthropic actually shipped
their model to the entire world now so
all the European users can finally use
cloud free just as a refresher open eyes
GPT 40 Google Geminis Advanced and
claud's Opus model are considered the
fre best AI models available today and
all of this competition between open a
and Google actually push them to ship
this to the entire world which makes me
wonder what's up with all these
limitations on release usually all these
tools come out and it's not accessible
in the European Union the UK and a few
more countries but now that the
competition releases their tools to
everybody they do it too I don't know I
don't fully understand that maybe
somebody can clarify in the comments I
thought it was like a legal barrier that
is unsurmountable but apparently it's
not that hard to ship these things so
both in the IOS app and in the web
version no Android app available yet
unfortunately you can use this from all
around the world now but yeah now that
gp4 is better at vision and free why
should you pay $20 per month for an
inferior model that is slower to be fair
some people do like the writing style of
CLA but I think that would pretty much
be the only reason and talking about AI
models that are inferior to GPT 40 but
now open up to the European Union hey I
now have access to Twitter's grock
without using a VPN so to update you on
this one it's pretty much a consensus
across the entire space that there's no
real reason to use this over some of the
top models especially now that GPT 40 is
free again I can't overstate how bold of
a move that was by them but the one
thing that Gro does really well is that
it actually pulls in the Twitter feed so
it's super up to date it doesn't need to
browse the web it pulls in the Twitter
feed and it's just aware of all the
latest happenings in the world as
Twitter is the place where a lot of news
breaks or arrives at first and has
access to that but the model really is
not that great in every conversation
around the best AI it usually doesn't
even come up and that's for a reason so
yeah that's what happened in the land of
llms for this week let's move on to the
next category which is other companies
that came out with interesting Updates
this week and they were completely
overshadowed by all of this massive
announcements between Google and open AI
but this one is actually really
interesting this is stable Artisan by
stability ey the company behind stable
diffusion and what they did is they
created a Discord interface where they
actually did something surprising which
is pulled together multiple models they
have so they took their different image
generation models their video generation
model and their music generation model
and you can access all of this through
one interface in Discord so look in
practice it's very similar to Mid
Journey but it has the ability to create
videos and sounds too and look before we
give this a shot and try this live here
I just want to point out this is a PID
tool it starts at $9 a month very
similar to my journey but you do get
free days for free if you just want to
try this out just watch out they do make
you commit with a credit card and then
it just Auto renews after a free day
days and for that you get 900 credits
and these get used up as you use the
tool so obviously generating video will
take up more credits 20 as you can see
versus using stable diffusion Excel
which is around half a credit and yes
this also includes access to stable
diffusion free which they recently
released this is their best model but it
does cost six credits per generation oh
and one more thing that I should point
out here is that upscaling is 25 credits
which is quite a bit so just be careful
with upscaling only do that on pictures
that you actually want to use and with
all that being said let's get into this
and here's an important note if you want
to use this tool that as of not only
works in Discord you also need to sign
up and subscribe with your Discord
account and once you do that you can
head on over to the stable diffusion
Discord server going into one of these
Artisan rooms and say slash dream
instead of Slash imagin MJ journey and
you already know what we're going to
prompt first cat with a hat let's go
let's see what this gets us here with
stable diffusion free all right very
nice I like this first one and then we
can keep working with this as I
mentioned this is a combination of
multiple tools so let me do some out
painting on all sides where I add more
cats with hats on all sides excellent
excellent okay that didn't work let me
just try some of the other features here
let's turn this into a video and to the
creative upscaling tool okay and we have
to prompt it while upscaling so let me
just repeat the prompt here keep the
creativity at the default setting again
this is just a first look here okay and
let's review the video that it created
here yeah there you go this is a typical
stable diffusion video where it's a
slight motion well and then at certain
points it just morphs into unusable
things but if you want a very slight
animation on something something that's
where this actually works it's just yeah
it is what it is but the upscaler on the
other hand look at that this looks
excellent the original image of 350
kiloby over here and then the new
upscale diin at 2.5 megabytes over here
wow look at the difference yeah day
night so look I think it's really nice
that they combined all of these tools in
one interface obviously stable video is
what it is if you're familiar with the
tool it's just not that great but the
upscaler here is actually really
impressive and it's really convenient to
have all of this in one interface so if
you're looking to create many of these
this is probably the most efficient
workflow you can have with all of the
tools including upscaling and video
generation in one chat interface and
look even though it is Discord it is the
most userfriendly way to generate these
rather than having multiple websites and
having to download and re-upload files
across the place to generate videos it's
just a welcome addition that brings
together multiple of their tools one
thing that I am missing here is the
audio that they promised on the sales
page right after a little review they
actually did not promise the audio and
the blog post but it is included in the
announce video so that's probably coming
soon and while on the topic of AI audio
I just quickly want to point you towards
this announcement by 11 Labs they're
working on a music model which is not
available today it's just too good not
to show off just listen to this as
they're super good at recreating voices
their rap abilities are best in class
have a quick
listen was sh the Paradigm boldly
advancing no fearing Prime I don't know
about you but to me this does pass the
touring test yes this sounds like a
human being yet again it's not available
today I just wanted to bring it up as
we're talking about audio okay and I got
one more tool for you this week and that
is this tool called icy light which
comes with a hugging face Bas so you can
really easily try it and basically this
allows you to relight images with AI so
basically inut something like this and
say Sunset over C and it changes it into
this this is not just image generation
but we're starting to get image editing
capabilities with AI few more examples
of a Husky a I love
huskys turn turned into a Sci-Fi RGB
glowing magically lit husky or youve
better at turning something simple like
this into a beauty photo shoot now let
me briefly try this myself cuz these
examples are usually cherry-picked let's
take this high quality Instagram worthy
picture here and let's use one of the
prompts that they use in their examples
as I want to keep it fair I don't want
to switch it up too much I'm just going
to change the first part to man and then
keep everything on the default settings
and I'll just say relight I'm super
curious to see what we get here first
try no editing no two takes okay 10
seconds later okay that's not that bad
I'll slightly vary The Prompt and run it
one more time not bad look at that it
put me into a forest it adjusted the
lightness and the colors of the image to
actually fit it it perfectly color match
it I actually really like this result so
look at that this is just a demo but
soon we will have these tools built into
interfaces like we saw with stable Artis
and bringing it all together and when we
combine something like this with GPT
40's New Image generation capabilities
you're not going to need Photoshop for
most use cases anymore it's going to
generate exactly what you want with the
correct textt with the character
consistency of it just by uploading one
image of yourself and then you're going
to be able to relight it with tools like
this that eventually will all be baked
into one tool hm the future is going to
get interesting to say the very least
and with that being said I hope you have
a great day I'll see you soon
Посмотреть больше похожих видео
AI News: Everything You Missed This Week!
AI Realism Breakthrough & More AI Use Cases
Why OpenAI's Announcement Was A Bigger Deal Than People Think
AI News: This Was an INSANE Week in AI!
Google I/O 2024: Everything Revealed in 12 Minutes
BIG AI NEWS: 10,000X Bigger Than GPT-4, AGI 2025, New Boston Dynamics Demo And More
5.0 / 5 (0 votes)