These AI Use Cases Will Affect Everyone You Know

The AI Advantage
17 May 202424:15

Summary

TLDRThis week in AI brought a flurry of updates, with OpenAI's GPT-4 leading the charge, offering significant improvements over its predecessor. The model promises multimodal capabilities, faster processing, and a voice assistant with emotion detection. While many features are yet to come, some are already available for free users, including the new image generation capabilities. Google also made strides with its AI offerings, including the release of Project Astra and Gemini Advanced updates. Other companies like Stability AI and Hugging Face introduced new tools for image and video generation, while 11 Labs teased their upcoming music model. The summary highlights the rapid advancements and accessibility of AI technologies that are shaping the future of content creation and beyond.

Takeaways

  • 📈 **GPT-4 Release**: OpenAI's new model, GPT-40, surpasses GPT-4 in many aspects, including speed, cost, and capabilities. It's currently available to paid users and is being rolled out to free users.
  • 🆓 **Free Access**: GPT-40 is being made freely accessible to all users, which is a significant move by OpenAI, allowing everyone to utilize advanced AI capabilities.
  • 🖼️ **Image Generation Updates**: Improvements to image generation capabilities include text generation, one-shot fine-tuning, and character consistency for creating comics or storyboards.
  • 📈 **Performance Benchmarks**: GPT-40's vision model is leading in benchmarks, outperforming other models like Opus and Gemini Ultra.
  • 🔄 **Web Interface Enhancements**: Web browsing and code interpreter have been improved for faster iterations and multiple generations creation.
  • 🚀 **GPT-40's Multimodal Features**: Users can now upload images to engage with the new multimodal GPT-40, leveraging its advanced capabilities.
  • 🔗 **New GPT Builder Features**: OpenAI has integrated a building block approach into the GPT Builder, allowing for easier creation of specialized versions of GPT called gpts.
  • 📱 **Voice Input and Output**: The phone app still uses the old Whisper model for voice input and text-to-speech, with no immediate update to the new models.
  • 📚 **Google's AI Announcements**: Google has released several AI tools, with Project Astra being a notable mention, though most are not yet available for use.
  • 🌐 **Global Access**: Anthropic's model, Claude, is now accessible worldwide, increasing competition in the AI market.
  • 🎨 **Stable Artisan by Stability AI**: A new Discord interface that combines multiple models, including image, video, and music generation, into one user-friendly platform.
  • 🌟 **Icy Light Tool**: An AI tool for relighting images, showcasing the potential for AI in image editing and generation, which may soon replace traditional tools like Photoshop for many tasks.

Q & A

  • What is the main focus of the video script?

    -The main focus of the video script is to discuss the latest AI developments and releases from companies like OpenAI and Google, highlighting tools and features that are currently available for use.

  • What does the term 'AI news you can use' refer to in the context of the script?

    -The term 'AI news you can use' refers to the practical applications and immediate usability of the AI advancements discussed in the script, as opposed to announcements of future developments.

  • What is GPT 40 and why is it significant?

    -GPT 40 is OpenAI's new model that outperforms GPT-4 in various aspects, such as speed and cost. It is significant because it offers new capabilities like a human-like voice assistant, multimodality, and emotion detection, which are groundbreaking in the field of AI.

  • How can users access GPT 40 currently?

    -As of the script's recording date, GPT 40 is accessible to paying users on chat.open.com. It is also being rolled out to free users, with some already reporting access.

  • What improvements have been made to the image generation capabilities in the new AI models?

    -The new AI models have improved image generation capabilities, including text-to-image generation, one-shot fine-tuning, character consistency for creating comic strips or storyboards, and an upload feature for engaging with the new multimodal GPT 40.

  • What is the current status of the GPT 40's specialized versions called gpts?

    -As of the script's recording, the specialized versions of GPT 40, known as gpts, still run on GPT 4. However, there are screenshots indicating a new module for building gpts with added blocks and states.

  • What is the significance of the Mac app mentioned in the script?

    -The Mac app mentioned in the script is significant because it represents a new interface for accessing AI tools. However, the script notes that access to certain features, like the new GPT 40, may still be restricted until further updates.

  • What is the AI Advantage Community and how does it relate to the script?

    -The AI Advantage Community is a subscription-based service that offers challenges and resources related to AI. In the script, the community is mentioned as offering a yearly subscription as a prize for a challenge to submit favorite GPT 40 use cases.

  • How does the script address the topic of Google's AI announcements and releases?

    -The script addresses Google's AI announcements by focusing on the releases that are currently available for use, such as Project Astra and Gemini Advanced updates. It also provides a free resource to help users navigate Google's extensive lineup of AI tools.

  • What is the significance of the new Gemini 1.5 flash model released by Google?

    -The new Gemini 1.5 flash model is significant because it is faster than the 1.5 pro model and ranks highly in terms of speed for AI models. It is accessible through a site that hosts various new chatbots and models, indicating advancements in Google's AI capabilities.

Outlines

00:00

🤖 AI News and GPT 4.0 Updates

The content creator discusses the overwhelming pace of new AI releases, focusing on practical tools from Google Open AI and other companies. The highlight is GPT 4.0, a significant upgrade from its predecessor, boasting enhanced capabilities, speed, and cost-effectiveness. The model includes a human-like voice assistant and multimodal features. The creator provides updates on GPT 4.0's availability, noting that paid users have access, and a rollout to free users has begun. They also mention improvements in image generation and character consistency, with a new upload feature in the chat interface offering the latest model's capabilities. However, certain features like GPT 4.0's integration with gpts are not yet available.

05:00

🔍 GPT Builder and Google's AI Announcements

The script shifts to discussing the GPT Builder, which now incorporates a building block approach that the content creator had previously highlighted. The creator expresses excitement about the new features and plans to create tutorials once they are released. The discussion then moves to Google's AI announcements, with a focus on Project Astra, which is likened to the future promise of GPT 4.0's voice assistant. The creator also mentions a free resource provided by the AI Advantage team, which offers an overview of Google's extensive AI tools and offerings, including new features and research projects.

10:01

🎨 Google's AI Creative Tools and Off Script Sponsor

The content creator introduces Off Script, a sponsor that turns digital creations into physical products. They describe the app's functionality, which allows users to both judge others' creations for potential physical production and generate new products for community voting. If a product is successful, the creators earn a revenue share. The script also touches on Google's new video model, 'vo', which, while not at the level of some competitors, shows promise and is available for users to sign up and try through a waitlist.

15:02

🌐 Global Accessibility of AI Models

The script addresses the global release of AI models, noting that competition between Open AI and Google has led to wider accessibility, including in Europe. The creator expresses confusion over previous limitations, suggesting that legal barriers may not be as restrictive as thought. They also mention the release of Anthropic's model, which is now available worldwide, and discuss the relative merits of different AI models, including Google's Gemini and Twitter's grock, in comparison to Open AI's GPT 4.0.

20:03

🎭 Other Companies' AI Updates and Stable Artisan

Despite being overshadowed by Google and Open AI's announcements, other companies have released interesting AI updates. Stability AI, known for stable diffusion, has launched a Discord interface that integrates multiple models, including image, video, and music generation. The tool, called Stable Artisan, offers a convenient workflow for creators, although it lacks the promised audio feature at the time of the script. The creator also mentions 11 Labs' work on a music model and introduces Icy Light, an AI tool for relighting images, suggesting a future where Photoshop may not be necessary for many image editing tasks.

Mindmap

Keywords

💡Content Creator

A content creator is an individual or entity responsible for producing various forms of content, such as videos, articles, or other digital media. In the context of the video, the speaker identifies as a content creator who is particularly busy due to the rapid pace of new AI-related releases, which they must cover and explain to their audience.

💡AI News

AI News refers to the latest information, updates, and developments in the field of artificial intelligence. The video's theme revolves around providing AI news that viewers can immediately apply or utilize, setting the stage for the discussion of new AI tools and models introduced by companies like Google and OpenAI.

💡GPT-4

GPT-4, or GPT 40 as mentioned in the transcript, is a new model developed by OpenAI that surpasses its predecessor, GPT-3, in various capabilities. It's highlighted as a significant release due to its enhanced speed, cost-effectiveness, and multimodal capabilities, including voice assistance with emotion detection.

💡Multimodal

In the context of AI, multimodal refers to systems that can process and understand multiple types of input data, such as text, images, and voice. The script discusses GPT-4's multimodal capabilities, meaning it can handle various forms of data, making it a versatile tool for different applications.

💡Image Generation

Image generation is the process by which AI algorithms create visual content based on textual descriptions or other prompts. The video discusses advancements in image generation, particularly improvements in text-to-image capabilities and one-shot fine-tuning, allowing users to generate images in various styles with a single example.

💡Benchmarks

Benchmarks are standards or points of reference used to evaluate the performance of a system or model. In the script, benchmarks are mentioned to compare the performance of different AI vision models, with GPT-4 being recognized as the best-in-class model as of the video's recording date.

💡API

API stands for Application Programming Interface, which is a set of rules and protocols that allows different software applications to communicate with each other. The video mentions an updated cookbook for developers working with GPT-4, which includes instructions on how to implement the API and utilize new modalities.

💡Waitlist

A waitlist is a list of people or entities waiting to gain access to a product, service, or feature. The script encourages viewers to sign up for waitlists to gain early access to new AI tools, such as Google's video model, which is not yet available to the public.

💡AI Advantage Community

The AI Advantage Community appears to be a subscription-based group or platform that offers challenges, resources, and community engagement around AI tools and applications. The speaker mentions it as a place where viewers can submit their favorite GPT-4 use cases and potentially win a yearly subscription.

💡Discord Interface

A Discord interface refers to a method of interaction within the Discord platform, which is a popular communication app designed for creating communities. The script discusses the introduction of a Discord interface by Stability AI, allowing users to access multiple AI models, including image, video, and music generation, through a unified chat-based system.

Highlights

GPT 40, OpenAI's new model, outperforms GPT-4 in almost all aspects, including speed and cost.

GPT 40 introduces a human-like voice assistant capable of detecting and expressing emotions.

GPT 40 is multimodal, offering a wide range of new capabilities.

A summary video of GPT 40's most important points is available, along with a compilation of groundbreaking use cases.

As of May 16th, 2024, GPT 40 is accessible to paying users, with a rollout to free users underway.

OpenAI's image generation capabilities have been massively improved, allowing text generation and one-shot fine-tuning.

Character consistency feature allows for the creation of comic strips or series with uploaded images.

Vision understanding benchmarks show GPT 40 as the new best-in-class vision model.

Web browsing and code interpreter improvements make iterations faster and more efficient.

GPT 404, specialized versions of GPT, are not yet available for use.

A new module for building GPTs with a building block approach has been added to the GPT Builder.

Off Script's iOS app allows users to turn AI-generated images into physical products.

Google's Project Astra is a voice assistant similar to the promised capabilities of GPT 40's voice assistant.

Google has released a comprehensive overview of their AI offerings, including 44 different AI tools.

Google's new video model, called 'vo', is a competitor to other video models like Runway Gen 2 and Sorá.

Anthropic's model, Claude, is now available for use worldwide, including in the European Union.

Stable Artisan by Stability AI offers a Discord interface combining multiple models for image, video, and music generation.

11 Labs is developing a music model that recreates voices and rap abilities with high fidelity.

ICY Light by Hugging Face allows users to relight images with AI, offering new image editing capabilities.

Transcripts

play00:00

okay this is one of those weeks in the

play00:01

ipace where as a content creator you

play00:03

don't get much sleep because there's

play00:05

just new stuff coming out every single

play00:08

day there are so many new releases but

play00:11

as this is AI news you can use we will

play00:13

only focus on the ones that you can

play00:15

actually put to work today which as a

play00:17

matter of fact is not that many most of

play00:19

the open Ai and Google IO announcements

play00:21

are things that are coming in the future

play00:23

nevertheless me and my team compiled all

play00:25

the different releases from this week

play00:26

from both Google open Ai and other

play00:28

companies and it turn out that there are

play00:30

some weit list that you should know

play00:31

about and sign up to so you can use some

play00:34

of these tools as early as possible and

play00:35

with that being said let's just dive

play00:37

into all of this AI Madness that came

play00:39

out this week ranging from GPT 40 to new

play00:41

video generators by Google and some

play00:43

Nifty new hugging face spaces that were

play00:45

overshadowed by the big announcements

play00:47

but nevertheless you should know about

play00:49

them all right so first things first

play00:51

this was by far the biggest one this

play00:52

week GPT 40 open eyes brand new model

play00:56

that beats gp4 on pretty much everything

play00:58

it's way faster it's way cheap bread has

play01:00

new capabilities there's a voice

play01:01

assistant coming that is humanlike a ly

play01:05

about Majestic potatoes now that's what

play01:08

I call a mashup andt can detect and

play01:10

express emotions this model is

play01:12

multimodal there is so much to talk

play01:14

about here and I absolutely did if you

play01:17

didn't see them there's two separate

play01:18

videos that I created this week about

play01:20

this announcement the first one focusing

play01:22

on a summary of all the most important

play01:24

points I'll link it on top right now and

play01:25

then a second one which compiling that

play01:27

one was absolute Madness me and my team

play01:29

collected all all the use cases that we

play01:31

considered really groundbreaking and put

play01:33

all of them into video that one you can

play01:34

also check out on the channel but beyond

play01:36

those two videos there have actually

play01:37

been developments with the model because

play01:39

the most confusing part of the release

play01:41

was what is available today what is

play01:44

coming up and as this is news you can

play01:45

use I see it as my responsibility to

play01:48

keep you up to date on which parts of

play01:50

this massive release you could be

play01:51

putting to work today and which parts

play01:53

you will be using in the future and in

play01:55

that spirit I actually created a tweet

play01:57

here summarizing where we're at with the

play01:58

roll out of this release as of today May

play02:01

16th 2024 by the way I will keep posting

play02:05

on Twitter when things change I will

play02:06

keep updating this so if you want the

play02:08

fresh updates just follow the Twitter at

play02:10

the advantage but as of now if you go to

play02:12

chat. open.com if you're a paying user

play02:14

you will find that you have access to

play02:16

GPT 40 this is the first point virtually

play02:18

all paid users that I know have access

play02:21

to this model as of today but they also

play02:23

announced that it's coming to all free

play02:24

users and this roll out seemingly

play02:26

started a few hours ago I caught the

play02:28

first few comments on my videos and the

play02:30

first few people on Twitter are

play02:31

reporting that they can actually use GPT

play02:33

40 as a completely free user this still

play02:36

blows my mind I haven't wrapped my head

play02:37

around the fact that every single person

play02:39

on this planet is going to be able to

play02:41

access GPT 4 heck better than gp4 a

play02:44

multimodal gp4 that has access to gpts

play02:47

and so on for free might take a while

play02:50

till everybody gets it but this is

play02:52

underway next up if you're using it the

play02:54

image generation in it is still D free

play02:57

now in the use Keys video we look at all

play02:58

the image generation capabilities and

play03:00

they are massively improved it's

play03:01

incredible what they added in there it

play03:03

can now do text meaning you can generate

play03:05

a full font for yourself or write text

play03:07

on various images like so and they also

play03:09

show of capabilities where it does one

play03:11

shot fine-tuning meaning you give it one

play03:13

picture of yourself and you can recreate

play03:15

that in any style this is absolutely

play03:16

mindblowing when you pair it with the

play03:18

fact that it can also do character

play03:20

consistency meaning you can upload one

play03:22

image of yourself recreate that in a

play03:23

different style and then create a whole

play03:25

comic strip you can create a whole

play03:27

series you can tell a story create a

play03:28

storyboard and to our before this

play03:30

recording Greg Brockman actually started

play03:31

tweeting about this feature meaning this

play03:33

is probably on its way but as of now

play03:35

it's still the old model D free so if

play03:37

you're going to be testing this in the

play03:38

chat GP interface you're going to find

play03:40

it's still the old model that is kind of

play03:41

meh meh okay but what does work as of

play03:44

today is this upload feature so if you

play03:45

upload images to GPT 40 you will be

play03:48

engaging with the new multimodal GPT 40

play03:50

and you will get all the improved

play03:51

capabilities if you check out the

play03:53

benchmarks on Vision understanding this

play03:55

is the new best-in-class vision model it

play03:57

beats Opus it beats Gemini Ultra it

play03:59

beats gp4 on pretty much all of these

play04:01

benchmarks plus as a power user I can

play04:03

confirm it is the best Vision model that

play04:05

we have today and it is available in the

play04:07

web interface today and for the last

play04:09

ones I'll just speedrun this we already

play04:11

have improved web browsing and code

play04:12

interpreter available today they made

play04:14

under the hood improvements to these but

play04:16

the main thing is that they're super

play04:17

fast now so it's easy to iterate and

play04:19

create multiple Generations whereas

play04:21

before it took forever one thing that is

play04:23

not here yet is GPT 404 gpts and this is

play04:26

surprising to me to be honest seems like

play04:27

an easy thing to implement but if you

play04:29

have G gpts that you use for specific

play04:31

tasks these still run on GPT 4 as of

play04:33

today okay wait a minute 12 hours passed

play04:35

since I recorded that segment and

play04:37

actually some screenshots surfaced on

play04:39

Twitter or X from Jeremy here that

play04:40

shares that there's a new module when

play04:42

you build gpts so this wasn't announced

play04:45

we didn't know of this and I don't have

play04:46

it neither does anybody else but it is

play04:48

an interesting preview look basically

play04:50

when you create these specialized

play04:51

versions of chat GPT called gpts you

play04:54

have this interface where you create

play04:55

them and there's a new button at the

play04:56

bottom where you can add blocks and

play04:58

States and it's so interesting that they

play05:00

added this to me because when I teach

play05:02

building gpts matter of fact when the

play05:03

GPT store came out I created a video

play05:05

outlining how to build a GPT with just

play05:07

one prompt and the entire prompt was

play05:09

based on building blocks these are

play05:11

different blocks with modalities that

play05:13

the GPT can do for you they integrated

play05:15

this building block approach into the

play05:17

GPT Builder itself seriously I don't

play05:20

want to brag here but it's so cool to

play05:21

see that the channel is months ahead of

play05:23

these feature roll outs and I teach you

play05:25

techniques that they later on Implement

play05:27

I mean this has been the case with the

play05:28

prompt templates that I released in

play05:30

December 2022 for cat GPT it took over a

play05:33

year but now they have these buttons and

play05:35

Fric has their prompt library and

play05:36

they're all set up based on use case

play05:38

with variables that you can change then

play05:40

the emphasis on custom instructions now

play05:41

the GPT Builder anyway just wanted to

play05:43

inform you on this and I'll definitely

play05:45

be creating tutorials on this once it

play05:47

ships and just a reminder everybody will

play05:49

have access to GPT soon as this model

play05:51

including these gpts and the new

play05:52

features will be available to everyone

play05:54

okay let's move along if you're using

play05:55

the phone app the voice input still uses

play05:57

the old whisper so all of these voice

play05:59

assist assistent features both the voice

play06:00

input and the voice generation are the

play06:03

old models whisper or tts1 respectively

play06:06

I suspect that this is the one that will

play06:08

take the longest amount of time because

play06:10

this comes packaged with the new iPhone

play06:11

or Android app and the new Mac app and

play06:14

yes there is actually no windows app for

play06:16

now there will only be a Mac app this

play06:18

Mac app you can actually download

play06:19

already I have it on my laptop but I do

play06:22

not have access yet so I downloaded it

play06:24

but when I log in it just tells me hey

play06:25

you don't have access to this yet you'll

play06:26

have to wait a little more and that's

play06:28

what you need to know I'll keep you

play06:29

updated on my Twitter and by the way one

play06:30

more thing we're actually running a

play06:32

challenge this week this is the first

play06:33

time we're doing this where I'm

play06:34

essentially challenging everybody in the

play06:36

public everybody watching this video to

play06:37

submit their favorite GPT 40 use case

play06:40

and then the winner gets a yearly

play06:41

subscription to the AI Advantage

play06:42

Community where we do challenges like

play06:44

this every single week so if you ever

play06:46

wondered what people like you watching

play06:48

these types of videos are doing with

play06:50

something like cat gbt 40 we essentially

play06:52

created a crowdsource database of all

play06:54

the different use cases that you could

play06:55

be applying to your everyday life too oh

play06:57

and one more thing for all of you

play06:59

Developers is building with GPT 40 open

play07:01

I released this brand new and updated

play07:02

cookbook this is how to implement the

play07:04

API and use some of the new modalities

play07:07

so if you're building with gp40 you

play07:08

definitely want to check this out

play07:10

there's some new things to be aware of

play07:11

as the image processing Etc it's all

play07:13

this page they put together I'll link it

play07:14

below all right enough on this topic

play07:16

let's move on to the next one here so we

play07:17

clearly cover a lot of super interesting

play07:19

and Cutting Edge Tech but a lot of the

play07:21

tools that we show off can create

play07:22

something incredible but it never leaves

play07:24

the digital realm and that's why I'm

play07:26

super excited to show you today's

play07:27

sponsor of script they made it their

play07:29

mission to actually take some of these

play07:30

incredible Creations namely the visual

play07:32

ones and they turn them into products

play07:34

you heard that right you create

play07:35

something with a tool like M journey and

play07:37

then they make it their mission to turn

play07:39

that into a physical product and they do

play07:42

it by empowering creators and their

play07:44

ideas so how does it work well they have

play07:46

a IOS app that I'm going to show you now

play07:48

briefly and basically there's two main

play07:49

functionalities one is you can judge

play07:52

other people's creations and decide if

play07:54

this is worth turning into an actual

play07:56

physical product like is it just me or

play07:57

have you ever looked at these AI

play07:59

generated images and you thought to

play08:00

yourself wow it would be so cool to have

play08:02

this in person and that's what you're

play08:04

doing here you're basically swiping left

play08:05

or right on these different mockups and

play08:07

when it gets enough swipes to the right

play08:09

they make it happen and you can purchase

play08:11

these products so this is one aspect of

play08:13

the app W this jacket is amazing look at

play08:15

that I could actually buy it right

play08:17

now this is too much fun wa what about

play08:19

this Medusa lamp this would look

play08:21

fantastic in the background so you get

play08:23

the point if a product gets enough volts

play08:24

they partner up with the creator of that

play08:26

idea and they take care of all the

play08:28

design manufacturing and ship Shing now

play08:29

here's the second part to the app and

play08:31

that's the creation because you can

play08:32

participate if I go to this middle part

play08:34

you can actually generate brand new

play08:35

products inside of this app and then

play08:37

submit it and then other people can vote

play08:39

on it and if it goes through and they

play08:40

sell it you get a revenue share of the

play08:42

final product and the whole thing here

play08:44

is quality so a lot of these are not the

play08:45

cheapest version of that product that

play08:46

you can find but they sure are extremely

play08:49

unique so let's just make a super quick

play08:51

idea here happen I'll go over here I'll

play08:53

pick something from the catalog let's

play08:54

say we want a rock that could look good

play08:56

in the background of the video and you

play08:58

already know it we're going to do cats

play08:59

with have hats generated like so and I'm

play09:01

just speedrunning this obviously you

play09:03

would want to create more detailed

play09:04

prompts for your Generations all right

play09:07

this should do for a quick carpet and

play09:09

after filling out these fields I can

play09:10

submit this and now it's available in

play09:12

the app and people can vote on my

play09:13

concept and that's basically the whole

play09:15

idea and all you need to do is download

play09:16

the free IOS app log in with your Google

play09:18

account for example and in seconds you

play09:20

can be up and running and looking at

play09:21

some of the I generated product so I

play09:23

personally think this is absolutely

play09:24

amazing because Off Script is really

play09:26

taking care of all the hard parts of

play09:28

this process like designing it

play09:29

manufacturing it shipping it selling it

play09:31

marketing it all you need to do is you

play09:33

need to come up with an interesting idea

play09:35

and then get enough people to swipe

play09:36

right on the idea and they'll take it

play09:38

from there so if you ever had any

play09:39

product idea why not take it the next

play09:41

step and you can do that by downloading

play09:43

the offs script app today and they might

play09:44

just bring your next idea to life all

play09:46

right let's get back to the next piece

play09:47

of AI news you can actually use okay now

play09:49

that we talked about the releases from

play09:51

openi let's switch gears and talk about

play09:53

Google's releases and look this is not

play09:55

going to be a video summarizing all the

play09:57

things they announc there's a lot of

play09:58

interesting things in there if you're

play09:59

interested in Ai and if you want to

play10:01

explore what direction Google is taking

play10:03

you can check out the full keynote but

play10:05

this is news you can use these are the

play10:07

releases that you can put to work today

play10:09

so if you want to check out one thing it

play10:10

would be project Astra from Google deep

play10:13

mind it's basically their version of

play10:14

what GPT 40 promises to be when the

play10:16

voice assistant ships so I would

play10:18

strongly recommend you check that out

play10:20

but beyond that I have an exciting

play10:21

freebie here for you because the number

play10:23

one question I received with all of

play10:24

these Google AI products is how am I

play10:26

supposed to make sense of their entire

play10:28

lineup there's like four four different

play10:29

versions of Gemini there's smaller

play10:31

models there's Enterprise models they

play10:33

have offerings across Google workspace

play10:35

for private consumers for Enterprise

play10:36

consumers there's developer interfaces

play10:38

Google Search now uses AI it's included

play10:41

in all their little apps and so on there

play10:43

is just so much matter of fact I counted

play10:45

it there's a total of 44 AI tools and

play10:48

offerings that Google has right now so

play10:49

what the AI Advantage team did here is

play10:51

we actually went ahead and created a

play10:53

full overview of all of their offerings

play10:56

and we decided to give it out for free

play10:57

so if you care to gain an overview of

play10:59

everything you can check out this free

play11:01

resource I will also link it below but

play11:03

look at this basically here's an

play11:04

overview of all the different Gemini

play11:05

models what they do and how to use them

play11:07

consumer products business facing

play11:10

products business and developer facing

play11:12

products all of their AI related

play11:14

research projects new features that they

play11:16

announced but that are not available yet

play11:18

and we even compiled all of this into

play11:19

infographics I might create a separate

play11:21

video where I take you for the full

play11:22

thing but for now here's the resource

play11:24

you can check it out you can use it you

play11:26

can share it with your friends and

play11:27

family because they have a lot of

play11:28

goodness when it comes to AI tools it's

play11:30

just not very clear how it relates where

play11:32

to find it and which ones are the tools

play11:34

and offerings you might want to consider

play11:35

for yourself but now let's talk about

play11:37

what actually shipped from Google this

play11:39

week because there are some things that

play11:40

are available already and a wait list I

play11:42

want to point you towards the one big

play11:43

thing that shipped is a Gemini Advanced

play11:45

updates and the main change here is that

play11:47

they made their Gemini 1.5 pro model

play11:50

accessible through Gemini Advanced that

play11:52

is their GPT 40 competitor that is

play11:55

accessible through a simple web

play11:57

interface and yes that does cost quite

play11:59

$20 a month but it includes a million

play12:01

tokens of context which is 1,500 Pages

play12:04

versus GPT 40 that right now has 32,000

play12:06

tokens of context good enough for most

play12:08

use cases but here you get 50 times more

play12:10

context now GPT 40 is better in most

play12:12

other categories so I would usually

play12:14

recommend that but if you want to upload

play12:15

a th Pages this is what you would want

play12:17

to use they also expanded the

play12:19

accessibility to many new countries by

play12:21

the way this is a common thing amongst

play12:23

many tools I'll show you some others

play12:24

that did the same thing throughout this

play12:26

week open I really pushed them to do

play12:27

that but again this is shipped to Gemini

play12:29

Advanced and the big thing here is that

play12:30

it supports document uploads meaning

play12:33

that if you have some business use cases

play12:34

where you want to give it a lot of data

play12:35

and then talk to it or rework it into

play12:37

other formats the Gemini 1.5 pro model

play12:40

inside of Gemini Advanced is the

play12:42

simplest user interface I know of today

play12:45

if you want to do it with a th Pages now

play12:46

I do have to point out that usually this

play12:49

doesn't work as well as people expect

play12:51

because the data needs to be labeled you

play12:52

can't just dump all your info in there

play12:54

and expect the AI to make sense of it it

play12:56

needs some context by the way if you're

play12:58

familiar with fact this is the same

play13:00

problem there you can't just give it

play13:01

everything it won't make sense of it a

play13:03

little tip that I learned from building

play13:04

chatbots is that the best thing you can

play13:06

do as a beginner is actually restructure

play13:08

the data into question and answer pairs

play13:10

but that might take a lot of work if you

play13:11

have 1 thousand Pages oh and just to

play13:13

round this out one more very important

play13:15

fact is that they actually offer a 2mon

play13:17

free trial now with the 1 million token

play13:19

size in this web interface and you can

play13:22

upload Google Docs and PDFs to the model

play13:24

now plus one of the announcements was

play13:26

that there's going to be a 2 million

play13:28

token size window though meaning you're

play13:29

going to be able to add 3,000 Pages I'm

play13:32

not exactly sure who was asking for that

play13:33

at this point but there you go that will

play13:35

be coming down the line so Google

play13:37

definitely making some moves but most of

play13:39

the things they announced were simply

play13:41

announcements they weren't shipped

play13:42

products but one of these was really

play13:44

exciting it was Google's new video model

play13:46

a direct competitor to open AI Sora

play13:49

Runway Gen 2 pabs or all the other video

play13:52

models now they call it vo and look the

play13:55

quality was not on Sora level that's the

play13:58

simplest way to express it it's very

play14:00

good it seems to be better than all the

play14:01

other generators but even the examples

play14:03

that they showed off which will

play14:04

obviously be Cherry Picked those will be

play14:06

the best of the best they weren't on the

play14:08

level of Sora examples that we also

play14:10

don't have access to so a right now is

play14:12

just a space that has a lot of promise

play14:13

but we don't have access to the very

play14:15

best tools the ones we have are kind of

play14:17

me me but why am I bringing this up

play14:20

because they opened up a wait list for

play14:22

this very tools so you can head on over

play14:23

to this link as per usual it's linked in

play14:25

the description below and you can

play14:27

actually sign in here with Google pick

play14:29

your country and you will be added to

play14:31

the weit list of this brand new video

play14:33

tool and let me tell you from experience

play14:34

once Google does a weit list they're

play14:36

usually pretty fast to roll these out so

play14:38

I would expect this to be days or weeks

play14:40

and not months but again that is just my

play14:42

estimation based on all the other Google

play14:44

AI tool weight list that I've been on

play14:46

before and if this releases over the

play14:47

next weeks they will have the best video

play14:49

model in the entire space until Sora

play14:52

comes out so consider yourself informed

play14:55

sign up to the wait list and just one

play14:57

last quick note about this website it's

play14:58

actually an incredible website we

play15:00

covered this on this exact show a few

play15:01

months back when it released this is

play15:03

what they call their AI Test Kitchen and

play15:05

it Harbors a bunch of amazing creative

play15:08

tools some of them are super unique like

play15:09

text effects that allows to create

play15:11

alliterations and explode words and

play15:13

acronyms it's really good for lyricists

play15:15

or anybody who wants to juggle around

play15:16

words in a creative way but as you can

play15:18

see on screen right now I don't have my

play15:19

VPN activated meaning this won't work as

play15:22

I'm sitting in Europe but if you're into

play15:23

creative and fun things with AI I highly

play15:25

recommend you revisit this although this

play15:26

came out a few months ago it's a really

play15:28

fun way to explore AI capabilities and

play15:30

completely free okay and there is one

play15:32

more thing that Google actually released

play15:33

this week and it's this brand new Gemini

play15:35

1.5 flash model you can access it

play15:37

through a site that Harbors a lot of new

play15:39

chatbots and models like Po and if you

play15:42

watched last week's episode you will

play15:43

know that there is this new website that

play15:45

actually benchmarks the speed of these

play15:46

different models so if what you care

play15:48

about is speed this is usually relevant

play15:50

for developers then this site ranks them

play15:52

and this new flash model ranks above the

play15:54

1.5 pro model that is in advanced see

play15:57

how confusing this naming gets that's

play15:58

why we create the resource check that

play15:59

out there it should make more sense but

play16:01

yeah this flash model is speedier than

play16:03

the pro model by quite a bit but look

play16:06

these two models are down here but if we

play16:08

look at the new open a GPT 40 model that

play16:10

you can access freely that ranks up here

play16:13

it's twice as fast and their flash

play16:16

model so yeah there you go Google

play16:18

announced a lot of interesting things a

play16:19

lot of Inspira things that get me

play16:21

excited about the future but when it

play16:23

comes to what has been released this

play16:24

week opening ey does take the crown and

play16:26

I did a little survey on the YouTube

play16:28

channel you might have seen it at this

play16:29

point over 700 people voted and I asked

play16:32

which one of these announcements did you

play16:33

find more interesting or exciting and

play16:35

opening I just won by a landslide

play16:37

because of some of the points that I

play16:38

just showed you when it comes to what we

play16:39

can use today open eyes the clear winner

play16:41

but I do have to say I'm impressed by

play16:42

what Google is doing it seems like

play16:44

they're pulling all of the different

play16:45

strings together and it's just clear

play16:46

that they have all the ingredients to

play16:48

compete for the number one spot in this

play16:49

race but only time will show and I'll be

play16:52

here covering it just like I'll be

play16:53

covering the next update here which is

play16:55

the fact that anthropic actually shipped

play16:57

their model to the entire world now so

play16:59

all the European users can finally use

play17:01

cloud free just as a refresher open eyes

play17:04

GPT 40 Google Geminis Advanced and

play17:07

claud's Opus model are considered the

play17:09

fre best AI models available today and

play17:12

all of this competition between open a

play17:13

and Google actually push them to ship

play17:16

this to the entire world which makes me

play17:17

wonder what's up with all these

play17:18

limitations on release usually all these

play17:20

tools come out and it's not accessible

play17:21

in the European Union the UK and a few

play17:23

more countries but now that the

play17:25

competition releases their tools to

play17:26

everybody they do it too I don't know I

play17:28

don't fully understand that maybe

play17:30

somebody can clarify in the comments I

play17:31

thought it was like a legal barrier that

play17:33

is unsurmountable but apparently it's

play17:34

not that hard to ship these things so

play17:36

both in the IOS app and in the web

play17:37

version no Android app available yet

play17:40

unfortunately you can use this from all

play17:42

around the world now but yeah now that

play17:44

gp4 is better at vision and free why

play17:46

should you pay $20 per month for an

play17:48

inferior model that is slower to be fair

play17:51

some people do like the writing style of

play17:52

CLA but I think that would pretty much

play17:54

be the only reason and talking about AI

play17:56

models that are inferior to GPT 40 but

play17:58

now open up to the European Union hey I

play18:00

now have access to Twitter's grock

play18:01

without using a VPN so to update you on

play18:04

this one it's pretty much a consensus

play18:05

across the entire space that there's no

play18:07

real reason to use this over some of the

play18:09

top models especially now that GPT 40 is

play18:12

free again I can't overstate how bold of

play18:14

a move that was by them but the one

play18:15

thing that Gro does really well is that

play18:17

it actually pulls in the Twitter feed so

play18:19

it's super up to date it doesn't need to

play18:20

browse the web it pulls in the Twitter

play18:22

feed and it's just aware of all the

play18:24

latest happenings in the world as

play18:25

Twitter is the place where a lot of news

play18:27

breaks or arrives at first and has

play18:29

access to that but the model really is

play18:31

not that great in every conversation

play18:32

around the best AI it usually doesn't

play18:34

even come up and that's for a reason so

play18:36

yeah that's what happened in the land of

play18:37

llms for this week let's move on to the

play18:39

next category which is other companies

play18:41

that came out with interesting Updates

play18:42

this week and they were completely

play18:43

overshadowed by all of this massive

play18:45

announcements between Google and open AI

play18:47

but this one is actually really

play18:48

interesting this is stable Artisan by

play18:50

stability ey the company behind stable

play18:52

diffusion and what they did is they

play18:53

created a Discord interface where they

play18:55

actually did something surprising which

play18:57

is pulled together multiple models they

play18:59

have so they took their different image

play19:01

generation models their video generation

play19:03

model and their music generation model

play19:05

and you can access all of this through

play19:08

one interface in Discord so look in

play19:10

practice it's very similar to Mid

play19:11

Journey but it has the ability to create

play19:13

videos and sounds too and look before we

play19:16

give this a shot and try this live here

play19:17

I just want to point out this is a PID

play19:19

tool it starts at $9 a month very

play19:20

similar to my journey but you do get

play19:22

free days for free if you just want to

play19:23

try this out just watch out they do make

play19:25

you commit with a credit card and then

play19:27

it just Auto renews after a free day

play19:28

days and for that you get 900 credits

play19:30

and these get used up as you use the

play19:32

tool so obviously generating video will

play19:34

take up more credits 20 as you can see

play19:36

versus using stable diffusion Excel

play19:38

which is around half a credit and yes

play19:40

this also includes access to stable

play19:41

diffusion free which they recently

play19:43

released this is their best model but it

play19:45

does cost six credits per generation oh

play19:47

and one more thing that I should point

play19:48

out here is that upscaling is 25 credits

play19:50

which is quite a bit so just be careful

play19:52

with upscaling only do that on pictures

play19:53

that you actually want to use and with

play19:54

all that being said let's get into this

play19:56

and here's an important note if you want

play19:57

to use this tool that as of not only

play19:59

works in Discord you also need to sign

play20:00

up and subscribe with your Discord

play20:02

account and once you do that you can

play20:04

head on over to the stable diffusion

play20:05

Discord server going into one of these

play20:07

Artisan rooms and say slash dream

play20:10

instead of Slash imagin MJ journey and

play20:12

you already know what we're going to

play20:13

prompt first cat with a hat let's go

play20:16

let's see what this gets us here with

play20:17

stable diffusion free all right very

play20:19

nice I like this first one and then we

play20:21

can keep working with this as I

play20:23

mentioned this is a combination of

play20:24

multiple tools so let me do some out

play20:26

painting on all sides where I add more

play20:28

cats with hats on all sides excellent

play20:31

excellent okay that didn't work let me

play20:33

just try some of the other features here

play20:34

let's turn this into a video and to the

play20:36

creative upscaling tool okay and we have

play20:39

to prompt it while upscaling so let me

play20:41

just repeat the prompt here keep the

play20:42

creativity at the default setting again

play20:44

this is just a first look here okay and

play20:46

let's review the video that it created

play20:48

here yeah there you go this is a typical

play20:50

stable diffusion video where it's a

play20:52

slight motion well and then at certain

play20:54

points it just morphs into unusable

play20:55

things but if you want a very slight

play20:57

animation on something something that's

play20:59

where this actually works it's just yeah

play21:01

it is what it is but the upscaler on the

play21:03

other hand look at that this looks

play21:05

excellent the original image of 350

play21:07

kiloby over here and then the new

play21:09

upscale diin at 2.5 megabytes over here

play21:12

wow look at the difference yeah day

play21:14

night so look I think it's really nice

play21:16

that they combined all of these tools in

play21:17

one interface obviously stable video is

play21:19

what it is if you're familiar with the

play21:21

tool it's just not that great but the

play21:22

upscaler here is actually really

play21:24

impressive and it's really convenient to

play21:26

have all of this in one interface so if

play21:28

you're looking to create many of these

play21:30

this is probably the most efficient

play21:31

workflow you can have with all of the

play21:32

tools including upscaling and video

play21:34

generation in one chat interface and

play21:37

look even though it is Discord it is the

play21:38

most userfriendly way to generate these

play21:41

rather than having multiple websites and

play21:42

having to download and re-upload files

play21:44

across the place to generate videos it's

play21:46

just a welcome addition that brings

play21:47

together multiple of their tools one

play21:49

thing that I am missing here is the

play21:51

audio that they promised on the sales

play21:52

page right after a little review they

play21:54

actually did not promise the audio and

play21:56

the blog post but it is included in the

play21:58

announce video so that's probably coming

play22:00

soon and while on the topic of AI audio

play22:02

I just quickly want to point you towards

play22:03

this announcement by 11 Labs they're

play22:05

working on a music model which is not

play22:07

available today it's just too good not

play22:08

to show off just listen to this as

play22:10

they're super good at recreating voices

play22:12

their rap abilities are best in class

play22:14

have a quick

play22:20

listen was sh the Paradigm boldly

play22:24

advancing no fearing Prime I don't know

play22:26

about you but to me this does pass the

play22:28

touring test yes this sounds like a

play22:29

human being yet again it's not available

play22:31

today I just wanted to bring it up as

play22:32

we're talking about audio okay and I got

play22:34

one more tool for you this week and that

play22:35

is this tool called icy light which

play22:38

comes with a hugging face Bas so you can

play22:39

really easily try it and basically this

play22:41

allows you to relight images with AI so

play22:44

basically inut something like this and

play22:45

say Sunset over C and it changes it into

play22:47

this this is not just image generation

play22:49

but we're starting to get image editing

play22:51

capabilities with AI few more examples

play22:53

of a Husky a I love

play22:57

huskys turn turned into a Sci-Fi RGB

play23:00

glowing magically lit husky or youve

play23:03

better at turning something simple like

play23:04

this into a beauty photo shoot now let

play23:06

me briefly try this myself cuz these

play23:08

examples are usually cherry-picked let's

play23:10

take this high quality Instagram worthy

play23:13

picture here and let's use one of the

play23:15

prompts that they use in their examples

play23:16

as I want to keep it fair I don't want

play23:17

to switch it up too much I'm just going

play23:19

to change the first part to man and then

play23:21

keep everything on the default settings

play23:22

and I'll just say relight I'm super

play23:24

curious to see what we get here first

play23:26

try no editing no two takes okay 10

play23:29

seconds later okay that's not that bad

play23:31

I'll slightly vary The Prompt and run it

play23:33

one more time not bad look at that it

play23:35

put me into a forest it adjusted the

play23:37

lightness and the colors of the image to

play23:39

actually fit it it perfectly color match

play23:40

it I actually really like this result so

play23:42

look at that this is just a demo but

play23:43

soon we will have these tools built into

play23:46

interfaces like we saw with stable Artis

play23:48

and bringing it all together and when we

play23:49

combine something like this with GPT

play23:51

40's New Image generation capabilities

play23:54

you're not going to need Photoshop for

play23:56

most use cases anymore it's going to

play23:58

generate exactly what you want with the

play23:59

correct textt with the character

play24:01

consistency of it just by uploading one

play24:03

image of yourself and then you're going

play24:04

to be able to relight it with tools like

play24:07

this that eventually will all be baked

play24:08

into one tool hm the future is going to

play24:10

get interesting to say the very least

play24:12

and with that being said I hope you have

play24:13

a great day I'll see you soon

Rate This

5.0 / 5 (0 votes)

Etiquetas Relacionadas
AI InnovationsGPT-4Image GenerationVideo ModelsGoogle AIOpenAI UpdatesAI AssistantsTech GiantsMultimodal AIAI Applications
¿Necesitas un resumen en inglés?