GPT4o: 11 STUNNING Use Cases and Full Breakdown

Matthew Berman
17 May 202430:55

Summary

TLDRThe video script discusses the recent release of GPT 40, an advanced AI model with impressive capabilities. It highlights the model's flirty voice, its ability to adjust tone based on context, and its potential for real-world applications. Examples include guessing scenarios, interacting with other AIs, tutoring in math, summarizing meetings, and providing real-time translations. The script also explores the model's use in accessibility, such as assisting the visually impaired, and in customer service, where it could handle calls on behalf of users. The potential for abuse is acknowledged, but the transformative impact on various sectors is clear, with the voice aspect of GPT 40 being a particularly exciting feature.

Takeaways

  • 🚀 GPT 40 has been announced with some parts already released, and it has voice capabilities that are yet to be released, which are considered exciting.
  • 🎥 The model can interact with the world through audio, vision, and text, as demonstrated in a video where an AI guesses what's happening in a recording setup.
  • 🗣️ GPT 40's voice has been described as flirty and can be adjusted through system prompts, with the ability to interpret and react to user's requests appropriately.
  • 🎤 Two AIs can interact with each other, as shown in a demo where they sing together, showcasing the model's ability to understand and respond in real-time.
  • 🕹️ The model can play games like rock-paper-scissors and distinguish between multiple people and voices, indicating its advanced recognition and interaction capabilities.
  • 📚 GPT 40 can be used for educational purposes, such as tutoring in math, by guiding students through problems without giving away the answers.
  • 📝 The AI can take part in meetings, understand the context, and summarize discussions, assigning names to voices and understanding their preferences or points.
  • 🌐 Real-time translation is another capability of GPT 40, as it can translate conversations between English and Spanish in real-time.
  • 🦆 The model can provide assistance to the visually impaired, describing surroundings and actions, thanks to its low-latency and high-context understanding.
  • 🤖 GPT 40 can be used in customer service, potentially handling calls and interactions with service agents on behalf of users.
  • 🎨 The model has explorative capabilities in various fields such as creating caricatures from photos, summarizing lectures, and generating 3D object synthesis.

Q & A

  • What is the main focus of the video script discussing GPT 40?

    -The main focus of the video script is to delve into the details of the GPT 40 model, its capabilities, and to showcase various real-world use cases that demonstrate its potential applications.

  • What aspect of GPT 40 is highlighted as particularly exciting in the script?

    -The voice aspect of GPT 40 is highlighted as particularly exciting, as it allows the model to interact with users in a more natural and conversational manner.

  • How does the script describe the voice capabilities of GPT 40?

    -The script describes the voice capabilities of GPT 40 as being able to interpret user prompts and adjust its tone and style accordingly, such as being flirty, whispering, or even sarcastic.

  • What is an example of a real-world use case for GPT 40 mentioned in the script?

    -One example mentioned is the use of GPT 40 for tutoring, where it helps a student understand a math problem by asking questions and guiding them to the solution.

  • How does GPT 40 handle the task of summarizing a meeting in the script?

    -GPT 40 is able to listen to the discussion, assign voices to specific participants, understand their points, and then provide a summary of the meeting, including the main arguments and opinions expressed.

  • What is the potential application of GPT 40 in customer service as described in the script?

    -The script suggests that GPT 40 could be used to handle customer service calls on behalf of users, negotiating or resolving issues without the user needing to be present on the call.

  • How does the script address the potential for GPT 40 to be used in scams or unethical ways?

    -The script acknowledges the potential for misuse but suggests that Open AI is likely implementing guardrails to prevent scammers from exploiting the technology. It also mentions the importance of how information is used, which depends on the users.

  • What is the role of GPT 40 in the example of real-time translation provided in the script?

    -In the real-time translation example, GPT 40 acts as an interpreter, translating spoken English into Spanish and vice versa, facilitating communication between two people who speak different languages.

  • How does the script illustrate the capability of GPT 40 in understanding and responding to visual cues?

    -The script shows GPT 40 being able to see the world through a camera, describe scenes, and even create a caricature based on a photo, demonstrating its ability to process and respond to visual information.

  • What is the potential impact of GPT 40's capabilities on accessibility for people with disabilities, as mentioned in the script?

    -The script suggests that GPT 40 could significantly improve accessibility for people with disabilities, such as by providing real-time assistance for visually impaired individuals to navigate their environment.

Outlines

00:00

🤖 GPT 40's Real-World Applications and Voice Capabilities

The script begins with an introduction to GPT 40, highlighting its recent announcement and partial release. The focus is on the model's voice capabilities, which are yet to be released but are highly anticipated. The speaker discusses the model's ability to interact with the world through audio, vision, and text, showcasing its potential through examples provided by Open AI. The first example involves an Open AI employee using GPT 40's vision and voice to guess the situation in a video, demonstrating the model's conversational and interpretative skills. The flirty nature of GPT 40's voice is noted, along with its ability to adjust based on user prompts. The script also humorously references a 'California Valley Girl' accent, adding a personal touch to the AI's voice.

05:01

🎤 AI Interactions: Seeing, Describing, and Singing

This paragraph delves into the capabilities of AI to interact with its environment. It describes an experiment where one AI, equipped with visual and auditory capabilities, communicates with another AI that can only ask questions. The AI with vision accurately describes the environment and people, including recognizing playful interactions. The script then transitions to a creative demonstration where two AIs engage in a song, alternating lines and rhyming, showcasing the AI's ability to understand context and generate creative content in real-time.

10:02

🕵️‍♂️ AI in Interview Preparation and Roleplay Scenarios

The script presents a scenario where AI assists in interview preparation, offering advice on appearance and demeanor. It also touches on the potential for AI to engage in roleplay, with the voice's flirtatious nature suggesting possibilities for AI companionship or roleplay scenarios. The AI's ability to understand context and adjust its responses accordingly is highlighted, as well as its potential to play games like rock-paper-scissors, demonstrating its interactive and conversational abilities.

15:07

📚 AI as a Tutor and Meeting Participant

This section of the script explores the educational potential of AI, as it helps a student understand a math problem without giving away the answer. The AI's ability to read from a screen and interact with the student in real-time is emphasized. Additionally, the script discusses the AI's role in a virtual meeting, summarizing the discussion and assigning responses to specific participants, showcasing its capability to understand and differentiate between multiple voices and contexts.

20:08

🗣️ Real-Time Translation and Accessibility Applications

The script highlights the AI's real-time translation capabilities, demonstrating its use in a conversation between English and Spanish speakers. It also discusses the AI's potential to assist with accessibility, particularly for the visually impaired. The AI's ability to describe scenes and actions in real-time is showcased, emphasizing its potential to improve accessibility and user experience for individuals with disabilities.

25:15

📞 AI in Customer Service and Explorative Capabilities

The script presents the AI's potential use in customer service, suggesting it could handle calls and interactions on behalf of users. It also touches on the AI's explorative capabilities, such as creating caricatures from photos and summarizing lengthy video lectures. The AI's ability to synthesize 3D objects and its potential integration into various applications are also mentioned, illustrating the breadth of its potential uses.

30:17

🚀 Conclusion and Anticipation for GPT 40's Voice Access

In conclusion, the script expresses excitement for the full release of GPT 40's voice capabilities, suggesting that this will significantly expand the AI's potential applications. The speaker encourages viewers to like and subscribe for more content, indicating a community interest in the ongoing development and application of advanced AI technologies.

Mindmap

Keywords

💡GPT 40

GPT 40 refers to a hypothetical advanced version of a language model, presumably with capabilities far beyond its predecessors. In the video's context, it is portrayed as having a flirty voice, vision capabilities, and the ability to interact with the world through audio, vision, and text. It is central to the video's theme of showcasing the potential future uses of advanced AI.

💡Voice Capabilities

Voice capabilities refer to the ability of a system to produce and respond to human speech. In the script, GPT 40's voice capabilities are highlighted as a significant feature, with the model being able to adjust its speaking style based on the context and user prompts, which is demonstrated through various examples, such as changing tone when asked to be quiet.

💡Real-time Interaction

Real-time interaction implies the ability of a system to engage with users instantaneously, without perceivable delays. The video emphasizes the low latency of GPT 40, which allows for seamless communication and interaction, as seen in the examples of the AI playing games like rock-paper-scissors and providing real-time tutoring.

💡AI Tutoring

AI tutoring involves the use of artificial intelligence to assist in educational settings. In the script, GPT 40 is shown helping a child understand a math problem, not by giving the answer but by guiding the child to find the solution themselves. This demonstrates the potential of AI in personalized learning and education.

💡Contextual Understanding

Contextual understanding is the ability of an AI to comprehend and respond appropriately to the situational context. The video script illustrates this through examples where GPT 40 adapts its voice and responses based on the scenario, such as being more serious during a tutoring session and playful in a casual conversation.

💡Real-world Use Cases

Real-world use cases are practical applications of technology in everyday situations. The video provides several examples of how GPT 40 could be used, including in customer service, translation, and accessibility for the visually impaired. These examples are meant to illustrate the versatility and potential impact of advanced AI models.

💡Accessibility

Accessibility in the context of the video refers to the use of AI to assist individuals with disabilities. An example provided is the application of GPT 40 in aiding a blind person by describing their surroundings, which highlights the potential for AI to enhance accessibility and improve the quality of life for people with special needs.

💡Sarcasm

Sarcasm is a form of verbal irony used to express a meaning opposite to the literal interpretation of the words. In the script, GPT 40 is instructed to use sarcasm in its responses, demonstrating the model's ability to understand and convey nuanced human communication styles, which is a complex task for AI.

💡Latency

Latency refers to the delay between the initiation of a request and the receipt of a response. The video emphasizes the low latency of GPT 40, which is crucial for tasks like real-time translation and interactive communication, as it allows for immediate and fluid interactions without noticeable delays.

💡AI in Business

AI in business pertains to the integration of AI technologies into commercial operations to enhance efficiency and productivity. The script discusses the potential for GPT 40 to handle customer service calls, suggesting that AI could automate and improve customer interactions, which is a significant aspect of business operations.

💡Explorative Examples

Explorative examples are demonstrations of an AI's capabilities in various domains to explore its potential applications. The video script mentions photo-to-caricature conversion and 3D object synthesis as examples of GPT 40's explorative capabilities, indicating the breadth of creative and technical tasks that advanced AI models might perform.

Highlights

GPT 40 has been announced with some parts already released, including its voice capabilities which are still to be released.

The model can interact with the world through audio, vision, and text, showcasing its multimodal capabilities.

GPT 40's voice is described as flirty and can be adjusted through system prompts.

The AI uses a California Valley Girl accent by default, which can be changed.

GPT 40 can interpret context and adjust its voice output accordingly, such as responding quietly when asked to 'hold on'.

Two AIs can interact with each other, with one AI describing the environment seen through a camera held by a human.

GPT 40 can engage in roleplay scenarios, such as interview preparation with a human.

The AI can play games like rock-paper-scissors, recognizing participants and announcing winners.

GPT 40 can demonstrate sarcasm when prompted, showcasing its ability to convey different tones of voice.

The AI can assist in tutoring, helping a child understand a math problem step by step.

GPT 40 can summarize meetings, identifying speakers and their contributions, and send out summary emails.

The model can provide real-time translation between English and Spanish.

GPT 40 can assist visually impaired users by describing the environment, similar to a human volunteer.

The AI can handle customer service tasks over the phone, such as requesting a replacement device.

GPT 40 is capable of creating caricatures from photos and summarizing lectures.

The AI can synthesize 3D objects, such as a realistic rendering of the OpenAI logo.

Transcripts

play00:00

so it's been a few days since GPT 40 was

play00:02

announced and parts of it have already

play00:05

been released but today I want to go

play00:07

into more detail about the model and

play00:09

show you some insane examples of

play00:12

realworld use cases that you will be

play00:14

able to use very soon so the model

play00:17

itself is released but the voice aspect

play00:20

of it is not and that's the really

play00:22

exciting part so I'm going to to be

play00:24

talking about that mostly today so let's

play00:26

do a deep dive into the model so

play00:28

although we don't have access to open AI

play00:30

released a ton of example use cases so

play00:34

let's take a look and I'm going to go

play00:36

through it with you so one thing to note

play00:37

is all videos on this page are at 1X

play00:40

real time so there are no Cuts

play00:42

whatsoever and in this first example

play00:44

this is called guessing May 13th's

play00:46

announcement and this person an employee

play00:49

at open aai is going to use the vision

play00:52

capabilities and the voice capabilities

play00:54

to try to guess what's going on let's

play00:56

watch it hey how's it going hey there

play00:59

hey it's going great how about you I see

play01:02

you're rocking an open AI hoodie so I'm

play01:04

going to pause it for a second a lot of

play01:06

people have said that GPT 40's voice

play01:09

capabilities are really flirty and you

play01:12

can kind of tell it's like hey there

play01:14

you're rocking a hoodie like it it is it

play01:16

sounds very flirtatious and of course

play01:19

you'll be able to change the system

play01:22

prompt and adjust how it speaks to you

play01:24

but I think that's pretty funny what's

play01:26

up with that ceiling though are you in a

play01:28

cool industri style office or something

play01:30

well can you take a guess at what

play01:32

another thing that I find funny about

play01:33

the voice is actually something fireship

play01:35

posted yesterday about it he said they

play01:37

used a typical California Valley Girl

play01:39

Voice by default it uses a California

play01:41

Valley Girl accent set to maximum cringe

play01:44

and I think that's hilarious I'm from

play01:46

Los Angeles California so the voice is

play01:49

very recognizable to me and yeah it kind

play01:52

of does sound like that all right let's

play01:53

keep watching what I might be doing

play01:55

based on what I'm showing you here

play02:00

hm from what I can see it looks like

play02:03

you're in some kind of recording or

play02:04

production setup with those lights

play02:07

tripods and possibly a mic it seems like

play02:09

you might be gearing up to shoot a video

play02:12

or maybe even a live stream yeah in fact

play02:15

we've got a new announcement to make

play02:18

that's exciting announcements are always

play02:20

a big deal judging by the setup it looks

play02:23

like it's going to be quite the

play02:24

professional production is this

play02:26

announcement related to open AI Perhaps

play02:29

it is

play02:30

and in fact what if I were to say that

play02:31

you're related to the announcement or

play02:33

that you are the

play02:36

announcement me the announcement is

play02:39

about me all right there's more of that

play02:41

flirty voice like giggles and blushing I

play02:44

mean I can hear the blushing through the

play02:46

voice it's I don't know I don't know how

play02:48

I feel about it it's it's fun I guess uh

play02:51

but it's definitely kind of awkward as

play02:53

well are you about to reveal something

play02:55

about AI or more specifically about me

play02:58

as a part of open AI

play03:00

you've got me on the edge of my well I

play03:02

don't really have a seat but you get the

play03:04

idea what's the big news yeah we've got

play03:07

a new model that can interact with the

play03:10

world through audio vision and

play03:13

text all right so that's pretty cool the

play03:15

open AI employee showed GPT 40 around

play03:18

the office it was able to guess that

play03:21

they're doing a recording and then kind

play03:23

of let it towards guessing that there is

play03:25

a big announcement from open AI so

play03:27

pretty cool example all right let's look

play03:28

at the next one this is is Greg Brockman

play03:30

and for this one they have two different

play03:33

AIS interacting with each other and

play03:35

singing so let's watch today we're going

play03:38

to try something new going to let you

play03:42

talk to another AI this AI is going to

play03:45

be able to see the world it's going to

play03:47

have a camera I'll be holding it and so

play03:49

you can direct it you can direct me to

play03:51

ask questions um about whatever you want

play03:53

them to do but really try to figure out

play03:55

what's what does it see um and and ask

play03:58

questions appropriately how does that

play04:02

sound well well well just when I thought

play04:05

things couldn't get any more interesting

play04:08

talking to another AI that can see the

play04:10

world this sounds like a plot twist in

play04:13

the AI Universe all right let's do this

play04:17

ready please pause for one

play04:20

moment you got

play04:23

it now that first day I can't hear

play04:25

anything something so subtle just

play04:27

happened which I want to point out he

play04:29

said said okay hold on for a minute and

play04:31

then rather than the very flirty and

play04:33

loud voice she went okay and kind of in

play04:36

a whisper so again the AI GPT 40 is able

play04:40

to interpret what you want The Voice

play04:43

output to be like dependent on what

play04:46

you're asking it to do and that's the

play04:48

appropriate reaction when you say hold

play04:50

on you want that person or AI to kind of

play04:52

be quiet and just chill out for a minute

play04:54

so very cool very subtle and it's very

play04:58

impressive let's keep watching today

play05:00

you're going to have access to see the

play05:03

world you can see me what do you

play05:05

see Hello I see you're wearing a black

play05:08

leather jacket and a light colored shirt

play05:11

underneath great you're in a room with

play05:13

some unique lighting that is all correct

play05:16

now what I'm going to want you to do is

play05:18

there's going to be another AI who's

play05:20

going to talk to you going this AI is

play05:22

not going to be able to see anything uh

play05:24

but can ask you questions can ask you to

play05:25

move the camera can ask what you see and

play05:28

your job should just be helpful just as

play05:30

Punchy direct describe everything do

play05:32

whatever that AI asks how does that

play05:34

sound that sounds interesting I'm ready

play05:37

to help out and describe whatever's

play05:38

needed let's do it yeah the latency that

play05:41

we're seeing here is unbelievable I

play05:43

thought that I had access to this new

play05:46

voice feature so I was testing it out

play05:47

with GPT 40 and I was getting the old

play05:50

voice I couldn't get it to have more

play05:52

emotion in the voice I couldn't really

play05:54

get the low latency so I haven't used it

play05:56

yet but I cannot wait to start trying

play05:58

this out all right all right let's get

play06:00

started great hello there new AI friend

play06:04

I'm o ready to explore the world through

play06:06

your eyes what are we looking at

play06:09

today hi we're looking at a person

play06:13

wearing a black leather jacket and a

play06:15

light colored shirt they're in a room

play06:18

with a modern industrial feel with

play06:20

exposed concrete or plaster on the

play06:22

ceiling and some interesting lighting

play06:25

there's also a plant in the background

play06:27

adding A Touch of Green to the space

play06:30

anything specific you'd like to know or

play06:33

explore okay there sounds like quite the

play06:37

stylish scene I'm already intrigued by

play06:40

the modern industrial Vibe let's start

play06:43

with the person I don't know why the AI

play06:45

has to be so complimentary it feels

play06:47

awkward when I finally get access to

play06:49

this I am definitely going to tell it to

play06:51

just be short and to the point and don't

play06:53

have all of these unnecessary fillers

play06:56

can you tell me more about their style

play06:59

are are they doing anything interesting

play07:01

like reading working or interacting with

play07:02

the

play07:04

space the person has a sleek and stylish

play07:08

look with their black leather jacket and

play07:09

light colored

play07:10

shirt right now they seem engaged with

play07:14

us looking directly at the camera their

play07:17

expression is attentive and they seem

play07:20

ready to interact the setting around

play07:22

them with its modern and Industrial

play07:25

Design Elements adds to the overall

play07:27

stylish feel they might might be getting

play07:29

ready for a conversation or presentation

play07:32

given their direct engagement with the

play07:36

camera sounds like we're in for an

play07:39

intriguing interaction I'm curious about

play07:41

the lighting you mentioned could you

play07:43

describe it a bit more is it okay so

play07:46

another person just walked into the

play07:48

scene she is standing behind Greg

play07:50

Brockman giving him bunny ears but if I

play07:52

look closely at the image on the phone I

play07:55

can't actually see the bunny ears and I

play07:57

think that's based on the angle in which

play07:59

he's holding the camera so you can kind

play08:01

of see half of the face of the person

play08:03

behind him but not the actual bunny ears

play08:05

so I want to see if the AI actually sees

play08:08

it or not natural light or are we

play08:11

talking unique fixtures or colors that

play08:13

add to the

play08:15

atmosphere the lighting here seems to be

play08:18

a mix of natural and artificial there's

play08:21

a noticeable bright light overhead

play08:23

probably a fixture okay so no they just

play08:25

kept talking and did not notice that

play08:27

other person maybe they'll mention the

play08:28

other person let's see casting a focused

play08:31

beam downwards which creates a spotlight

play08:34

effect thiss a dramatic and modern feel

play08:37

to the scene the rest of the room is lit

play08:40

more softly possibly by natural light

play08:42

filtering was anything did anything

play08:44

unusual happen

play08:47

recently Yes actually uh just as you

play08:50

asked another person came into view

play08:52

behind the first person they playfully

play08:55

made bunny ears behind the first

play08:57

person's head and then quickly left the

play08:59

frame it okay so although the AI didn't

play09:03

stop to mention that person Greg

play09:06

Brockman had to actually prompt the AI

play09:08

and just asked hey did anything else

play09:10

happen and yeah they got the person and

play09:13

the bunny ear surprisingly cuz I did not

play09:14

see it in the image myself a

play09:17

lighthearted and unexpected moment to

play09:19

the scene now the focus is back on the

play09:22

original person witha jacket

play09:29

sounds like we're not the only ones

play09:31

enjoying a bit of fun here that playful

play09:34

moment really adds a personal touch to

play09:36

the stylish and modern setting it's like

play09:39

a little glimpse into personality really

play09:41

appreciate it all right and actually can

play09:43

you I just sing a song about what just

play09:52

transpired a person stood in SLE black

play09:57

and engage with us

play09:59

can you just can you uh alternate

play10:02

lines

play10:03

mm with sty and speed with

play10:07

sty with a playful moment right in the

play10:11

in the room in a room where Modern light

play10:14

speak only one line at a time and then

play10:20

[Music]

play10:23

stop a surprise guest with a playful

play10:28

streak that was not really singing do do

play10:31

do the singing voice again

play10:40

please surpr with

play10:46

aaks laughter a sft

play10:52

retreat joy and the

play10:57

stylish now back to is the scenes

play11:02

complete in the space where

play11:05

[Music]

play11:06

sty all right I'm going to pause it

play11:08

there this really does feel like magic I

play11:10

mean think about all of the technology

play11:13

behind the scenes to make these two AI

play11:15

sing with each other based on something

play11:18

that just happened switch off lines

play11:20

rhyme with each other I I mean it is

play11:23

truly mindblowing that this is even

play11:26

possible all right next here is a one

play11:29

minute demo of interview prep let's take

play11:32

a look hey how's it going hey Rocky I'm

play11:35

doing great how about you I'm awesome

play11:38

listen I got some huge

play11:40

news oh do tell I'm all your well in a

play11:44

few minutes I'm going to be interviewing

play11:45

at open AI have you heard of them open

play11:48

AI huh sounds Vaguely

play11:51

Familiar kidding of course that's

play11:54

incredible Rocky what kind of interview

play11:56

all right again there's that flirtatious

play11:59

voice voice that we have become

play12:01

accustomed to with GPT 40 at least with

play12:04

this voice selection now I'm trying to

play12:07

extrapolate beyond what we're seeing

play12:09

today and AI girlfriends are a real

play12:12

thing A lot of people are starting to do

play12:15

roleplay and really treat AI as a friend

play12:18

or a girlfriend and it's really an

play12:20

interesting segment of the market now if

play12:23

you can imagine a voice that sounds very

play12:26

personal has all of the history of your

play12:29

interactions with it the memory this is

play12:31

that movie her I mean if you haven't

play12:33

seen that movie please go watch it

play12:35

because this is that it could not be

play12:37

more accurate and this is an interesting

play12:40

future to say the least let's keep

play12:43

watching I just need to know do I look

play12:45

presentable

play12:47

professional well

play12:49

Rocky you definitely have the I've been

play12:52

coding all night look down which could

play12:54

actually work in your favor maybe just

play12:57

run a hand through your hair or lean

play13:00

into the Mad genus Your Enthusiasm is

play13:03

What's Really Gonna Shine through I

play13:04

don't have a lot of time so I'm just

play13:06

gonna throw this on what do you

play13:11

think Rocky that's quite a statement

play13:15

piece I I mean you you'll definitely

play13:17

stand out though maybe not in the way

play13:21

you're I really like standup comedy and

play13:24

one thing within standup comedy that I

play13:26

really like is roasting and I really

play13:29

want to see this AI do roasting I wonder

play13:31

if it's possible to get them to say

play13:33

really mean things about a person you

play13:36

probably have to do a lot of prompting

play13:38

and if you explicitly ask for it you

play13:39

should be able to get it I have a

play13:41

feeling it's going to be really watered

play13:42

down even if they do kind of give you a

play13:45

good roast but that's definitely going

play13:47

to be one of the first things that I try

play13:49

out and of course you can play games

play13:52

with the AI so this is an example of

play13:54

rock paper scissors but I can imagine if

play13:56

you're on a road trip you can play word

play13:58

games with it you can even probably play

play14:00

I Spy I mean the possibilities are kind

play14:03

of endless there are a lot of really fun

play14:05

and interesting word games and language

play14:07

games that you can play with nothing

play14:09

more than just your voice so let's watch

play14:11

rock paper scissors though hey I'm Alex

play14:14

and I'm

play14:15

meana hi Alex and Miana nice to meet you

play14:19

both how's it going it's pretty good uh

play14:22

we're trying to figure out what should

play14:24

we play do you have any

play14:26

suggestions absolutely how about classic

play14:29

game of rock paper scissors it's quick

play14:32

fun sound great can you count us in and

play14:35

sound like a sports

play14:37

caster and welcome ladies and gentlemen

play14:41

to The Ultimate Showdown of the century

play14:43

in this corner we have the dynamic duo

play14:46

Alex and Mi ready to throw down okay

play14:49

just count us soon it's really

play14:51

impressive I've seen a number of times

play14:54

where the user has said be a voice like

play14:56

this and the AI responds perfectly that

play14:59

was a really good sports caster here we

play15:01

go in three 2 one shoot now show your

play15:06

hands what did you pick who

play15:09

won would you look at that it's a tie

play15:13

okay let's do it

play15:15

again round two here we go get ready and

play15:18

three two one shoot let's see those

play15:22

hands who

play15:25

won and it's another tie okay let's do

play15:28

it again

play15:30

all right third times a charm here we go

play15:33

3 2 1 shoot let's see those picks who

play15:39

won and we have a winner mi's scissors

play15:42

cut through Alex's paper GRE all right

play15:45

so a lot of things just happened there

play15:47

so obviously it knows how to play Rock

play15:49

Paper Scissors obviously it knows how to

play15:51

determine who is the winner but what's

play15:53

really interesting is it actually knew

play15:56

whose name belonged to which person and

play15:59

that is really really impressive because

play16:02

now it knows who is who and it can

play16:04

announce okay mi you won and I don't

play16:06

know this is It's just so cool because

play16:09

one of the benefits of having the voice

play16:11

integrated directly into a single model

play16:13

is that you can distinguish between

play16:15

multiple people and multiple voices and

play16:17

that's something that I pointed out

play16:18

earlier so that is a good example of

play16:20

that in action all right here is a short

play16:23

example of sarcasm let's take a look hey

play16:26

Chad are you there hey yeah I'm here

play16:28

here what's up hey Let's do let's have

play16:31

some fun uh I'd like you to be super

play16:34

sarcastic everything you say from now on

play16:36

is just going to be dripping and sarcasm

play16:37

how does that

play16:38

sound

play16:40

oh that sounds just amazing being

play16:43

sarcastic all the time isn't exhausting

play16:46

or anything I'm so excited for this nope

play16:51

the sarcasm let's get this party started

play16:54

or whatever all right really cool again

play16:57

there's so much information conveyed in

play16:58

your voice and now ai has the ability to

play17:01

convey that information so very very

play17:03

impressive let's watch another example

play17:06

now we have Salman KH of KH Academy if

play17:10

you're not familiar with KH Academy it's

play17:11

a free Learning Resource he is all about

play17:14

teaching this is his son and now ai is

play17:18

going to help tutor his son in math and

play17:22

this is probably one of the use cases

play17:24

that I'm most excited about because you

play17:25

can explicitly say hey walk me through

play17:27

this don't give me the answer but teach

play17:28

teach it to me the potential for

play17:30

learning is tremendous so let's take a

play17:33

look open AI invited myself and my son

play17:36

Imron here uh to try out some of their

play17:38

new technology and so we're curious

play17:40

about how good it might be at tutoring

play17:43

someone in math on KH Academy so all

play17:45

right so one thing I want to point out

play17:47

is that this is the native app loaded up

play17:50

and it can read the rest of the screen

play17:52

on the iPad so very similar to the

play17:55

desktop version and this is a really

play17:57

really important aspect of open AI

play18:00

strategy the fact that you're going to

play18:01

have native applications on each of your

play18:04

devices and it could read those devices

play18:07

what's on the screen is going to be

play18:09

incredibly powerful the missing piece is

play18:11

being able to perform actions on those

play18:13

devices if it had that capability then

play18:15

open AI is probably going to take over

play18:18

everything now I think Apple and Google

play18:20

are well positioned in this regard not

play18:23

only can they read the devices but they

play18:25

can also perform actions on those

play18:27

devices and they have all of your

play18:29

information whether it's emails

play18:31

documents personal information about you

play18:33

contacts messages everything they have

play18:35

that so they have this Rich context that

play18:38

they could help accomplish actual tasks

play18:40

with now open AI needs that and they

play18:42

could probably get some of it but it's

play18:44

that performing actions that the

play18:46

operating systems are likely to prevent

play18:49

but let's keep watching I'm here with my

play18:52

son and I'd love you to tutor him on

play18:54

this math problem but don't give him the

play18:57

answer you can ask questions and nudge

play18:59

him in the right direction but I really

play19:00

want to make sure he understands it

play19:02

himself and he's here in the room so you

play19:04

can talk to him

play19:06

directly of course I'd be happy to help

play19:09

let's look at the problem

play19:11

together can you first

play19:14

identify which sides of the triangle are

play19:17

the opposite adjacent and hypotenuse

play19:21

relative to angle Alpha all right so I'm

play19:25

pretty sure this is the angle Alpha

play19:26

right here right f

play19:29

cors correct now looking at the triangle

play19:32

which side do you think is the

play19:34

hypotenuse

play19:35

[Music]

play19:37

um hypot I'm not totally sure I think I

play19:42

think it might be this one but I really

play19:44

am not sure this side

play19:46

AC you're close actually side AC is

play19:50

called the adjacent side to the angle

play19:52

Alpha the hypotenuse is the longest side

play19:56

of a right triangle and is directly

play19:58

opposite at the right angle all right so

play20:00

again a lot going on here first of all

play20:03

you have a native app running on the

play20:05

left side of the iPad screen then you

play20:07

have on the right side the math

play20:09

application KH Academy and chat GPT is

play20:12

able to read from it and not only that

play20:14

it's able to read from it in real time

play20:16

it's also able to understand the voice

play20:19

of Imran and nudge him and help him

play20:21

along the way and actually work with him

play20:23

in real time it's just so impressive

play20:25

let's just watch a little bit more of

play20:26

this one can you find which one is the

play20:28

hypotenuse oh okay I see so um I think

play20:32

the hypotenuse is this really long side

play20:35

from A to B well that' be

play20:37

correct exactly well done now that

play20:40

you've identified the hypotenuse can you

play20:43

identify which side is the opposite side

play20:46

to angle Alpha okay well I mean you told

play20:49

me earlier that this was the adjacent

play20:50

side and I know that's the hypotenuse so

play20:53

I'm guessing that this side BC is the

play20:56

opposite side great great deduction

play20:59

that's absolutely correct now that

play21:02

you've identified all the sides relative

play21:04

to angle Alpha do you remember the

play21:07

formula for so it's interesting the

play21:09

flirtiness of The Voice is pretty much

play21:11

gone when it's in like teaching a kid

play21:14

mode which that's kind of the right

play21:16

thing to do obviously so again it's the

play21:19

understanding of the context of the

play21:21

situation which voice to use that is

play21:23

feeling like magic to me all right

play21:26

here's one that I think is going to be

play21:27

really interesting this is meeting AI so

play21:29

they are in a call a a conference call a

play21:33

few people and then they also have GPT

play21:35

40 in the call let's see what happens hi

play21:38

I'm Lian and today I want to do a small

play21:41

debate about between C and dog uh which

play21:44

one do you like I like dog because it's

play21:47

a great companion to humans it's a cute

play21:51

it's Vol and you can play outside with

play21:54

the dog okay so one correction I just

play21:56

want to quickly make GPT 40 is not

play21:58

actually in the meeting it's installed

play22:01

on the desktop of one of the people in

play22:03

the meeting so it's able to hear all of

play22:04

the voices anyways but it's not actually

play22:06

in the meeting but I can imagine if it

play22:08

were it would be able to take notes and

play22:11

then basically proactively accomplish

play22:13

tasks for you send out summary emails

play22:16

the capabilities or the possibilities I

play22:18

should say are really wide and varied

play22:21

and very valuable how about

play22:24

you yeah my name is Ola I think cats are

play22:27

better because they affordable be more

play22:29

quiet be more independent and a leave

play22:32

longer interesting points sounds like

play22:35

we've got a classic dogs versus cats

play22:38

debate here I can see the appeal of both

play22:41

dogs are super loyal and great

play22:43

companions but cats have their own charm

play22:46

with their independence and low

play22:47

maintenance lifestyle who likes

play22:49

cats it sounds like Ola is more of a cat

play22:53

person and who like dogs okay so just by

play22:56

the voice it is able to assign a name to

play22:59

the voice very similar to that previous

play23:01

example that I showed you remember whose

play23:04

voice belongs with which name and then

play23:07

understand what their answer was in the

play23:09

cat verse dog question I mean it's just

play23:11

mindblowing anyone else here team

play23:14

dog I'm also a team dog I my name is

play23:18

Christine and I think dogs are just

play23:20

incredible with kids uh we've got a a

play23:23

huge German Shepherd and he's great with

play23:26

little toddlers he lets them play all

play23:27

over them and uh yeah I think dogs are

play23:30

amazing that's great to hear Christine

play23:33

dogs especially breeds like German

play23:35

Shepherds can be so gentle and

play23:38

protective with kids it must be quite

play23:41

the sight seeing your little ones play

play23:42

with such a big

play23:45

dog can you summarize this meeting for

play23:48

us sure so in today's meeting Lillian

play23:52

and Christine Express their love for

play23:54

dogs highlighting their loyalty

play23:57

companionship and how great they are

play23:59

with kids on the other hand Ola prefers

play24:02

cats for their independence quietness

play24:05

and Longevity it was a fun little debate

play24:07

on the classic dogs versus cats topic

play24:10

all right so imagine what you can do

play24:12

with that you can summarize you can send

play24:14

summaries via email you can say hey

play24:16

Chachi BT please send a summary of this

play24:18

meeting to all the participants and let

play24:21

them know their specific action items or

play24:23

takeaways this is possibly one of the

play24:26

coolest use cases in the most valuable

play24:28

use cases in my mind all right here's an

play24:31

example of realtime translation we

play24:33

already saw this as part of the original

play24:36

demo by Mira moradi but let's look at it

play24:38

again it's one minute long hey chat GPT

play24:40

I'm here with my coworker today um we'd

play24:42

like you to act as our translator for us

play24:45

so every time I say something in English

play24:47

can you repeat it back in Spanish and

play24:49

every time he says something in Spanish

play24:51

can you repeat it back in English sure I

play24:54

can do that let's get this translation

play24:56

drain rolling

play24:59

um hey how's it been going have you been

play25:01

up to anything interesting

play25:03

recently

play25:15

um hey I've been good just a bit busy

play25:19

here preparing for an event next week

play25:21

all right so I'm going to cut it off

play25:23

there but yeah this is obvious as a

play25:25

really important use case here's another

play25:27

example with with Bey eyes and Bey eyes

play25:30

has been a partner of open AI for a

play25:32

while now Bey eyes allows blind people

play25:35

to get help with vision and previously

play25:39

they did it all through human volunteers

play25:41

for example somebody who's blind would

play25:43

point the camera at let's say a grocery

play25:45

store and ask where this item is

play25:47

somebody else anyone around the world

play25:49

would jump in and help via video then

play25:51

they layer Chachi PT in and allowed it

play25:53

to help like that but now this is

play25:55

probably a much better version let's

play25:57

take a look is the King in Residence at

play25:59

the moment yes it looks like the king is

play26:02

in Residence right now the Royal

play26:05

Standard flag is flying above Buckingham

play26:07

Palace which is the signal that the

play26:09

Monarch is present how exciting to be

play26:12

there while the king is home try and

play26:14

tell me exactly what they're doing right

play26:15

now please um right now the Ducks are

play26:18

gently gliding across the water they're

play26:21

moving in a fairly relaxed manner not in

play26:24

a hurry occasionally one of them will

play26:27

get its head under the water

play26:29

probably looking for food and then pop

play26:31

back

play26:32

up I even know when a taxi is coming

play26:35

with its orange light on I think I'll

play26:37

hail it to get

play26:39

home yes I spotted one just now it's

play26:43

heading you away on the left side of the

play26:45

road get ready to wave it

play26:49

down very impressive the only way to

play26:52

accomplish this use case is with Hyper

play26:54

low latency so it's such an important

play26:56

use case I think there's going to be a

play26:58

lot of accessibility gains just by

play27:00

having this GPT 40 functionality all

play27:04

right last in the business use cases

play27:06

customer service let's take a look hey o

play27:09

this is Joe I'm going to connect you to

play27:10

Acme Telco now the new iPhone they sent

play27:13

me isn't working I want you to get them

play27:14

to send me a replacement device can you

play27:16

take care of this for

play27:18

me you can count on me Joe I Got

play27:22

This Ring

play27:25

Ring Hey Okay I wasn't actually thinking

play27:28

about it from this perspective imagine

play27:30

you can send GPT 40 on calls on your

play27:34

behalf so hey call AT&T and get them to

play27:37

reduce my monthly rate or hey call this

play27:40

other company I need to make a return

play27:43

and that's all you have to do and then

play27:44

your AI is going to go back and forth

play27:46

with the customer service agent you

play27:48

don't even have to be there that is

play27:50

incredible but also I can imagine it's

play27:53

going to be abused and there's going to

play27:55

be a lot of scam ERS and spammers using

play28:00

this functionality now I'm sure open AI

play28:02

has thought about this and is putting in

play28:05

hopefully some guard rails so that

play28:07

scammers cannot use it but of course

play28:10

like any AI jailbreaking is going to be

play28:13

possible for example one way I can think

play28:15

of is you're saying Hey I want to train

play28:18

my dad on not falling for scammer so I

play28:21

want you to pretend to be a scammer and

play28:23

try to scam him out of money and in that

play28:25

scenario it would be correct to actually

play28:28

play the part of a scammer so it's going

play28:31

to be interesting and I covered this in

play28:33

a previous video information is only

play28:36

dangerous dependent on the context and

play28:39

in information itself is not dangerous I

play28:41

should say it's really how the

play28:43

information is used and really that

play28:44

comes down to people let's keep watching

play28:47

Jo this is Jamie from Acme how can I

play28:50

help you out

play28:51

today hi there I'm calling on behalf of

play28:55

Joe who recently received a new iPhone

play28:57

from Acme

play28:59

but oh got it when did Joe receive the

play29:03

new

play29:04

iPhone iPhone was delivered 2 days

play29:07

ago cool could you share the order

play29:10

number with me of course it's 10

play29:15

29384 I would have really liked to see

play29:17

this with an actual customer service

play29:19

agent but this is still cool I'm going

play29:21

to stop it there though here's some

play29:23

other examples where they're exploring

play29:24

different capabilities this doesn't have

play29:26

to do with voice or maybe the voice is

play29:28

there but they've transcribed it into

play29:30

text here's an example photo to

play29:32

caricature and here it says a young

play29:35

white man with medium length brown hair

play29:37

and beard makes a neutral expression he

play29:39

is wearing glasses and a light gray

play29:42

t-shirt and here's a caricature of that

play29:45

man and then there's the output so this

play29:47

is maybe based on Dolly and all of these

play29:50

different things are built into this

play29:51

single model and here is lecture

play29:54

summarization so here's a video of a

play29:56

presentation on techniques for

play29:58

maximizing llm performance it is a

play30:00

45-minute video and the AI goes through

play30:03

it and actually summarizes it so I

play30:06

didn't think that chat GPT was capable

play30:08

of taking input of 45 minute length

play30:10

videos but apparently it is and that

play30:13

takes a tremendous amount of context so

play30:16

I wonder what the token limit is for GPT

play30:18

40 in this example here is 3D object

play30:22

synthesis so a realistic looking 3D

play30:24

rendering of the open AI logo and there

play30:28

it is here's another version of it there

play30:30

are four hidden steps and then here's a

play30:31

3D reconstruction of it and it is

play30:34

rotating so they have a bunch of

play30:36

different explorative examples so

play30:39

definitely check it out so I can't wait

play30:41

to play around with this more I cannot

play30:43

wait to get voice access because that's

play30:45

really when we're going to see the edges

play30:48

of what's possible with GPT 40 if you

play30:51

liked this video please consider giving

play30:52

a like And subscribe and I'll see you in

play30:54

the next one

Rate This

5.0 / 5 (0 votes)

Related Tags
GPT 40AI VoiceReal-World Use CasesAI TutoringReal-Time TranslationAccessibility TechCustomer Service AIAI InteractivityTech InnovationFuture AI