GPT-4o: The Good, Bad & Ugly for AI Agency Owners

Liam Ottley
23 May 202416:23

Summary

TLDRLe Motley discusses the Open AI's GPT 4.0 update, highlighting both the positive and negative implications for AI agencies. The new model, GPT-Omni, offers enhanced human-computer interaction through multimodal inputs and outputs, simplifying workflows and reducing costs. However, integration challenges, lagging consumer behavior, and technical complexities with images and video inputs pose concerns. The video also touches on the potential plateau in AI intelligence, suggesting a temporary pause that could benefit AI agencies by allowing them to solidify their solutions and services before the next wave of advancements.

Takeaways

  • 🚀 OpenAI has released GPT-4, a significant update that includes the new flagship model GPT-4 Omni, which can process text, audio, image, and video inputs and outputs.
  • 💡 GPT-4 Omni promises a more natural human-computer interaction by streamlining the process of handling multiple types of inputs and outputs.
  • 🌐 The new Chat GPT desktop app is available for Mac users, and GPT-4 is now accessible to free users, expanding the GPT store to over 100 million potential users.
  • 💰 There is still no clear information on the monetization of GPTs in the store, which was initially presented as an attractive feature for developers.
  • 📈 The introduction of GPT-4 brings new modalities, simplifying workflows and potentially reducing costs by requiring fewer APIs for various tasks.
  • 🎙️ Voice AI systems and providers are expected to benefit greatly from GPT-4, as it could reduce response times by up to 60%.
  • 📊 GPT-4 APIs are reportedly twice as fast and 50% cheaper than GPT-3.5 Turbo, making it an attractive option for cost-conscious AI agencies.
  • 🌏 GPT-4 offers improved language support, covering over 50 different languages and 97% of the spoken world, which could open up new markets for AI agencies.
  • 🔗 The challenge for AI agencies will be the integration of GPT-4's new capabilities into existing platforms and customer interfaces, which may lag behind the technology.
  • 🤖 Consumer behavior may not keep pace with technological advancements, meaning that the adoption of new AI functionalities could be slower than expected.
  • 📉 There is a concern that generative AI, including GPT-4, may be plateauing in terms of intelligence gains, with diminishing returns on increasing training data sizes.
  • 🛑 Despite potential plateaus, the speaker is optimistic about the future of AI, suggesting that this could be a temporary pause that allows the industry to solidify solutions and innovate further.

Q & A

  • What is the main focus of the video by Le Motley?

    -The main focus of the video is to break down the GPT 4.0 release from Open AI, discussing the implications of the update for AI agencies, highlighting both the positive and negative aspects.

  • What does GPT 4.0 (Omni) aim to achieve according to Open AI?

    -GPT 4.0 (Omni) aims to achieve much more natural human and computer interaction by taking inputs of text, audio, image, and video, and being able to output text, audio, and image.

  • What new features are included in the Open AI spring update besides GPT 4.0?

    -Besides GPT 4.0, the Open AI spring update includes a new chat GPT desktop app and making GPT 4.0 available to free users, opening the GPT store to over 100 million users.

  • What is the potential impact of GPT 4.0 on AI agency owners?

    -The potential impact on AI agency owners includes new solution opportunities, simplification of workflow, decreased costs due to fewer APIs, and the possibility of expanding into new markets with better language support.

  • How does GPT 4.0's new capabilities affect response times in voice AI systems?

    -GPT 4.0's new capabilities can reduce response times in voice AI systems by up to 60%, making interactions faster and more efficient.

  • What are the cost implications of using GPT 4.0 compared to GPT 3.5 Turbo?

    -GPT 4.0 is twice as fast and 50% cheaper than GPT 3.5 Turbo, making it a more cost-effective option for AI agency owners.

  • What challenges does the video suggest regarding the integration of GPT 4.0's new modalities?

    -The video suggests that the integration of GPT 4.0's new modalities will require platforms like Make.com and Voiceflow to catch up with the technology, as they currently lag behind and do not support these new features.

  • How does the video address the issue of consumer behavior in relation to AI advancements?

    -The video addresses the issue by stating that while technology can advance rapidly, consumer behavior and preferences take much longer to adjust, which could be a barrier to the adoption of new AI features.

  • What technical difficulties might arise with the introduction of image and video inputs in AI systems?

    -The video suggests that building systems around complex and varied inputs like text, image, and audio could introduce difficulties in prompt engineering and reliability, as these models are not yet good enough to be integrated into business systems.

  • What is the potential concern regarding the plateau in intelligence as shown in the model evaluation?

    -The potential concern is that generative AI may be reaching a plateau or an upper limit in terms of intelligence, with incremental improvements and diminishing returns on increasing the size and amount of data in training models.

  • What is the speaker's perspective on the future of generative AI beyond the current plateau?

    -The speaker believes that the plateau is temporary and that the field of AI is far from done. They anticipate that new, more efficient architectures will be developed, pushing the field further and providing a chance for AI agencies to solidify their solutions.

Outlines

00:00

🚀 GPT 4.0 Release: Opportunities and Challenges for AI Agencies

The video discusses the recent GPT 4.0 release by OpenAI, highlighting both positive and negative implications for AI agencies. The new model, GPT-4 Omni, is designed to handle multiple inputs like text, audio, image, and video, and produce outputs in text, audio, and image formats. This advancement simplifies workflows and reduces reliance on multiple APIs. The release also includes a new chat GPT desktop app and makes GPT 4.0 available to free users, expanding the GPT store to over 100 million potential developers. However, concerns are raised about the lack of information on GPT store monetization and the incremental rather than revolutionary nature of the improvements.

05:00

🌐 GPT 4.0's Multilingual Advancements and Cost Reductions

This paragraph emphasizes the improvements in GPT 4.0's language support, now covering over 50 languages and 97% of the spoken world. The new token compression method also reduces token usage for certain languages, which could make localizing AI solutions more accessible. The video mentions the cost benefits of GPT 4.0, being twice as fast and 50% cheaper than GPT 3.5 Turbo, making it an attractive option for agencies. The potential for voice AI systems to significantly reduce response times by up to 60% due to the new audio capabilities is also highlighted, suggesting a bright future for voice AI in the industry.

10:00

🔍 Integration Challenges and Consumer Behavior in AI Advancements

The speaker expresses concerns about the long road to integration for the new multimodal capabilities of GPT 4.0. While the technology is advancing rapidly, platforms like Make.com and Voiceflow are lagging behind and may take time to support these new features. This could delay the practical application of these technologies for end-users. Additionally, there is a discussion about consumer behavior and how it takes time for people to adapt to new technologies, such as e-commerce and potentially AI interactions. The video suggests that while technology advances, consumer adoption may not keep pace, which could affect the uptake of new AI solutions.

15:01

🛠 Technical Complexities and the Plateau of Intelligence

This section delves into the technical challenges of building systems around complex inputs like images and videos. The video points out that while text-based AI systems are difficult enough to engineer reliably, introducing visual and auditory elements adds another layer of complexity. The speaker also discusses the potential plateau in the intelligence of generative AI, as evidenced by incremental improvements in performance and a research paper suggesting diminishing returns as model size and data increase. The video concludes with an optimistic view that this plateau may be temporary and that innovation will continue to drive progress in AI, providing a window for AI agencies to solidify their solutions and services.

🎯 The Future of AI Agencies Amidst Technological Advancements

In the final paragraph, the focus shifts to the potential impact of AI advancements on the role of AI agencies. The video suggests that the incremental improvements in AI and the temporary plateau could provide a valuable opportunity for agencies to refine their services and establish themselves in the market. The speaker argues against the notion that AI will soon become so advanced that businesses won't need specialized AI agency services, stating that such an outcome is highly unlikely in the near future. The video ends with a call to action for AI agencies to use this period to improve their craft and solidify their place in the industry.

Mindmap

Keywords

💡GPT 40

GPT 40, also referred to as Omni, is a flagship model released by OpenAI that represents a significant step towards more natural human-computer interaction. It is designed to take in inputs of text, audio, image, and video, and output text, audio, and image. The model's multimodal capabilities are a simplification of previously separate services, allowing for a more integrated approach to AI interaction. In the video, GPT 40 is highlighted as a central theme, with discussions around its implications for AI agencies and its potential to revolutionize the industry.

💡AI Agency Space

The AI agency space refers to the industry and businesses that focus on developing and implementing AI solutions for clients. In the context of the video, the AI agency space is directly impacted by the release of GPT 40, as it opens up new opportunities and challenges for agencies to adapt and integrate advanced AI capabilities into their offerings. The script discusses both the positive and negative implications of GPT 40 for these agencies.

💡OpenAI Spring Update

The OpenAI Spring Update refers to a recent release of new features and models by OpenAI, the company behind GPT models. The update is significant because it includes the launch of GPT 40 and other enhancements like the new chat GPT desktop app. The video discusses the update's impact on the AI industry and the potential it holds for future developments.

💡Multimodality

Multimodality in the context of AI refers to the ability of a system to process and understand multiple types of input and output data, such as text, audio, image, and video. The script highlights GPT 40's multimodality as a key feature, allowing for more natural and human-like interactions with AI systems. This capability is seen as a significant advancement for AI agencies, as it simplifies workflows and opens up new solution opportunities.

💡Voice AI Systems

Voice AI systems are technologies that enable interaction with AI through voice inputs and outputs. The script suggests that with GPT 40's ability to handle audio inputs and outputs, Voice AI systems are poised to win big, as they can potentially reduce response times by up to 60%. This improvement could lead to a boom in the voice AI space, offering new opportunities for AI agencies to specialize in this area.

💡Token Compression

Token compression in the context of AI refers to a method that reduces the number of tokens needed to represent a piece of data, thereby increasing efficiency and potentially reducing costs. The script mentions that GPT 40 introduces a new token compression method that decreases token usage for various languages, which could be beneficial for AI agencies operating in multilingual markets.

💡Monetization of GPTs

Monetization of GPTs refers to the process of generating revenue from the development and use of GPT models. The script raises a concern about the lack of clarity on how monetization will work for GPT store developers, who were initially attracted by the promise of an 'app store moment' with potential for monetization. The video suggests that this is an area to watch for future updates.

💡Integration

Integration, in the context of the video, refers to the process of incorporating new AI capabilities, such as those offered by GPT 40, into existing platforms and systems used by AI agencies. The script expresses concern about the potential long road to integration, as current platforms may lag behind the technology provided by OpenAI, making it challenging to deliver the full benefits of GPT 40 to end-users.

💡Consumer Behavior

Consumer behavior in the video script refers to the preferences, tastes, and actions of end-users when interacting with AI technologies. The video suggests that while technology advances rapidly, consumer behavior changes more slowly, which could pose a challenge for AI agencies trying to sell new solutions based on GPT 40's capabilities.

💡Image and Video Difficulties

Image and video difficulties refer to the challenges associated with building reliable and predictable AI systems that can handle complex inputs like images and videos. The script discusses the increased complexity that these types of inputs bring to AI systems, which could be problematic for agencies trying to integrate GPT 40's multimodal capabilities into their offerings.

💡Intelligence Plateau

The intelligence plateau refers to the idea that the improvements in AI capabilities may be reaching a point of diminishing returns, where further increases in data or model size do not lead to significant gains in intelligence. The script presents evidence from a research paper suggesting that generative AI may be experiencing this plateau, which has implications for the future of AI advancements and the strategies of AI agencies.

Highlights

Introduction of GPT 40, Open AI's new flagship model, which can process text, audio, image, and video inputs, and output text, audio, and image.

Release of the new chat GPT desktop app, currently available for Mac users.

GPT 40 is now available to free users, opening the GPT store to over 100 million users.

No information on GPT store monetization has been provided yet.

New modalities open up numerous solution opportunities with different types of inputs and outputs.

Workflow simplification with fewer APIs needed for various functionalities.

Potential decrease in costs due to reduced reliance on multiple APIs.

Voice AI systems and providers are expected to benefit from GPT 40's audio capabilities.

GPT 40 APIs are twice as fast and 50% cheaper than GPT 3.5 Turbo.

GPT 40 can handle over 50 different languages, covering 97% of the spoken world.

Token usage is decreased with the new token compression method in GPT 40.

Concerns about the long road to integration with current platforms lagging behind Open AI's technology.

Lagging consumer behavior may slow down the adoption of new AI functionalities.

Technical challenges with building systems around complex inputs like image and video.

Potential plateau in intelligence levels of generative AI models.

Research suggests diminishing returns as model size and data increase.

Opportunity for the AI agency space to solidify solutions and identify use cases before the next wave of advancements.

Transcripts

play00:00

hey guys Le Motley here and today I've

play00:01

got a video breaking down the GPT 40

play00:03

release uh the reason for the spring

play00:04

update from open AI there's some good

play00:06

things and there's some bad things for

play00:07

the AI agency space I wanted to jump on

play00:09

here and give my thoughts now that I've

play00:11

had a few days to Stew it and really

play00:12

think about what the implications are

play00:14

for us so GPT 40 the good and the bad

play00:16

for AI agencies of course what's new if

play00:19

you've been living under a rock we've

play00:20

had the open AI spring update recently

play00:22

and they released a bunch of things but

play00:23

mainly uh the new flagship model GPT 40

play00:27

um o meaning Omni and this is a step

play00:29

towards much more natural human and

play00:31

computer interaction according to open

play00:32

aai on their blog post so that's taking

play00:35

in inputs of text audio image and video

play00:38

and it's able to output and text audio

play00:40

and image so massive new capabilities

play00:42

that we've kind of been cobbling

play00:43

together using different services and

play00:44

we've had whisper and we've had these

play00:46

other services that allow us to get this

play00:48

kind of functionality anyway but they've

play00:49

just pulled it all into one and allowed

play00:51

the model to understand not only text

play00:53

input but also audio and video and

play00:55

stream it all into one input as well so

play00:57

very exciting stuff um I was hoping for

play00:59

GPD 4.5 or gpg 5 um but as we'll see

play01:02

later in this video there's some

play01:03

ramifications of what this incremental

play01:06

Improvement might mean for this bace um

play01:08

but I'll save that for a little later so

play01:10

there's also other stuff of the new chat

play01:11

GPT desktop app which is cool I've just

play01:13

been able to get it downloaded I think I

play01:15

need to update with Mac as many of you

play01:16

guys will as well I think it's only

play01:18

available for Mac uh for the time being

play01:20

um and GPT 40 is now available to free

play01:23

users um not just plus subscribers and

play01:25

therefore the GPT store is open to over

play01:27

a 100 million users so if you guys are

play01:29

interested in uh jumping on gpts I know

play01:31

I've made a video before that I think a

play01:32

lot of you found my Channel Through the

play01:34

GPT store is now available to 100

play01:36

million users or more um which is great

play01:39

great news for you if you're hoping to

play01:40

be a GPT developer um we have yet to see

play01:43

anything about the monetization of gpts

play01:45

so I'd be curious to see what they have

play01:47

in store for that cuz that's kind of

play01:48

they attracted us all they honey potted

play01:50

us into this gpts uh and building gpts

play01:52

on the store thinking it's going to be

play01:54

an app store moment saying yeah

play01:55

monetization this monetization that um

play01:57

and we haven't seen seen anything of at

play01:59

least I have haven't seen anything about

play02:01

about the gbt store monetization so

play02:03

other stuff I don't think it's that

play02:05

relevant for us as as AI agency owners

play02:07

okay I want to start off with the good

play02:08

points first then we'll get on to the

play02:09

bad a little later um here's Sam Alman

play02:11

counting up all the all the money he's

play02:12

making from these recent updates um

play02:15

firstly new modalities it is a good

play02:17

advancement we are getting a ton of new

play02:19

solution opportunities opening up for us

play02:21

as we're able to take in different types

play02:22

of inputs from our end users and then

play02:24

give them different types of outputs

play02:25

back and really there's there's it's not

play02:28

a massive Leap Forward cuz we've been

play02:30

able to do this one of the examples on

play02:31

screen that openai has provided shows

play02:33

them asking a question and then

play02:34

providing an audio file as part of the

play02:36

input and then it's able to answer

play02:37

questions and reason off the back of

play02:39

that as well but prior to that it's not

play02:41

we were able to do that with

play02:42

transcription anyway we just transcribe

play02:43

it and then put it into the give it to

play02:45

the model to to reason over so not a

play02:47

massive leap we've been able to do a lot

play02:48

of these things but really what it is is

play02:50

just a simplification of our workflow

play02:51

and of the systems we need to build for

play02:53

our clients um less fiddling around with

play02:55

multiple different apis which is easier

play02:56

to get the results that we want for our

play02:58

clients and I think this is great for

play03:00

many of you who are not so technically

play03:01

inclined and I know a lot of you have

play03:02

been brought into this opportunity and

play03:04

still struggle with some of the

play03:05

technical parts of it but it's a clear

play03:07

Trend that we're seeing towards this

play03:09

simplification but there's still a level

play03:10

of complexity of how can this actually

play03:12

be implemented into the business so

play03:13

you're getting easier to do um but you

play03:15

still you're getting more power

play03:16

essentially in your hands that you can

play03:17

provide to your clients as well and

play03:19

because we're going to be using fewer

play03:20

apis this is probably going to decrease

play03:21

our cost as well because we're not

play03:23

having to use transcription and then

play03:25

generate an answer and then use text to

play03:26

speech if we're using these kind of

play03:27

systems which we'll get on to next which

play03:29

I think is pretty exciting The Voice AI

play03:31

systems and providers I think are going

play03:32

to win big here um because once audio

play03:35

inputs and outputs become available via

play03:36

the GPT 40 API the response times can be

play03:39

reduced by up to 60% um based off the

play03:42

numbers that open AI is provided which

play03:43

is between 200 and 300 milliseconds for

play03:46

responses at least that's what we saw of

play03:47

the chat GPT in the demo and as you can

play03:49

see here on platforms like vapy um even

play03:51

on the fastest and lowest intelligence

play03:54

model uh we've got a 650 millisecond

play03:57

response time and this is purely because

play03:58

they're having to stack up so many

play03:59

models that when your voice comes in

play04:01

over the phone they have to transcribe

play04:02

that then they have to generate an

play04:04

answer and text and then they need to

play04:05

turn that text into speech and then they

play04:07

need to send it off to you as well so

play04:09

this 650 millisecond latency which was

play04:12

fine it was fast enough we're now going

play04:13

to get a potentially 60% reduction of

play04:15

that as well so I think soon as these

play04:17

guys are able to access vapy and and

play04:19

Bland Etc are able to access GPT 40 via

play04:22

API and send and receive audio inputs

play04:24

and outputs um we could see a a

play04:26

continued boom of the voice AI space

play04:28

which is something I've been talking

play04:29

about a lot on the channel here if you

play04:30

guys are just starting with your AI

play04:31

agency you know looking for a good place

play04:33

to start or specialize in then voice AI

play04:35

is a great place to look into next we

play04:37

have a quick win for us as AI agency

play04:38

owners which is GPT 40 apis being twice

play04:41

as fast and 50% cheaper than GPT 4 Turbo

play04:44

it's always great on these big updates

play04:45

from open AI cuz we can kind of expect

play04:47

these reductions um and it's good to see

play04:49

that they're continuing to do this over

play04:50

time so we can expect it in future and

play04:52

an interesting thing to point out is

play04:53

that we're getting much closer to this

play04:55

GPT 3.5 Turbo cost which is is basically

play04:57

free this thing is so cheap it's it

play05:00

barely cost you a dime to do anything um

play05:02

but here we can see that we've got input

play05:03

of $5 and 50 for gbt 3.5 turbo so it's

play05:07

just a 10x price difference considering

play05:09

the the massive increase in intelligence

play05:10

and and modalities that we're going to

play05:12

get from GPT 40 you can't not be happy

play05:14

with that outcome next we have another

play05:16

quick win for us as AI agency owners

play05:17

which might have slipped under the radar

play05:18

for you a little bit better language

play05:20

support for GPT 40 that can handle over

play05:22

50 different languages now covering 97%

play05:25

of the spoken world and it's also going

play05:27

to decrease a token usage as you can see

play05:28

here the new token compression method is

play05:30

actually reducing the amount of tokens

play05:32

for some of these languages as you can

play05:34

see now this may not seem like a major

play05:36

but this is a question I get all the

play05:38

time my accelerator and on my free

play05:39

community q&as which is should I be

play05:41

selling local or should I be trying to

play05:43

sell in the US or should I be trying to

play05:44

sell in Europe it's mainly people

play05:45

interested in selling in the US um and

play05:47

my answer is always no ideally Go Local

play05:50

um if you're if you're from South

play05:51

America and you're trying to go over to

play05:52

the United States and start selling

play05:54

there you're at a natural disadvantage

play05:56

just by purely being outside of the

play05:57

country you might sound a little

play05:58

different over the phone um you might

play06:00

have a name that doesn't necessarily

play06:01

ring like you're you could be someone's

play06:03

neighbor um and there's nothing wrong

play06:05

with that but it's just the the cold

play06:06

heart facts of it's going to be harder

play06:08

for you you're playing at some kind of

play06:09

disadvantage or debuff versus someone

play06:11

who is is John Smith who lives next door

play06:13

you know if you're in for example the

play06:14

Spanish speaking world I'm sure you've

play06:15

already had Fairly good responses and

play06:17

and good translation capabilities from

play06:19

GPT uh but it's really the smaller areas

play06:21

and these smaller languages that up

play06:23

until now haven't really had the support

play06:25

now you can be the first person into

play06:26

those markets so if you live somewhere

play06:28

that you thought oh no one's ever going

play06:29

to be able use this in my language or I

play06:31

shouldn't bother selling local now is

play06:32

your chance to be the first guy or the

play06:34

first girl in that market to go and

play06:35

start selling these Solutions and you

play06:37

might say that oh but they don't they're

play06:38

not interested in AI don't try to sell

play06:40

it as AI then just sell it as a

play06:41

meaningful difference in their business

play06:42

and now getting into the bad and we may

play06:44

have the rise of e girlfriends sooner or

play06:46

later um but that's not what I want to

play06:47

go into here um it's actually the long

play06:49

road to integration that's the first

play06:51

thing that I'm kind of concerned about

play06:52

here um and and by that what I mean is

play06:55

these new modalities and text and audio

play06:57

and video and image and all this stuff

play06:58

is cool but it's it it doesn't mean

play07:00

anything to us as AI agency owners until

play07:02

we're able to get that to our in

play07:03

customer with all these platforms that

play07:05

we use like make.com and voiceflow um

play07:07

and sending things to WhatsApp and the

play07:09

different solutions we build they are

play07:11

lagging far behind the technology that

play07:14

open AI is providing it really is an

play07:15

issue of trying to get the stuff in the

play07:17

hands and making it useful for our end

play07:18

users um but until these platforms catch

play07:20

up and and allow support for the

play07:22

customers concerned voice notes and they

play07:24

can send photos and and say voice flow

play07:26

allows you to send photos through your

play07:27

web chat widget which I'm not sure why

play07:30

and more so for things like WhatsApp

play07:31

deployments for your your AI agents um

play07:33

being able to send voice notes to the

play07:35

customers and receive it from them and

play07:36

send pictures and get them back um

play07:38

that's I think a long way off and I'm

play07:40

looking forward to seeing how they allow

play07:41

us to build these different modalities

play07:43

into our systems that we sell even for

play07:45

my own platform agentive we now have

play07:46

this question of do we want to integrate

play07:48

audio and video and image and all these

play07:50

different things into our application

play07:51

and into our platform or do we want to

play07:53

just stick with text base and I think

play07:54

this is a conversation many of these

play07:56

platforms are going to be having um it's

play07:57

interesting to see how they play out and

play07:58

moving on from that to something closely

play08:00

related is the lagging consumer Behavior

play08:02

now we can have technology that moves

play08:04

ahead very fast and and early adopters

play08:06

kind of catch up if you've seen my uh my

play08:08

technology adoption curve video which

play08:09

I'll put up here somewhere while

play08:10

technology can race ahead the actual

play08:13

tastes and preferences and and behaviors

play08:15

of the the consumer populace take a lot

play08:18

longer to adjust and e-commerce is an

play08:20

example of this where it took a long

play08:22

time for people to become comfortable

play08:23

with putting the credit card online and

play08:25

now we do it like the thought of putting

play08:26

a credit card and giving it to some some

play08:28

random website was ludicrous back in if

play08:31

you go back far enough it was a

play08:33

completely silly idea and over time it

play08:35

took like decades for them to get to the

play08:37

point where it's okay yeah now now we

play08:39

all buy stuff online this is the same

play08:41

sort of thing with with AI and I think

play08:43

we're going to run into this GPT and and

play08:44

chat GPT may help with people getting

play08:47

used to speaking to these AI assistants

play08:48

and having conversations um but I think

play08:51

there's still a considerable lag in the

play08:53

actual consumer behaviors where if we're

play08:55

trying to sell these Solutions do our

play08:57

end customers actually want to be

play08:58

sending voice notes to WhatsApp do they

play09:00

want to be sending pictures and and

play09:02

giving videos to them and personally if

play09:04

historical preceden are anything to go

play09:05

off I'm not betting on this thing moving

play09:07

too fast next we have more of a

play09:08

technical one that I think could be an

play09:09

issue which is the image and video

play09:11

difficulties that come along with

play09:12

Building Systems around much more

play09:14

complex and varied inputs like text

play09:17

image and audio um in this example here

play09:19

you might have watched my prompt

play09:20

engineering video where I I highlighted

play09:22

the difference between conversational

play09:24

prompting and single shot prompting I'll

play09:25

put the video up there if you haven't

play09:27

watched it highly recommend it's very

play09:28

important for you to know how to do

play09:29

prompt engineering and it's not your

play09:31

regular video it's very very different

play09:32

we had some really good feedback on that

play09:33

there conversational prompting and then

play09:35

there single shot prompting for us as AI

play09:37

agencies in many cases we're working in

play09:38

this single shot range where we need to

play09:40

engineer The Prompt and engineer the

play09:42

system to be reliable and predictable

play09:44

and continue to give the same outputs

play09:45

over and over and over again um so that

play09:47

they can actually be built into a

play09:49

company and operate as a a artificial

play09:51

intelligence task that plugged into

play09:53

their systems and doing that with text

play09:54

only proves to be difficult enough as

play09:56

I'm sure some of you have found out but

play09:58

now we introduce a whole another layer

play10:00

of complexity of images and videos so

play10:03

imagine this example here of of an email

play10:04

classification system where there's a

play10:07

user and they fill out a contact form we

play10:08

get an email then we use prompt the GPT

play10:11

task on something like make.com to

play10:13

Output a label now imagine this instead

play10:15

of it being an email input it was a

play10:17

video from a customer or it was a was a

play10:19

photo of a of a damaged good that they

play10:21

got from from an e-commerce store

play10:23

there's so much more complexity in that

play10:25

than okay I can understand text but this

play10:27

is okay what's wrong with with the

play10:29

product from my own experience with

play10:31

vision models they are nowhere near good

play10:32

enough to be able to be baked into these

play10:34

kind of systems and be able to reliably

play10:36

perform so that's that's another concern

play10:38

that I have um for this new multimodal

play10:40

and Omni Channel or Omni model future

play10:43

that they're going for and finally what

play10:44

might seem very concerning to some of

play10:46

you is this plateau and intelligence um

play10:48

here's the model evaluation from open

play10:50

ai's blog post on the GPT 4 release and

play10:52

you can see the pink bars here a GPT 40

play10:55

and it's about the it's about the same

play10:58

and it's not a massive Improvement you

play11:00

know it's it's very incremental look

play11:01

we're just edging up above gp4 turbo uh

play11:04

in most of these categories and yes this

play11:06

is a text evaluation I get that and

play11:08

they're really focused on being audio

play11:10

and video and things like this to give

play11:11

them credit if you go through these

play11:13

other tabs on the website I will leave a

play11:14

link to the blog down below um you can

play11:16

see that in uh in these audio and video

play11:19

different tests they are outperforming

play11:21

the the prim models and we are seeing

play11:23

much larger increases in capability and

play11:25

and intelligence but when it comes to

play11:27

the scores on the text evaluation which

play11:28

is what we typically determine as the

play11:30

the level of intelligence for getting

play11:32

questions right and reasoning things

play11:34

like this which is really crucial to the

play11:35

systems that we currently build we're

play11:37

plateauing um and we're seeing

play11:39

incremental improvements and this opens

play11:41

up to a much broader discussion about

play11:43

the state of generative Ai and and where

play11:45

the future's going to go and here's a

play11:46

look at the multilingual performance so

play11:48

you can see GPT 40 is better with

play11:50

different languages sure but again it

play11:52

looks fairly incremental it's not like

play11:54

this massive Leap Forward in

play11:56

intelligence um this idea that

play11:58

generative AI may be plateauing or we're

play12:00

reaching some kind of upper upper limit

play12:02

has been to some degree corroborated or

play12:04

or confirmed by re research paper and

play12:07

while this may look crazy on screen I

play12:09

I'll explain it basically um they have

play12:11

found that as they continue to increase

play12:12

the size and the amount of data that

play12:13

they train models on they seeing a a

play12:15

diminishing return um which may sound

play12:18

awful but give me a second to explain so

play12:20

I've actually a great video breaking

play12:21

this down is from I think they're called

play12:23

computer file I'll leave a link down

play12:24

below and as you can see on screen here

play12:25

there's three scenarios of generative Ai

play12:28

and and artificial intelligence and with

play12:30

these language models and the transform

play12:32

architecture they have three different

play12:34

scenarios say the one is the most

play12:36

exciting where as we continue to say on

play12:38

these on these axises sorry we have the

play12:40

number of examples in the training data

play12:42

set and then we have the performance or

play12:43

the Intelligence on the y- axis and in

play12:45

the first case if we continue to

play12:46

increase the number of examples in the

play12:47

training data set we get this

play12:49

exponential curve as you can see here of

play12:51

it's just runaway general intelligence

play12:53

and things get so much smarter and we

play12:55

put in a little bit more data and it

play12:56

gets way smarter then there's a more

play12:58

balanced or conservative one which is a

play13:00

a a linear relationship where more

play13:02

examples more data equals better

play13:04

performance and and greater intelligence

play13:06

but what they are starting to see as you

play13:08

can see from from this graph here is a

play13:11

potential uh worst case maybe not worst

play13:13

case but not the greatest case outcome

play13:16

um based off the evidence of this paper

play13:18

where we are seeing a flattening curve

play13:19

and as we continue to increase the

play13:21

examples it's flattening off and we're

play13:23

not getting any kind of meaningful

play13:24

increase on intelligence this is a

play13:26

little summary from that research paper

play13:27

which is there is a clear log linear

play13:29

trend for the models to get

play13:31

incrementally better at handling a

play13:32

concept that concept needs to appear

play13:34

exponentially more times in the training

play13:36

data this shows the models are very data

play13:39

inefficient which may sound concerning

play13:40

to you as someone who's just bit their

play13:42

whole life on this AI space and

play13:43

continued advancements um and I guess

play13:46

the question is has generative AI peaked

play13:48

um and I think this is an important one

play13:50

to ask but personally I'm betting on

play13:52

this being a temporary Plateau where the

play13:54

architecture and the Transformer that we

play13:55

currently use is kind of maxing out and

play13:58

and we've pushed as far as we can we've

play14:00

got great capabilities we have an

play14:01

awesome set of new technology to use and

play14:03

Implement into businesses um and we've

play14:05

just attracted the entire every smart

play14:07

brain on the planet has has flooded into

play14:09

the space as as you are here watching

play14:12

this video interested in the space and

play14:13

we have the greatest researchers in

play14:15

Minds all going into it humans are not

play14:17

done with artificial intelligence by no

play14:19

by any means we're not just going to go

play14:20

oh well well we've got the missing

play14:21

returns right I'm out like okay I guess

play14:24

we can't get any smarter than that so

play14:26

that's complete rubbish so we're going

play14:27

to continue to keep pushing and just

play14:29

like the transformer architecture

play14:30

blasted open the scene for us to be able

play14:32

to get to this point um when GPT 3

play14:34

released we're going to continue to see

play14:36

people innovate and try to create

play14:37

smarter architectures that are more

play14:39

efficient on training with that data so

play14:41

we could be looking at a a little

play14:42

Plateau a nice little bit of room to

play14:44

catch our breath and and I honestly

play14:46

think that that is going to be very very

play14:47

good for the space as the AAA and the AI

play14:49

automation agency space as a whole

play14:51

because if if the the tech keep kept

play14:53

taking off and it just got better and

play14:54

better and better and every 6 months we

play14:56

had whole bunch of new stuff to try and

play14:58

adapt to with we don't ever get a chance

play15:00

to really catch our breath and solidify

play15:01

our solution so this is something that

play15:03

I've I've been kind of excited for to

play15:05

some degree where I get questions on the

play15:06

channel which is oh but like what if

play15:09

what if the like why why do they need

play15:10

our Solutions why would they ever need

play15:11

to use us if if the thing just going to

play15:13

get so good that suddenly they can type

play15:15

one word and then it just automates

play15:17

their whole business that that is a a

play15:19

very low percentage outcome like in

play15:21

terms of probability and if this is

play15:23

anything to go by we're not going to get

play15:25

there anytime soon so we have a chance

play15:27

now to say okay yes people have dropped

play15:29

out of the space they've gone off

play15:30

they've chased other other shiny objects

play15:33

we now have a chance to really work on

play15:35

our craft really work on identifying

play15:37

these use cases and going into

play15:38

businesses and finding ways that AI can

play15:40

help them and we get a nice Runway now

play15:42

before things take off again to dial in

play15:44

our services to get experience and to

play15:45

continue to push ourselves further and

play15:47

further away from The Spectators who sit

play15:49

there and watch AI news to the people

play15:50

who are actually taking action like you

play15:52

and I I hope they give you a little bit

play15:53

more clarity about what this update

play15:54

means for us as AI agency owners I do

play15:56

not like being an AI News Channel but

play15:58

when there's big updat from open AI um

play16:00

it is worthwhile coming in here and

play16:01

hopefully giving you guys a bit of

play16:02

insight into what this means for us if

play16:04

you have enjoyed the video hit down

play16:05

below and leave a like um subscribe to

play16:07

the channel if you're not already uh for

play16:08

more content like this teaching you how

play16:09

to make money and build businesses in

play16:10

the AI space if you're interested in

play16:12

seeing what my life day to-day is like

play16:13

here in Dubai as an AI agency own it you

play16:15

can check out my recent video here um

play16:17

showing you the RO reality of my life as

play16:19

an AI entrepreneur but aside from that

play16:20

guys thank you so much for watching and

play16:21

I'll see you in the next one

Rate This

5.0 / 5 (0 votes)

Etiquetas Relacionadas
AI UpdateGPT 40AI AgenciesMultimodal AIVoice AINatural InteractionTech AdvancementIntegration RoadmapConsumer BehaviorAI Monetization
¿Necesitas un resumen en inglés?