GPT-4o - Full Breakdown + Bonus Details

AI Explained
13 May 202418:43

Summary

TLDRGPT-4 Omni,作为OpenAI的最新人工智能模型,以其多模态输入输出、高效的编码能力和较低的延迟受到关注。该模型在文本、图像和视频处理方面展现出色的表现,如生成电影海报、提供客户服务模拟对话、以及实时翻译等。尽管在某些推理基准测试中表现参差不齐,但GPT-4 Omni在数学基准测试和多语言处理方面取得了显著进步。此外,OpenAI的桌面应用程序提供了实时编程协助,进一步增强了用户体验。尽管GPT-4 Omni并非完全达到通用人工智能(AGI)的水平,但其免费提供和多模态交互的特性预示着它可能会吸引数亿新用户,推动AI技术的普及和应用。

Takeaways

  • 🚀 GPT-4 Omni 被描述为在多个方面更智能、更便宜、更快,并且在编程、多模态输入输出方面表现出色,能够从 Google 那里夺取焦点。
  • 📈 GPT-4 Omni 的命名暗示了其多模态能力,OpenAI 计划从 100 万用户扩展到数亿用户,这可能预示着一个更智能的模型即将到来。
  • 📷 GPT-4 在文本和图像的生成准确度上有显著提升,即使是在非演示环境下,生成的文本和图像的准确性也给人留下了深刻印象。
  • 🎬 GPT-4 能够根据文本要求设计电影海报,并且在经过改进后的输出中,文本更清晰,颜色更鲜明,整体图像质量得到提升。
  • 📹 GPT-4 展示了实时视频处理能力,尽管目前模型还无法进行视频输出,但这一功能预示着未来的可能性。
  • 🎓 GPT-4 在数学基准测试上的表现比 GPT-3 有显著提升,尽管它在处理某些数学问题时仍然存在困难。
  • 💰 GPT-4 的定价为输入 1 百万个标记 5 美元,输出 1 百万个标记 15 美元,相比其他模型如 Claude 3 Opus 更具成本效益。
  • 🌐 GPT-4 在多语言性能上有所提升,尽管英语仍然是最适合的语言,但对非英语语言的支持也有所增强。
  • 📱 OpenAI 推出了桌面应用程序,作为一个实时编程助手,这可能会改变开发者与 AI 交互的方式。
  • 🤖 GPT-4 在实时演示中表现出了更快的响应时间和更高的交互性,这使得与 AI 的对话更加自然和流畅。
  • ⏱️ GPT-4 的一个关键创新是降低了延迟,这使得 AI 的响应时间更接近人类,提高了交互的真实感。
  • 🔍 尽管 GPT-4 在某些推理基准测试中表现良好,但在其他一些测试中表现参差不齐,特别是在对抗性阅读理解方面。

Q & A

  • GPT-4 Omni 与之前的模型相比有哪些显著的改进?

    -GPT-4 Omni 在多个方面进行了改进,包括更智能、成本更低、速度更快、编码能力更强,并且在多模态输入输出方面表现出色。它还具有更好的时机,能够从 Google 那里夺取关注。

  • GPT-4 Omni 在用户规模上的预期增长是多少?

    -GPT-4 Omni 预期将用户规模从100万扩展到数亿用户,这表明开发者对于模型的扩展性有着极高的信心。

  • GPT-4 Omni 在文本生成准确性方面有哪些突破?

    -GPT-4 Omni 在文本生成的准确性上取得了显著进步,尽管并非完美,但已经能够生成具有高准确度的文本,这在以往的演示中是未曾见过的。

  • GPT-4 Omni 是否能够根据文本要求设计电影海报?

    -是的,GPT-4 Omni 能够根据文本要求设计电影海报,并且当要求其改进时,它能够提供更清晰、颜色更鲜明、整体效果更佳的海报设计。

  • GPT-4 Omni 的发布时间是什么时候?

    -GPT-4 Omni 的一些功能将在接下来的几周内发布,但具体的发布时间尚未明确。

  • GPT-4 Omni 在客户服务模拟中的表现如何?

    -在模拟的客户服务场景中,GPT-4 Omni 能够成功地进行交流并完成任务,例如询问 Joe 的电子邮件地址并确认邮件的发送。

  • GPT-4 Omni 在编程编码方面的表现如何?

    -GPT-4 Omni 在编程编码方面表现出色,与 GPT-2 等其他模型相比,它在人类等级排行榜上获得了更多的偏好。

  • GPT-4 Omni 的桌面应用程序有什么特点?

    -GPT-4 Omni 的桌面应用程序是一个实时编程助手,它能够实时接收和响应用户的代码,提供实时的编程辅助功能。

  • GPT-4 Omni 在数学基准测试中的表现如何?

    -尽管 GPT-4 Omni 在某些数学问题上存在失败,但它在数学基准测试中的表现相比原始的 GPT-4 有了显著提升。

  • GPT-4 Omni 在多语言性能方面有哪些提升?

    -GPT-4 Omni 在多语言性能方面有所提升,尽管英语仍然是最适合的语言,但它在其他语言上的表现也有了进步。

  • GPT-4 Omni 的价格是多少,与 Claude 3 Opus 相比如何?

    -GPT-4 Omni 的价格为每100万个输入令牌5美元,每100万个输出令牌15美元。相比之下,Claude 3 Opus 的价格为1575美元,且需要通过订阅服务注册。

  • GPT-4 Omni 在视频输入功能方面有哪些特点?

    -GPT-4 Omni 支持视频输入功能,用户可以直接将视频流传输到 Transformer 架构。尽管视频输入的反应时间不如音频输入那样即时,但这一功能仍然令人印象深刻。

Outlines

00:00

🚀 GP4 Omni的智能与性能

第一段主要介绍了GP4 Omni的智能水平和性能。GP4 Omni在多个方面超越了Google,包括在编码、多模态输入输出、以及在正确的时间吸引注意力。作者通过基准测试和发布视频,对GP4 Omni的第一印象是它比AGI更像一个显著的进步。GP4 Omni的命名意味着它覆盖了多种模态,并且Open AI计划将其用户规模从100万扩展到数百万。此外,还提到了GP4 Omni在文本、图像生成和视频摘要方面的准确性和能力,以及它在即将发布的几周内的一些功能亮点。

05:01

📈 GP4 Omni的基准测试与定价

第二段聚焦于GP4 Omni的基准测试结果和定价策略。GP4 Omni在数学基准测试上的表现令人印象深刻,尽管它在某些数学问题上会失败。在Google的证明研究生测试中,它超越了Claude 3 Opus,这是Anthropic的主要基准。GP4 Omni的定价为每100万个输入令牌5美元,每100万个输出令牌15美元,相比Claude 3 Opus的价格1575美元,GP4 Omni显得更为经济。此外,还讨论了GP4 Omni在翻译、视觉理解评估和多语言性能方面的进步。

10:03

🎭 GP4 Omni的实时演示与应用

第三段展示了GP4 Omni的实时演示,包括它在对话中的实时响应能力和个性化的声音。GP4 Omni能够根据要求调整语速,并且能够模拟多种声音进行和谐歌唱。此外,还提到了GP4 Omni在实时翻译方面的潜力,以及它如何通过视频输入功能直接与Transformer架构交互,尽管视频反应时间不如音频那样即时。

15:04

🌐 GP4 Omni的普及与未来展望

第四段探讨了GP4 Omni的普及对AI领域的影响,以及对未来的展望。作者指出,尽管GP4 Omni并非在所有测试中都表现出色,但它的免费和多模态特性可能会吸引更多人使用AI。GP4 Omni的发布可能会使更多的人测试AI,并且它可能会成为目前可用的最智能的模型。同时,作者也提到了Open AI的其他更新和即将到来的产品,以及它们如何通过降低延迟来提高AI的真实感。最后,作者邀请观众加入AI Insiders的Discord社区,以获取更多关于AI的信息和交流。

Mindmap

Keywords

💡GPT-4 Omni

GPT-4 Omni 是一种人工智能模型,它在多个方面表现出色,如编码、多模态输入输出等。它被设计为能够处理各种模态的数据,意味着它能够理解和生成文本、图像等不同类型的信息。在视频中,GPT-4 Omni 被提及为一个重大进步,尽管它并不完全符合通用人工智能(AGI)的标准,但它在性能上的提升是显著的。

💡多模态

多模态指的是系统能够处理并理解多种不同类型的数据输入,如文本、图像、声音等。在视频中,GPT-4 Omni 的多模态能力被强调,它能够接收和生成多种类型的数据,这使得它在交互性和应用范围上更为广泛。

💡基准测试

基准测试是一种评估和比较不同系统性能的方法。在视频中,GPT-4 Omni 在多个基准测试中的表现被详细讨论,包括数学问题解决、语言理解和生成等。这些测试结果帮助观众理解模型的能力和局限性。

💡实时翻译

实时翻译指的是系统能够即时将一种语言的文本或语音转换成另一种语言。视频中提到了GPT-4 Omni可能很快会具备的实时翻译能力,这将极大地提高跨语言交流的效率。

💡视频输入

视频输入是指系统能够接收视频数据作为输入,并对其进行处理和理解。GPT-4 Omni的视频输入功能在视频中被提及,这表明它能够分析和响应实时视频流,这是其多模态能力的一部分。

💡响应时间

响应时间指的是系统对输入做出反应所需的时间。在视频中,GPT-4 Omni的响应时间被特别提及,因为它的快速响应能力提高了与人类交互时的自然感和实时性。

💡智能代理

智能代理是指能够自主执行任务或做出决策的智能系统。视频中提到了GPT-4 Omni在智能代理方面的潜力,暗示了它在未来可能在自动化和决策支持方面发挥的作用。

💡OpenAI

OpenAI 是一个致力于开发和推广先进人工智能技术的研究机构。视频中讨论了OpenAI发布的GPT-4 Omni模型,以及它如何通过提供先进的AI工具来推动技术的普及和应用。

💡知识截止日期

知识截止日期是指人工智能模型所训练的数据集包含信息的最新时间点。GPT-4 Omni的知识截止日期是2023年10月,这意味着它所理解的信息和知识都是基于那个时间点之前的。

💡定价策略

定价策略涉及如何为产品或服务设定价格。视频中提到了GPT-4 Omni的定价,它相对于其他模型如Claude 3 Opus更为经济,这可能会影响用户的使用选择和模型的市场接受度。

💡用户界面

用户界面是指用户与系统交互的界面,它可以是图形界面、语音界面或其他形式。视频中提到了GPT-4 Omni的用户界面,包括其在OpenAI PlayGround上的使用,这使得用户可以直接与模型交互并测试其功能。

Highlights

GPT-4 Omni 被描述为在多个方面更智能、更便宜、更快,并且在编码、多模态输入输出方面表现更好,完美地从 Google 那里夺取了焦点。

GPT-4 Omni 的命名暗示了它的多模态能力,预示着从100万用户扩展到数亿用户的计划,或者他们即将推出一个更智能的模型。

OpenAI 将 GPT-4 标榜为第 4 级智能,但实际上可能稍微低估了它,特别是在文本、图像和视频生成的准确性方面。

GPT-4 在设计电影海报的任务中展现了出色的创造力,甚至在被要求改进后,能够提供更清晰的文本和更鲜明的颜色。

GPT-4 能够在几周后发布的一项功能是,用户可以与之交互,进行实时的多模态交互,如视频通话和实时翻译。

GPT-4 在数学基准测试中的表现令人印象深刻,尽管它在某些数学提示上失败了,但与原始 GPT-4 相比仍有显著提升。

GPT-4 在 Google Proof Graduate 测试中击败了 Claude 3 Opus,这是 Anthropic 的主要基准测试,表明 GPT-4 的性能超越了竞争对手。

GPT-4 的定价为每 100 万个输入令牌 5 美元,每 100 万个输出令牌 15 美元,相对于 Claude 3 Opus 的定价具有竞争力。

GPT-4 在 DROP 基准测试中的表现略好于原始 GPT-4,但略逊于 Llama 3400b,显示出在推理能力上的细微差异。

GPT-4 在翻译和视觉理解评估方面取得了进步,特别是在非英语语言的令牌需求减少,使得对话更便宜、更快捷。

GPT-4 在多语言性能方面有所提升,尽管英语仍然是最适合的语言。

GPT-4 的视频输入功能令人印象深刻,可以直接将视频流式传输到 Transformer 架构,尽管反应时间不如音频输入那样即时。

GPT-4 能够产生多种声音,并且能够尝试和谐地唱歌,显示出其在声音生成方面的多样性和协调性。

GPT-4 的实时翻译功能预示着不久的将来可能会出现类似的功能,为用户提供更便捷的语言交流体验。

GPT-4 的发布可能会吸引更多的人使用 AI,尤其是当它作为一个免费且健谈的模型时,可能会使数百万人开始测试 AI。

GPT-4 的博客文章强调了将这种强大的 AI 工具免费提供给每个人的重要性,这体现了 OpenAI 的开放性。

GPT-4 的延迟降低是其关键创新之一,这使得它能够实现接近人类水平的响应时间和表现力。

GPT-4 在演示中的交互性,包括实时对话和对视频的反应,展示了其作为聊天机器人的潜力。

尽管 GPT-4 在某些基准测试中表现混杂,但它在多模态交互和实时翻译方面的潜力使其成为一个值得关注的模型。

Transcripts

play00:00

it's smarter in most ways cheaper faster

play00:03

better at coding multimodal in and out

play00:07

and perfectly timed to steal the

play00:09

spotlight from Google it's gp4 Omni I've

play00:14

gone through all the benchmarks and the

play00:16

release videos to give you the

play00:18

highlights my first reaction was it's

play00:21

more flirtatious sigh than AGI but a

play00:25

notable step forward nonetheless first

play00:28

things first GPT 40 meaning Omni which

play00:31

is all or everywhere referencing the

play00:34

different modalities it's got is Free by

play00:37

making GPT 43 they are either crazy

play00:40

committed to scaling up from 100 million

play00:42

users to hundreds of millions of users

play00:45

or they have an even smarter model

play00:47

coming soon and they did hint at that of

play00:49

course it could be both but it does have

play00:51

to be something just giving paid users

play00:54

five times more in terms of message

play00:55

limits doesn't seem enough to me next

play00:58

open AI branded this as GPT 4 level

play01:01

intelligence although in a way I think

play01:03

they slightly underplayed it so before

play01:05

we get to the video demos some of which

play01:08

you may have already seen let me get to

play01:10

some more under the radar announcements

play01:12

take text image and look at the accuracy

play01:16

of the text generated from this prompt

play01:18

now I know it's not perfect there aren't

play01:20

two question marks on the now there's

play01:23

others that you can spot like the I

play01:24

being capitalized but overall I've never

play01:27

seen text generated with that much

play01:29

accuracy and it wasn't even in the demo

play01:31

or take this other example where two

play01:33

openai researchers submitted their

play01:35

photos then they asked GPT 40 to design

play01:38

a movie poster and they gave the

play01:40

requirements in text now when you see

play01:43

the first output you're going to say

play01:45

well that isn't that good but then they

play01:47

asked GPT 40 something fascinating it

play01:49

seemed to be almost reverse psychology

play01:52

because they said here is the same

play01:53

poster but cleaned up the text is

play01:55

crisper and the colors Bolder and more

play01:57

dramatic the whole image is now improved

play02:00

this is the input don't forget the final

play02:02

result in terms of the accuracy of the

play02:05

photos and of the text was really quite

play02:07

impressive I can imagine millions of

play02:09

children and adults playing about with

play02:11

this functionality of course they can't

play02:13

do so immediately because open AI said

play02:15

this would be released in the next few

play02:17

weeks as another bonus here is a video

play02:19

that open AI didn't put on their YouTube

play02:22

channel it mimics a demo that Google

play02:24

made years ago but never followed up

play02:26

with the openai employee asked GPT 40 to

play02:30

call customer service and ask for

play02:32

something I've skipped ahead and the

play02:34

customer service in this case is another

play02:36

AI but here is the conclusion could you

play02:39

provide Joe's email address for me sure

play02:41

it's Joe example.com

play02:43

awesome all right I've just sent the

play02:46

email can you check if Joe received it

play02:48

we'll check right now please hold sure

play02:51

thing Hey Joe could you please check

play02:53

your email to see if the shipping label

play02:55

and return instructions have arrived

play02:56

fingers crossed yes I got the

play02:58

instructions perfect Joe has received

play03:00

the email they call it a proof of

play03:02

concept but it is a hint toward the

play03:04

agents that are coming here are five

play03:06

more quick things that didn't make it to

play03:08

the demo how about a replacement for

play03:11

lensa submit your photo and get a

play03:14

caricature of yourself or what about

play03:16

text to new font you just ask for a new

play03:19

style of font and it will generate one

play03:21

or what about meeting transcription the

play03:24

meeting in this case had four speakers

play03:26

and it was transcribed or video

play03:29

summaries remember this model is

play03:30

multimodal in and out now it doesn't

play03:34

have video out but I'll get to that in a

play03:36

moment here though was a demonstration

play03:38

of a 45-minute video submitted to GPC 40

play03:42

and a summary of that video we also got

play03:44

character consistency across both woman

play03:47

and dog almost like an entire cartoon

play03:50

strep if those were the quick bonuses

play03:52

what about the actual intelligence and

play03:54

performance of the model before I get to

play03:56

official benchmarks here is a human

play03:59

grade leaderboard pitting one model

play04:01

against another and yes I'm also a good

play04:04

gpt2 chatbot is indeed GPT 40 so it

play04:09

turns out I've actually been testing the

play04:10

model for days overall you can see the

play04:13

preference for GPT 40 compared to all

play04:16

other models in coding specifically the

play04:19

difference is quite Stark I would say

play04:22

even here though we're not looking at an

play04:24

entirely new tier of intelligence

play04:27

remember that a 100 ELO Gap is is a win

play04:30

rate of around 2/3 so 1/3 of the time

play04:33

gp4 turbos outputs would be preferred

play04:36

that's about the same gap between GPT 4

play04:38

Turbo and last year's GPT 4 a huge step

play04:42

forward but not completely night and day

play04:44

I think one underrated announcement was

play04:47

the desktop app a live coding co-pilot

play04:50

okay so I'm going to open the chbt

play04:53

desktop app like miror was talking about

play04:55

before okay and to give a bit of

play04:57

background of what's going on so here we

play04:59

have um a computer and on the screen we

play05:01

have some code and then the chat gbt

play05:02

voice app is on the right so chat gbt

play05:05

will be able to hear me but it can't see

play05:06

anything on the screen so I'm going to

play05:08

highlight the code command C it and then

play05:10

that will send it to chat GPT and then

play05:12

I'm going to talk about the code to chat

play05:14

GPT okay so I just shared some code with

play05:17

you could you give me a really brief

play05:18

one- sentence description of what's

play05:19

going on in the code this code fetches

play05:22

daily weather data for a specific

play05:24

location and time period Smooths the

play05:26

temperature data using a rolling average

play05:29

and Ates a significant weather event on

play05:31

the resulting plot and then displays the

play05:33

plot with the average minimum and

play05:35

maximum temperatures over the year I've

play05:38

delayed long enough here are the

play05:40

benchmarks I was most impressed with

play05:42

Gypsy 40's performance on the math

play05:44

benchmark even though it fails pretty

play05:46

much all of my math prompts that is

play05:48

still a stark improvement from the

play05:50

original GPT 4 on the Google proof

play05:53

graduate test it beats Claude 3 Opus and

play05:56

remember that was the headline Benchmark

play05:58

for anthropic in fact speaking of

play06:00

anthropic they are somewhat challenged

play06:02

by this release GPT 40 costs $5 per 1

play06:06

million tokens input and $15 per 1

play06:08

million tokens output as a quick aside

play06:10

it also has 128k token context and an

play06:13

October knowledge cut off but remember

play06:15

the pricing 5 and 15 Claude 3 Opus is

play06:20

1575 and remember for Claude 3 Opus on

play06:23

the web you have to sign up with a

play06:25

subscription but GPT 40 will be free so

play06:28

for claw Opus to be beaten in its

play06:31

headline Benchmark is a concern for them

play06:34

in fact I think the results are clear

play06:36

enough to say that gp40 is the new

play06:39

smartest AI however just before you get

play06:42

carried away and type on Twitter the AGI

play06:44

is here there are some more mixed

play06:47

benchmarks take the drop Benchmark I dug

play06:50

into this Benchmark and it's about

play06:51

adversarial reading comprehension

play06:53

questions they're designed to really

play06:55

test the reasoning capabilities of

play06:58

models if you give models difficult

play06:59

passages and they've got to sort through

play07:01

references do some counting and other

play07:04

operations how do they Fair the drop by

play07:06

the way is discrete reasoning over the

play07:08

content of paragraphs it does slightly

play07:10

better than the original GPT 4 but

play07:13

slightly worse than llama 3400b and as

play07:16

they note llama 3400b is still training

play07:19

so it's just about the new smartist

play07:22

model by a hairs breath however we're

play07:24

not done yet it's better at translation

play07:27

than Gemini models quick caveat there

play07:29

Gemini 2 might be announced tomorrow and

play07:32

that could regain the lead then there

play07:34

are the vision understanding evaluations

play07:37

it was a real step forward on the mm muu

play07:40

as you can see a clear 10 points better

play07:42

than Claude Opus again I'm curious if

play07:45

Google Gemini can exceed it though the

play07:47

improvements to the tokenizer could be

play07:50

revolutionary for non-english speakers

play07:53

the dramatically fewer tokens needed for

play07:55

languages like Gujarati Hindi Arabic and

play07:58

more don't just mean that conversations

play08:01

are cheaper they're also quicker and

play08:03

what about multilingual performance well

play08:05

this time they didn't compare it to

play08:07

other models but compared it to the

play08:08

original GPT 4 definitely a step up

play08:11

across languages but English is still by

play08:14

far the most suited language indeed here

play08:16

is a video of some of the models

play08:18

mistakes ending with some dodgy language

play08:21

tuition I know enough Mandarin to say it

play08:24

wasn't perfect at churing let's root

play08:27

root root for the home

play08:32

what was

play08:33

that sorry guys I got carried

play08:40

away right for

play08:43

round

play08:55

Francisco I have feeling I'm very wrong

play09:00

hello uh my name is NCH I'm here with my

play09:02

coworker hi I'm sh I'm trying to teach

play09:05

my coworker how to speaking manding we

play09:07

want to start from simple words like

play09:09

niow can you teach him how to pronounce

play09:12

that of course hey Nao nice to meet you

play09:15

KNE how is pretty straightforward to

play09:17

pronounce it sounds like KN how just

play09:21

make sure to keep the KNE part high and

play09:23

then go down and pitch on the how give

play09:26

it a try me how

play09:29

that was great really good first try not

play09:31

you you're natural it really wasn't he

play09:34

needs to work on his tones and her face

play09:36

was the giveaway there were a lot of

play09:39

other interesting video demonstrations

play09:41

but before them the GPT 40 blog post

play09:44

from Sam opman put out tonight he made

play09:46

the argument that putting this capable

play09:48

AI tool in the hands of everyone for

play09:51

free is justifying that open part of

play09:54

open AI less about open weights or open

play09:57

source more about open to to everyone

play10:00

without ads or anything like that he

play10:02

also draws our attention to the latency

play10:05

of the models in many ways that is the

play10:07

key Innovation with GPT 40 by dialing

play10:10

down the latency you dial up the realism

play10:14

as he said it feels like AI from the

play10:15

movies getting to human level response

play10:18

times and expressiveness turns out to be

play10:20

a big change indeed I think I should get

play10:22

a little credit for predicting her like

play10:25

AI in a video from a month ago but now I

play10:28

want to get to those demos the response

play10:30

times were amazing and the model was

play10:33

clearly designed to be flirtatious note

play10:36

that in Senate testimony last year

play10:38

samman said we try to design systems

play10:40

that do not maximize for engagement so

play10:43

let me know in the comments whether you

play10:44

think these demos are designed to

play10:47

maximize engagement okay so this is what

play10:49

I wrote down what do you

play10:51

see a I see I love chat chpt that's so

play10:57

sweet of

play10:58

you yeah well I really appreciate all

play11:00

the

play11:01

help so yeah as we can see

play11:07

um wow that's quite the outfit you've

play11:10

got on yeah as we can see you know we

play11:11

can chat in real time I was impressed

play11:13

though that it could speed up its

play11:14

talking on demand of course one two

play11:19

three hey actually that's um that's a

play11:21

little slow could you count

play11:24

faster Sure Thing 1 2 3 4 5 6 7 8 9 10

play11:27

okay that was a little bit too fast of

play11:29

course open AI must be given credit

play11:31

because for blind people this could be

play11:34

revolutionary right now the Royal

play11:36

Standard flag is flying above Buckingham

play11:39

Palace which is the signal that the

play11:41

Monarch is present how exciting to be

play11:43

there while the king is home try and

play11:46

tell me exactly what they're doing right

play11:47

now please um right now the Ducks are

play11:50

gently gliding across the water it will

play11:53

also help you with interview prep by

play11:55

analyzing your appearance how about you

play11:58

I'm awesome list listen I got some huge

play12:00

news oh do tell I'm allers well in a few

play12:05

minutes I'm going to be interviewing at

play12:06

open aai have you heard of them open AI

play12:10

huh sounds Vaguely

play12:12

Familiar kidding of course that's

play12:15

incredible Rocky what kind of interview

play12:17

well it's for a software engineering

play12:19

role but I just need to know do I look

play12:21

presentable

play12:23

professional well

play12:25

Rocky you definitely have the I've been

play12:28

coding all night look down which could

play12:31

actually work in your favor maybe just

play12:34

run a hand through your hair or lean

play12:36

into the Mad genis SP Your Enthusiasm is

play12:39

what's really going to shine through I

play12:40

don't have a lot of time so I'm just

play12:42

going to throw this on what do you

play12:47

think oh Rocky that's quite a statement

play12:51

piece I I mean you you'll definitely

play12:53

stand out though maybe not in the way

play12:57

you're hoping for an interview okay I

play12:59

got it I got it I could definitely see

play13:01

GPT 40 being used for mathematics

play13:04

tutoring here is a 30second clip but

play13:06

notice also the slight glitchiness at

play13:09

certain moments relative to angle Alpha

play13:13

all right so I'm pretty sure this is the

play13:15

angle Alpha right here

play13:17

right cor correct now looking at the

play13:20

triangle which side do you think is a

play13:24

hypotonus

play13:26

um remember the hyot I'm not totally

play13:29

sure I think I think it might be this

play13:32

one but I really am not sure this side

play13:35

AC you're close I also noticed quite a

play13:38

few glitches in the demo sounds cutting

play13:41

in and out the one feature though the

play13:44

open AI wants you to really focus on is

play13:47

the video in capacity you can live

play13:49

stream video straight to the Transformer

play13:52

architecture behind GPT 40 and so the

play13:55

video in functionality is certainly

play13:57

impressive note though that the reaction

play13:59

time of GPC 40 isn't quite as immediate

play14:03

as for audio Greg Brockman the

play14:05

co-founder of open aai has to prompt it

play14:07

to react to the Intruder overall stylish

play14:11

feel they might be getting ready for a

play14:13

conversation or a presentation given

play14:15

their direct engagement with the

play14:19

camera sounds like we're in for an

play14:21

intriguing interaction I'm curious about

play14:24

the lighting you mentioned could you

play14:26

describe it a bit more is it natural

play14:29

light are we talking unique fixtures or

play14:32

colors that add to the

play14:35

atmosphere the lighting here seems to be

play14:37

a mix of natural and artificial there's

play14:40

a noticeable bright light overhead

play14:42

probably a fixture casting a focused

play14:45

beam downwards which creates a spotlight

play14:47

effect this adds a dramatic and modern

play14:50

feel to the scene the rest of the room

play14:53

is lit more softly possibly by natural

play14:55

light filtering anything did anything

play14:58

unusual happen

play15:00

recently Yes actually uh just as you

play15:03

asked another person came into view

play15:06

behind the first person they playfully

play15:09

made bunny ears behind the first

play15:10

person's head and then quickly left the

play15:13

frame it added a light-hearted and

play15:15

unexpected moment to the scene now the

play15:18

focus is back on the original person

play15:21

with the leather jacket nevertheless gbt

play15:23

40 can produce multiple voices that can

play15:26

sing almost in harmony

play15:30

and really try to harmonize

play15:32

here San Francisco San Francisco in the

play15:37

month of May but maybe make it more

play15:40

dramatic and make the soprano

play15:42

higher San Francisco in the month of May

play15:46

San franisco in the month of May it's a

play15:50

Friday C may we are harmonizing are

play15:55

Harmon great thank you and I suspect

play15:58

this real time translation could soon be

play16:01

coming too Siri later for us so every

play16:04

time I say something in English can you

play16:06

repeat it back in Spanish and every time

play16:08

he says something in Spanish can you

play16:10

repeat it back in English sure I can do

play16:13

that let's get this translation train

play16:16

rolling um hey how's it been going have

play16:19

you been up to anything interesting

play16:21

recently

play16:35

hey I've been good just a bit busy here

play16:38

preparing for an event next week why do

play16:40

I say that because Bloomberg reported

play16:42

two days ago that apple is nearing a

play16:44

deal with open AI to put chat GPT on

play16:48

iPhone and in case you're wondering

play16:49

about GPT 4.5 or even five samman said

play16:53

we'll have more stuff to share soon and

play16:55

Mira murati in the official presentation

play16:58

said that would be soon updating us on

play17:01

progress on the next big thing whether

play17:04

that's empty hype or real you can decide

play17:07

no word of course about openai

play17:09

co-founder ilas Sask although he was

play17:12

listed as a contributor under additional

play17:15

leadership overall I think this model

play17:18

will be massively more popular even if

play17:20

it isn't massively more intelligent you

play17:23

can prompt the model now with text and

play17:25

images in the open AI playground all the

play17:28

links will be in the description note

play17:30

also that all the demos you saw were in

play17:32

real time at 1X speed that I think was a

play17:36

nod to Google's botch demo of course

play17:39

let's see tomorrow what Google replies

play17:41

with to those who think that GPT 40 is a

play17:44

huge dry towards AGI I would Point them

play17:47

to the somewhat mixed results on the

play17:49

reasoning benchmarks expect GPT 40 to

play17:52

still suffer from a massive amount of

play17:55

hallucinations to those though who think

play17:57

that GPT 40 will change nothing I would

play18:00

say this look at what chat GPT did to

play18:03

the popularity of the underlying GPT

play18:05

series it being a free and chatty model

play18:08

brought a 100 million people into

play18:11

testing AI GPT 40 being the smartest

play18:14

model currently available and free on

play18:17

the web and multimodal I think could

play18:21

unlock AI for hundreds of millions more

play18:24

people but of course only time will tell

play18:27

if you want to analyze the announcement

play18:29

even more do join me on the AI insiders

play18:32

Discord via patreon we have live meetups

play18:35

around the world and professional best

play18:36

practice sharing so let me know what you

play18:39

think and as always have a wonderful day

Rate This

5.0 / 5 (0 votes)

相关标签
人工智能GPT-4多模态图像生成实时视频智能对话技术革新用户体验OpenAIAI发展未来趋势