OpenAI Releases GPT-4o Mini! How Does It Compare To Other Models?

SVIC Podcast
18 Jul 202421:02

Summary

TLDRDieses Podcast-Skript diskutiert die neuesten Entwicklungen in der KI-Branche, insbesondere die Einführung der kostengünstigen GPD 40 Mini, die die Anwendung von KI erweitern und zugänglicher machen soll. Es geht um die Verbesserung von Modellen und die Reduzierung der Kosten, was zu einer zukünftigen Integration von KI in jede App und jeden Website führen wird. Auch die Herausforderungen und Wettbewerbsstrategien gegenüber Google werden angesprochen.

Takeaways

  • 😀 Das Podcast hatte technische Probleme, aber es geht weiter mit Diskussionen über AI, Geschäftswesen und Comedy.
  • 🛍️ Der Sprecher sucht eine Stelle bei einer SAS-Marketing-Firma oder in der Produktabteilung und betont die Bedeutung von AI für die Zukunft.
  • 🚗 Er spricht über den Cybertruck und seine finanzielle Notwendigkeit, den Deal abzuschließen, um ihn zu bezahlen.
  • 📈 Die Einführung von GPD 40 mini, das als kostengünstigster kleiner Modell präsentiert wird, mit dem Ziel, die Anwendung von KI zu erweitern und sie zugänglicher zu machen.
  • 💰 GPD 40 mini ist preislich um ein Vielfaches günstiger als frühere Modelle und 60% günstiger als GPT 3.5 turbo, mit Kosten von 15 Cent pro Million Eingabe-Token und 60 Cent pro Million Ausgabe-Token.
  • 🔍 GPD 40 mini unterstützt Text- und Bild-APIs und wird in Zukunft auch Unterstützung für Video- und Audio-Eingaben und -Ausgaben bieten.
  • 📚 Das Modell hat ein Kontextfenster von 128.000 Token und unterstützt bis zu 16k Ausgabe-Token pro Anfrage, mit Wissen bis Oktober 2023.
  • 📉 Die Verbesserungen im Tokenizer machen GPD 40 mini bei der Behandlung von nicht-englischem Text kosteneffizienter und mit überlegenen textuellen Fähigkeiten.
  • 🏆 GPD 40 mini übertrifft GPT 3.5 turbo und andere kleine Modelle in akademischen Benchmarks sowohl in textueller Intelligenz als auch in mehrmodaler Begründung.
  • 🔑 Die Funktionen von GPD 40 mini ermöglichen es Entwicklern, Anwendungen zu erstellen, die Daten abrufen oder Aktionen mit externen Systemen durchführen können, und verbessern die Leistung bei langem Kontext im Vergleich zu GPT 3.5 turbo.
  • 📊 Die Präsentation von Benchmark-Ergebnissen zeigt, dass GPD 40 mini in verschiedenen Bereichen besser abschneidet als andere kleine Modelle, einschließlich mathematischer und kodierender Fähigkeiten.

Q & A

  • Was ist das Hauptthema des Podcasts?

    -Das Hauptthema des Podcasts ist die Diskussion über künstliche Intelligenz, Geschäftsstrategien und Comedy.

  • Was ist der aktuelle Preis für die Verarbeitung von Eingabe-Token mit dem GPD 40 Mini?

    -Der Preis für die Verarbeitung von Eingabe-Token mit dem GPD 40 Mini beträgt 15 Cent pro Million Eingabe-Token.

  • Wie viel kostet die Verarbeitung von Ausgabe-Token mit dem GPD 40 Mini?

    -Die Verarbeitung von Ausgabe-Token mit dem GPD 40 Mini kostet 60 Cent pro Million Ausgabe-Token.

  • Was ist der Vergleichspreis für Eingabe-Token mit dem GPT 3.5 Turbo?

    -Der Preis für Eingabe-Token mit dem GPT 3.5 Turbo beträgt 50 Cent pro Million Eingabe-Token.

  • Welche Funktionen bietet der GPD 40 Mini?

    -Der GPD 40 Mini unterstützt Text und Vision in der API und bietet in Zukunft Unterstützung für Text, Bild, Video und Audio-Eingaben und -Ausgaben.

  • Wie groß ist das Kontextfenster des GPD 40 Mini?

    -Das Kontextfenster des GPD 40 Mini beträgt 128.000 Token und unterstützt bis zu 16.000 Ausgabe-Token pro Anfrage.

  • Was ist der Hauptvorteil des Batch-API im Hinblick auf die Kosten?

    -Das Batch-API kann die Kosten für die Verarbeitung von Eingabe-Token und Ausgabe-Token halbieren, da es für Aufgaben verwendet wird, die nicht sofort Ergebnisse erfordern.

  • Welche Funktionen bietet die Superhuman-E-Mail-Verfolgung?

    -Die Superhuman-E-Mail-Verfolgung ermöglicht es, wann und auf welchem Gerät Empfänger E-Mails öffnen, indem sie einen kleinen Tracking-Pixel in den gesendeten E-Mails einbettet.

  • Was ist das Ziel von Open AI, wenn es um die Reduzierung der Kosten und Verbesserung der Modellfähigkeiten geht?

    -Das Ziel von Open AI ist es, die Kosten weiter zu reduzieren und gleichzeitig die Modellfähigkeiten zu verbessern, um eine nahtlose Integration von Modellen in jede App und auf jeder Website zu ermöglichen.

  • Was ist der aktuelle Stand der Open AI-Entwicklung im Hinblick auf Sprachmodelle?

    -Open AI plant für diesen Monat den Start der Sprachfunktion und plant die allgemeine Verfügbarkeit für einige Zeit danach.

Outlines

00:00

😀 AI-Geschäft und Comedy-Podcast

Der Podcast beginnt mit einer Diskussion über technische Schwierigkeiten und einen Überblick über das Thema AI, Geschäfts- und Comedy-Diskussionen. Es wird erwähnt, dass der Sprecher auf der Suche nach einer Stelle in einem SaaS-Marketing- oder Produktunternehmen ist. Der Podcast kündigt auch die Veröffentlichung der GPD 40 Mini an, eine kostengünstige AI-Modell, das die Anwendungsbereiche von KI erweitern soll, indem es Intelligenz erschwinglicher macht. Die GPD 40 Mini hat 82% der MLU und ist im Vergleich zu früheren Modellen preislich um ein Vielfaches günstiger und um über 60% kosteneffizienter als GPT 3.5 Turbo. Es wird auch auf die Fähigkeiten der Mini hinsichtlich der Text- und Bild-API sowie die zukünftige Unterstützung von Video- und Audio-Eingaben und -Ausgaben hingewiesen.

05:02

😉 Die Entwicklung von Modellen und deren Anwendung

In diesem Abschnitt wird über die Herausforderungen und die Entwicklung von AI-Modellen gesprochen, mit einem besonderen Fokus auf die Erfahrungen von Unternehmen wie Superhuman, die unter dem Druck der Integration von AI-Modellen leiden. Es wird auch über die Verwendung von Tracking-Pixeln in E-Mails und Werbeanzeigen diskutiert, um Nutzerverhalten zu verfolgen und zu personalisieren. Der Sprecher lobt die Verwendung von Tracking-Pixeln und ihre Bedeutung für Marketingstrategien. Zudem wird auf die Rolle von Maage als Chat-Moderator hingewiesen, der in der Lage ist, unerwünschten Beitrag zu moderieren.

10:02

😲 Die zunehmende Preisgünstigkeit von AI-Modellen

Der Sprecher diskutiert die signifikanten Preissenkungen für AI-Modelle, insbesondere für die GPD 40 Mini, die 15 Cent pro Million Eingabe-Token und 60 Cent pro Million Ausgabe-Token kostet, was ein Zehntel der Kosten früherer Modelle ist. Es wird auch auf die Batch-API eingegangen, die die Kosten für Aufgaben, die nicht sofort bearbeitet werden müssen, halbiert. Der Sprecher kritisiert die komplizierte Preisstruktur von OpenAI und vergleicht die Kosten für verschiedene Modelle, unterstrichend die erhebliche Kostensenkung für die GPD 40 Mini im Vergleich zu anderen Modellen.

15:03

🤔 AI-Modelle als Werkzeuge für wissenschaftliche Hypothesen

In diesem Abschnitt werden die Fähigkeiten aktueller AI-Modelle hervorgehoben, wissenschaftliche Hypothesen zu entwickeln, die dann getestet und weiterentwickelt werden können. Der Sprecher teilt mit, dass er an einem OpenAI-Entwicklerforum teilgenommen hat, bei dem über die Verwendung von AI zur Erfassung von Bildern des ersten Schwarzen Lochs gesprochen wurde. Es wird auch darüber diskutiert, wie AI-Modelle in einer Umgebung agieren können, in der sie keine physikalischen Theorien kennen, aber Theorien ableiten können, basierend auf ihrer Erfahrung. Der Sprecher zeigt Interesse an der Vorstellung, dass AI-Modelle in der Zukunft möglicherweise eigene wissenschaftliche Theorien entwickeln könnten.

20:04

😎 OpenAI's bevorstehende Einführung von Sprachfunktionen

Der letzte Absatz konzentriert sich auf die bevorstehende Einführung von Sprachfunktionen durch OpenAI, die in Kürze allgemein verfügbar sein werden. Der Sprecher diskutiert auch die Preisgestaltung von OpenAI und vergleicht sie mit anderen Anbietern wie Claude. Es wird auf die günstige Preisstruktur von GPD 40 hingewiesen, die 15 Cent pro Million Eingabe-Token kostet, im Vergleich zu 300 Cent pro Million Eingabe-Token von Claude 3. Der Sprecher beendet die Diskussion mit einem Aufruf an die Zuhörer, das Podcast zu liken und zu abonnieren und auf der Website zu unterstützen.

Mindmap

Keywords

💡AI

Künstliche Intelligenz (AI) bezieht sich auf Computerprogramme oder maschinelles Lernen, die in der Lage sind, Aufgaben auszuführen, die normalerweise von menschlichen Gehirnen durchgeführt werden. Im Video wird AI als zentrales Thema diskutiert, insbesondere in Bezug auf ihre Anwendung in Geschäfts- und Marketingstrategien sowie im Bereich der Unterhaltung. Beispielsweise wird die Verwendung von AI in Podcasts und Chat-Bots erwähnt.

💡SAS

SAS steht für Software as a Service und bezeichnet eine Art von Cloud-Computing, bei dem Software über das Internet anstatt lokal auf dem Computer des Benutzers gehostet wird. Im Kontext des Videos wird SAS in Verbindung mit der Vermarktung und dem Verkauf von Softwarelösungen für den Unternehmensmarkt genannt.

💡GPT

GPT (Generative Pretrained Transformer) ist ein Typ von künstlicher Intelligenz, der für die Generierung von Text verwendet wird. Im Video wird GPT 4.0 Mini als neuere Version von GPT vorgestellt, die preisgünstiger und leistungsfähiger ist als frühere Modelle. Das Video diskutiert die Verbesserungen in der Textverarbeitung und die Kostensenkung für die Verwendung von GPT-Modellen.

💡MLU

MLU (Machine Learning Unit) ist eine Maßeinheit, die verwendet wird, um die Leistung von künstlicher Intelligenz zu messen. Im Video wird erwähnt, dass GPT 4.0 Mini 82% der MLU erreicht, was seine Effizienz im Vergleich zu anderen Modellen hervorhebt.

💡Kosteneffizienz

Kosteneffizienz bezieht sich auf die Fähigkeit, eine Aufgabe oder einen Dienst zu erbringen, ohne unnötige Kosten zu verursachen. Im Video wird die Kosteneffizienz von GPT 4.0 Mini hervorgehoben, insbesondere in Bezug auf die Reduzierung der Kosten pro Eingabe-Token und Ausgabe-Token, was die Verwendung von AI für Unternehmen attraktiver macht.

💡Chat-Bots

Chat-Bots sind Programme, die in der Lage sind, mit Menschen über Chat-Systeme zu interagieren. Im Video wird die Verwendung von Chat-Bots in der Kundenbetreuung diskutiert, wobei AI-Technologien genutzt werden, um schnelle und effektive Textantworten zu generieren.

💡Vision

Vision bezieht sich auf die Fähigkeit, Bilder oder visuelle Daten zu erkennen und zu interpretieren. Im Video wird die Fähigkeit von GPT 4.0 Mini, Text und visuelle Daten zu verarbeiten, hervorgehoben, was die Anwendung von AI in Bereichen wie Bild- und Videoanalyse ermöglicht.

💡Token

Ein Token in der künstlichen Intelligenz ist eine Einheit von Text, die von Modellen wie GPT verwendet wird, um Sprache zu verarbeiten. Im Video wird die Kostenstruktur für die Verwendung von Eingabe-Token und Ausgabe-Token bei der Verwendung von AI-Modellen diskutiert.

💡Batch-API

Ein Batch-API ermöglicht die Verarbeitung von Daten in Stapeln anstatt einzeln, was zu Kosteneinsparungen führen kann. Im Video wird die Verwendung von Batch-APIs zur Senkung der Kosten für die Verarbeitung von AI-Modellen diskutiert, insbesondere wenn die Ergebnisse nicht sofort benötigt werden.

💡Superhuman

Superhuman ist ein E-Mail-Client, der AI-Technologien verwendet, um die E-Mail-Verwaltung zu verbessern. Im Video wird die Funktionalität von Superhuman hervorgehoben, insbesondere seine Fähigkeit, E-Mails zu tracken und Benutzer über das Öffnen von E-Mails durch Empfänger zu informieren.

💡Tracking-Pixel

Ein Tracking-Pixel ist ein kleines Bild, das in E-Mails oder Webseiten eingebettet ist, um das Verhalten von Benutzern zu verfolgen. Im Video wird die Verwendung von Tracking-Pixeln in E-Mails von Superhuman diskutiert, um zu sehen, wann und auf welchem Gerät eine E-Mail geöffnet wurde.

Highlights

Introduction of GPD 40 mini, a cost-efficient AI model aimed at expanding AI applications affordability.

GPD 40 mini scores 82% on MLU and outperforms GP4 on chat preferences and LMS leaderboard.

Pricing of GPD 40 mini at 15 cents per million input tokens and 60 cents per million output tokens, significantly cheaper than previous models.

GPD 40 mini enables a broad range of tasks with low cost and latency, including customer support chat bots.

Support for text and vision in the API, with future support for image, video, and audio inputs and outputs.

Model context window of 128,000 tokens and knowledge up to October 2023, but no inclusion of recent events like Terrence Howard's new math.

Improved tokenizer enhances GPD 40 mini's handling of non-English text, making it more cost-effective.

GPD 40 mini surpasses GPT 3.5 turbo and other small models in academic benchmarks for textual intelligence and multimodal reasoning.

GPD 40 mini's strong performance in function calling enables developers to build applications that interact with external systems.

Evaluation of GPD 40 mini across key benchmarks shows its superiority in reasoning tasks involving text and vision.

GPD 40 mini excels in mathematical reasoning and coding tasks, outperforming previous small models.

Partnership with companies like Ramp to understand use cases and limitations of GPD 40 mini.

Discussion on the impact of GPD 40 mini on companies like Superhuman, which faces pressure from integrating AI functionalities.

The importance of tracking pixels in emails for sales and marketing, similar to their use in advertising and web analytics.

GPD 40 mini's pricing is 10 times cheaper than previous models, making AI more accessible and integrated into daily digital experiences.

Open AI's commitment to reducing costs while enhancing model capabilities, paving the way for developers to build AI applications more efficiently.

Introduction of temporary chat feature in GPD 40, which doesn't appear in chat history for safety and compliance reasons.

Discussion on the potential of AI models to create their own scientific hypotheses, as demonstrated in an Open AI developer event.

Transcripts

play00:00

welome s podcast had technical

play00:01

difficulties on on my side it's just you

play00:03

know it's the way it is it's not like I

play00:04

work for a tech company or anything uh

play00:06

we talk about AI business and comedy uh

play00:08

we also have uh it's D minus comedy and

play00:13

um yeah like And subscribe so good news

play00:16

today that right very that was very

play00:18

compelling very enthusiastic compelling

play00:21

I'm trying to get a job at either

play00:22

Enterprise markeet SAS marketing company

play00:24

or either either in SAS sales or you

play00:27

what you GNA look product works you just

play00:30

pay me now come on Jesus Christ just pay

play00:34

me already yeah I gotta pay for the

play00:36

Cyber truck because it's always falling

play00:38

apart now and I need you to really close

play00:40

this deal okay so gb4 40 ounce mini

play00:44

advancing cost efficient intelligence

play00:47

yeah little it's a 40 ounce right there

play00:48

no Z in there I you're right fair enough

play00:53

uh a z in my heart is what I imagine

play00:56

it's being you're just upset cuz they

play00:57

went off brand and put an O there

play00:59

instead of saying dpd5 yeah could they

play01:02

do like four I don't yeah I even even

play01:06

Sam Walman mentions like uh where does

play01:08

he say you guys need a naming scheme

play01:10

revamp so bad he's like yes we

play01:13

do uh yeah don't go Nintendo on this N64

play01:17

64 DD Wii Wii U Etc it's just confusing

play01:21

so okay well so yep gbd4 o mini has been

play01:26

released um and so today we're

play01:29

announcing GP GPD uh 40 mini our cost

play01:33

our most cost efficient small model we

play01:35

expect GPD 40 mini will significantly

play01:37

expand the range of applications built

play01:39

with AI by making intelligence much more

play01:41

affordable 4 mini scores 82% of mlu and

play01:44

currently outperforms gp4 on chat pref

play01:47

preferences an L LMS leaderboard okay

play01:52

it's priced at 15 cents per million

play01:54

input tokens and 60 cents per million

play01:57

output tokens an order of magnitude more

play01:59

affordable than previous Frontier models

play02:01

and more than 60% cheaper than GPT 3.5

play02:04

turbo yay that's what we like to hear

play02:07

tokens getting cheaper gp4 mini enables

play02:10

a broad range of tasks with low cost and

play02:12

latency such as applications that chain

play02:14

or paralyze multiple model calls calling

play02:17

multiple apis pass a large volume of

play02:19

context to the model full code base or

play02:22

conversation history or interact with

play02:23

customers through fast real-time text

play02:25

responses customer support chat bots so

play02:28

basically it I like this trend of it's

play02:31

getting cheap cheaper and cheaper and

play02:32

cheaper which then means there's no

play02:33

excuse for companies not just to put

play02:35

llms everywhere just every any interface

play02:38

that I can type something into I want an

play02:40

llm there it just so I'm very happy

play02:43

about this today gp4 mini supports text

play02:46

and vision in in the API with support

play02:49

for text image video and audio inputs

play02:51

and outputs coming in the future the

play02:54

model has a context window of 128,000

play02:57

tokens supporting up to 16k Output

play02:59

tokens per request and has knowledge up

play03:01

to October 2023 so

play03:05

unfortunately it doesn't have Terrence

play03:08

Howard's new math in it so I this

play03:11

thing's already this thing's already

play03:12

flawed can't have everything exactly

play03:14

thanks to the improved tokenizer sh gp40

play03:17

handling non-english text is now more

play03:19

cost effective a small model with

play03:21

Superior text textual intelligence and

play03:23

multimodel reasoning gp40 mini surpasses

play03:26

GPT 3.5 turbo and other small models and

play03:28

academic benchmarks both textual

play03:30

intelligence and multimol reasoning and

play03:32

supports the same range of language as

play03:33

GPT 40 it also demonstrates strong

play03:36

performance and function calling which

play03:38

can enable developers to build

play03:39

applications that fetch data or take

play03:41

actions with external systems and

play03:43

improves long context performance

play03:45

compared to GPT 3.5 turbo GPD 40 mini

play03:48

has been evaluated across several key

play03:50

benchmarks reasoning task gp40 mini is

play03:52

better than other small models at

play03:54

reasoning tasks involving text and

play03:56

vision according to 82% mlu a textual

play03:59

intelligence raising Benchmark as

play04:01

compared to 77.9% of Gemini Flash and 73

play04:04

73.8% on Claude Hau math and coding

play04:07

proficiency 40 mini excels in

play04:09

mathematical reasoning and coding task

play04:11

outperforming previous small models on

play04:13

Market mgsm measuring with reasoning gp4

play04:16

mini scored 87% compared to blah blah

play04:19

blah blah blah okay so just that was

play04:22

your summary for that benchmarks yeah so

play04:25

here's orange these are very pretty I'll

play04:27

say it's very pretty colors so here's

play04:28

all the benchmarks ml U GP QA drop

play04:31

whatever and the Orange is mini and the

play04:34

yellow's Flash and so and there's GPD 40

play04:39

so it looks like it's

play04:43

outperforming all of the smaller

play04:45

models I guess which is cool but again

play04:49

like you know everyone's use case is

play04:51

different so I want to hear from people

play04:53

how they're going to be using well no

play04:55

it's just it's like look we're better at

play04:57

like one point percentage Point than

play04:59

this and people on Twitter who are like

play05:01

oh my God this changes the game and then

play05:04

three weeks four weeks later people like

play05:06

actually I'm going back to my previous

play05:08

model I like my previous model better

play05:10

and so it's the same circle jerk keeps

play05:12

going on as part of model development

play05:14

process work with a handful of trusted

play05:16

Partners better understand use cases and

play05:17

limitations of 40 mini we partnered with

play05:20

uh companies like ramp I haven't used

play05:22

ramp before what do they do spending

play05:25

made smarter easy use cards spend limits

play05:27

approval flows vendor payments and more

play05:29

plus average savings of 5% is this a is

play05:32

ramp a Zer company let's see

play05:35

here when was ramp um I'm using regular

play05:39

Google search 2019 so H tail tail end of

play05:44

zerp so Z zero interest rate phenomenon

play05:48

okay so let's see here um where we

play05:52

superhuman now superhumans is getting

play05:53

pressure because they were we're going

play05:55

to improve email and actually yesterday

play05:57

I was an open AI event and I was

play05:59

speaking to to an engineer who just left

play06:01

superhuman and they said they were

play06:03

running like once all these LMS came out

play06:05

they were praying to God that g uh Gmail

play06:07

didn't integrate them quick enough so

play06:09

they were trying to integrate them

play06:10

faster and now Gmail's slowly in

play06:12

integrating them and this this company's

play06:14

under tons and tons of pressure because

play06:17

they want you to pay 30

play06:21

bucks a month for their email that uses

play06:24

all these different AI functionalities

play06:26

but there are but they're just plugging

play06:27

in chat GPT already so with Gmail coming

play06:30

out and basically slowly integrating

play06:32

their own Gemini models then it really

play06:34

puts pressure on this company so okay um

play06:38

GP I thought the big deal about

play06:39

superhuman was that you could tell if

play06:41

people

play06:42

had uh seen your email or opened it or

play06:46

scrolled through different parts of it

play06:48

oh again that snitch functionality yeah

play06:51

isn't the snitch functionality isn't

play06:53

that the part that everyone's really

play06:54

excited about yeah let's unsubscribe and

play06:57

clear spam instantly snooze email for

play06:59

later fly through your inbox get more

play07:00

time back let's see here let's go to our

play07:03

friend over here it has a nice UI as

play07:06

well it looks beautiful does

play07:08

superhuman allow you to track when

play07:11

people open your email okay let's

play07:16

see yeah super allows users to track

play07:18

when recipients open their emails

play07:20

through a feature called snitch or no

play07:21

read statuses here are the key details

play07:24

about superhumans Email tracking

play07:25

capabilities snitch is a better

play07:26

marketing snitch show one show when and

play07:29

on which device recipients open your

play07:32

emails this feature works by embedding a

play07:34

tiny what perplexity what did you just

play07:37

do there okay this feature works by

play07:39

embodying a tiny tracking pixel image in

play07:41

sent emails when recipients open email

play07:43

it loads the image allowing superum to

play07:45

log when it was opened users can enable

play07:47

read statuses by using command K on

play07:49

desktop okay so snitch feature good to

play07:51

call Joe so it snitches on you and use

play07:53

AI this little tracking pixel idea is

play07:56

employed in a whole bunch of different

play07:57

ways all over the internet mhm to great

play08:00

effect I mean a lot of people who use

play08:02

superhuman really want to know you know

play08:04

I'm trying I'm a Salesman I'm trying to

play08:05

send somebody email about my product I

play08:08

want to know if they've opened it yet

play08:10

right especially if I'm hoping to call

play08:12

them or meet with them I want to know if

play08:14

they spent time with it like did they

play08:15

scroll through it right and then if I if

play08:18

I attached a presentation did they open

play08:21

the

play08:22

presentation did they go to certain

play08:24

slides in the presentation where did

play08:26

they which slide did they spend the most

play08:28

time on those are all really interesting

play08:30

and valuable uh pieces of information

play08:32

before I meet with that person exactly

play08:34

so you can tailor tailor your pitch and

play08:36

you know and right and then at the same

play08:39

time all across the advertising

play08:41

ecosystem uh advertisements and the

play08:43

pages that they lead to all have

play08:45

tracking pixels in them right and all

play08:47

the open content sites have tracking

play08:49

pixels in them so that I can know about

play08:52

when advertising is followed up on or

play08:54

viewed I can know if uh when a person

play08:57

comes to my ad or to my website I can

play08:59

know know what all they've done in the

play09:00

past what their interests are and so on

play09:03

all because these stupid tracking pixels

play09:05

are embedded in everything God bless

play09:07

them God bless tracking pixels always

play09:09

watching comforting us there providing

play09:11

us the data that we need comforting us

play09:13

God bless yeah they'll never browse

play09:15

alone they Rock Me asleep at night and

play09:16

tuck me in it's fantastic that's right

play09:19

okay uh also kudos to maage he's been

play09:22

promoted to chat mod and so basically

play09:25

when we get random AI CS who come in

play09:27

here and just put in walls of text that

play09:30

confuses me he can Now commute them so

play09:32

first line of defense yes and he gets a

play09:33

nice little right exactly it's a nice

play09:35

little wrench in his profile which is

play09:36

dope nice um we had Frisco fat sees say

play09:41

tuning and muted during a doctor's

play09:43

appointment that's the type of

play09:44

dedication we want of this show that is

play09:46

dedication the doctor could come in and

play09:48

be like it's not looking good but but

play09:51

Frisco has a headphones on he's

play09:52

listening to the sick podcast laughing

play09:54

and doctor's like wow he took that

play09:56

really well and it's because he wasn't

play09:57

paying attention he was listening to us

play09:58

so thank you way getting back to GPT

play10:01

that's right that's why we're here yes

play10:03

this thing is dramatically cheap exactly

play10:06

exactly I if you look at the summary

play10:09

page deep inside of opening ey's

play10:12

horrible marketing layout There's a

play10:14

summary page lists all the old models

play10:16

and how much input and output uh what

play10:18

the cost is for input and output tokens

play10:21

yeah and they're all like numbers like

play10:22

you know $10 a million all the way going

play10:25

back to the ancient models like $2 a

play10:28

million or 40 C a million right but then

play10:31

you look at the crazy numbers for 40 and

play10:33

it's like 15 cents a million or

play10:35

something crazy it's really cheap yeah

play10:38

mini yeah says it's uh 15 cents a

play10:42

million input and 60 cents a million

play10:45

output which is like 10 times

play10:48

cheaper that's see that's the Benchmark

play10:52

stuff that's great and everything and

play10:53

The Benchmark stuck was like more

play10:54

approved like hey you're not going to

play10:55

see a diminishment in like value but for

play10:58

me what I really care about is like

play10:59

getting cheaper and that's a big story

play11:01

so super duper cheap because you look

play11:03

GPT 3.5 turbo 50 cents per a million

play11:07

input tokens and then mini it's 15 cents

play11:10

and then let's keep on going do they

play11:12

have let's go back forther pretty crazy

play11:14

yeah fine tune models $3 per

play11:17

millon what else they have here have you

play11:19

ever used this batch API they're talking

play11:21

about no I have not I don't even know

play11:24

the the they're saying if you if you use

play11:25

the batch API they'll they'll cut your

play11:28

cost again it looks like in half mhm uh

play11:32

and the batch API is just like you don't

play11:34

need the result right away like you can

play11:35

wait for it Ah that's see this is

play11:38

exactly so you know zapier I hate their

play11:41

pricing their pricing is ridiculous they

play11:42

want like $30 a month and I barely do

play11:45

any tasks on it and one of our Engineers

play11:49

uh Braun draa who is a awesome person

play11:53

was like what they should do is a tier

play11:54

for Jordan where it's basically you

play11:56

don't care when it runs and you get de

play12:00

prior deprioritized the que but at the

play12:02

benefit of you get a much cheaper price

play12:04

yeah and it sounds like this is what bat

play12:05

API does which is pretty sweet yeah it's

play12:08

like if you're trying to pre-process

play12:10

something or yeah build an index or all

play12:12

these kind of background tasks that

play12:13

you're not a you're not in a hurry yeah

play12:16

anyway so they they have a a segment on

play12:18

this crazy pricing page that says older

play12:21

models and it lists everything except

play12:24

for GPD

play12:25

4 uh o mini mhm and actually GP 4 those

play12:29

two AR on this list but it it yeah it

play12:32

describes all these you know token costs

play12:35

mhm if you're if you really want to

play12:37

compare them all yeah like all the

play12:39

interesting models are up in this sort

play12:41

of $10 per million kind of range right

play12:45

right except for GPT 432k which is $60 a

play12:49

million that's painful yikes yeah and

play12:53

then they they haven't Minis on on here

play12:56

yet no mini and and 40 only at the top

play12:59

of the page yeah a separate pricing

play13:03

breakout this getting to these pricing

play13:05

things is really painful yeah it it

play13:08

shouldn't shouldn't be that way it's

play13:09

also super confusing because there's the

play13:11

pricing page for what consumers pay for

play13:13

chat GPT and then there's these

play13:14

developer Pages for what you pay to

play13:16

access the API yeah it feels like I'm in

play13:19

Enterprise pricing Hill right now you

play13:21

know and then there's this hilarious

play13:23

check box that says show prices per 1K

play13:26

token instead of millions like like we

play13:28

can't do the division ourselves no

play13:31

that's it's pretty it's pretty

play13:32

complicated that's awesome yeah I'm

play13:34

already it's already hurting my brain so

play13:36

let's go so the batch API thank you for

play13:38

bringing that up that's really

play13:39

interesting did not know about that so

play13:42

let's see here gp24 mini surpasses 3.5

play13:46

turbo and other small models and

play13:47

academic benchmarks we already talked

play13:48

about that reasoning math rereading this

play13:51

sorry team what's next over the past few

play13:54

years we witnessed remarkable

play13:56

advancements in AI intelligence pair

play13:57

with substantial reductions in costs for

play13:59

example the cost per token for GPU 40

play14:01

mini has dropped 99% since Tex avenci O3

play14:06

003 a less capable model introduced in

play14:08

2022 we're committed to continuing this

play14:10

trajectory of driving cost cost down

play14:12

while enhancing model capabilities

play14:14

that's beautiful that's what we want to

play14:15

see so we envision a future where models

play14:18

become seamlessly integrated in every

play14:20

app and on every website that's exactly

play14:22

what I want to see gp40 mini is Paving

play14:24

the way for developers to build and

play14:26

scale powerful AI applications more

play14:28

efficiently

play14:29

uh and affordably the future of AI is

play14:31

becoming more accessible reliable and

play14:33

embedded in our daily digital

play14:34

experiences and we're excited to

play14:35

continue to lead the way cool that's

play14:37

super duper cool um let me keep cutting

play14:39

the price that's exactly what we want we

play14:42

want the Walmart strategy just keep

play14:44

clean up the The Branding on the name of

play14:46

these models I mean GPT 35 turbo and now

play14:49

GPT 40 mini yeah what the hell I know

play14:54

they're just uh just th throwing

play14:57

spaghetti on a wall so nice if the names

play15:00

kind of indicated what the model could

play15:03

do how powerful it was and how expensive

play15:05

it was but but trying to get that out of

play15:08

the current names is just hopeless yeah

play15:11

it it it sucks and um so here I don't

play15:15

know I guess how how am I going to

play15:17

compare this people can understand

play15:18

what's going on oh if a temporary chat I

play15:20

guess something that goes away after

play15:22

time so This Is 40 mini actually

play15:24

disappearing chats I guess so let's see

play15:26

what this how this works not in history

play15:28

temporary chat won't appear in your

play15:30

history for safety purposes we may keep

play15:31

a copy of your chat for 30 days because

play15:33

Federal Regulation no model training

play15:35

memories off cool this is just like when

play15:39

you're doing sneaky sketch stuff uh okay

play15:42

so I don't know yeah tell me about the

play15:46

French

play15:50

Revolution that sure is fast okay cool

play15:55

wow that was very quick and uh everyone

play15:58

has their own problem so let's go to

play16:00

40 uh can I do a new chat okay

play16:05

40 tell the French

play16:08

Revolution that's your test chat yeah

play16:11

that's my test chat so this is for

play16:12

reolution yeah I I just it's always in

play16:15

the mind constantly like like room

play16:17

French Revolution or room is in my mind

play16:20

constantly um so so yeah faster no

play16:25

noticeable difference cheaper now

play16:26

compare just straight up gp4

play16:30

yeah but you got to compare the quality

play16:31

of the answers of course it requires me

play16:33

to read but you're right um it' be funny

play16:36

if the first one was just like yeah and

play16:38

so Flat Earth is correct and the world

play16:41

isn't actually

play16:43

roundish yeah

play16:45

so so I'm part of that open AI

play16:48

Developers for it's a forum where it's

play16:50

just they get random Community people be

play16:51

in this private Community where they

play16:52

invite a campus and they just bring in

play16:54

researchers to come talk and then

play16:55

researcher come talk about you know how

play16:57

they used AI to take the first images of

play17:00

a black hole which is super cool he went

play17:02

through all of that for us and presented

play17:03

it all I'm trying to get the slide so I

play17:04

can show it to everyone and so he then

play17:07

talked about how he thinks that that

play17:09

we're gonna the current AI models we

play17:10

have right now can start creating their

play17:13

own scientific hypothesis that we can

play17:15

that we can then use develop and test

play17:17

which was interesting because there's a

play17:18

research paper that we came across where

play17:20

it said using llms to develop scientific

play17:22

hypothesis so it was kind of cool this

play17:25

guy was independently coming to the same

play17:26

conclusion that other researchers came

play17:28

to which I thought was nice so he was

play17:30

talking about that and then he was

play17:32

talking about he said they' created this

play17:34

training environment where they don't

play17:37

teach the model about any theories of

play17:40

physics but they give the model access

play17:42

to this environment and to see if they

play17:44

can then derive theories of physics

play17:46

through what it's experiencing and it

play17:48

was able to come up with a few theories

play17:50

of physics based upon the environment it

play17:51

was in without having any like someone

play17:54

showing him the training set like this

play17:55

is like the first law of thermo blah

play17:57

blah blah blah blah which I thought that

play17:59

was really awesome um I want to get a

play18:01

slide so I can share them with all you

play18:03

and then he was going to answer audience

play18:05

questions and this one physicist re

play18:09

raised his hand and he said what about a

play18:12

flat 2D environment and for a second I

play18:15

thought he was GNA raise his hand and

play18:16

said what does the LM think about the

play18:19

Flat Earth theory I was like how oh God

play18:22

how did this guy get in the

play18:27

audience Okay so that's 40 let's troll

play18:30

around some Twitter comments see if I

play18:32

find anything

play18:33

interesting um when do we get the voice

play18:36

model you guys showed off thought it was

play18:38

weeks away months ago he's the question

play18:40

that see Sam dug his own grave on this

play18:42

they did not have to show The Voice

play18:44

model back during uh the week of didn't

play18:48

he didn't he want to undercut some

play18:50

Google announcements exactly that's when

play18:52

he dug his own grave because he Google

play18:53

was doing IO oh we got to go we got to

play18:55

announce on Monday when they could have

play18:57

just said on the Monday they get said

play18:59

we're going to have our own conference

play19:01

two months from now I think this is a

play19:04

it's an interesting competitive approach

play19:06

but it sort of

play19:08

betrays their fear of Google I think

play19:12

they're I think they're

play19:13

overestimating how quickly Google can

play19:16

move yeah

play19:18

and just how many problems Google can

play19:21

cause for itself right uh it's going to

play19:24

take Google forever to really build

play19:26

these things out exactly and Google has

play19:29

so much internal inertia and they also

play19:31

are to catch open Ai and go all in they

play19:33

would have to hurt the revenue streams

play19:36

in a lot of different directions so

play19:38

everything is lined up well for open AI

play19:40

right now I mean Google's still focused

play19:42

on even though I think it's it's bull

play19:44

it's crap the whole glue on Pizza Fiasco

play19:46

they're still internally probably have

play19:47

columns teams think oh what can we do to

play19:49

make sure this never happens again it's

play19:50

like do you remember the history of this

play19:52

company is about just you know you test

play19:54

stuff some some implodes some doesn't

play19:55

you move forward there's no way to no

play19:58

know all the unknowns and launch a

play20:02

product that has no glitches or errors

play20:04

like that so he says Alpha starts this

play20:07

month with the voice functionality and

play20:10

then General availability will come a

play20:12

bit

play20:14

after um person said this is an

play20:16

unbelievably good price talks about what

play20:19

you're mentioning Joe how so G4 it's

play20:22

very cheap 15 cents per million input uh

play20:26

tokens compared to Claude 3 son it was

play20:29

300 cents per million input tokens but I

play20:32

I think Sonet perform is probably

play20:34

stronger than this mini thing but still

play20:36

depends upon what your use case is um

play20:39

techies when there's a new update from

play20:41

open AI Scarlett Johansson okay uh let's

play20:46

see

play20:47

here okay that's about it for now don't

play20:50

forget to like And

play20:51

subscribe our store at

play20:55

svic

play20:57

merch.com support my Uncle J have a

play21:01

great day

Rate This

5.0 / 5 (0 votes)

Связанные теги
Künstliche IntelligenzGPD 40 MiniAI-EntwicklungPreisstrategieInnovationChat-BotDatenanalyseKI-ModelleMarkenstrategieZukunftsprognosen
Вам нужно краткое изложение на английском?