Generative AI: Moving Beyond The Hype & Hysteria

Data Science Connect
30 Oct 202318:53

Summary

TLDRThe video explores why generative AI is gaining significant attention. It compares generative AI's growth to the 'Big Data' trend, noting that generative AI goes beyond traditional machine learning by creating new data rather than classifying existing data. The speaker highlights real-world applications, like Wayfair's AI-powered room designs, and discusses how AI tools streamline creative processes without threatening jobs. The challenges of AI, such as hallucinations and legal concerns, are also mentioned. The speaker concludes by emphasizing AI's future, combining multiple models and applications for better, task-specific results.

Takeaways

  • ๐Ÿ“ˆ **Generative AI's Rise**: Generative AI is gaining popularity because it uses historical data to create new data instances, unlike traditional analytics that append information to existing data.
  • ๐Ÿš€ **Innovation Overload**: The ability to generate new data instances opens up new use cases that were not possible before, similar to how Big Data revolutionized the industry a decade ago.
  • ๐Ÿ  **Practical Applications**: Companies like Wayfair use generative AI to allow customers to visualize room designs with actual purchasable products, streamlining the buying process.
  • ๐ŸŽจ **Creative Freedom**: Generative AI could free up designers to focus on more complex tasks by handling routine design mockups that customers might not purchase.
  • ๐Ÿค– **Seamless Integration**: The technology is being integrated into tools like Adobe's, allowing users to interact with AI capabilities without needing to understand the underlying technology.
  • ๐Ÿค **Contrarian View on Fair Use**: The speaker argues that generative AI's use of training data is analogous to how artists and musicians learn from past works, suggesting AI should not be held to a higher standard.
  • ๐Ÿ“š **Educational Parallel**: Just as students learn from existing works to create their own, generative AI learns from vast amounts of data to produce new creations.
  • ๐Ÿ” **Hallucination Misconception**: Generative AI does not 'hallucinate' or match prompts to existing data; instead, it creates entirely new outputs based on mathematical probabilities.
  • โฒ๏ธ **Recency and Obscurity**: The accuracy of generative AI is influenced by the recency and obscurity of the information; well-documented recent events are more likely to be accurately represented.
  • ๐Ÿ“‰ **Model Collapse Risk**: There's a risk of 'model collapse' where the AI's output becomes less diverse and converges towards the mean, reducing its effectiveness over time.
  • ๐Ÿ› ๏ธ **Ensemble Approaches**: The future of generative AI may involve using multiple models in an ensemble approach to increase accuracy and reliability.

Q & A

  • Why has generative AI gained so much attention recently?

    -Generative AI has gained attention because it creates new examples of data entirely, such as new images or strings of text, instead of simply classifying or predicting based on existing data. This ability opens up new possibilities that were not possible with traditional AI models.

  • How does generative AI differ from traditional machine learning models?

    -Traditional machine learning models classify or make predictions based on existing data, such as identifying if an image contains a cat or not. In contrast, generative AI creates new data by generating unique examples rather than analyzing existing ones.

  • What is an example of a business application of generative AI mentioned in the transcript?

    -Wayfair uses generative AI in a tool that allows customers to upload an image of their space and input design preferences. The AI generates a visual of the room with purchasable products, allowing customers to see what their space could look like and easily shop for items.

  • Does generative AI threaten jobs, such as interior designers?

    -The transcript suggests that generative AI doesn't necessarily threaten jobs like those of interior designers. Instead, it allows them to focus on more complex, creative tasks rather than routine tasks, such as creating multiple mockups for undecided customers.

  • How has the integration of analytics and AI changed over time?

    -In the past, analytics were more visible, and users knew they were interacting with algorithms. Today, AI functionalities are seamlessly embedded into products and services, so users often interact with AI without even realizing it, such as generating images or analyzing credit scores.

  • What is the 'contrarian view' on AI and copyright raised in the transcript?

    -The speaker argues that AI should not be held to a higher standard than humans when it comes to learning from existing content. Just as art or music students learn by studying past works, generative AI uses large datasets to create new works, and this should not inherently be seen as problematic.

  • What are AI 'hallucinations' and why do they occur?

    -AI hallucinations refer to situations where AI generates incorrect or made-up information. This happens because generative AI doesn't retrieve real-world documents or facts but creates responses based on mathematical probabilities derived from its training data.

  • What is one of the major limitations of current public AI models?

    -One major limitation is that public AI models, such as GPT, are trained on data up to a certain point (e.g., 2021). As a result, they may not be able to provide accurate information about recent events or conflicting documentation.

  • What risk does 'model collapse' pose for generative AI?

    -Model collapse occurs when AI models are trained on data that includes a large amount of generated content instead of human-created content. Over time, this can cause the model to produce less accurate or more generic outputs, which diminishes its usefulness.

  • How can companies maximize the value of generative AI according to the transcript?

    -Companies can maximize the value of generative AI by building application layers that parse user prompts and direct parts of the task to the appropriate models, such as using a search engine for factual data and a math engine for calculations. This ensemble approach leverages AIโ€™s strengths for different types of tasks.

Outlines

00:00

๐Ÿš€ Generative AI's Impact and Applications

The speaker begins by discussing the surge in generative AI's popularity, drawing a parallel to the 'Big Data' trend of the past. They explain that big data's significance was not merely the volume of data but the introduction of new types of data, such as text and images, which enabled novel use cases. Similarly, generative AI stands out for its ability to create entirely new data instances, not just augment existing ones. The speaker illustrates this with an example of how Wayfair uses generative AI to help customers visualize room decorations, a process that was traditionally time-consuming and expensive. They also touch on the potential for generative AI to redefine roles like designers, allowing them to focus on more creative tasks.

05:02

๐ŸŽจ The Art of Generative AI and Legal Considerations

The speaker offers a contrarian view on the legal disputes surrounding generative AI, particularly concerning fair use and training data. They compare the learning process of generative AI to how artists and musicians study the works of their predecessors to develop their own style. The speaker argues that generative AI, having 'studied' vast amounts of data, arguably 'steals' less from its training data than humans do from their learning experiences. They also address the misconception of AI 'hallucinations,' clarifying that generative AI creates outputs based on mathematical probabilities rather than matching existing data. The speaker highlights the importance of understanding the limitations and capabilities of generative AI when it comes to accuracy and consistency.

10:02

โณ The Reliability and Risks of Generative AI

The speaker delves into the accuracy and reliability of generative AI, emphasizing the importance of recency and obscurity of information. They note that AI performs well on well-documented and recent events but may falter on obscure or more recent topics. The speaker shares personal anecdotes about AI inaccuracies, underscoring the need for caution when using AI-generated information. They also discuss the risk of 'model collapse,' where overexposure to AI-generated data can degrade the model's performance. The speaker suggests that using generative AI appropriately, avoiding tasks where exactness is critical, and combining it with other computational models can mitigate these risks.

15:05

๐Ÿ› ๏ธ The Future of Generative AI: Ensemble Approaches and Custom Plugins

The speaker envisions the future of generative AI involving ensemble approaches, where multiple AI models are used to cross-verify information, and custom plugins that cater to specific tasks. They predict that underlying AI models will become commoditized, and the true value will lie in the application layers that parse prompts and direct them to the appropriate computational engines. The speaker suggests that companies will need to develop these application layers to leverage the strengths of generative AI while avoiding its pitfalls. They conclude by emphasizing the importance of thoughtful integration of AI into business processes to enhance efficiency and creativity.

Mindmap

Keywords

๐Ÿ’กGenerative AI

Generative AI refers to artificial intelligence systems that can create new content, such as images, text, or music, based on existing data. In the video, the speaker explains how generative AI differs from traditional AI by not just classifying or predicting data but by generating entirely new examples. It's highlighted as a significant development because it opens up new use cases and applications, much like the advent of Big Data did a decade ago.

๐Ÿ’กBig Data

Big Data denotes the vast amounts of structured and unstructured data that are so large and complex that they are difficult to process using traditional data management tools. The speaker contrasts Big Data with generative AI, noting that while Big Data was about the volume and variety of data, generative AI is about creating new data types and applications, which is why it's gaining significant attention.

๐Ÿ’กMachine Learning

Machine learning is a subset of AI that focuses on enabling machines to learn from data, identify patterns, and make decisions with minimal human intervention. The script mentions machine learning in the context of how traditional models would take data and append something to it, like a prediction or classification, unlike generative AI which creates new data.

๐Ÿ’กData Generation

Data generation in the context of the video refers to the process by which generative AI creates new data samples, such as images or text, from existing data. It's a key feature that distinguishes generative AI from other types of AI and is central to the new capabilities and applications discussed in the video.

๐Ÿ’กAI Functionality

AI functionality in the video script refers to the capabilities and features of AI systems that can be embedded into various applications to enhance their performance. The speaker gives examples of how AI is being seamlessly integrated into tools like Adobe's product, where users can input prompts to generate images, illustrating the practical application of AI functionality.

๐Ÿ’กAnalytics

Analytics in the video script refers to the use of statistical and mathematical techniques to analyze data and extract meaningful insights. The speaker discusses how analytics, including traditional AI work and generative AI, are combined to create innovative solutions like room decoration tools that can generate images of room setups based on user inputs.

๐Ÿ’กTraining Data

Training data is the data used to train machine learning models to recognize patterns and make predictions. The video script touches on the debate over fair use and the use of training data in AI, suggesting that generative AI's use of vast amounts of training data is analogous to how artists and musicians learn from past works to create their own.

๐Ÿ’กModel Collapse

Model collapse is a phenomenon where a generative model starts to produce generic or less varied outputs over time, often due to being trained on too much generated data. The speaker warns about the risk of model collapse in generative AI, emphasizing the need to maintain a balance between human-generated and AI-generated data to ensure the model's effectiveness.

๐Ÿ’กEnsemble Approaches

Ensemble approaches in the video script refer to the use of multiple models or algorithms to solve a problem, with the idea that combining their outputs can improve accuracy or performance. The speaker predicts that we will see ensemble approaches in AI, where different models handle different parts of a task, to leverage the strengths of each and produce more reliable results.

๐Ÿ’กApplication Layers

Application layers in the context of the video script are additional software components or plugins that are built on top of foundational AI models to provide specific functionalities or improve performance for particular tasks. The speaker suggests that the real value for companies will come from creating these application layers that can parse prompts and direct different parts of them to the most appropriate AI models or engines.

Highlights

Generative AI is gaining significant attention due to its ability to create entirely new data examples, unlike traditional AI which appends predictions or classifications to existing data.

The emergence of Big Data 10-12 years ago was due to the availability of different types of data, such as text and images, which enabled new use cases.

Generative AI opens up new problem-solving possibilities that were not previously accessible, similar to how Big Data did a decade ago.

An example of generative AI in action is a Wayfair tool that allows users to upload images of their space and generate room layouts with different products.

Generative AI can revolutionize industries by providing instant design mockups, reducing the need for physical visits or manual designs.

Designers and data scientists can focus on more complex tasks as generative AI automates routine design mockups.

The seamless integration of AI and analytics into user interfaces allows for intuitive interactions without explicit awareness of the underlying technology.

Adobe's product demonstrates generative AI by allowing users to input prompts and generate images for presentations, enhancing creative workflows.

Generative AI's training on vast amounts of data means it has 'seen' more examples than a human, potentially leading to less 'stealing' from individual sources.

Copyright law should apply to AI similarly to how it applies to human learning and creation, according to the speaker's contrarian view on fair use.

Generative AI is not about finding existing data that matches a prompt but creating new, mathematically probable outputs based on the input.

The accuracy of generative AI depends on the recency and obscurity of the information; well-documented events are more likely to be correctly generated.

Generative AI can produce different but not necessarily incorrect answers to the same question, highlighting the probabilistic nature of its responses.

The speaker shares a personal anecdote where generative AI incorrectly 'awarded' their book a nomination, demonstrating the potential for inaccuracies.

Generative AI's ability to handle standardized tests with factual answers does not equate to general intelligence or sentience.

For tasks requiring exact answers, like inventory checks or flight availability, generative AI should not be used to provide probabilistic guesses.

Ensemble approaches that use multiple models and reconcile their answers can improve the accuracy and reliability of generative AI outputs.

The future of generative AI lies in building application layers that can parse prompts and direct different parts to the most suitable computation models.

Underlying generative AI models may become commodities, with value shifting towards companies that can create specialized plugins and application layers.

Transcripts

play00:01

[Music]

play00:10

let's start with why is generative AI

play00:13

blowing up so much and I think there's

play00:15

an important uh thing that's going on

play00:17

here which is if you go back 10 years 12

play00:19

years ago when Big Data hit and it feels

play00:21

like it was a lot further than that it's

play00:22

10 to 12 years I always said that big

play00:25

data was a bad label for that Trend

play00:28

because what made big data so big wasn't

play00:31

that there was more data wasn't that it

play00:33

was bigger data it was that it was

play00:36

different data so we didn't just have a

play00:37

bunch more transactions or inventory

play00:40

data we had things for the first time

play00:42

like text and images and web browsing

play00:45

history and so forth and that opened up

play00:47

entirely new use cases which is why Big

play00:50

Data went big now ai's been trending for

play00:53

years but why did it blow up so much in

play00:56

the past year and it's because

play00:58

historically every type of machine

play01:00

learning or analytical model would take

play01:03

a given row or a given example of data

play01:05

and append something to it a typically a

play01:08

prediction a forecast a

play01:10

classification and the early AI even

play01:13

though it was going hot and heavy for a

play01:15

couple years was effectively doing the

play01:16

same is this image a cat or is it not a

play01:19

cat is the sentiment of this specific

play01:21

piece of text good or is it bad and so

play01:24

while there were definitely new problems

play01:26

you could solve with that it was the

play01:28

same type of problems with different

play01:29

data whereas generative AI on the other

play01:32

hand now is something totally different

play01:34

where it uses all of the historical data

play01:37

we have but then generates a new example

play01:40

of that data in totality so a whole new

play01:43

image not classifying an image a whole

play01:46

new string of texts not classifying a

play01:48

text and just like with big data then

play01:50

there's all kinds of new problems we can

play01:51

solve with that that weren't previously

play01:54

uh possible and so I think that's a big

play01:56

reason why it's gone so

play01:58

big

play02:00

and there's stuff happening very fast

play02:02

this is a shot of one of my favorite uh

play02:05

articles I've read in in recent months

play02:07

this is Decor ay from Wayfair you know

play02:10

back to companies morphine a classic

play02:12

cataloger from the day that's that's

play02:14

gone online and what they have if you

play02:15

look at the top right corner there's an

play02:18

image that's kind of cut down the middle

play02:20

and there's little arrows you upload the

play02:22

image of your space you enter in a

play02:24

prompt what you're looking for what

play02:25

color scheme and such it'll generate an

play02:27

image with products in it so you can see

play02:30

what your room would look like and if

play02:31

you move that little arrow back and

play02:32

forth it'll just give you more or less

play02:34

of your room versus the new room and so

play02:37

if you think about it historically if

play02:38

you wanted to decorate a room you'd have

play02:40

to go to a a store even if you're on an

play02:43

online site maybe get a designer to

play02:45

actually mock something like this up

play02:47

which would take time you could only get

play02:49

one or two examples uh and it was costly

play02:52

for the the seller now you can go out

play02:54

and with a just a few prompts I could

play02:56

say you know I don't want white I want

play02:57

blue and it'll instantly update and

play03:00

what's even cooler is that all of those

play03:02

products are actual products that are

play03:04

purchasable so it's not just a

play03:06

hypothetical here's what your room might

play03:08

look like it is literally this is your

play03:10

room if you want it you know you could

play03:12

either buy the whole set or you could

play03:13

click on on certain items and buy the

play03:15

set and so this is an example of of so

play03:18

many different types of analytics coming

play03:20

together including some traditional AI

play03:21

work some generative AI some traditional

play03:24

uh propensity modeling and such but when

play03:26

you think about it from a customer

play03:27

perspective I can now go out and matter

play03:30

of minutes have multiple options for

play03:33

both style and product that I might put

play03:34

in my room and buy it and be done and

play03:38

the the retailer of course they're

play03:40

getting business much faster I still

play03:42

think though the designer if you're a

play03:44

designer what you really want to do

play03:45

probably much like data scientists you

play03:47

want to work on the cool stuff those

play03:49

designers don't get excited about having

play03:50

to do 10 different quick room mockups

play03:53

for 10 customers today nine of which

play03:55

won't buy anything anyway they'd rather

play03:58

be working with those customers that are

play04:00

going to hire him to really do a full

play04:02

classic design and I think that'll free

play04:03

this up so I also don't know that it

play04:05

threatens the jobs of designer so much

play04:07

as it lets them focus on the cool

play04:11

stuff and one of the things that's

play04:13

happening in this isn't just with

play04:14

generative AI but it's definitely going

play04:15

up a notch is this idea of seamlessly

play04:18

embedding analytics and AI functionality

play04:20

so that the users aren't even aware

play04:22

explicitly that they're using it so if

play04:24

you think back even 5 10 years ago

play04:27

everyone knew analytics was in use right

play04:28

you go to a website you know those

play04:30

offers are building off of whatever

play04:32

history you had and such but you would

play04:33

just see what they offered you you

play04:35

wouldn't really interact with the

play04:36

analytics per se and even with

play04:38

traditional analytics this is changing

play04:40

when you go to your credit card or bank

play04:41

account and they say check your credit

play04:43

score when you go to look at the credit

play04:45

score they'll now all have little

play04:46

simulators well you can simulate what

play04:48

would happen if I increase the balance

play04:50

on a card or opened a new account and

play04:52

they'll simulate what your score would

play04:53

be so you're actively interacting with a

play04:55

model this is from adobe's um fir

play04:59

product and much like PowerPoint and

play05:01

Adobe for years have had type in a

play05:03

keyword it'll give you real pictures

play05:05

they've just embedded generative AI now

play05:06

type in your prompt it'll generate

play05:08

images you can insert them so you have

play05:10

creative types now who don't know

play05:12

anything about AI or modeling probably

play05:14

don't care about how it works but

play05:15

seamlessly in the same interface and

play05:17

with the same style of interaction

play05:19

they've always used are able to take

play05:20

full advantage of the uh AI capabilities

play05:24

generative

play05:28

AI so I want to give you a bit of a

play05:30

contrarian view this whole battle over

play05:32

fair use and there's lawsuits all over

play05:34

about training data and such and so my

play05:36

contrarian view is that if I were to go

play05:39

and study art or music as an example

play05:41

what's a huge part of what you learn

play05:45

they make you go out and study past

play05:47

artists past musicians learn about their

play05:49

style and why they did what they did and

play05:51

how they did it and then over time

play05:54

you're supposed to synthesize what you

play05:56

learn about that art or that music and

play05:57

then you create your own style which

play05:59

which is by necessity based off of

play06:01

everything that you've ever seen of

play06:03

those artists and musicians now we can't

play06:05

go too far there's cases you know the

play06:08

early days of sampling I remember it was

play06:11

uh Vanilla Ice was one of the first

play06:12

people got sued because he sampled too

play06:14

much of a David Bo song as is and didn't

play06:16

credit him and he lost a big lawsuit and

play06:18

had to pay royalties and such and such

play06:19

so the point is there's already

play06:21

copyright law and I don't think that AI

play06:24

should be held to a higher standard

play06:26

though than we are held to so if I'm an

play06:29

artart an art student I could learn

play06:32

let's say tens of thousands of pieces of

play06:34

art maybe if I'm a musician tens of

play06:36

thousands of songs maybe on the Outer

play06:38

Edge O of which I would then generate my

play06:42

output these gener of AI algorithms have

play06:44

looked at millions of photos tens of

play06:46

millions hundreds of millions every song

play06:48

ever created so I would argue that in a

play06:50

way a generative AI song or piece of art

play06:54

steals a little less quote unquote from

play06:56

all of its training data than you or I

play06:58

would steal from from all of our

play06:59

training data because they have a whole

play07:01

lot more training data so I think

play07:02

there's a lot to figure out here but I

play07:04

do think that this idea that just

play07:06

because AI looked at an image or

play07:08

ingested a song and then generated

play07:11

something that inherently that's a you

play07:13

know a big problem but again this this

play07:15

will be in the courts for

play07:19

years so the biggest misnomer I think

play07:22

that happens around generative AI is

play07:25

this idea of hallucinations I'm sure

play07:27

everybody here has heard of

play07:28

hallucinations show P everybody

play07:30

everybody's heard of it so here's the

play07:33

key to understand and this is where you

play07:35

folks in the room I'll bet a higher

play07:37

percentage of you know this by far than

play07:39

the general population but when you go

play07:41

to Google you type in keywords and what

play07:43

Google's going to attempt to do is to

play07:45

match your keywords to real websites

play07:47

real documents real tags on real images

play07:50

whatever the case is but the idea is

play07:53

matching what you've asked for to real

play07:56

specific items that exist now it might

play07:58

do a better or worse job of that

play08:00

sometimes the links won't be as relevant

play08:01

but it'll always be a real document a

play08:03

real link

play08:04

Etc the thing with generative AI it's in

play08:06

the name generative generative AI is not

play08:09

attempting to match anything to your

play08:11

prompt what it's doing is reducing your

play08:13

prompt to a mathematical representation

play08:15

passing it through an algorithm to then

play08:17

spit out a

play08:19

mathematically uh probable answer it's

play08:22

effectively making up every single

play08:24

answer that it gives and so rather than

play08:28

being concerned about the hallucinations

play08:31

you should actually be amazed that while

play08:33

generative AI is literally making it up

play08:35

word by word every time you ask it a

play08:37

question it gets a lot of things right

play08:40

because well documented things

play08:42

probabilistically speaking is the right

play08:45

the right answer is what the facts

play08:48

are but there's some nuances here I went

play08:51

in and asked the same question tell me

play08:53

the history of the world in 50 words as

play08:55

soon as the first answer came back I

play08:57

resubmitted the exact same question

play08:59

question if you read that neither one of

play09:02

them I would argue is wrong although

play09:04

there could be some actually wrong facts

play09:06

in those if you really look carefully

play09:08

but they're also not the

play09:10

same and the interesting thing after I

play09:13

post about this on LinkedIn a friend of

play09:14

mine then went and did the same

play09:16

experiment and within say chat GPT and

play09:18

similar tools within the context of a

play09:20

single session it retains some some

play09:22

memory of it and that's why those two

play09:24

answers are very similar this other

play09:26

gentleman submitted the same question

play09:28

and what he got similar two similar but

play09:30

different answers but in my case it's

play09:33

really a Humanity Centric uh history of

play09:35

the world about civilization both of his

play09:38

answers started about you know 13

play09:40

billion years ago the Earth formed and

play09:41

then there were dinosaurs and then you

play09:43

know at the end it says there's people

play09:45

and so between he and I we got four

play09:47

answers none of which were wrong per se

play09:50

all of which were different and his

play09:52

versus mine would have given a

play09:53

completely different impression of the

play09:56

history of the world so you just have to

play09:58

be aware and be cautious when you use

play09:59

the generative AI that not only could it

play10:02

give you wrong information but it might

play10:04

not even give you the same

play10:06

information a minute from now let alone

play10:08

if one of your you know colleagues goes

play10:10

to replicate it a week from

play10:14

now so what will it get right and wrong

play10:16

this is actually uh discernible first

play10:20

thing any recency is important because

play10:23

the current models that are public I

play10:25

think it was the end of

play10:26

2021 uh they were cut off so you asked

play10:28

something before that it might do a

play10:29

pretty good job but you ask say about

play10:31

the Ukraine war there's two problems

play10:33

with that one it didn't exist in 2021

play10:35

and two there's a lot of

play10:37

conflicting uh documentation of the war

play10:40

in Ukraine and so probabilistically who

play10:42

knows what you're going to

play10:44

get but equally important is obscurity

play10:47

so if I ask about World War II and I ask

play10:50

about Pearl Harbor or Winston

play10:52

Churchill the most probable answer is

play10:54

probably pretty much the right answer

play10:56

because Pearl Harbor's highly documented

play10:59

and very consistently documented Winston

play11:01

Churchill but if I go and ask about an

play11:04

obscure Colonel and an obscure battle in

play11:07

the middle of Germany I mean who knows

play11:09

what you're going to get it's probably

play11:10

half made up it might be all you know

play11:12

all bogus so I've asked gone and asked

play11:15

it asked it questions about myself and

play11:16

it came back uh I I remember this one

play11:19

really threw me it said that my first

play11:22

book was nominated for an award by

play11:26

informs and I I was like I don't

play11:28

remember this and I said well maybe

play11:30

because I got nominated and didn't win

play11:32

and no one ever told me I I just didn't

play11:34

know I went and poked around I could

play11:35

find no record of it and so even with

play11:37

myself and I know my history it it made

play11:39

up a very believable award by a very

play11:42

real organization that would have had

play11:44

exactly that award but then you know my

play11:47

book never had so anyway I didn't update

play11:48

my bio with that award

play11:50

nomination so the point is if it can

play11:52

fool me about myself you got to remember

play11:55

that it can fool you very easily on

play11:56

other

play11:57

things if if you want facts or

play11:59

subjectivity people have I think wrongly

play12:01

said that this is you know general

play12:03

intelligence sentient because it can

play12:05

pass an SAT or GRE and so forth with a

play12:08

high level but think about those exams

play12:10

they ask a very specific question with a

play12:13

factual answer and then there's four

play12:15

options that you choose from and so

play12:18

actually probabilistically speaking if I

play12:20

say you know who won the case of Brown

play12:23

versus Board of Education in 1970

play12:25

whatever it was or 1960 whatever the

play12:28

point is that chat gbd should do a good

play12:30

job of something like that because it's

play12:31

a Well documented case the outcome is

play12:33

known and probabilistically it'll

play12:35

probably get the right answer and so a

play12:37

lot of those standardized tests are

play12:40

testing your ability to recall some

play12:42

general facts more than it is that

play12:44

you're really having to Think Through

play12:45

complicated uh

play12:47

topics and then last you don't want it

play12:49

to do math or any kind of computation

play12:52

because remember it's not a calculator

play12:54

if you say what is 1+ one and it says

play12:55

two it has not gone and calculated that

play12:58

1 + 1 equals 2 it is probabilistically

play13:01

saying if I see the string 1 + 1 what is

play13:05

the most likely next answer and there's

play13:07

enough 1 plus 1 equals 2 out there it's

play13:09

probably going to get two almost every

play13:10

time but you can go out and find

play13:12

examples of even late Elementary School

play13:15

math it'll mess it up left and right

play13:18

it'll do the wrong order of operation in

play13:19

a complicated math probably it'll put an

play13:21

extra parenthesis and it might even get

play13:23

80 90% of it right but then that one

play13:25

thing it gets wrong makes the entire

play13:27

answer wrong

play13:31

but this isn't to say it's not at all

play13:32

useful so here's a a prototype I saw

play13:35

that Amazon's about to roll out you know

play13:37

when I go to a product and I'm looking

play13:39

at reviews I get um stressed because I

play13:42

feel obligated to look at a bunch of

play13:44

reviews the positive ones and negative

play13:45

ones and I try and figure out what's the

play13:47

general pattern here and is the things I

play13:49

care about going to be affected what

play13:51

they're doing here is they're taking

play13:52

those thousands of reviews passing it

play13:54

through and saying give us a paragraph

play13:55

summary and then see if you can tag any

play13:58

of a number of important things like

play14:00

performance ease of use that there that

play14:03

this appears to be positive or negative

play14:04

on that and so the beauty of this is we

play14:07

don't care it doesn't have to be the

play14:09

exact same answer like those examples I

play14:11

gave you earlier as long as it's

play14:12

generally uh the right direction that's

play14:14

what we need and in this case they

play14:16

probably freeze this and it'll show it

play14:17

to everybody who asks for the sumary

play14:19

reviews up until at some point they'll

play14:21

refresh it and give it to everyone so

play14:22

it's not like I'll get a different

play14:24

answer the next minute the next person

play14:25

gets a different answer and so these are

play14:27

these these nuances where this would be

play14:29

an incredibly powerful approach that

play14:31

will leverage all the strengths of these

play14:32

models without actually the negatives

play14:38

whoops thing just did a double click so

play14:40

one of the big risks we have is this

play14:42

idea of model collapse and so any kind

play14:44

of model as you know it kind of

play14:45

regresses to the mean you tend to trim

play14:47

off more and more outliers each pass so

play14:50

on the top is a real distribution and as

play14:51

you move to the right that's what

play14:53

happened over many iterations first the

play14:55

distribution changed then it converged

play14:58

on a point

play14:59

down on the bottom the classic uh hand

play15:01

handwritten numbers as you take a bunch

play15:04

of the original human generated data and

play15:06

pollute it with more and more algorithm

play15:09

generated data and retrain it it

play15:10

eventually converges to a bunch of like

play15:12

Blurry zeros and ones and so the the

play15:15

power of the current models is that it's

play15:17

almost all human generated content

play15:18

training them to predict human generated

play15:20

output we're soon going to have more

play15:22

images and more text than was human

play15:24

generated it might almost be true today

play15:25

but we're talking very short time and if

play15:27

we don't filter it properly will

play15:29

basically have generative AI become

play15:31

completely useless because we won't be

play15:33

able to have a good model anymore

play15:35

because we polluted so much generated

play15:37

stuff that comes down to the mean so we

play15:39

have to figure this one out uh pretty

play15:43

quickly and then this gets back you just

play15:45

don't just don't force generative AI

play15:47

where it doesn't belong you got to you

play15:49

got to be thinking about what's the task

play15:51

you've asked for if you want to know how

play15:54

much inventory is left you don't want it

play15:56

to generate a probabilistic answer you

play15:58

want it to to actually go look up that

play16:00

answer if I call to know is there a seat

play16:02

available on that flight and how much

play16:03

does it cost I don't want a

play16:04

probabilistic guess I want an actual

play16:06

answer if on the other hand I want to

play16:08

know are these reviews generally good or

play16:10

bad can you give me the main themes that

play16:11

someone's talked about or here's a draft

play16:14

I just wrote can you double check this

play16:15

for some grammar and such it makes a lot

play16:17

of

play16:19

sense and so what we're going to start

play16:21

to see and it's already happening is

play16:22

this idea of Ensemble approaches using

play16:25

multiple types of models multiple types

play16:27

of things and you have to think about

play16:29

chat GPT everyone talks about that's the

play16:31

app on top of the underl model GPT 4

play16:34

these underl models are already

play16:35

commoditized right you can use Microsoft

play16:37

open AI all of these you don't need to

play16:39

build your own but you want to build

play16:41

layers on top and so back to the math

play16:44

problem instead of

play16:45

asking do math have the prompt parsed

play16:49

and realize here's a math question and

play16:50

pass it to a math engine that's what

play16:52

wolfrom has now it'll extract the parts

play16:54

that are asking a math question do a

play16:56

computation with a computation engine

play16:58

and spit it back and so what I see over

play17:00

time will be first of all you'll submit

play17:02

your prompt to multiple large language

play17:04

models and then reconcile say three or

play17:07

four answers together in an ensemble

play17:09

approach and this engine over here might

play17:11

have made something up and this engine

play17:12

over here might have made something up

play17:13

but hopefully that will be the parts

play17:15

that aren't consistent and the

play17:16

consistent Parts across the four will be

play17:19

will be mostly the accurate pieces so I

play17:21

think we'll have plugins for physics for

play17:23

chemistry for certain types of of

play17:25

physics chemistry math and then last not

play17:28

Le you with this is where I think the

play17:31

real value of this if you want to

play17:33

differentiate as a company is build

play17:35

these application layers that are going

play17:37

to parse out the prompt identify what

play17:39

part of this prompt needs what type of

play17:40

information computation input and maybe

play17:43

part of it goes to a a traditional

play17:45

Google search to actually look up some

play17:47

documents while another part goes and

play17:49

does some math another part looks up

play17:50

something about the chemistry and the

play17:52

last part crafts some kind of paragraph

play17:54

summary of the information that was

play17:55

found but I think this is where this

play17:57

value of all of this is going to play

play17:58

out those underlying models will be a

play18:00

commodity there will be companies

play18:02

generating um let's say generic add-ons

play18:05

and then as an organization what you all

play18:07

are going to need to do is generate your

play18:09

own little add-ons and plugins that you

play18:10

can put on top and you'll get your most

play18:13

value so with that I think I'm right at

play18:15

about 20 and I'll say uh thank you for

play18:18

coming I'll be around for an hour or

play18:25

[Music]

play18:27

two

play18:29

[Music]

play18:41

[Music]

play18:51

oh

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
Generative AIDesign InnovationAI AnalyticsMachine LearningData TrendsPredictive ModelsUser InteractionArtificial IntelligenceTech AdvancementsInnovation Trends