Generative AI: Moving Beyond The Hype & Hysteria
Summary
TLDRThe video explores why generative AI is gaining significant attention. It compares generative AI's growth to the 'Big Data' trend, noting that generative AI goes beyond traditional machine learning by creating new data rather than classifying existing data. The speaker highlights real-world applications, like Wayfair's AI-powered room designs, and discusses how AI tools streamline creative processes without threatening jobs. The challenges of AI, such as hallucinations and legal concerns, are also mentioned. The speaker concludes by emphasizing AI's future, combining multiple models and applications for better, task-specific results.
Takeaways
- 📈 **Generative AI's Rise**: Generative AI is gaining popularity because it uses historical data to create new data instances, unlike traditional analytics that append information to existing data.
- 🚀 **Innovation Overload**: The ability to generate new data instances opens up new use cases that were not possible before, similar to how Big Data revolutionized the industry a decade ago.
- 🏠 **Practical Applications**: Companies like Wayfair use generative AI to allow customers to visualize room designs with actual purchasable products, streamlining the buying process.
- 🎨 **Creative Freedom**: Generative AI could free up designers to focus on more complex tasks by handling routine design mockups that customers might not purchase.
- 🤖 **Seamless Integration**: The technology is being integrated into tools like Adobe's, allowing users to interact with AI capabilities without needing to understand the underlying technology.
- 🤝 **Contrarian View on Fair Use**: The speaker argues that generative AI's use of training data is analogous to how artists and musicians learn from past works, suggesting AI should not be held to a higher standard.
- 📚 **Educational Parallel**: Just as students learn from existing works to create their own, generative AI learns from vast amounts of data to produce new creations.
- 🔍 **Hallucination Misconception**: Generative AI does not 'hallucinate' or match prompts to existing data; instead, it creates entirely new outputs based on mathematical probabilities.
- ⏲️ **Recency and Obscurity**: The accuracy of generative AI is influenced by the recency and obscurity of the information; well-documented recent events are more likely to be accurately represented.
- 📉 **Model Collapse Risk**: There's a risk of 'model collapse' where the AI's output becomes less diverse and converges towards the mean, reducing its effectiveness over time.
- 🛠️ **Ensemble Approaches**: The future of generative AI may involve using multiple models in an ensemble approach to increase accuracy and reliability.
Q & A
Why has generative AI gained so much attention recently?
-Generative AI has gained attention because it creates new examples of data entirely, such as new images or strings of text, instead of simply classifying or predicting based on existing data. This ability opens up new possibilities that were not possible with traditional AI models.
How does generative AI differ from traditional machine learning models?
-Traditional machine learning models classify or make predictions based on existing data, such as identifying if an image contains a cat or not. In contrast, generative AI creates new data by generating unique examples rather than analyzing existing ones.
What is an example of a business application of generative AI mentioned in the transcript?
-Wayfair uses generative AI in a tool that allows customers to upload an image of their space and input design preferences. The AI generates a visual of the room with purchasable products, allowing customers to see what their space could look like and easily shop for items.
Does generative AI threaten jobs, such as interior designers?
-The transcript suggests that generative AI doesn't necessarily threaten jobs like those of interior designers. Instead, it allows them to focus on more complex, creative tasks rather than routine tasks, such as creating multiple mockups for undecided customers.
How has the integration of analytics and AI changed over time?
-In the past, analytics were more visible, and users knew they were interacting with algorithms. Today, AI functionalities are seamlessly embedded into products and services, so users often interact with AI without even realizing it, such as generating images or analyzing credit scores.
What is the 'contrarian view' on AI and copyright raised in the transcript?
-The speaker argues that AI should not be held to a higher standard than humans when it comes to learning from existing content. Just as art or music students learn by studying past works, generative AI uses large datasets to create new works, and this should not inherently be seen as problematic.
What are AI 'hallucinations' and why do they occur?
-AI hallucinations refer to situations where AI generates incorrect or made-up information. This happens because generative AI doesn't retrieve real-world documents or facts but creates responses based on mathematical probabilities derived from its training data.
What is one of the major limitations of current public AI models?
-One major limitation is that public AI models, such as GPT, are trained on data up to a certain point (e.g., 2021). As a result, they may not be able to provide accurate information about recent events or conflicting documentation.
What risk does 'model collapse' pose for generative AI?
-Model collapse occurs when AI models are trained on data that includes a large amount of generated content instead of human-created content. Over time, this can cause the model to produce less accurate or more generic outputs, which diminishes its usefulness.
How can companies maximize the value of generative AI according to the transcript?
-Companies can maximize the value of generative AI by building application layers that parse user prompts and direct parts of the task to the appropriate models, such as using a search engine for factual data and a math engine for calculations. This ensemble approach leverages AI’s strengths for different types of tasks.
Outlines
🚀 Generative AI's Impact and Applications
The speaker begins by discussing the surge in generative AI's popularity, drawing a parallel to the 'Big Data' trend of the past. They explain that big data's significance was not merely the volume of data but the introduction of new types of data, such as text and images, which enabled novel use cases. Similarly, generative AI stands out for its ability to create entirely new data instances, not just augment existing ones. The speaker illustrates this with an example of how Wayfair uses generative AI to help customers visualize room decorations, a process that was traditionally time-consuming and expensive. They also touch on the potential for generative AI to redefine roles like designers, allowing them to focus on more creative tasks.
🎨 The Art of Generative AI and Legal Considerations
The speaker offers a contrarian view on the legal disputes surrounding generative AI, particularly concerning fair use and training data. They compare the learning process of generative AI to how artists and musicians study the works of their predecessors to develop their own style. The speaker argues that generative AI, having 'studied' vast amounts of data, arguably 'steals' less from its training data than humans do from their learning experiences. They also address the misconception of AI 'hallucinations,' clarifying that generative AI creates outputs based on mathematical probabilities rather than matching existing data. The speaker highlights the importance of understanding the limitations and capabilities of generative AI when it comes to accuracy and consistency.
⏳ The Reliability and Risks of Generative AI
The speaker delves into the accuracy and reliability of generative AI, emphasizing the importance of recency and obscurity of information. They note that AI performs well on well-documented and recent events but may falter on obscure or more recent topics. The speaker shares personal anecdotes about AI inaccuracies, underscoring the need for caution when using AI-generated information. They also discuss the risk of 'model collapse,' where overexposure to AI-generated data can degrade the model's performance. The speaker suggests that using generative AI appropriately, avoiding tasks where exactness is critical, and combining it with other computational models can mitigate these risks.
🛠️ The Future of Generative AI: Ensemble Approaches and Custom Plugins
The speaker envisions the future of generative AI involving ensemble approaches, where multiple AI models are used to cross-verify information, and custom plugins that cater to specific tasks. They predict that underlying AI models will become commoditized, and the true value will lie in the application layers that parse prompts and direct them to the appropriate computational engines. The speaker suggests that companies will need to develop these application layers to leverage the strengths of generative AI while avoiding its pitfalls. They conclude by emphasizing the importance of thoughtful integration of AI into business processes to enhance efficiency and creativity.
Mindmap
Keywords
💡Generative AI
💡Big Data
💡Machine Learning
💡Data Generation
💡AI Functionality
💡Analytics
💡Training Data
💡Model Collapse
💡Ensemble Approaches
💡Application Layers
Highlights
Generative AI is gaining significant attention due to its ability to create entirely new data examples, unlike traditional AI which appends predictions or classifications to existing data.
The emergence of Big Data 10-12 years ago was due to the availability of different types of data, such as text and images, which enabled new use cases.
Generative AI opens up new problem-solving possibilities that were not previously accessible, similar to how Big Data did a decade ago.
An example of generative AI in action is a Wayfair tool that allows users to upload images of their space and generate room layouts with different products.
Generative AI can revolutionize industries by providing instant design mockups, reducing the need for physical visits or manual designs.
Designers and data scientists can focus on more complex tasks as generative AI automates routine design mockups.
The seamless integration of AI and analytics into user interfaces allows for intuitive interactions without explicit awareness of the underlying technology.
Adobe's product demonstrates generative AI by allowing users to input prompts and generate images for presentations, enhancing creative workflows.
Generative AI's training on vast amounts of data means it has 'seen' more examples than a human, potentially leading to less 'stealing' from individual sources.
Copyright law should apply to AI similarly to how it applies to human learning and creation, according to the speaker's contrarian view on fair use.
Generative AI is not about finding existing data that matches a prompt but creating new, mathematically probable outputs based on the input.
The accuracy of generative AI depends on the recency and obscurity of the information; well-documented events are more likely to be correctly generated.
Generative AI can produce different but not necessarily incorrect answers to the same question, highlighting the probabilistic nature of its responses.
The speaker shares a personal anecdote where generative AI incorrectly 'awarded' their book a nomination, demonstrating the potential for inaccuracies.
Generative AI's ability to handle standardized tests with factual answers does not equate to general intelligence or sentience.
For tasks requiring exact answers, like inventory checks or flight availability, generative AI should not be used to provide probabilistic guesses.
Ensemble approaches that use multiple models and reconcile their answers can improve the accuracy and reliability of generative AI outputs.
The future of generative AI lies in building application layers that can parse prompts and direct different parts to the most suitable computation models.
Underlying generative AI models may become commodities, with value shifting towards companies that can create specialized plugins and application layers.
Transcripts
[Music]
let's start with why is generative AI
blowing up so much and I think there's
an important uh thing that's going on
here which is if you go back 10 years 12
years ago when Big Data hit and it feels
like it was a lot further than that it's
10 to 12 years I always said that big
data was a bad label for that Trend
because what made big data so big wasn't
that there was more data wasn't that it
was bigger data it was that it was
different data so we didn't just have a
bunch more transactions or inventory
data we had things for the first time
like text and images and web browsing
history and so forth and that opened up
entirely new use cases which is why Big
Data went big now ai's been trending for
years but why did it blow up so much in
the past year and it's because
historically every type of machine
learning or analytical model would take
a given row or a given example of data
and append something to it a typically a
prediction a forecast a
classification and the early AI even
though it was going hot and heavy for a
couple years was effectively doing the
same is this image a cat or is it not a
cat is the sentiment of this specific
piece of text good or is it bad and so
while there were definitely new problems
you could solve with that it was the
same type of problems with different
data whereas generative AI on the other
hand now is something totally different
where it uses all of the historical data
we have but then generates a new example
of that data in totality so a whole new
image not classifying an image a whole
new string of texts not classifying a
text and just like with big data then
there's all kinds of new problems we can
solve with that that weren't previously
uh possible and so I think that's a big
reason why it's gone so
big
and there's stuff happening very fast
this is a shot of one of my favorite uh
articles I've read in in recent months
this is Decor ay from Wayfair you know
back to companies morphine a classic
cataloger from the day that's that's
gone online and what they have if you
look at the top right corner there's an
image that's kind of cut down the middle
and there's little arrows you upload the
image of your space you enter in a
prompt what you're looking for what
color scheme and such it'll generate an
image with products in it so you can see
what your room would look like and if
you move that little arrow back and
forth it'll just give you more or less
of your room versus the new room and so
if you think about it historically if
you wanted to decorate a room you'd have
to go to a a store even if you're on an
online site maybe get a designer to
actually mock something like this up
which would take time you could only get
one or two examples uh and it was costly
for the the seller now you can go out
and with a just a few prompts I could
say you know I don't want white I want
blue and it'll instantly update and
what's even cooler is that all of those
products are actual products that are
purchasable so it's not just a
hypothetical here's what your room might
look like it is literally this is your
room if you want it you know you could
either buy the whole set or you could
click on on certain items and buy the
set and so this is an example of of so
many different types of analytics coming
together including some traditional AI
work some generative AI some traditional
uh propensity modeling and such but when
you think about it from a customer
perspective I can now go out and matter
of minutes have multiple options for
both style and product that I might put
in my room and buy it and be done and
the the retailer of course they're
getting business much faster I still
think though the designer if you're a
designer what you really want to do
probably much like data scientists you
want to work on the cool stuff those
designers don't get excited about having
to do 10 different quick room mockups
for 10 customers today nine of which
won't buy anything anyway they'd rather
be working with those customers that are
going to hire him to really do a full
classic design and I think that'll free
this up so I also don't know that it
threatens the jobs of designer so much
as it lets them focus on the cool
stuff and one of the things that's
happening in this isn't just with
generative AI but it's definitely going
up a notch is this idea of seamlessly
embedding analytics and AI functionality
so that the users aren't even aware
explicitly that they're using it so if
you think back even 5 10 years ago
everyone knew analytics was in use right
you go to a website you know those
offers are building off of whatever
history you had and such but you would
just see what they offered you you
wouldn't really interact with the
analytics per se and even with
traditional analytics this is changing
when you go to your credit card or bank
account and they say check your credit
score when you go to look at the credit
score they'll now all have little
simulators well you can simulate what
would happen if I increase the balance
on a card or opened a new account and
they'll simulate what your score would
be so you're actively interacting with a
model this is from adobe's um fir
product and much like PowerPoint and
Adobe for years have had type in a
keyword it'll give you real pictures
they've just embedded generative AI now
type in your prompt it'll generate
images you can insert them so you have
creative types now who don't know
anything about AI or modeling probably
don't care about how it works but
seamlessly in the same interface and
with the same style of interaction
they've always used are able to take
full advantage of the uh AI capabilities
generative
AI so I want to give you a bit of a
contrarian view this whole battle over
fair use and there's lawsuits all over
about training data and such and so my
contrarian view is that if I were to go
and study art or music as an example
what's a huge part of what you learn
they make you go out and study past
artists past musicians learn about their
style and why they did what they did and
how they did it and then over time
you're supposed to synthesize what you
learn about that art or that music and
then you create your own style which
which is by necessity based off of
everything that you've ever seen of
those artists and musicians now we can't
go too far there's cases you know the
early days of sampling I remember it was
uh Vanilla Ice was one of the first
people got sued because he sampled too
much of a David Bo song as is and didn't
credit him and he lost a big lawsuit and
had to pay royalties and such and such
so the point is there's already
copyright law and I don't think that AI
should be held to a higher standard
though than we are held to so if I'm an
artart an art student I could learn
let's say tens of thousands of pieces of
art maybe if I'm a musician tens of
thousands of songs maybe on the Outer
Edge O of which I would then generate my
output these gener of AI algorithms have
looked at millions of photos tens of
millions hundreds of millions every song
ever created so I would argue that in a
way a generative AI song or piece of art
steals a little less quote unquote from
all of its training data than you or I
would steal from from all of our
training data because they have a whole
lot more training data so I think
there's a lot to figure out here but I
do think that this idea that just
because AI looked at an image or
ingested a song and then generated
something that inherently that's a you
know a big problem but again this this
will be in the courts for
years so the biggest misnomer I think
that happens around generative AI is
this idea of hallucinations I'm sure
everybody here has heard of
hallucinations show P everybody
everybody's heard of it so here's the
key to understand and this is where you
folks in the room I'll bet a higher
percentage of you know this by far than
the general population but when you go
to Google you type in keywords and what
Google's going to attempt to do is to
match your keywords to real websites
real documents real tags on real images
whatever the case is but the idea is
matching what you've asked for to real
specific items that exist now it might
do a better or worse job of that
sometimes the links won't be as relevant
but it'll always be a real document a
real link
Etc the thing with generative AI it's in
the name generative generative AI is not
attempting to match anything to your
prompt what it's doing is reducing your
prompt to a mathematical representation
passing it through an algorithm to then
spit out a
mathematically uh probable answer it's
effectively making up every single
answer that it gives and so rather than
being concerned about the hallucinations
you should actually be amazed that while
generative AI is literally making it up
word by word every time you ask it a
question it gets a lot of things right
because well documented things
probabilistically speaking is the right
the right answer is what the facts
are but there's some nuances here I went
in and asked the same question tell me
the history of the world in 50 words as
soon as the first answer came back I
resubmitted the exact same question
question if you read that neither one of
them I would argue is wrong although
there could be some actually wrong facts
in those if you really look carefully
but they're also not the
same and the interesting thing after I
post about this on LinkedIn a friend of
mine then went and did the same
experiment and within say chat GPT and
similar tools within the context of a
single session it retains some some
memory of it and that's why those two
answers are very similar this other
gentleman submitted the same question
and what he got similar two similar but
different answers but in my case it's
really a Humanity Centric uh history of
the world about civilization both of his
answers started about you know 13
billion years ago the Earth formed and
then there were dinosaurs and then you
know at the end it says there's people
and so between he and I we got four
answers none of which were wrong per se
all of which were different and his
versus mine would have given a
completely different impression of the
history of the world so you just have to
be aware and be cautious when you use
the generative AI that not only could it
give you wrong information but it might
not even give you the same
information a minute from now let alone
if one of your you know colleagues goes
to replicate it a week from
now so what will it get right and wrong
this is actually uh discernible first
thing any recency is important because
the current models that are public I
think it was the end of
2021 uh they were cut off so you asked
something before that it might do a
pretty good job but you ask say about
the Ukraine war there's two problems
with that one it didn't exist in 2021
and two there's a lot of
conflicting uh documentation of the war
in Ukraine and so probabilistically who
knows what you're going to
get but equally important is obscurity
so if I ask about World War II and I ask
about Pearl Harbor or Winston
Churchill the most probable answer is
probably pretty much the right answer
because Pearl Harbor's highly documented
and very consistently documented Winston
Churchill but if I go and ask about an
obscure Colonel and an obscure battle in
the middle of Germany I mean who knows
what you're going to get it's probably
half made up it might be all you know
all bogus so I've asked gone and asked
it asked it questions about myself and
it came back uh I I remember this one
really threw me it said that my first
book was nominated for an award by
informs and I I was like I don't
remember this and I said well maybe
because I got nominated and didn't win
and no one ever told me I I just didn't
know I went and poked around I could
find no record of it and so even with
myself and I know my history it it made
up a very believable award by a very
real organization that would have had
exactly that award but then you know my
book never had so anyway I didn't update
my bio with that award
nomination so the point is if it can
fool me about myself you got to remember
that it can fool you very easily on
other
things if if you want facts or
subjectivity people have I think wrongly
said that this is you know general
intelligence sentient because it can
pass an SAT or GRE and so forth with a
high level but think about those exams
they ask a very specific question with a
factual answer and then there's four
options that you choose from and so
actually probabilistically speaking if I
say you know who won the case of Brown
versus Board of Education in 1970
whatever it was or 1960 whatever the
point is that chat gbd should do a good
job of something like that because it's
a Well documented case the outcome is
known and probabilistically it'll
probably get the right answer and so a
lot of those standardized tests are
testing your ability to recall some
general facts more than it is that
you're really having to Think Through
complicated uh
topics and then last you don't want it
to do math or any kind of computation
because remember it's not a calculator
if you say what is 1+ one and it says
two it has not gone and calculated that
1 + 1 equals 2 it is probabilistically
saying if I see the string 1 + 1 what is
the most likely next answer and there's
enough 1 plus 1 equals 2 out there it's
probably going to get two almost every
time but you can go out and find
examples of even late Elementary School
math it'll mess it up left and right
it'll do the wrong order of operation in
a complicated math probably it'll put an
extra parenthesis and it might even get
80 90% of it right but then that one
thing it gets wrong makes the entire
answer wrong
but this isn't to say it's not at all
useful so here's a a prototype I saw
that Amazon's about to roll out you know
when I go to a product and I'm looking
at reviews I get um stressed because I
feel obligated to look at a bunch of
reviews the positive ones and negative
ones and I try and figure out what's the
general pattern here and is the things I
care about going to be affected what
they're doing here is they're taking
those thousands of reviews passing it
through and saying give us a paragraph
summary and then see if you can tag any
of a number of important things like
performance ease of use that there that
this appears to be positive or negative
on that and so the beauty of this is we
don't care it doesn't have to be the
exact same answer like those examples I
gave you earlier as long as it's
generally uh the right direction that's
what we need and in this case they
probably freeze this and it'll show it
to everybody who asks for the sumary
reviews up until at some point they'll
refresh it and give it to everyone so
it's not like I'll get a different
answer the next minute the next person
gets a different answer and so these are
these these nuances where this would be
an incredibly powerful approach that
will leverage all the strengths of these
models without actually the negatives
whoops thing just did a double click so
one of the big risks we have is this
idea of model collapse and so any kind
of model as you know it kind of
regresses to the mean you tend to trim
off more and more outliers each pass so
on the top is a real distribution and as
you move to the right that's what
happened over many iterations first the
distribution changed then it converged
on a point
down on the bottom the classic uh hand
handwritten numbers as you take a bunch
of the original human generated data and
pollute it with more and more algorithm
generated data and retrain it it
eventually converges to a bunch of like
Blurry zeros and ones and so the the
power of the current models is that it's
almost all human generated content
training them to predict human generated
output we're soon going to have more
images and more text than was human
generated it might almost be true today
but we're talking very short time and if
we don't filter it properly will
basically have generative AI become
completely useless because we won't be
able to have a good model anymore
because we polluted so much generated
stuff that comes down to the mean so we
have to figure this one out uh pretty
quickly and then this gets back you just
don't just don't force generative AI
where it doesn't belong you got to you
got to be thinking about what's the task
you've asked for if you want to know how
much inventory is left you don't want it
to generate a probabilistic answer you
want it to to actually go look up that
answer if I call to know is there a seat
available on that flight and how much
does it cost I don't want a
probabilistic guess I want an actual
answer if on the other hand I want to
know are these reviews generally good or
bad can you give me the main themes that
someone's talked about or here's a draft
I just wrote can you double check this
for some grammar and such it makes a lot
of
sense and so what we're going to start
to see and it's already happening is
this idea of Ensemble approaches using
multiple types of models multiple types
of things and you have to think about
chat GPT everyone talks about that's the
app on top of the underl model GPT 4
these underl models are already
commoditized right you can use Microsoft
open AI all of these you don't need to
build your own but you want to build
layers on top and so back to the math
problem instead of
asking do math have the prompt parsed
and realize here's a math question and
pass it to a math engine that's what
wolfrom has now it'll extract the parts
that are asking a math question do a
computation with a computation engine
and spit it back and so what I see over
time will be first of all you'll submit
your prompt to multiple large language
models and then reconcile say three or
four answers together in an ensemble
approach and this engine over here might
have made something up and this engine
over here might have made something up
but hopefully that will be the parts
that aren't consistent and the
consistent Parts across the four will be
will be mostly the accurate pieces so I
think we'll have plugins for physics for
chemistry for certain types of of
physics chemistry math and then last not
Le you with this is where I think the
real value of this if you want to
differentiate as a company is build
these application layers that are going
to parse out the prompt identify what
part of this prompt needs what type of
information computation input and maybe
part of it goes to a a traditional
Google search to actually look up some
documents while another part goes and
does some math another part looks up
something about the chemistry and the
last part crafts some kind of paragraph
summary of the information that was
found but I think this is where this
value of all of this is going to play
out those underlying models will be a
commodity there will be companies
generating um let's say generic add-ons
and then as an organization what you all
are going to need to do is generate your
own little add-ons and plugins that you
can put on top and you'll get your most
value so with that I think I'm right at
about 20 and I'll say uh thank you for
coming I'll be around for an hour or
[Music]
two
[Music]
[Music]
oh
5.0 / 5 (0 votes)