Are Hallucinations Popping the AI Bubble?
Summary
TLDRThe video script discusses the recent dip in Nvidia stocks and attributes it to the bursting of a bubble surrounding Large Language Models (LLMs) in AI, rather than AI as a whole. It highlights the issue of 'hallucinations' in LLMs, where nonsensical outputs are confidently produced. The speaker argues for the integration of symbolic language and logic into AI to address these issues, citing Deepmind's progress in AI for mathematical proofs as an example of 'neurosymbolic' AI. The script concludes that companies focusing solely on LLMs may struggle, while those building AI with logical reasoning and real-world models, like Deepmind, are more likely to succeed.
Takeaways
- 📉 The speaker bought Nvidia stocks hoping to benefit from the AI boom but has seen the stocks drop, attributing the decline to a bubble bursting in the market, specifically around Large Language Models (LLMs).
- 🧐 The speaker believes that the current AI enthusiasm will resume once people realize there's more to AI than just LLMs, and that their stocks will recover as a result.
- 😵 The main issue with LLMs, as highlighted, is the phenomenon of 'hallucinations' where the models confidently produce nonsensical outputs.
- 📚 Improvements have been made in LLMs to avoid obvious errors, such as providing real book recommendations, but they still struggle with logical consistency, like referencing non-existent papers and reports.
- 🤔 The speaker uses the example of Kindergarten riddles to illustrate how LLMs can provide answers that are 'close' to correct but fundamentally flawed due to missing context.
- 🔍 The crux of the problem is identified as the difference in metrics for 'good' output between humans and models, suggesting that models lack an understanding of what makes their output valuable to humans.
- 💡 A proposed solution is the integration of logic and symbolic language into AI, termed 'neurosymbolic', which could address many of the issues with LLMs by providing a more human-like logical structure.
- 🏆 The speaker cites Deepmind's progress in AI for mathematical proofs as an example of the potential of neurosymbolic AI, emphasizing the importance of logical rigor in AI development.
- 🤖 The speaker suggests that building intelligent AI requires starting with mathematical and physical reality models, then layering language on top, rather than focusing solely on language models.
- 💸 There is a warning that companies heavily invested in LLMs may not recover their expenses, and that those building AI on logical reasoning and real-world models, like Deepmind, are likely to be the winners.
- 😸 The speaker ends with a humorous note about the potential for AI to manage stocks and the recommendation of Brilliant.org for those interested in learning more about AI and related fields, offering a discount for new users.
Q & A
Why did the speaker decide to buy Nvidia stocks?
-The speaker bought Nvidia stocks to benefit from the AI boom, as they wanted to get something out of the advancements in artificial intelligence.
What is the speaker's opinion on the current state of AI and its impact on the stock market?
-The speaker believes that the current drop in AI-related stocks is not due to a problem with AI itself, but rather with a specific type of AI called Large Language Models, and they are optimistic that AI enthusiasm will resume once people realize there's more to AI.
What is the main issue with Large Language Models according to the script?
-The main issue with Large Language Models is that they sometimes produce 'hallucinations,' which means they confidently generate incorrect or nonsensical information.
How have Large Language Models improved in providing book recommendations?
-Large Language Models have become better at avoiding obvious pitfalls by listing books that actually exist, tying their output more closely to the training set.
What is an example of a problem that Large Language Models still struggle with?
-An example is when solving riddles like the wolf, goat, and cabbage problem, where leaving out critical information leads the model to provide an answer that is logically incorrect, even though it uses similar words to the original problem.
What is the fundamental problem the speaker identifies with Large Language Models?
-The fundamental problem is that Large Language Models use a different metric for 'good' output than humans do, and this discrepancy cannot be fixed by simply training the models with more input.
What solution does the speaker propose to improve the output of AI models?
-The speaker suggests teaching AI logic and using symbolic language, similar to what math software uses, and combining this with neural networks in an approach called 'neurosymbolic' AI.
Why does the speaker believe that AI needs to be built on logical reasoning and models of the real world?
-The speaker believes that because the world at its deepest level is mathematical, building intelligent AI requires starting with math and models of physical reality, and then adding words on top.
What is the speaker's view on the future of companies that have invested heavily in Large Language Models?
-The speaker thinks that companies that have invested heavily in Large Language Models may never recover those expenses, and the winners will be those who build AI on logical reasoning and models of the real world.
What does the speaker suggest for AI researchers to focus on?
-The speaker suggests that AI researchers should think less about words and more about physics, implying that a deeper understanding of the physical world is crucial for developing truly intelligent AI.
What resource does the speaker recommend for learning more about neural networks and large language models?
-The speaker recommends Brilliant.org for its interactive visualizations and follow-up questions on a variety of topics, including neural networks and large language models.
Outlines
📉 AI Stock Woes and the Large Language Model Challenge
The speaker begins by sharing their experience of purchasing Nvidia stocks in hopes of capitalizing on the AI boom, only to witness a decline in stock value. They argue that the current market panic is not a reflection of AI's potential but rather a response to the shortcomings of Large Language Models (LLMs). The speaker highlights the issue of 'hallucinations' in LLMs, where these models produce nonsensical outputs with confidence. Despite improvements in avoiding obvious errors, such as generating real book recommendations, LLMs still struggle with logical consistency, as illustrated by the kindergarten riddle example. The speaker emphasizes that the problem lies in the models' inability to discern 'good' output based on human metrics, suggesting that integrating logic and symbolic language, as seen in neurosymbolic AI, could resolve many of these issues. They conclude by expressing optimism for AI's future, particularly in approaches that combine neural networks with logical reasoning, citing Deepmind's progress in AI-assisted mathematical proofs as an encouraging sign.
🧠 Beyond Words: The Necessity of Logic in AI Development
In the second paragraph, the speaker addresses the limitations of training large language models on vast amounts of text and images, arguing that such an approach does not lead to true understanding. They discuss the concept of 'linguistic confusion', pointing out the inherent subjectivity in language use and the difficulty of establishing logical relations from diverse textual sources. The speaker advocates for a foundational approach to AI that starts with mathematical and physical models, suggesting that companies investing solely in LLMs may not recoup their investments. They predict that success in AI will come to those who build on logical reasoning and real-world models, using Deepmind's virtual mouse example to illustrate this point. The speaker ends with a tongue-in-cheek comment about the potential chaos of AI in stock management and promotes Brilliant.org for those interested in learning more about neural networks and AI, offering a discount for new users through a provided link.
Mindmap
Keywords
💡Nvidia stocks
💡AI boom
💡Large Language Models (LLMs)
💡Hallucinations
💡Symbolic language
💡Neurosymbolic AI
💡DeepMind
💡Linguistic confusion
💡Physical reality models
💡Brilliant.org
Highlights
Investment in Nvidia stocks was motivated by the AI boom, but the stocks have been dropping due to investor panic.
The current bubble bursting is not due to AI itself, but rather Large Language Models (LLMs).
AI enthusiasm is expected to resume as people realize there's more to AI than just LLMs.
LLMs are known to produce 'hallucinations', confidently generating incorrect information.
An example of LLMs' limitations includes citing non-existent legal cases, as seen with a lawyer using ChatGPT.
LLMs have improved in avoiding obvious pitfalls, such as recommending real books, but still struggle with made-up references.
AI's challenge is producing correct output that is 'close' to wrong output, as illustrated with modified riddles.
The issue with LLMs is the discrepancy between human and model metrics for what constitutes 'good' output.
A potential solution is 'neurosymbolic' AI, combining neural networks with symbolic language for improved logic.
DeepMind's progress in AI for mathematical proofs demonstrates the potential of neurosymbolic AI.
Neurosymbolic AI could resolve many issues with LLMs by applying logical rigor to verbal arguments.
The need for AI to understand and use logic, as opposed to just processing more text and images.
The difficulty of integrating symbolic reasoning with existing models due to 'linguistic confusion'.
The necessity for retraining large language models to incorporate a deeper understanding of logic and physical reality.
The world's fundamental nature is mathematical, suggesting AI should be built on models of physical reality.
Companies that invested heavily in LLMs may not recover their expenses, unlike those focusing on logical reasoning.
DeepMind's virtual mouse example illustrates a promising approach to building intelligent AI.
A call for AI researchers to focus more on physics and less on words for truly intelligent AI development.
Recommendation of Brilliant.org for learning about neural networks, large language models, and other scientific topics.
Transcripts
A few weeks ago, I finally came around to buy a few Nvidia stocks because, hey,
I also want to get something out of the AI boom. These stocks have been
dropping ever since. Why? Oh right, I don’t believe in god.
Yes, so, it doesn’t look good for AI at the moment as investors are panicking and
stocks are dropping. But in this video I want to make a case that this bubble
which is currently bursting is not that of AI per se, it’s that of the specific
type of AI called Large Language Models. I am sure AI enthusiasm will resume once
people get it into their head that there’s more to AI. And my stocks will recover.
The best known problem with Large Language Models is what has become known as “hallucinations”.
They sometimes confidently ramble along and produce nonsense. You’d think we all
learned this lesson in 2022, but then there was the lawyer who used ChatGPT
to cook up a defence and ended up citing cases that simply didn’t exist. Oops.
Large Language Models have become better at avoiding some obvious pitfalls. For example,
if you ask ChatGPT for book recommendations it will now
list books that actually exist. It still often refers to made-up papers
and reports though. And Midjourney now for the most part puts 5 fingers on each hand,
so much so that if you explicitly ask for a hand with 6 fingers, it will still have 5 fingers.
You can do this by tying some output closely to the training set. That might
make the problem appear solvable. But it’s not that simple because hallucinations are
just one symptom of a much bigger problem, which is that for a Large
Language Models correct output that is -- in a quantifiable sense -- “close” to wrong output.
A very illustrative example comes from Colin Fraser who has been using modified versions
of Kindergarten riddles, like the wolf, goat, and cabbage problem. In this riddle,
the farmer has to get all three in a boat across the river. But the boat will only carry one item
in addition to the farmer. Left unattended, the wolf will eat the goat, and the goat will eat
the cabbage. The solution is that the farmer has to take one of the items back on a trip.
If you ask a large language model this question but leave out the
information that the wolf will eat the goat and the goat the cabbage
then it will still give the same answer, which now makes no sense.
I like this example because it’s obvious what’s going wrong. By way of word content, the altered
riddle is similar to the riddle that the models have been trained on. So they extrapolate from
what they know and spit out an answer that is close to the answer for the original riddle.
But as with hallucinations these answers are “close” in a sense that we don’t care about. Yes,
they use similar words. But the content is wrong. It’s like in some sense a hand with
six fingers is “close” to one with five fingers. But it’s still wrong.
The issue is that we have a different metric for good output than the one that the models use.
When I say “we” I mean humans, just in case there are some misunderstandings. And these different
metrics for what is “good” are a problem you can’t fix by just training a model with more and more
input. It’s fundamentally missing information about what makes its output good for us.
The solution is to teach AI logic to use symbolic language, similar to what most
maths software uses. If you combine that with a neural network, it’s called “neurosymbolic”.
This can fix a lot of problems with large language models and some of those approaches already exist.
For example I mentioned already in January that deepmind made remarkable progress with using
AI for mathematical proofs. Just last month they reported that their maths AI now reached the level
of a silver medallist in the maths Olympics. Not only does it solve the problems it also provides
a proof that humans can understand. Well, kind of.
The relevant point isn’t that AI can solves maths Olympics problems because let’s be honest,
who really cares. The relevant point is that this AI can parse the problems,
and can formulate logically correct answers that a human can understand. Apply this logical rigor to
verbal arguments, and boom, a lot of problems with large language models will disappear.
Imagine an AI that could win every internet argument. Reddit
would become a ghost town overnight. What I just told you is neither new
nor particularly original. It’s been pointed out for decades by many computer scientists,
including Gary Marcus and and others. I just want to say: I think they’re right.
You can’t just train large language models on more and more text and images and hope that
it will begin to “understand” what’s going on. And I think no one really expected that.
That said, it’s more difficult than lumping symbolic reasoning on top of the already existing
Models basically because of what Wittgenstein called “linguistic confusion”. It’s that no
two people use a word to mean exactly the same. And once you have lumped together
text from billions of different people, logical relations between these words become washed out,
if there ever were any to begin with. I mean it’s not like people are all that good with
logic. So I’m afraid that the already trained large language models will have to be retrained.
Ultimately, the problem with large language models is that the world is not made of
words. At the deepest level we know of, the world is mathematics. If you want to
build an intelligent AI you need to start with maths, and with models about physical reality,
and then put words on top of that. What this all means is that companies
which have poured a lot of money into large language models might never recover
those expenses. The winners will eventually be those who build an AI on logical reasoning and
models of the real world, like Deepmind. What you see here is a recent example in
which they created a virtual mouse in a virtual environment that’s moving with its own neural
network modelled after a real mouse brain. This, I think, is how you’ll get to really intelligent
AI. Next up: virtual cats chasing virtual mice across your computer screen during important Zoom
calls. Deepmind was acquired in 2014 by Google, so I haven’t yet lost faith in my Google stocks.
The brief summary is that all those people working on AI need to think less about words and
more about physics. Just wait until people start using AI to manage their stocks, it’ll be great.
Artificial intelligence is really everywhere these days. If you want to learn more about
how neural networks and large language models work, I recommend you check out the courses on
Brilliant.org. All courses on Brilliant have interactive visualizations and come
with follow-up questions. I found it to be very effective to learn something new. It really gives
you a feeling for what's going on and helps you build general problem-solving skills. They cover
a large variety of topics in science, computer science, and maths. From general scientific
thinking to dedicated courses on differential equations or large language models. And they're
adding new courses each month. It's a fast and easy way to learn and you can do it whenever and
wherever you have the time. Sounds good? I hope it does! You can try Brillant yourself for free
if you use my link brilliant.org/sabine that way you'll get to try out everything Brilliant has
to offer for a full 30 days and you'll get 20% off the annual premium subscription. So go and
give it a try, I'm sure you will won't regret it. Thanks for watching, see you tomorrow.
関連動画をさらに表示
Conversation w/ Victoria Albrecht (Springbok.ai) - How To Build Your Own Internal ChatGPT
The AI Hype is OVER! Have LLMs Peaked?
AI Unveiled beyond the buzz episode 4
Introduction to large language models
Simplifying Generative AI : Explaining Tokens, Parameters, Context Windows and more.
Fine Tuning, RAG e Prompt Engineering: Qual é melhor? e Quando Usar?
5.0 / 5 (0 votes)