Are Hallucinations Popping the AI Bubble?

Sabine Hossenfelder
15 Aug 202408:37

Summary

TLDRThe video script discusses the recent dip in Nvidia stocks and attributes it to the bursting of a bubble surrounding Large Language Models (LLMs) in AI, rather than AI as a whole. It highlights the issue of 'hallucinations' in LLMs, where nonsensical outputs are confidently produced. The speaker argues for the integration of symbolic language and logic into AI to address these issues, citing Deepmind's progress in AI for mathematical proofs as an example of 'neurosymbolic' AI. The script concludes that companies focusing solely on LLMs may struggle, while those building AI with logical reasoning and real-world models, like Deepmind, are more likely to succeed.

Takeaways

  • 📉 The speaker bought Nvidia stocks hoping to benefit from the AI boom but has seen the stocks drop, attributing the decline to a bubble bursting in the market, specifically around Large Language Models (LLMs).
  • 🧐 The speaker believes that the current AI enthusiasm will resume once people realize there's more to AI than just LLMs, and that their stocks will recover as a result.
  • 😵 The main issue with LLMs, as highlighted, is the phenomenon of 'hallucinations' where the models confidently produce nonsensical outputs.
  • 📚 Improvements have been made in LLMs to avoid obvious errors, such as providing real book recommendations, but they still struggle with logical consistency, like referencing non-existent papers and reports.
  • 🤔 The speaker uses the example of Kindergarten riddles to illustrate how LLMs can provide answers that are 'close' to correct but fundamentally flawed due to missing context.
  • 🔍 The crux of the problem is identified as the difference in metrics for 'good' output between humans and models, suggesting that models lack an understanding of what makes their output valuable to humans.
  • 💡 A proposed solution is the integration of logic and symbolic language into AI, termed 'neurosymbolic', which could address many of the issues with LLMs by providing a more human-like logical structure.
  • 🏆 The speaker cites Deepmind's progress in AI for mathematical proofs as an example of the potential of neurosymbolic AI, emphasizing the importance of logical rigor in AI development.
  • 🤖 The speaker suggests that building intelligent AI requires starting with mathematical and physical reality models, then layering language on top, rather than focusing solely on language models.
  • 💸 There is a warning that companies heavily invested in LLMs may not recover their expenses, and that those building AI on logical reasoning and real-world models, like Deepmind, are likely to be the winners.
  • 😸 The speaker ends with a humorous note about the potential for AI to manage stocks and the recommendation of Brilliant.org for those interested in learning more about AI and related fields, offering a discount for new users.

Q & A

  • Why did the speaker decide to buy Nvidia stocks?

    -The speaker bought Nvidia stocks to benefit from the AI boom, as they wanted to get something out of the advancements in artificial intelligence.

  • What is the speaker's opinion on the current state of AI and its impact on the stock market?

    -The speaker believes that the current drop in AI-related stocks is not due to a problem with AI itself, but rather with a specific type of AI called Large Language Models, and they are optimistic that AI enthusiasm will resume once people realize there's more to AI.

  • What is the main issue with Large Language Models according to the script?

    -The main issue with Large Language Models is that they sometimes produce 'hallucinations,' which means they confidently generate incorrect or nonsensical information.

  • How have Large Language Models improved in providing book recommendations?

    -Large Language Models have become better at avoiding obvious pitfalls by listing books that actually exist, tying their output more closely to the training set.

  • What is an example of a problem that Large Language Models still struggle with?

    -An example is when solving riddles like the wolf, goat, and cabbage problem, where leaving out critical information leads the model to provide an answer that is logically incorrect, even though it uses similar words to the original problem.

  • What is the fundamental problem the speaker identifies with Large Language Models?

    -The fundamental problem is that Large Language Models use a different metric for 'good' output than humans do, and this discrepancy cannot be fixed by simply training the models with more input.

  • What solution does the speaker propose to improve the output of AI models?

    -The speaker suggests teaching AI logic and using symbolic language, similar to what math software uses, and combining this with neural networks in an approach called 'neurosymbolic' AI.

  • Why does the speaker believe that AI needs to be built on logical reasoning and models of the real world?

    -The speaker believes that because the world at its deepest level is mathematical, building intelligent AI requires starting with math and models of physical reality, and then adding words on top.

  • What is the speaker's view on the future of companies that have invested heavily in Large Language Models?

    -The speaker thinks that companies that have invested heavily in Large Language Models may never recover those expenses, and the winners will be those who build AI on logical reasoning and models of the real world.

  • What does the speaker suggest for AI researchers to focus on?

    -The speaker suggests that AI researchers should think less about words and more about physics, implying that a deeper understanding of the physical world is crucial for developing truly intelligent AI.

  • What resource does the speaker recommend for learning more about neural networks and large language models?

    -The speaker recommends Brilliant.org for its interactive visualizations and follow-up questions on a variety of topics, including neural networks and large language models.

Outlines

00:00

📉 AI Stock Woes and the Large Language Model Challenge

The speaker begins by sharing their experience of purchasing Nvidia stocks in hopes of capitalizing on the AI boom, only to witness a decline in stock value. They argue that the current market panic is not a reflection of AI's potential but rather a response to the shortcomings of Large Language Models (LLMs). The speaker highlights the issue of 'hallucinations' in LLMs, where these models produce nonsensical outputs with confidence. Despite improvements in avoiding obvious errors, such as generating real book recommendations, LLMs still struggle with logical consistency, as illustrated by the kindergarten riddle example. The speaker emphasizes that the problem lies in the models' inability to discern 'good' output based on human metrics, suggesting that integrating logic and symbolic language, as seen in neurosymbolic AI, could resolve many of these issues. They conclude by expressing optimism for AI's future, particularly in approaches that combine neural networks with logical reasoning, citing Deepmind's progress in AI-assisted mathematical proofs as an encouraging sign.

05:01

🧠 Beyond Words: The Necessity of Logic in AI Development

In the second paragraph, the speaker addresses the limitations of training large language models on vast amounts of text and images, arguing that such an approach does not lead to true understanding. They discuss the concept of 'linguistic confusion', pointing out the inherent subjectivity in language use and the difficulty of establishing logical relations from diverse textual sources. The speaker advocates for a foundational approach to AI that starts with mathematical and physical models, suggesting that companies investing solely in LLMs may not recoup their investments. They predict that success in AI will come to those who build on logical reasoning and real-world models, using Deepmind's virtual mouse example to illustrate this point. The speaker ends with a tongue-in-cheek comment about the potential chaos of AI in stock management and promotes Brilliant.org for those interested in learning more about neural networks and AI, offering a discount for new users through a provided link.

Mindmap

Keywords

💡Nvidia stocks

Nvidia is a leading technology company known for its graphics processing units (GPUs) and is a significant player in the AI industry. The video discusses the purchase of Nvidia stocks as an investment in the AI boom, which has seen a drop in value, reflecting the fluctuating investor confidence in AI technologies.

💡AI boom

The term 'AI boom' refers to the rapid growth and increased interest in artificial intelligence technologies. It is the backdrop against which the video's narrative is set, as the speaker discusses the investment in AI and its potential despite current market fluctuations.

💡Large Language Models (LLMs)

Large Language Models are a type of AI that processes and generates human-like text based on vast amounts of data. The video identifies LLMs as a specific area of AI that is currently facing challenges, such as producing 'hallucinations' or nonsensical outputs, which is causing a bubble to burst in the AI market.

💡Hallucinations

In the context of AI, 'hallucinations' refer to the phenomenon where an AI model confidently generates incorrect or nonsensical information. The video uses this term to describe a flaw in LLMs, where they may produce outputs that seem plausible but are factually incorrect, such as citing non-existent legal cases.

💡Symbolic language

Symbolic language in AI refers to the use of structured, logical representations of knowledge, akin to mathematical notation. The video suggests that teaching AI to use symbolic language, as part of a neurosymbolic approach, could address the issues with LLMs and lead to more accurate and logical AI outputs.

💡Neurosymbolic AI

Neurosymbolic AI is a hybrid approach that combines neural networks with symbolic reasoning. The video posits that this approach could solve many of the problems associated with LLMs by integrating logic and structured knowledge representation into AI systems.

💡DeepMind

DeepMind is an AI company known for its advances in creating AI systems that can solve complex problems, such as mathematical proofs. The video highlights DeepMind's progress as an example of the potential for AI that goes beyond LLMs and incorporates logical reasoning and models of the physical world.

💡Linguistic confusion

Linguistic confusion refers to the idea that different people use words with slightly different meanings, leading to ambiguity. The video discusses this concept as a challenge for AI, especially when training on text from billions of individuals, which can obscure logical relationships between words.

💡Physical reality models

Physical reality models are representations of how the physical world operates, often based on mathematical principles. The video argues that building AI on a foundation of physical reality models and mathematics, rather than just words, is key to creating truly intelligent AI systems.

💡Brilliant.org

Brilliant.org is an online platform offering interactive courses in various fields, including science, computer science, and mathematics. The video recommends it as a resource for learning more about neural networks, large language models, and other AI-related topics, with a special offer for viewers to try the platform.

Highlights

Investment in Nvidia stocks was motivated by the AI boom, but the stocks have been dropping due to investor panic.

The current bubble bursting is not due to AI itself, but rather Large Language Models (LLMs).

AI enthusiasm is expected to resume as people realize there's more to AI than just LLMs.

LLMs are known to produce 'hallucinations', confidently generating incorrect information.

An example of LLMs' limitations includes citing non-existent legal cases, as seen with a lawyer using ChatGPT.

LLMs have improved in avoiding obvious pitfalls, such as recommending real books, but still struggle with made-up references.

AI's challenge is producing correct output that is 'close' to wrong output, as illustrated with modified riddles.

The issue with LLMs is the discrepancy between human and model metrics for what constitutes 'good' output.

A potential solution is 'neurosymbolic' AI, combining neural networks with symbolic language for improved logic.

DeepMind's progress in AI for mathematical proofs demonstrates the potential of neurosymbolic AI.

Neurosymbolic AI could resolve many issues with LLMs by applying logical rigor to verbal arguments.

The need for AI to understand and use logic, as opposed to just processing more text and images.

The difficulty of integrating symbolic reasoning with existing models due to 'linguistic confusion'.

The necessity for retraining large language models to incorporate a deeper understanding of logic and physical reality.

The world's fundamental nature is mathematical, suggesting AI should be built on models of physical reality.

Companies that invested heavily in LLMs may not recover their expenses, unlike those focusing on logical reasoning.

DeepMind's virtual mouse example illustrates a promising approach to building intelligent AI.

A call for AI researchers to focus more on physics and less on words for truly intelligent AI development.

Recommendation of Brilliant.org for learning about neural networks, large language models, and other scientific topics.

Transcripts

play00:00

A few weeks ago, I finally came around  to buy a few Nvidia stocks because, hey,  

play00:05

I also want to get something out of  the AI boom. These stocks have been  

play00:09

dropping ever since. Why? Oh  right, I don’t believe in god.

play00:14

Yes, so, it doesn’t look good for AI at  the moment as investors are panicking and  

play00:18

stocks are dropping. But in this video  I want to make a case that this bubble  

play00:23

which is currently bursting is not that  of AI per se, it’s that of the specific  

play00:28

type of AI called Large Language Models. I am sure AI enthusiasm will resume once  

play00:35

people get it into their head that there’s  more to AI. And my stocks will recover.

play00:40

The best known problem with Large Language Models  is what has become known as “hallucinations”.  

play00:46

They sometimes confidently ramble along  and produce nonsense. You’d think we all  

play00:52

learned this lesson in 2022, but then  there was the lawyer who used ChatGPT  

play00:58

to cook up a defence and ended up citing  cases that simply didn’t exist. Oops.

play01:03

Large Language Models have become better at  avoiding some obvious pitfalls. For example,  

play01:09

if you ask ChatGPT for book  recommendations it will now  

play01:13

list books that actually exist. It  still often refers to made-up papers  

play01:18

and reports though. And Midjourney now for  the most part puts 5 fingers on each hand,  

play01:24

so much so that if you explicitly ask for a hand  with 6 fingers, it will still have 5 fingers.

play01:31

You can do this by tying some output  closely to the training set. That might  

play01:36

make the problem appear solvable. But it’s  not that simple because hallucinations are  

play01:42

just one symptom of a much bigger  problem, which is that for a Large  

play01:46

Language Models correct output that is -- in a  quantifiable sense -- “close” to wrong output.

play01:53

A very illustrative example comes from Colin  Fraser who has been using modified versions  

play01:59

of Kindergarten riddles, like the wolf,  goat, and cabbage problem. In this riddle,  

play02:05

the farmer has to get all three in a boat across  the river. But the boat will only carry one item  

play02:11

in addition to the farmer. Left unattended, the  wolf will eat the goat, and the goat will eat  

play02:17

the cabbage. The solution is that the farmer  has to take one of the items back on a trip.

play02:23

If you ask a large language model  this question but leave out the  

play02:28

information that the wolf will eat  the goat and the goat the cabbage  

play02:32

then it will still give the same  answer, which now makes no sense.

play02:37

I like this example because it’s obvious what’s  going wrong. By way of word content, the altered  

play02:44

riddle is similar to the riddle that the models  have been trained on. So they extrapolate from  

play02:50

what they know and spit out an answer that is  close to the answer for the original riddle. 

play02:56

But as with hallucinations these answers are  “close” in a sense that we don’t care about. Yes,  

play03:03

they use similar words. But the content is  wrong. It’s like in some sense a hand with  

play03:09

six fingers is “close” to one with  five fingers. But it’s still wrong.

play03:14

The issue is that we have a different metric for  good output than the one that the models use.  

play03:21

When I say “we” I mean humans, just in case there  are some misunderstandings. And these different  

play03:27

metrics for what is “good” are a problem you can’t  fix by just training a model with more and more  

play03:33

input. It’s fundamentally missing information  about what makes its output good for us.

play03:39

The solution is to teach AI logic to use  symbolic language, similar to what most  

play03:45

maths software uses. If you combine that with  a neural network, it’s called “neurosymbolic”.  

play03:52

This can fix a lot of problems with large language  models and some of those approaches already exist. 

play03:58

For example I mentioned already in January that  deepmind made remarkable progress with using  

play04:04

AI for mathematical proofs. Just last month they  reported that their maths AI now reached the level  

play04:11

of a silver medallist in the maths Olympics. Not  only does it solve the problems it also provides  

play04:19

a proof that humans can understand. Well, kind of.

play04:23

The relevant point isn’t that AI can solves  maths Olympics problems because let’s be honest,  

play04:28

who really cares. The relevant point  is that this AI can parse the problems,  

play04:33

and can formulate logically correct answers that a  human can understand. Apply this logical rigor to  

play04:41

verbal arguments, and boom, a lot of problems  with large language models will disappear.

play04:47

Imagine an AI that could win  every internet argument. Reddit  

play04:51

would become a ghost town overnight. What I just told you is neither new  

play04:55

nor particularly original. It’s been pointed  out for decades by many computer scientists,  

play05:01

including Gary Marcus and and others. I  just want to say: I think they’re right.  

play05:08

You can’t just train large language models on  more and more text and images and hope that  

play05:13

it will begin to “understand” what’s going  on. And I think no one really expected that.

play05:19

That said, it’s more difficult than lumping  symbolic reasoning on top of the already existing  

play05:25

Models basically because of what Wittgenstein  called “linguistic confusion”. It’s that no  

play05:32

two people use a word to mean exactly the  same. And once you have lumped together  

play05:37

text from billions of different people, logical  relations between these words become washed out,  

play05:44

if there ever were any to begin with. I mean  it’s not like people are all that good with  

play05:49

logic. So I’m afraid that the already trained  large language models will have to be retrained.

play05:55

Ultimately, the problem with large language  models is that the world is not made of  

play06:02

words. At the deepest level we know of,  the world is mathematics. If you want to  

play06:08

build an intelligent AI you need to start with  maths, and with models about physical reality,  

play06:15

and then put words on top of that. What this all means is that companies  

play06:21

which have poured a lot of money into  large language models might never recover  

play06:25

those expenses. The winners will eventually be  those who build an AI on logical reasoning and  

play06:33

models of the real world, like Deepmind. What you see here is a recent example in  

play06:38

which they created a virtual mouse in a virtual  environment that’s moving with its own neural  

play06:46

network modelled after a real mouse brain. This,  I think, is how you’ll get to really intelligent  

play06:53

AI. Next up: virtual cats chasing virtual mice  across your computer screen during important Zoom  

play06:59

calls. Deepmind was acquired in 2014 by Google,  so I haven’t yet lost faith in my Google stocks.

play07:06

The brief summary is that all those people  working on AI need to think less about words and  

play07:12

more about physics. Just wait until people start  using AI to manage their stocks, it’ll be great.

play07:19

Artificial intelligence is really everywhere  these days. If you want to learn more about  

play07:25

how neural networks and large language models  work, I recommend you check out the courses on  

play07:31

Brilliant.org. All courses on Brilliant  have interactive visualizations and come  

play07:36

with follow-up questions. I found it to be very  effective to learn something new. It really gives  

play07:42

you a feeling for what's going on and helps you  build general problem-solving skills. They cover  

play07:48

a large variety of topics in science, computer  science, and maths. From general scientific  

play07:53

thinking to dedicated courses on differential  equations or large language models. And they're  

play08:00

adding new courses each month. It's a fast and  easy way to learn and you can do it whenever and  

play08:07

wherever you have the time. Sounds good? I hope  it does! You can try Brillant yourself for free  

play08:14

if you use my link brilliant.org/sabine that way  you'll get to try out everything Brilliant has  

play08:20

to offer for a full 30 days and you'll get 20%  off the annual premium subscription. So go and  

play08:27

give it a try, I'm sure you will won't regret  it. Thanks for watching, see you tomorrow.

Rate This

5.0 / 5 (0 votes)

Etiquetas Relacionadas
AI FutureLarge ModelsLogical AINeurosymbolicStock MarketAI BubbleTech InvestingAI HallucinationsDeep LearningAI EthicsTech Trends
¿Necesitas un resumen en inglés?