Are Hallucinations Popping the AI Bubble?

Sabine Hossenfelder
15 Aug 202408:37

Summary

TLDRThe video script discusses the recent dip in Nvidia stocks and attributes it to the bursting of a bubble surrounding Large Language Models (LLMs) in AI, rather than AI as a whole. It highlights the issue of 'hallucinations' in LLMs, where nonsensical outputs are confidently produced. The speaker argues for the integration of symbolic language and logic into AI to address these issues, citing Deepmind's progress in AI for mathematical proofs as an example of 'neurosymbolic' AI. The script concludes that companies focusing solely on LLMs may struggle, while those building AI with logical reasoning and real-world models, like Deepmind, are more likely to succeed.

Takeaways

  • šŸ“‰ The speaker bought Nvidia stocks hoping to benefit from the AI boom but has seen the stocks drop, attributing the decline to a bubble bursting in the market, specifically around Large Language Models (LLMs).
  • šŸ§ The speaker believes that the current AI enthusiasm will resume once people realize there's more to AI than just LLMs, and that their stocks will recover as a result.
  • šŸ˜µ The main issue with LLMs, as highlighted, is the phenomenon of 'hallucinations' where the models confidently produce nonsensical outputs.
  • šŸ“š Improvements have been made in LLMs to avoid obvious errors, such as providing real book recommendations, but they still struggle with logical consistency, like referencing non-existent papers and reports.
  • šŸ¤” The speaker uses the example of Kindergarten riddles to illustrate how LLMs can provide answers that are 'close' to correct but fundamentally flawed due to missing context.
  • šŸ” The crux of the problem is identified as the difference in metrics for 'good' output between humans and models, suggesting that models lack an understanding of what makes their output valuable to humans.
  • šŸ’” A proposed solution is the integration of logic and symbolic language into AI, termed 'neurosymbolic', which could address many of the issues with LLMs by providing a more human-like logical structure.
  • šŸ† The speaker cites Deepmind's progress in AI for mathematical proofs as an example of the potential of neurosymbolic AI, emphasizing the importance of logical rigor in AI development.
  • šŸ¤– The speaker suggests that building intelligent AI requires starting with mathematical and physical reality models, then layering language on top, rather than focusing solely on language models.
  • šŸ’ø There is a warning that companies heavily invested in LLMs may not recover their expenses, and that those building AI on logical reasoning and real-world models, like Deepmind, are likely to be the winners.
  • šŸ˜ø The speaker ends with a humorous note about the potential for AI to manage stocks and the recommendation of Brilliant.org for those interested in learning more about AI and related fields, offering a discount for new users.

Q & A

  • Why did the speaker decide to buy Nvidia stocks?

    -The speaker bought Nvidia stocks to benefit from the AI boom, as they wanted to get something out of the advancements in artificial intelligence.

  • What is the speaker's opinion on the current state of AI and its impact on the stock market?

    -The speaker believes that the current drop in AI-related stocks is not due to a problem with AI itself, but rather with a specific type of AI called Large Language Models, and they are optimistic that AI enthusiasm will resume once people realize there's more to AI.

  • What is the main issue with Large Language Models according to the script?

    -The main issue with Large Language Models is that they sometimes produce 'hallucinations,' which means they confidently generate incorrect or nonsensical information.

  • How have Large Language Models improved in providing book recommendations?

    -Large Language Models have become better at avoiding obvious pitfalls by listing books that actually exist, tying their output more closely to the training set.

  • What is an example of a problem that Large Language Models still struggle with?

    -An example is when solving riddles like the wolf, goat, and cabbage problem, where leaving out critical information leads the model to provide an answer that is logically incorrect, even though it uses similar words to the original problem.

  • What is the fundamental problem the speaker identifies with Large Language Models?

    -The fundamental problem is that Large Language Models use a different metric for 'good' output than humans do, and this discrepancy cannot be fixed by simply training the models with more input.

  • What solution does the speaker propose to improve the output of AI models?

    -The speaker suggests teaching AI logic and using symbolic language, similar to what math software uses, and combining this with neural networks in an approach called 'neurosymbolic' AI.

  • Why does the speaker believe that AI needs to be built on logical reasoning and models of the real world?

    -The speaker believes that because the world at its deepest level is mathematical, building intelligent AI requires starting with math and models of physical reality, and then adding words on top.

  • What is the speaker's view on the future of companies that have invested heavily in Large Language Models?

    -The speaker thinks that companies that have invested heavily in Large Language Models may never recover those expenses, and the winners will be those who build AI on logical reasoning and models of the real world.

  • What does the speaker suggest for AI researchers to focus on?

    -The speaker suggests that AI researchers should think less about words and more about physics, implying that a deeper understanding of the physical world is crucial for developing truly intelligent AI.

  • What resource does the speaker recommend for learning more about neural networks and large language models?

    -The speaker recommends Brilliant.org for its interactive visualizations and follow-up questions on a variety of topics, including neural networks and large language models.

Outlines

00:00

šŸ“‰ AI Stock Woes and the Large Language Model Challenge

The speaker begins by sharing their experience of purchasing Nvidia stocks in hopes of capitalizing on the AI boom, only to witness a decline in stock value. They argue that the current market panic is not a reflection of AI's potential but rather a response to the shortcomings of Large Language Models (LLMs). The speaker highlights the issue of 'hallucinations' in LLMs, where these models produce nonsensical outputs with confidence. Despite improvements in avoiding obvious errors, such as generating real book recommendations, LLMs still struggle with logical consistency, as illustrated by the kindergarten riddle example. The speaker emphasizes that the problem lies in the models' inability to discern 'good' output based on human metrics, suggesting that integrating logic and symbolic language, as seen in neurosymbolic AI, could resolve many of these issues. They conclude by expressing optimism for AI's future, particularly in approaches that combine neural networks with logical reasoning, citing Deepmind's progress in AI-assisted mathematical proofs as an encouraging sign.

05:01

šŸ§  Beyond Words: The Necessity of Logic in AI Development

In the second paragraph, the speaker addresses the limitations of training large language models on vast amounts of text and images, arguing that such an approach does not lead to true understanding. They discuss the concept of 'linguistic confusion', pointing out the inherent subjectivity in language use and the difficulty of establishing logical relations from diverse textual sources. The speaker advocates for a foundational approach to AI that starts with mathematical and physical models, suggesting that companies investing solely in LLMs may not recoup their investments. They predict that success in AI will come to those who build on logical reasoning and real-world models, using Deepmind's virtual mouse example to illustrate this point. The speaker ends with a tongue-in-cheek comment about the potential chaos of AI in stock management and promotes Brilliant.org for those interested in learning more about neural networks and AI, offering a discount for new users through a provided link.

Mindmap

Keywords

šŸ’”Nvidia stocks

Nvidia is a leading technology company known for its graphics processing units (GPUs) and is a significant player in the AI industry. The video discusses the purchase of Nvidia stocks as an investment in the AI boom, which has seen a drop in value, reflecting the fluctuating investor confidence in AI technologies.

šŸ’”AI boom

The term 'AI boom' refers to the rapid growth and increased interest in artificial intelligence technologies. It is the backdrop against which the video's narrative is set, as the speaker discusses the investment in AI and its potential despite current market fluctuations.

šŸ’”Large Language Models (LLMs)

Large Language Models are a type of AI that processes and generates human-like text based on vast amounts of data. The video identifies LLMs as a specific area of AI that is currently facing challenges, such as producing 'hallucinations' or nonsensical outputs, which is causing a bubble to burst in the AI market.

šŸ’”Hallucinations

In the context of AI, 'hallucinations' refer to the phenomenon where an AI model confidently generates incorrect or nonsensical information. The video uses this term to describe a flaw in LLMs, where they may produce outputs that seem plausible but are factually incorrect, such as citing non-existent legal cases.

šŸ’”Symbolic language

Symbolic language in AI refers to the use of structured, logical representations of knowledge, akin to mathematical notation. The video suggests that teaching AI to use symbolic language, as part of a neurosymbolic approach, could address the issues with LLMs and lead to more accurate and logical AI outputs.

šŸ’”Neurosymbolic AI

Neurosymbolic AI is a hybrid approach that combines neural networks with symbolic reasoning. The video posits that this approach could solve many of the problems associated with LLMs by integrating logic and structured knowledge representation into AI systems.

šŸ’”DeepMind

DeepMind is an AI company known for its advances in creating AI systems that can solve complex problems, such as mathematical proofs. The video highlights DeepMind's progress as an example of the potential for AI that goes beyond LLMs and incorporates logical reasoning and models of the physical world.

šŸ’”Linguistic confusion

Linguistic confusion refers to the idea that different people use words with slightly different meanings, leading to ambiguity. The video discusses this concept as a challenge for AI, especially when training on text from billions of individuals, which can obscure logical relationships between words.

šŸ’”Physical reality models

Physical reality models are representations of how the physical world operates, often based on mathematical principles. The video argues that building AI on a foundation of physical reality models and mathematics, rather than just words, is key to creating truly intelligent AI systems.

šŸ’”Brilliant.org

Brilliant.org is an online platform offering interactive courses in various fields, including science, computer science, and mathematics. The video recommends it as a resource for learning more about neural networks, large language models, and other AI-related topics, with a special offer for viewers to try the platform.

Highlights

Investment in Nvidia stocks was motivated by the AI boom, but the stocks have been dropping due to investor panic.

The current bubble bursting is not due to AI itself, but rather Large Language Models (LLMs).

AI enthusiasm is expected to resume as people realize there's more to AI than just LLMs.

LLMs are known to produce 'hallucinations', confidently generating incorrect information.

An example of LLMs' limitations includes citing non-existent legal cases, as seen with a lawyer using ChatGPT.

LLMs have improved in avoiding obvious pitfalls, such as recommending real books, but still struggle with made-up references.

AI's challenge is producing correct output that is 'close' to wrong output, as illustrated with modified riddles.

The issue with LLMs is the discrepancy between human and model metrics for what constitutes 'good' output.

A potential solution is 'neurosymbolic' AI, combining neural networks with symbolic language for improved logic.

DeepMind's progress in AI for mathematical proofs demonstrates the potential of neurosymbolic AI.

Neurosymbolic AI could resolve many issues with LLMs by applying logical rigor to verbal arguments.

The need for AI to understand and use logic, as opposed to just processing more text and images.

The difficulty of integrating symbolic reasoning with existing models due to 'linguistic confusion'.

The necessity for retraining large language models to incorporate a deeper understanding of logic and physical reality.

The world's fundamental nature is mathematical, suggesting AI should be built on models of physical reality.

Companies that invested heavily in LLMs may not recover their expenses, unlike those focusing on logical reasoning.

DeepMind's virtual mouse example illustrates a promising approach to building intelligent AI.

A call for AI researchers to focus more on physics and less on words for truly intelligent AI development.

Recommendation of Brilliant.org for learning about neural networks, large language models, and other scientific topics.

Transcripts

play00:00

A few weeks ago, I finally came aroundĀ  to buy a few Nvidia stocks because, hey,Ā Ā 

play00:05

I also want to get something out ofĀ  the AI boom. These stocks have beenĀ Ā 

play00:09

dropping ever since. Why? OhĀ  right, I donā€™t believe in god.

play00:14

Yes, so, it doesnā€™t look good for AI atĀ  the moment as investors are panicking andĀ Ā 

play00:18

stocks are dropping. But in this videoĀ  I want to make a case that this bubbleĀ Ā 

play00:23

which is currently bursting is not thatĀ  of AI per se, itā€™s that of the specificĀ Ā 

play00:28

type of AI called Large Language Models. I am sure AI enthusiasm will resume onceĀ Ā 

play00:35

people get it into their head that thereā€™sĀ  more to AI. And my stocks will recover.

play00:40

The best known problem with Large Language ModelsĀ  is what has become known as ā€œhallucinationsā€.Ā Ā 

play00:46

They sometimes confidently ramble alongĀ  and produce nonsense. Youā€™d think we allĀ Ā 

play00:52

learned this lesson in 2022, but thenĀ  there was the lawyer who used ChatGPTĀ Ā 

play00:58

to cook up a defence and ended up citingĀ  cases that simply didnā€™t exist. Oops.

play01:03

Large Language Models have become better atĀ  avoiding some obvious pitfalls. For example,Ā Ā 

play01:09

if you ask ChatGPT for bookĀ  recommendations it will nowĀ Ā 

play01:13

list books that actually exist. ItĀ  still often refers to made-up papersĀ Ā 

play01:18

and reports though. And Midjourney now forĀ  the most part puts 5 fingers on each hand,Ā Ā 

play01:24

so much so that if you explicitly ask for a handĀ  with 6 fingers, it will still have 5 fingers.

play01:31

You can do this by tying some outputĀ  closely to the training set. That mightĀ Ā 

play01:36

make the problem appear solvable. But itā€™sĀ  not that simple because hallucinations areĀ Ā 

play01:42

just one symptom of a much biggerĀ  problem, which is that for a LargeĀ Ā 

play01:46

Language Models correct output that is -- in aĀ  quantifiable sense -- ā€œcloseā€ to wrong output.

play01:53

A very illustrative example comes from ColinĀ  Fraser who has been using modified versionsĀ Ā 

play01:59

of Kindergarten riddles, like the wolf,Ā  goat, and cabbage problem. In this riddle,Ā Ā 

play02:05

the farmer has to get all three in a boat acrossĀ  the river. But the boat will only carry one itemĀ Ā 

play02:11

in addition to the farmer. Left unattended, theĀ  wolf will eat the goat, and the goat will eatĀ Ā 

play02:17

the cabbage. The solution is that the farmerĀ  has to take one of the items back on a trip.

play02:23

If you ask a large language modelĀ  this question but leave out theĀ Ā 

play02:28

information that the wolf will eatĀ  the goat and the goat the cabbageĀ Ā 

play02:32

then it will still give the sameĀ  answer, which now makes no sense.

play02:37

I like this example because itā€™s obvious whatā€™sĀ  going wrong. By way of word content, the alteredĀ Ā 

play02:44

riddle is similar to the riddle that the modelsĀ  have been trained on. So they extrapolate fromĀ Ā 

play02:50

what they know and spit out an answer that isĀ  close to the answer for the original riddle.Ā 

play02:56

But as with hallucinations these answers areĀ  ā€œcloseā€ in a sense that we donā€™t care about. Yes,Ā Ā 

play03:03

they use similar words. But the content isĀ  wrong. Itā€™s like in some sense a hand withĀ Ā 

play03:09

six fingers is ā€œcloseā€ to one withĀ  five fingers. But itā€™s still wrong.

play03:14

The issue is that we have a different metric forĀ  good output than the one that the models use.Ā Ā 

play03:21

When I say ā€œweā€ I mean humans, just in case thereĀ  are some misunderstandings. And these differentĀ Ā 

play03:27

metrics for what is ā€œgoodā€ are a problem you canā€™tĀ  fix by just training a model with more and moreĀ Ā 

play03:33

input. Itā€™s fundamentally missing informationĀ  about what makes its output good for us.

play03:39

The solution is to teach AI logic to useĀ  symbolic language, similar to what mostĀ Ā 

play03:45

maths software uses. If you combine that withĀ  a neural network, itā€™s called ā€œneurosymbolicā€.Ā Ā 

play03:52

This can fix a lot of problems with large languageĀ  models and some of those approaches already exist.Ā 

play03:58

For example I mentioned already in January thatĀ  deepmind made remarkable progress with usingĀ Ā 

play04:04

AI for mathematical proofs. Just last month theyĀ  reported that their maths AI now reached the levelĀ Ā 

play04:11

of a silver medallist in the maths Olympics. NotĀ  only does it solve the problems it also providesĀ Ā 

play04:19

a proof that humans can understand. Well, kind of.

play04:23

The relevant point isnā€™t that AI can solvesĀ  maths Olympics problems because letā€™s be honest,Ā Ā 

play04:28

who really cares. The relevant pointĀ  is that this AI can parse the problems,Ā Ā 

play04:33

and can formulate logically correct answers that aĀ  human can understand. Apply this logical rigor toĀ Ā 

play04:41

verbal arguments, and boom, a lot of problemsĀ  with large language models will disappear.

play04:47

Imagine an AI that could winĀ  every internet argument. RedditĀ Ā 

play04:51

would become a ghost town overnight. What I just told you is neither newĀ Ā 

play04:55

nor particularly original. Itā€™s been pointedĀ  out for decades by many computer scientists,Ā Ā 

play05:01

including Gary Marcus and and others. IĀ  just want to say: I think theyā€™re right.Ā Ā 

play05:08

You canā€™t just train large language models onĀ  more and more text and images and hope thatĀ Ā 

play05:13

it will begin to ā€œunderstandā€ whatā€™s goingĀ  on. And I think no one really expected that.

play05:19

That said, itā€™s more difficult than lumpingĀ  symbolic reasoning on top of the already existingĀ Ā 

play05:25

Models basically because of what WittgensteinĀ  called ā€œlinguistic confusionā€. Itā€™s that noĀ Ā 

play05:32

two people use a word to mean exactly theĀ  same. And once you have lumped togetherĀ Ā 

play05:37

text from billions of different people, logicalĀ  relations between these words become washed out,Ā Ā 

play05:44

if there ever were any to begin with. I meanĀ  itā€™s not like people are all that good withĀ Ā 

play05:49

logic. So Iā€™m afraid that the already trainedĀ  large language models will have to be retrained.

play05:55

Ultimately, the problem with large languageĀ  models is that the world is not made ofĀ Ā 

play06:02

words. At the deepest level we know of,Ā  the world is mathematics. If you want toĀ Ā 

play06:08

build an intelligent AI you need to start withĀ  maths, and with models about physical reality,Ā Ā 

play06:15

and then put words on top of that. What this all means is that companiesĀ Ā 

play06:21

which have poured a lot of money intoĀ  large language models might never recoverĀ Ā 

play06:25

those expenses. The winners will eventually beĀ  those who build an AI on logical reasoning andĀ Ā 

play06:33

models of the real world, like Deepmind. What you see here is a recent example inĀ Ā 

play06:38

which they created a virtual mouse in a virtualĀ  environment thatā€™s moving with its own neuralĀ Ā 

play06:46

network modelled after a real mouse brain. This,Ā  I think, is how youā€™ll get to really intelligentĀ Ā 

play06:53

AI. Next up: virtual cats chasing virtual miceĀ  across your computer screen during important ZoomĀ Ā 

play06:59

calls. Deepmind was acquired in 2014 by Google,Ā  so I havenā€™t yet lost faith in my Google stocks.

play07:06

The brief summary is that all those peopleĀ  working on AI need to think less about words andĀ Ā 

play07:12

more about physics. Just wait until people startĀ  using AI to manage their stocks, itā€™ll be great.

play07:19

Artificial intelligence is really everywhereĀ  these days. If you want to learn more aboutĀ Ā 

play07:25

how neural networks and large language modelsĀ  work, I recommend you check out the courses onĀ Ā 

play07:31

Brilliant.org. All courses on BrilliantĀ  have interactive visualizations and comeĀ Ā 

play07:36

with follow-up questions. I found it to be veryĀ  effective to learn something new. It really givesĀ Ā 

play07:42

you a feeling for what's going on and helps youĀ  build general problem-solving skills. They coverĀ Ā 

play07:48

a large variety of topics in science, computerĀ  science, and maths. From general scientificĀ Ā 

play07:53

thinking to dedicated courses on differentialĀ  equations or large language models. And they'reĀ Ā 

play08:00

adding new courses each month. It's a fast andĀ  easy way to learn and you can do it whenever andĀ Ā 

play08:07

wherever you have the time. Sounds good? I hopeĀ  it does! You can try Brillant yourself for freeĀ Ā 

play08:14

if you use my link brilliant.org/sabine that wayĀ  you'll get to try out everything Brilliant hasĀ Ā 

play08:20

to offer for a full 30 days and you'll get 20%Ā  off the annual premium subscription. So go andĀ Ā 

play08:27

give it a try, I'm sure you will won't regretĀ  it. Thanks for watching, see you tomorrow.

Rate This
ā˜…
ā˜…
ā˜…
ā˜…
ā˜…

5.0 / 5 (0 votes)

Related Tags
AI FutureLarge ModelsLogical AINeurosymbolicStock MarketAI BubbleTech InvestingAI HallucinationsDeep LearningAI EthicsTech Trends