Are Hallucinations Popping the AI Bubble?
Summary
TLDRThe video script discusses the recent dip in Nvidia stocks and attributes it to the bursting of a bubble surrounding Large Language Models (LLMs) in AI, rather than AI as a whole. It highlights the issue of 'hallucinations' in LLMs, where nonsensical outputs are confidently produced. The speaker argues for the integration of symbolic language and logic into AI to address these issues, citing Deepmind's progress in AI for mathematical proofs as an example of 'neurosymbolic' AI. The script concludes that companies focusing solely on LLMs may struggle, while those building AI with logical reasoning and real-world models, like Deepmind, are more likely to succeed.
Takeaways
- š The speaker bought Nvidia stocks hoping to benefit from the AI boom but has seen the stocks drop, attributing the decline to a bubble bursting in the market, specifically around Large Language Models (LLMs).
- š§ The speaker believes that the current AI enthusiasm will resume once people realize there's more to AI than just LLMs, and that their stocks will recover as a result.
- šµ The main issue with LLMs, as highlighted, is the phenomenon of 'hallucinations' where the models confidently produce nonsensical outputs.
- š Improvements have been made in LLMs to avoid obvious errors, such as providing real book recommendations, but they still struggle with logical consistency, like referencing non-existent papers and reports.
- š¤ The speaker uses the example of Kindergarten riddles to illustrate how LLMs can provide answers that are 'close' to correct but fundamentally flawed due to missing context.
- š The crux of the problem is identified as the difference in metrics for 'good' output between humans and models, suggesting that models lack an understanding of what makes their output valuable to humans.
- š” A proposed solution is the integration of logic and symbolic language into AI, termed 'neurosymbolic', which could address many of the issues with LLMs by providing a more human-like logical structure.
- š The speaker cites Deepmind's progress in AI for mathematical proofs as an example of the potential of neurosymbolic AI, emphasizing the importance of logical rigor in AI development.
- š¤ The speaker suggests that building intelligent AI requires starting with mathematical and physical reality models, then layering language on top, rather than focusing solely on language models.
- šø There is a warning that companies heavily invested in LLMs may not recover their expenses, and that those building AI on logical reasoning and real-world models, like Deepmind, are likely to be the winners.
- šø The speaker ends with a humorous note about the potential for AI to manage stocks and the recommendation of Brilliant.org for those interested in learning more about AI and related fields, offering a discount for new users.
Q & A
Why did the speaker decide to buy Nvidia stocks?
-The speaker bought Nvidia stocks to benefit from the AI boom, as they wanted to get something out of the advancements in artificial intelligence.
What is the speaker's opinion on the current state of AI and its impact on the stock market?
-The speaker believes that the current drop in AI-related stocks is not due to a problem with AI itself, but rather with a specific type of AI called Large Language Models, and they are optimistic that AI enthusiasm will resume once people realize there's more to AI.
What is the main issue with Large Language Models according to the script?
-The main issue with Large Language Models is that they sometimes produce 'hallucinations,' which means they confidently generate incorrect or nonsensical information.
How have Large Language Models improved in providing book recommendations?
-Large Language Models have become better at avoiding obvious pitfalls by listing books that actually exist, tying their output more closely to the training set.
What is an example of a problem that Large Language Models still struggle with?
-An example is when solving riddles like the wolf, goat, and cabbage problem, where leaving out critical information leads the model to provide an answer that is logically incorrect, even though it uses similar words to the original problem.
What is the fundamental problem the speaker identifies with Large Language Models?
-The fundamental problem is that Large Language Models use a different metric for 'good' output than humans do, and this discrepancy cannot be fixed by simply training the models with more input.
What solution does the speaker propose to improve the output of AI models?
-The speaker suggests teaching AI logic and using symbolic language, similar to what math software uses, and combining this with neural networks in an approach called 'neurosymbolic' AI.
Why does the speaker believe that AI needs to be built on logical reasoning and models of the real world?
-The speaker believes that because the world at its deepest level is mathematical, building intelligent AI requires starting with math and models of physical reality, and then adding words on top.
What is the speaker's view on the future of companies that have invested heavily in Large Language Models?
-The speaker thinks that companies that have invested heavily in Large Language Models may never recover those expenses, and the winners will be those who build AI on logical reasoning and models of the real world.
What does the speaker suggest for AI researchers to focus on?
-The speaker suggests that AI researchers should think less about words and more about physics, implying that a deeper understanding of the physical world is crucial for developing truly intelligent AI.
What resource does the speaker recommend for learning more about neural networks and large language models?
-The speaker recommends Brilliant.org for its interactive visualizations and follow-up questions on a variety of topics, including neural networks and large language models.
Outlines
š AI Stock Woes and the Large Language Model Challenge
The speaker begins by sharing their experience of purchasing Nvidia stocks in hopes of capitalizing on the AI boom, only to witness a decline in stock value. They argue that the current market panic is not a reflection of AI's potential but rather a response to the shortcomings of Large Language Models (LLMs). The speaker highlights the issue of 'hallucinations' in LLMs, where these models produce nonsensical outputs with confidence. Despite improvements in avoiding obvious errors, such as generating real book recommendations, LLMs still struggle with logical consistency, as illustrated by the kindergarten riddle example. The speaker emphasizes that the problem lies in the models' inability to discern 'good' output based on human metrics, suggesting that integrating logic and symbolic language, as seen in neurosymbolic AI, could resolve many of these issues. They conclude by expressing optimism for AI's future, particularly in approaches that combine neural networks with logical reasoning, citing Deepmind's progress in AI-assisted mathematical proofs as an encouraging sign.
š§ Beyond Words: The Necessity of Logic in AI Development
In the second paragraph, the speaker addresses the limitations of training large language models on vast amounts of text and images, arguing that such an approach does not lead to true understanding. They discuss the concept of 'linguistic confusion', pointing out the inherent subjectivity in language use and the difficulty of establishing logical relations from diverse textual sources. The speaker advocates for a foundational approach to AI that starts with mathematical and physical models, suggesting that companies investing solely in LLMs may not recoup their investments. They predict that success in AI will come to those who build on logical reasoning and real-world models, using Deepmind's virtual mouse example to illustrate this point. The speaker ends with a tongue-in-cheek comment about the potential chaos of AI in stock management and promotes Brilliant.org for those interested in learning more about neural networks and AI, offering a discount for new users through a provided link.
Mindmap
Keywords
š”Nvidia stocks
š”AI boom
š”Large Language Models (LLMs)
š”Hallucinations
š”Symbolic language
š”Neurosymbolic AI
š”DeepMind
š”Linguistic confusion
š”Physical reality models
š”Brilliant.org
Highlights
Investment in Nvidia stocks was motivated by the AI boom, but the stocks have been dropping due to investor panic.
The current bubble bursting is not due to AI itself, but rather Large Language Models (LLMs).
AI enthusiasm is expected to resume as people realize there's more to AI than just LLMs.
LLMs are known to produce 'hallucinations', confidently generating incorrect information.
An example of LLMs' limitations includes citing non-existent legal cases, as seen with a lawyer using ChatGPT.
LLMs have improved in avoiding obvious pitfalls, such as recommending real books, but still struggle with made-up references.
AI's challenge is producing correct output that is 'close' to wrong output, as illustrated with modified riddles.
The issue with LLMs is the discrepancy between human and model metrics for what constitutes 'good' output.
A potential solution is 'neurosymbolic' AI, combining neural networks with symbolic language for improved logic.
DeepMind's progress in AI for mathematical proofs demonstrates the potential of neurosymbolic AI.
Neurosymbolic AI could resolve many issues with LLMs by applying logical rigor to verbal arguments.
The need for AI to understand and use logic, as opposed to just processing more text and images.
The difficulty of integrating symbolic reasoning with existing models due to 'linguistic confusion'.
The necessity for retraining large language models to incorporate a deeper understanding of logic and physical reality.
The world's fundamental nature is mathematical, suggesting AI should be built on models of physical reality.
Companies that invested heavily in LLMs may not recover their expenses, unlike those focusing on logical reasoning.
DeepMind's virtual mouse example illustrates a promising approach to building intelligent AI.
A call for AI researchers to focus more on physics and less on words for truly intelligent AI development.
Recommendation of Brilliant.org for learning about neural networks, large language models, and other scientific topics.
Transcripts
A few weeks ago, I finally came aroundĀ to buy a few Nvidia stocks because, hey,Ā Ā
I also want to get something out ofĀ the AI boom. These stocks have beenĀ Ā
dropping ever since. Why? OhĀ right, I donāt believe in god.
Yes, so, it doesnāt look good for AI atĀ the moment as investors are panicking andĀ Ā
stocks are dropping. But in this videoĀ I want to make a case that this bubbleĀ Ā
which is currently bursting is not thatĀ of AI per se, itās that of the specificĀ Ā
type of AI called Large Language Models. I am sure AI enthusiasm will resume onceĀ Ā
people get it into their head that thereāsĀ more to AI. And my stocks will recover.
The best known problem with Large Language ModelsĀ is what has become known as āhallucinationsā.Ā Ā
They sometimes confidently ramble alongĀ and produce nonsense. Youād think we allĀ Ā
learned this lesson in 2022, but thenĀ there was the lawyer who used ChatGPTĀ Ā
to cook up a defence and ended up citingĀ cases that simply didnāt exist. Oops.
Large Language Models have become better atĀ avoiding some obvious pitfalls. For example,Ā Ā
if you ask ChatGPT for bookĀ recommendations it will nowĀ Ā
list books that actually exist. ItĀ still often refers to made-up papersĀ Ā
and reports though. And Midjourney now forĀ the most part puts 5 fingers on each hand,Ā Ā
so much so that if you explicitly ask for a handĀ with 6 fingers, it will still have 5 fingers.
You can do this by tying some outputĀ closely to the training set. That mightĀ Ā
make the problem appear solvable. But itāsĀ not that simple because hallucinations areĀ Ā
just one symptom of a much biggerĀ problem, which is that for a LargeĀ Ā
Language Models correct output that is -- in aĀ quantifiable sense -- ācloseā to wrong output.
A very illustrative example comes from ColinĀ Fraser who has been using modified versionsĀ Ā
of Kindergarten riddles, like the wolf,Ā goat, and cabbage problem. In this riddle,Ā Ā
the farmer has to get all three in a boat acrossĀ the river. But the boat will only carry one itemĀ Ā
in addition to the farmer. Left unattended, theĀ wolf will eat the goat, and the goat will eatĀ Ā
the cabbage. The solution is that the farmerĀ has to take one of the items back on a trip.
If you ask a large language modelĀ this question but leave out theĀ Ā
information that the wolf will eatĀ the goat and the goat the cabbageĀ Ā
then it will still give the sameĀ answer, which now makes no sense.
I like this example because itās obvious whatāsĀ going wrong. By way of word content, the alteredĀ Ā
riddle is similar to the riddle that the modelsĀ have been trained on. So they extrapolate fromĀ Ā
what they know and spit out an answer that isĀ close to the answer for the original riddle.Ā
But as with hallucinations these answers areĀ ācloseā in a sense that we donāt care about. Yes,Ā Ā
they use similar words. But the content isĀ wrong. Itās like in some sense a hand withĀ Ā
six fingers is ācloseā to one withĀ five fingers. But itās still wrong.
The issue is that we have a different metric forĀ good output than the one that the models use.Ā Ā
When I say āweā I mean humans, just in case thereĀ are some misunderstandings. And these differentĀ Ā
metrics for what is āgoodā are a problem you canātĀ fix by just training a model with more and moreĀ Ā
input. Itās fundamentally missing informationĀ about what makes its output good for us.
The solution is to teach AI logic to useĀ symbolic language, similar to what mostĀ Ā
maths software uses. If you combine that withĀ a neural network, itās called āneurosymbolicā.Ā Ā
This can fix a lot of problems with large languageĀ models and some of those approaches already exist.Ā
For example I mentioned already in January thatĀ deepmind made remarkable progress with usingĀ Ā
AI for mathematical proofs. Just last month theyĀ reported that their maths AI now reached the levelĀ Ā
of a silver medallist in the maths Olympics. NotĀ only does it solve the problems it also providesĀ Ā
a proof that humans can understand. Well, kind of.
The relevant point isnāt that AI can solvesĀ maths Olympics problems because letās be honest,Ā Ā
who really cares. The relevant pointĀ is that this AI can parse the problems,Ā Ā
and can formulate logically correct answers that aĀ human can understand. Apply this logical rigor toĀ Ā
verbal arguments, and boom, a lot of problemsĀ with large language models will disappear.
Imagine an AI that could winĀ every internet argument. RedditĀ Ā
would become a ghost town overnight. What I just told you is neither newĀ Ā
nor particularly original. Itās been pointedĀ out for decades by many computer scientists,Ā Ā
including Gary Marcus and and others. IĀ just want to say: I think theyāre right.Ā Ā
You canāt just train large language models onĀ more and more text and images and hope thatĀ Ā
it will begin to āunderstandā whatās goingĀ on. And I think no one really expected that.
That said, itās more difficult than lumpingĀ symbolic reasoning on top of the already existingĀ Ā
Models basically because of what WittgensteinĀ called ālinguistic confusionā. Itās that noĀ Ā
two people use a word to mean exactly theĀ same. And once you have lumped togetherĀ Ā
text from billions of different people, logicalĀ relations between these words become washed out,Ā Ā
if there ever were any to begin with. I meanĀ itās not like people are all that good withĀ Ā
logic. So Iām afraid that the already trainedĀ large language models will have to be retrained.
Ultimately, the problem with large languageĀ models is that the world is not made ofĀ Ā
words. At the deepest level we know of,Ā the world is mathematics. If you want toĀ Ā
build an intelligent AI you need to start withĀ maths, and with models about physical reality,Ā Ā
and then put words on top of that. What this all means is that companiesĀ Ā
which have poured a lot of money intoĀ large language models might never recoverĀ Ā
those expenses. The winners will eventually beĀ those who build an AI on logical reasoning andĀ Ā
models of the real world, like Deepmind. What you see here is a recent example inĀ Ā
which they created a virtual mouse in a virtualĀ environment thatās moving with its own neuralĀ Ā
network modelled after a real mouse brain. This,Ā I think, is how youāll get to really intelligentĀ Ā
AI. Next up: virtual cats chasing virtual miceĀ across your computer screen during important ZoomĀ Ā
calls. Deepmind was acquired in 2014 by Google,Ā so I havenāt yet lost faith in my Google stocks.
The brief summary is that all those peopleĀ working on AI need to think less about words andĀ Ā
more about physics. Just wait until people startĀ using AI to manage their stocks, itāll be great.
Artificial intelligence is really everywhereĀ these days. If you want to learn more aboutĀ Ā
how neural networks and large language modelsĀ work, I recommend you check out the courses onĀ Ā
Brilliant.org. All courses on BrilliantĀ have interactive visualizations and comeĀ Ā
with follow-up questions. I found it to be veryĀ effective to learn something new. It really givesĀ Ā
you a feeling for what's going on and helps youĀ build general problem-solving skills. They coverĀ Ā
a large variety of topics in science, computerĀ science, and maths. From general scientificĀ Ā
thinking to dedicated courses on differentialĀ equations or large language models. And they'reĀ Ā
adding new courses each month. It's a fast andĀ easy way to learn and you can do it whenever andĀ Ā
wherever you have the time. Sounds good? I hopeĀ it does! You can try Brillant yourself for freeĀ Ā
if you use my link brilliant.org/sabine that wayĀ you'll get to try out everything Brilliant hasĀ Ā
to offer for a full 30 days and you'll get 20%Ā off the annual premium subscription. So go andĀ Ā
give it a try, I'm sure you will won't regretĀ it. Thanks for watching, see you tomorrow.
Browse More Related Video
Conversation w/ Victoria Albrecht (Springbok.ai) - How To Build Your Own Internal ChatGPT
The AI Hype is OVER! Have LLMs Peaked?
AI Unveiled beyond the buzz episode 4
Introduction to large language models
Simplifying Generative AI : Explaining Tokens, Parameters, Context Windows and more.
Fine Tuning, RAG e Prompt Engineering: Qual Ć© melhor? e Quando Usar?
5.0 / 5 (0 votes)