“What's wrong with LLMs and what we should be building instead” - Tom Dietterich - #VSCF2023
Summary
TLDRIn this insightful discussion, Tom reflects on the rapid evolution and challenges of large language models (LLMs). He highlights the importance of open-source initiatives in improving efficiency and adaptability. While LLMs excel in tasks like syntax manipulation, he cautions against using them blindly in high-risk applications, emphasizing the need for verification mechanisms. Tom explores how combining LLMs with traditional systems—such as code execution or proof assistants—can enhance their reliability. Ultimately, he envisions a future where LLMs play a key role in both creative fields and critical applications, provided they are appropriately integrated and verified.
Takeaways
- 😀 Large language models (LLMs) have impressive abilities, such as reading and ingesting large amounts of web data, but they are fundamentally flawed due to the lack of separation between knowledge components like factual knowledge, language understanding, and common sense.
- 🧠 Current LLMs are not equipped with episodic memory or situation modeling, which means they cannot remember past events or build a coherent understanding of ongoing conversations or narratives.
- 🔄 A modular approach to AI systems is proposed, where different components for language, factual knowledge, memory, and reasoning are separated and integrated to overcome the limitations of LLMs.
- 💡 Knowledge graphs are an important tool for representing factual knowledge, with the idea that new facts should be added to these graphs as they appear in conversations or documents.
- 🤖 The current architecture of LLMs treats knowledge as a statistical model rather than a knowledge base, making it difficult to ensure accuracy and correctness in outputs.
- 📈 A key challenge for AI systems is improving truthfulness, as models do not inherently understand what constitutes correct information or be able to provide sound justifications for their answers.
- 🔍 The proposal suggests that LLMs could output both answers and arguments, providing justifications for their conclusions to ensure that reasoning is transparent and verifiable.
- 🧳 Current LLMs struggle with epistemic uncertainty (lack of knowledge) and instead treat all uncertainty as randomness (aliatoric uncertainty), leading to overconfidence in their answers.
- 📚 Hybrid systems, combining LLMs with traditional methods like planning systems or proof assistants, can improve reliability by checking the output of LLMs against other trusted tools.
- 💬 For high-risk applications (e.g., autonomous vehicles or high-security software), it's essential to verify LLM outputs before accepting them, while more creative tasks can tolerate higher levels of uncertainty and error.
- 🌍 There is a strong push for open-source collaboration to address the challenges faced by LLMs. By releasing models to the public, small companies and academic researchers can experiment and drive progress in AI development.
Q & A
What role has open-source development played in improving large language models (LLMs)?
-Open-source development has been crucial in driving advancements in LLMs. It has spurred innovation from academics, hobbyists, and small companies, leading to improvements in speed, efficiency, and ease of updates, which have accelerated progress in LLM technology.
Why is there a need for a strong open-source push for large language models?
-A strong open-source push is needed to tackle the various challenges faced by LLMs. Open-source efforts allow for broader collaboration, rapid iteration, and solutions to efficiency, accuracy, and updating problems, which are central to improving the capabilities of LLMs.
How can prompt engineering help overcome the limitations of LLMs?
-Prompt engineering can guide LLMs to generate more accurate and relevant outputs by designing specific instructions. It can also be used in combination with verification mechanisms to check whether the answers provided by LLMs are correct, ensuring more reliable results in various applications.
What is the importance of combining LLMs with traditional systems like planners and proof assistants?
-Combining LLMs with traditional systems helps ensure the correctness of the LLM's output. For example, planners can check whether a generated plan is feasible, and proof assistants can help verify the correctness of software code, enhancing the reliability of AI applications in high-stakes areas.
Can LLMs be used for code generation, and if so, how can their accuracy be verified?
-Yes, LLMs can generate code, but their accuracy can be verified by executing the generated code to check if it produces the correct results. Additionally, running program analysis over the code can also help identify errors or inaccuracies.
What applications are LLMs particularly well-suited for, according to the transcript?
-LLMs are particularly effective for tasks involving language transformation (e.g., translating between formats or languages), syntactic tasks (e.g., converting JSON to CSV), and creative applications like writing assistance, where errors or stochastic outcomes are more acceptable.
How can LLMs be beneficial for creative writing and entertainment?
-LLMs can assist in creative writing by improving fluency, helping with language translation, and making scientific papers more accessible. They also enable more efficient content generation in entertainment, offering tools for writers to enhance their productivity and creativity.
What is the concern when using LLMs in high-risk applications like autonomous driving?
-The main concern with using LLMs in high-risk applications, such as autonomous driving, is the potential for errors. In these cases, it’s crucial to verify the LLM's interpretation and ensure it correctly understands the input, as trusting LLMs without verification can lead to catastrophic consequences.
Why is verification of LLM outputs critical in high-risk sectors?
-Verification is essential in high-risk sectors because errors in LLM outputs could result in severe consequences. For example, in autonomous vehicles, misinterpreting commands could lead to accidents. Therefore, mechanisms are needed to check the correctness of LLM-generated outputs before they are implemented.
What advancements were made in program verification for high-security software, as mentioned in the transcript?
-Recent advancements in program verification involve integrating LLMs with proof assistants. These systems can automate the process of generating proofs for software correctness, ensuring that the code meets high security and reliability standards, particularly for high-risk applications.
Outlines

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowMindmap

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowKeywords

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowHighlights

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowTranscripts

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowBrowse More Related Video

Ilya Sutskever: "Sequence to sequence learning with neural networks: what a decade"

Breaking AI's 1-GHz Barrier: Sunny Madra (Groq)

L'inesplicabile utilità di Claude Sonnet, a prescindere da ciò che dicono i benchmark

Large Language Models (LLMs) - Everything You NEED To Know

The Dark Matter of AI [Mechanistic Interpretability]

Introduction to large language models
5.0 / 5 (0 votes)