Demis Hassabis - Scaling, Superhuman AIs, AlphaZero atop LLMs, Rogue Nations Threat
Summary
TLDR在这次播客中,DeepMind的CEO Demis Hassabis分享了他对人工智能和神经科学的看法。他讨论了智能的本质,大型语言模型(LLMs)的学习方式,以及人工智能在科学和健康领域的应用。Hassabis强调了负责任地开发AI的重要性,并探讨了未来AI技术的潜在影响,包括对人类工作的影响和多模态交互的可能性。他还提到了DeepMind在确保AI安全方面的努力,以及他们如何通过内部检查和平衡来管理风险。
Takeaways
- 🧠 智能定义:Demis认为智能是一系列高层次通用算法主题,尽管大脑有专门化的部分。
- 📈 学习转移:大型语言模型(LLMs)在特定领域数据丰富时,会在该领域表现出不对称的改进。
- 🤖 人工智能学习:人工智能通过经验学习,类似于人类在特定领域(如棋类、创意写作)的专业化学习。
- 🧬 神经科学启示:神经科学为人工智能提供了灵感,如强化学习和注意力机制,尽管并非直接的算法映射。
- 🔍 智能分析:需要更多研究来理解这些系统构建的表示,类似于对真实大脑的fMRI或单细胞记录分析。
- 🌳 人工智能发展:尽管人工智能取得了显著进展,但在规划和构建正确世界模型方面仍有待改进。
- 🚀 计算能力:通过提高模型的准确性和可靠性,可以更有效地进行搜索,减少所需的计算资源。
- 🏆 奖励函数:在现实世界系统中定义正确的目标函数和奖励函数是一个挑战,但科学问题通常可以指定目标。
- 💡 创新与扩展:结合旧的算法思想和新的大规模模型是人工智能发展的一个重要方向。
- 🔒 安全与开放源代码:在确保安全性的同时,开放源代码可以促进科学进步,但需要平衡以防止滥用。
- 🤔 未来展望:人工智能的未来将由多个利益相关者共同决定,包括民间社会、学术界和政府等。
Q & A
Demis Hassabis认为智能的核心算法主题是什么?
-Demis Hassabis认为智能的核心算法主题是大脑如何处理我们周围的世界,尽管大脑有专门做特定事情的部分,但可能有一些基本原则支撑着所有这些。
在特定领域给予大型语言模型大量数据时,它们在该领域的性能会不对称地提升,这是否意味着我们应该在所有不同领域都看到普遍的改进?
-Demis Hassabis提到,有时在特定领域改进时,确实会在其他领域看到出人意料的改进。例如,当这些大型模型在编码方面改进时,实际上可以提高它们的一般推理能力。
Demis Hassabis认为人工智能系统中的哪种转移学习最令人惊讶?
-Demis Hassabis希望看到更多这类转移学习,例如在编码和数学方面变得更好,然后普遍提高推理能力。他认为这与人类学习者的工作方式相似。
Demis Hassabis如何看待神经科学对人工智能的启示?
-Demis Hassabis认为神经科学在过去的10-20年中为人工智能提供了很多有趣的方向性线索,例如强化学习和深度学习的结合,以及体验重放和注意力机制等概念。
Demis Hassabis如何看待从大型语言模型到具有树状搜索功能的系统的转变?
-Demis Hassabis认为这是一个非常有前途的方向。我们需要不断提高大型模型的准确性,使它们成为更可靠的世界模型,并在此基础上研究类似AlphaZero的规划机制。
Demis Hassabis如何看待计算资源的高效利用?
-Demis Hassabis认为摩尔定律有所帮助,每年都会有更多的计算资源。但他强调重点是样本高效的方法和重用现有数据,例如体验重放,以及寻找更高效的搜索方式。
Demis Hassabis如何看待人工智能的目标函数和奖励函数的设定?
-Demis Hassabis指出,为现实世界系统定义正确的目标函数和奖励函数是非常困难的。他认为这是一个挑战,需要以一种既一般又足够具体的方式来指定它们,以便正确指引系统的方向。
Demis Hassabis如何看待人类智能与人工智能的规划和搜索能力?
-Demis Hassabis认为人类大脑并不适合进行蒙特卡洛树搜索,而是使用直觉、知识和经验来构建非常准确的物理模型,包括心理模拟。他认为这与AlphaGo等人工智能系统规划下一步的方式不同。
Demis Hassabis对于强化学习在解决数据瓶颈问题上的乐观态度基于什么?
-Demis Hassabis认为还有更多的数据可以使用,尤其是如果我们将多模态和视频等视为数据来源。他看到了通过模拟、自我对弈等方法创造合成数据的空间,这在AlphaGo和AlphaZero中已经得到了很好的应用。
Demis Hassabis如何看待人工智能的未来发展和可能的时间线?
-Demis Hassabis没有给出具体的时间预测,但他表示,当他在2010年创办DeepMind时,他们将其视为一个20年的项目,并且他们现在仍在正确的轨道上。他不会对在未来十年内看到类似AGI的系统感到惊讶。
Demis Hassabis如何看待人工智能系统的安全和控制问题?
-Demis Hassabis认为,为了确保这些系统是可理解且可控的,我们需要更多的理解。他提到了一些可能的解决方案,如更严格的评估系统、使用专门的AI工具来帮助分析和总结更广泛系统的活动,并创建硬化的沙箱或模拟环境来进行实验。
Outlines
🤖 对话DeepMind CEO Demis Hassabis
本段落介绍了与DeepMind首席执行官Demis Hassabis的对话,探讨了他对智能的看法,特别是他是否认为智能是一个更高层次的通用推理回路,或者是数千个独立的子技能和启发式程序。Hassabis认为,智能的广泛应用表明,大脑处理周围世界的方式可能存在一些高级的通用算法主题。他还讨论了大型语言模型(LLMs)在特定领域数据丰富时表现出的不对称改进现象,以及这种改进是否会导致所有不同领域的普遍提升。此外,Hassabis分享了他对人类大脑学习和人工智能系统之间相似之处的看法,以及他对这些系统如何通过改进编码和数学能力来提高推理能力的期望。
🧠 神经科学对人工智能的启示
在这一段落中,Demis Hassabis讨论了神经科学对人工智能发展的影响,尤其是过去的10到20年中神经科学的贡献。他提到,早期的人工智能浪潮中,神经科学提供了很多有趣的方向性线索,比如强化学习和深度学习的结合。他还强调了大脑作为一个自然系统,为工程系统提供了灵感和方向,尽管不是具体的算法一一对应。Hassabis认为,大脑证明了通用智能的可能性,这对于人工智能的发展具有重要的启示作用。
📈 人工智能的未来趋势
Demis Hassabis在这一段落中分享了他对人工智能未来发展的看法,特别是关于规划和构建正确世界模型的问题。他提到了DeepMind在AlphaZero等系统上的工作,这些系统能够通过多个步骤思考来实现最终结果。Hassabis认为,将大型模型与类似AlphaZero的规划机制相结合是一个非常有前途的方向。他还讨论了如何通过提高模型的准确性和可靠性来使它们成为更好的世界模型,以及如何通过搜索来探索可能性的巨大空间。
🤔 人工智能研究中的挑战与机遇
在这一段落中,Demis Hassabis探讨了人工智能研究中存在的挑战和机遇,特别是在计算资源和算法创新方面。他提到了摩尔定律对计算能力增长的帮助,以及DeepMind在样本高效方法和现有数据重用方面的研究。Hassabis还强调了改进世界模型的重要性,因为这可以使搜索过程更加高效。他以AlphaZero为例,说明了如何通过更好的模型来减少搜索所需的计算资源。此外,他还讨论了如何定义正确的目标函数和奖励函数,以及如何在一般性和特定性之间找到平衡。
🌟 人工智能的潜力与责任
Demis Hassabis在这一段落中讨论了人工智能的潜力和责任,特别是对于强人工智能(AGI)的发展。他认为,尽管存在许多未知和不确定性,但AGI的出现可能会在几个月到几年内加速人工智能研究的进展。Hassabis强调了在AGI系统出现之前,我们需要开发适当的评估和度量标准,以确保系统的安全性和可控性。他还提到了DeepMind在确保AI安全方面的努力,包括创建硬化的沙箱环境和使用专门的AI工具来分析和总结更通用系统的行为。
🚀 人工智能的未来发展
在这一段落中,Demis Hassabis分享了他对人工智能未来发展的展望,包括对多模态模型和机器人技术的进步。他认为,随着系统变得更加强大和通用,我们将开始习惯于真正的多模态交互,这将改变我们与AI系统的互动方式。Hassabis还讨论了在机器人领域取得进展的挑战,尤其是数据问题,但他对通过大型模型转移学习和从模拟到现实的学习表示乐观。他强调了在AI领域保持谨慎和负责任的重要性,并希望在AGI到来之前,通过AI为科学和健康带来实际的好处。
Mindmap
Keywords
💡神经科学
💡深度学习
💡强化学习
💡人工智能
💡通用智能
💡多模态
💡自我对弈
💡计算资源
💡安全与伦理
💡知识转移
Highlights
Demis Hassabis认为智能背后存在高级通用算法主题,但大脑也有专门化的部分。
大型语言模型(LLMs)在特定领域数据丰富时,会在该领域不对称地变得更好,但这种改善有时也会意外地提升其他领域的能力。
DeepMind的研究表明,通过大型模型改进编码能力可以实际提高它们的一般推理能力。
Demis Hassabis希望看到更多跨领域的知识转移,例如通过改进编程和数学能力来提高一般推理能力。
DeepMind正在研究如何通过类似AlphaZero的规划机制来提高大型模型的效率和实用性。
Demis Hassabis认为,尽管摩尔定律有所帮助,但仍需关注样本高效方法和现有数据的重用,以提高计算效率。
DeepMind正在探索如何通过多模态和视频数据创建合成数据,以及如何通过自博弈生成知识库。
Demis Hassabis强调,对于AGI的发展,我们需要同时推进模型扩展和创新发明。
DeepMind的AlphaGo系统在围棋和象棋等游戏中的表现优于人类世界冠军,但使用的搜索量远少于传统的暴力搜索方法。
Demis Hassabis认为,人类智能与AlphaGo在规划下一步时的方法不同,因为人脑不适合进行蒙特卡洛树搜索。
DeepMind正在研究如何通过强化学习(RL)和自博弈来克服数据瓶颈问题,并提高模型的自我演化能力。
Demis Hassabis提出,未来的AGI系统可能需要结合大型多模态模型和额外的规划搜索功能。
DeepMind的研究表明,通过人类反馈系统(RLHF)可以获得一定程度的现实基础,尽管模型并未直接体验多模态世界。
Demis Hassabis讨论了如何确保随着AI技术的发展,保持其与现实世界的联系和道德约束。
DeepMind已经在内部建立了检查和平衡机制,并计划在未来几个月公开发表关于负责任的AI扩展和技术论文。
Demis Hassabis认为,AGI的到来可能会在下一个十年内实现,并且这将是人类历史上最具变革性的技术之一。
Transcripts
Today it is a true honor to speak with Demis Hassabis, who is the CEO of DeepMind. Demis,
welcome to the podcast. Thanks for having me.
First question, given your neuroscience background, how do you think about intelligence?
Specifically, do you think it’s one higher-level general reasoning circuit, or do you think it’s
thousands of independent subskills and heuristics? It’s interesting because intelligence is so
broad and what we use it for is so generally applicable. I think that suggests there must
be high-level common algorithmic themes around how the brain processes the world
around us. Of course, there are specialized parts of the brain that do specific things,
but I think there are probably some underlying principles that underpin all of that.
How do you make sense of the fact that in these LLMs, when you give them a lot
of data in any specific domain, they tend to get asymmetrically better in
that domain. Wouldn’t we expect a general improvement across all the different areas?
First of all, I think you do sometimes get surprising improvement in other domains when
you improve in a specific domain. For example, when these large models improve at coding,
that can actually improve their general reasoning. So there is evidence of some transfer although we
would like a lot more evidence of that. But that’s how the human brain learns too. If
we experience and practice a lot of things like chess, creative writing, or whatever,
we also tend to specialize and get better at that specific thing even though we’re using
general learning techniques and general learning systems in order to get good at that domain.
What’s been the most surprising example of this kind of transfer for you? Will you
see language and code, or images and text? I’m hoping we’re going to see a lot more of
this kind of transfer, but I think things like getting better at coding and math,
and then generally improving your reasoning. That is how it works with us as human learners.
But I think it’s interesting seeing that in these artificial systems.
And can you see the sort of mechanistic way, in the language and code example,
in which you’ve found the place in a neural network that’s getting better
with both the language and the code? Or is that too far down the weeds?
I don’t think our analysis techniques are quite sophisticated enough to be able to hone in on
that. I think that’s actually one of the areas where a lot more research needs to be done,
the kind of mechanistic analysis of the representations that these systems build
up. I sometimes like to call it virtual brain analytics. In a way, it’s a bit like doing fMRI,
or single-cell recording from a real brain. What are the analogous analysis techniques for these
artificial minds? There’s a lot of great work going on in this sort of stuff. People like
Chris Olah, I really like his work. I think a lot of computational neuroscience techniques can be
brought to bear on analyzing the current systems we’re building. In fact, I try to
encourage a lot of my computational neuroscience friends to start thinking in that direction and
applying their know-how to the large models. What do other AI researchers not understand
about human intelligence that you have some sort of insight on, given your neuroscience background?
I think neuroscience has added a lot, if you look at the last 10-20 years that we’ve been at it.
I’ve been thinking about this for 30+ years. In the earlier days of the new wave of AI,
neuroscience was providing a lot of interesting directional clues,
things like reinforcement learning and combining that with deep learning. Some of our pioneering
work we did there were things like experience replay and even the notion of attention,
which has become super important. A lot of those original inspirations came from some understanding
about how the brain works, although not the exact specifics of course. One is an engineered
system and the other one’s a natural system. It’s not so much about a one-to-one mapping
of a specific algorithm, but more so inspirational direction. Maybe it’s some ideas for architecture,
or algorithmic ideas, or representational ideas. The brain is an existence proof that
general intelligence is possible at all. I think the history of human endeavors has been such that
once you know something’s possible it’s easier to push hard in that direction, because you know it’s
a question of effort, a question of when and not if. That allows you to make progress a lot more
quickly. So I think neuroscience has inspired a lot of the thinking, at least in a soft way,
behind where we are today. As for going forward, I think there’s still a lot of interesting things
to be resolved around planning. How does the brain construct the right world models? I
studied how the brain does imagination, or you can think of it as mental simulation. How do we
create very rich visual spatial simulations of the world in order for us to plan better?
Actually, I’m curious how you think that will interface with LLMs. Obviously, DeepMind is
at the frontier and has been for many years with systems like AlphaZero and so forth, having these
agents which can think through different steps to get to an end outcome. Is there a path for
LLMs to have this tree search kind of thing on top of them? How do you think about this?
I think that’s a super promising direction. We’ve got to carry on improving the large models. We’ve
got to carry on making them more and more accurate predictors of the world, making them more and
more reliable world models. That’s clearly a necessary, but probably insufficient component
of an AGI system. On top of that, we’re working on things like AlphaZero-like planning mechanisms
on top that make use of that model in order to make concrete plans to achieve certain goals
in the world. Perhaps chaining thought, lines of reasoning, together and using search to explore
massive spaces of possibility. I think that’s kind of missing from our current large models.
How do you get past the immense amount of compute that these approaches tend to
require? Even the AlphaGo system was a pretty expensive system because you sort of had to run
an LLM on each node of the tree. How do you anticipate that’ll get made more efficient?
One thing is Moore’s law tends to help. Over every year more computation comes in. But we focus a lot
on sample-efficient methods and reusing existing data, things like experience replay and also just
looking at more efficient ways. The better your world model is, the more efficient your search
can be. One example I always give is AlphaZero, our system to play Go and chess and any game.
It’s stronger than human world champion level in all these games and it uses a lot less search than
a brute force method like Deep Blue to play chess. One of these traditional Stockfish or
Deep Blue systems would maybe look at millions of possible moves for every decision it’s going
to make. AlphaZero and AlphaGo may look at around tens of thousands of possible positions in order
to make a decision about what to move next. A human grandmaster or world champion probably
only looks at a few hundred moves, even the top ones, in order to make their very good decision
about what to play next. So that suggests that the brute force systems don’t have any
real model other than the heuristics about the game. AlphaGo has quite a decent model but the
top human players have a much richer, much more accurate model of Go or chess. That allows them
to make world-class decisions on a very small amount of search. So I think there’s a sort of
trade-off there. If you improve the models, then I think your search can be more efficient and
therefore you can get further with your search. I have two questions based on that. With AlphaGo,
you had a very concrete win condition: at the end of the day, do I win this game of Go or not? You
can reinforce on that. When you’re thinking of an LLM putting out thought, do you think there
will be this ability to discriminate in the end, whether that was a good thing to reward or not?
Of course that’s why we pioneered, and what DeepMind is sort of famous for,
using games as a proving ground. That’s partly because it’s efficient to research in that domain.
The other reason is, obviously, it’s extremely easy to specify a reward function. Winning the
game or improving the score, something like that is built into most games. So that is one of the
challenges of real-world systems. How does one define the right objective function, the right
reward function, and the right goals? How does one specify them in a general way, but specific enough
that one actually points the system in the right direction? For real-world problems, that can be a
lot harder. But actually, if you think about it in even scientific problems, there are usually ways
that you can specify the goal that you’re after. When you think about human intelligence,
you were just saying that humans thinking about these thoughts are just super sample-efficient.
Einstein coming up with relativity, right? There’s thousands of possible permutations
of the equations. Do you think it’s also this sense of different heuristics like,
“I’m going to try out this approach instead of this”? Or is it a totally different way of
approaching and coming up with that solution than what AlphaGo does to plan the next move?
I think it’s different because our brains are not built for doing Monte Carlo tree search. It’s
just not the way our organic brains work. I think that people like Einstein, in order to compensate
for that, have used their intuition—and maybe we can come to what intuition is—and their knowledge
and their experience to build in Einstein’s case, extremely accurate models of physics that include
mental simulations. If you read about Einstein and how he came up with things, he used to visualize
and really feel what these physical systems should be like, not just the mathematics of
it. He had a really intuitive feel for what they would be like in reality. That allowed him to
think these thoughts that were very outlandish at the time. So I think that that gets to the
sophistication of the world models that we’re building. Imagine your world model can get you
to a certain node in a tree that you’re searching, and then you just do a little bit of search around
that leaf node and that gets you to these original places. Obviously, if your model and your judgment
on that model is very, very good, then you can pick which leaf nodes you should expand
with search much more accurately. So overall, you therefore do a lot less search. I mean, there’s
no way that any human could do a kind of brute force search over any kind of significant space.
A big open question right now is whether RL will allow these models to use the self-play
synthetic data to get over data bottlenecks. It sounds like you’re optimistic about this?
I’m very optimistic about that. First of all, there’s still a lot more data that can be used,
especially if one views multimodal and video and these kinds of things. Obviously,
society is adding more data all the time to the Internet and things like that. I think that
there’s a lot of scope for creating synthetic data. We’re looking at that in different ways,
partly through simulation, using very realistic game environments, for example,
to generate realistic data, but also self-play. That’s where systems interact with each other
or converse with each other. It worked very well for us with AlphaGo and AlphaZero where we got the
systems to play against each other and actually learn from each other’s mistakes and build up a
knowledge base that way. I think there are some good analogies for that. It’s a little bit more
complicated to build a general kind of world data. How do you get to the point with these
models where the synthetic data they’re outputting on the self-play they’re doing
is not just more of what’s already in their data set, but something they haven’t seen
before? To actually improve the abilities. I think there’s a whole science needed there.
I think we’re still in the nascent stage of this, of data curation and data analysis and
actually analyzing the holes that you have in your data distribution. This is important for
things like fairness and bias and other stuff. To remove that from the system is to really make
sure that your data set is representative of the distribution you’re trying to learn.
There are many tricks there one can use, like overweighting or replaying certain parts of the
data. Or if you identify some gap in your data set, you could imagine that’s where you put your
synthetic generation capabilities to work on. Nowadays, people are paying attention to the RL
stuff that DeepMind did many years before. What are the early research directions, or something
that was done way back in the past, that you think will be a big deal but people just haven’t been
paying attention to it? There was a time where people weren’t paying attention to scaling. What’s
the thing now that is totally underrated? Well, I think that the history of the last
couple of decades has been things coming in and out of fashion, right? A while ago,
maybe five-plus years ago, we were pioneering with AlphaGo and before that DQN. It was the
first system that worked on Atari, our first big system really more than ten years ago now,
that scaled up Q-learning and reinforcement learning techniques and combined that with deep
learning to create deep reinforcement learning. We used that to scale up to master some pretty
complex tasks like playing Atari games just from the pixels. I do actually think a lot of those
ideas need to come back in again and, as we talked about earlier, combine them with the new advances
in large models and large multimodal models, which are obviously very exciting as well. So I do think
there’s a lot of potential for combining some of those older ideas together with the newer ones.
Is there any potential for the AGI to eventually come from a pure RL approach? The way we’re
talking about it, it sounds like the LLM will form the right prior and then this sort of tree search
will go on top of that. Or is it a possibility that it comes completely out of the dark?
Theoretically, I think there’s no reason why you couldn’t go full AlphaZero-like on it. There are
some people here at Google DeepMind and in the RL community who work on that, fully assuming no
priors, no data, and just building all knowledge from scratch. I think that’s valuable because
those ideas and those algorithms should also work when you have some knowledge too. Having
said that, I think by far the quickest way to get to AGI, and the most plausible way,
is to use all the knowledge that’s existing in the world right now that we’ve collected from things
like the Web. We have these scalable algorithms, like transformers, that are capable of ingesting
all of that information. So I don’t see why you wouldn’t start with a model as a kind of prior,
or to build on it and to make predictions that help bootstrap your learning. I just think it
doesn’t make sense not to make use of that. So my betting would be that the final AGI system will
have these large multimodal models as part of the overall solution, but they probably
won’t be enough on their own. You’ll need this additional planning search on top.
This sounds like the answer to the question I’m about to ask. As somebody who’s been in
this field for a long time and seen different trends come and go, what do you think the
strong version of the scaling hypothesis gets right and what does it get wrong? The idea that
you just throw enough compute at a wide enough distribution of data and you get intelligence.
My view is that this is kind of an empirical question right now. I think it was pretty
surprising to almost everyone, including the people who first worked on the scaling hypotheses,
how far it’s gone. In a way, I look at the large models today and I think they’re almost
unreasonably effective for what they are. I think it’s pretty surprising some of the properties that
emerge. In my opinion, they’ve clearly got some form of concepts and abstractions and things
like that. I think if we were talking five-plus years ago, I would have said to you that maybe
we need an additional algorithmic breakthrough in order to do that, maybe more like how the
brain works. I think that’s still true if we want explicit abstract concepts, neat concepts,
but it seems that these systems can implicitly learn that. Another really interesting, unexpected
thing was that these systems have some sort of grounding even though they don’t experience the
world multimodally, at least until more recently when we have the multimodal models. The amount of
information and models that can be built up just from language is surprising. I think that I’d
have some hypotheses about why that is. I think we get some grounding through the RLHF feedback
systems because obviously the human raters are by definition, grounded people. We’re grounded in
reality, so our feedback is also grounded. Perhaps there’s some grounding coming in through there.
Also if you’re able to ingest all of it, maybe language contains more grounding than linguists
thought before. So it actually raises some very interesting philosophical questions that people
haven’t even really scratched the surface of yet. Looking at the advances that have been made,
it’s quite interesting to think about where it’s going to go next. In terms of your question of
large models, I think we’ve got to push scaling as hard as we can and that’s what we’re doing here.
It’s an empirical question, whether that will hit an asymptote or a brick wall, and there are
different people who argue about that. I think we should just test it. I think no one knows.
In the meantime, we should also double down on innovation and invention. This is something where
Google Research and DeepMind and Google Brain have pioneered many, many things over the last decade.
That’s our bread and butter. You can think of half our effort as having to do with scaling and
half our efforts having to do with inventing the next architectures and the next algorithms that
will be needed, knowing that larger and larger scaled models are coming down the line. So my
betting right now, but it’s a loose betting, is that you need both. I think you’ve got to push
both of them as hard as possible and we’re in a lucky position that we can do that.
I want to ask more about the grounding. You can imagine two things that might change which would
make the grounding more difficult. One is that as these models get smarter, they are going to
be able to operate in domains where we just can’t generate enough human labels, just because we’re
not smart enough. If it does a million-line pull request, how do we tell it, for example,
this is within the constraints of our morality and the end goal we wanted and this isn’t? The
other thing has to do with what you were saying about compute. So far we’ve been doing next token
prediction and in some sense it’s a guardrail, because you have to talk as a human would talk
and think as a human would think. Now, additional compute is maybe going to come in the form of
reinforcement learning where it’s just getting to the objective and we can’t really trace how
you got there. When you combine those two, how worried are you that the grounding goes away?
I think if it’s not properly grounded, the system won’t be able to achieve those goals properly. In
a sense, you have to have some grounding for a system to actually achieve goals in the real
world. I do actually think that these systems, and things like Gemini, are becoming more multimodal.
As we start ingesting things like video and audiovisual data as well as text data, then the
system starts correlating those things together. I think that is a form of proper grounding. So
I do think our systems are going to start to understand the physics of the real world better.
Then one could imagine the active version of that as a very realistic simulation or
game environment where you’re starting to learn about what your actions do in the world and how
that affects the world itself. The world stays itself, but it also affects what next learning
episode you’re getting. So these RL agents we’ve always been working on and pioneered,
like AlphaZero and AlphaGo, actually are active learners. What they decide to do
next affects what next learning piece of data or experience they’re going to get. So there’s
this very interesting sort of feedback loop. And of course, if we ever want to be good at
things like robotics, we’re going to have to understand how to act in the real world.
So there’s grounding in terms of whether the capabilities will be able to proceed,
whether they will be enough in touch with reality to do the things we want. There’s
another sense of grounding in that we’ve gotten lucky that since they’re trained on human thought,
they maybe think like a human. To what extent does that stay true when more of the compute for
training comes from just “did you get the right outcome” and it’s not guardrailed by “are you
proceeding on the next token as a human would?” Maybe the broader question I’ll pose to you is,
and this is what I asked Shane as well, what would it take to align a system that’s smarter
than a human? Maybe it thinks in alien concepts and you can’t really monitor the million-line pull
request because you can’t really understand the whole thing and you can’t give labels.
This is something Shane and I, and many others here, have had at the forefront of our minds since
before we started DeepMind because we planned for success. In 2010, no one was thinking about AI let
alone AGI. But we already knew that if we could make progress with these systems and these ideas,
the technology created would be unbelievably transformative. So we were already thinking
20 years ago about what the consequences of that would be, both positive and negative. Of course,
the positive direction is amazing science, things like AlphaFold,
incredible breakthroughs in health and science, and mathematical and scientific discovery. But we
also have to make sure these systems are sort of understandable and controllable.
This will be a whole discussion in itself, but there are many, many ideas that people have
such as more stringent eval systems. I think we don’t have good enough evaluations and benchmarks
for things like if the system can deceive you. Can it exfiltrate its own code or do other undesirable
behaviors? There are also ideas of using AI, not general learning ones but maybe narrow AIs that
are specialized for a domain, to help us as the human scientists to analyze and summarize what the
more general system is doing. So there’s narrow AI tools. I think that there’s a lot of promise
in creating hardened sandboxes or simulations that are hardened with cybersecurity arrangements
around the simulation, both to keep the AI in and to keep hackers out. You could experiment
a lot more freely within that sandbox domain. There’s many, many other ideas, including the
analysis stuff we talked about earlier, where we can analyze and understand what the concepts
are that this system is building and what the representations are. So maybe then they’re not
so alien to us and we can actually keep track of the kind of knowledge that it’s building.
Stepping back a bit, I’m curious what your timelines are. So Shane
said his modal outcome is 2028. I think that’s maybe his median. What is yours?
I don’t have prescribed specific numbers to it because I think there’s so many unknowns
and uncertainties. Human ingenuity and endeavor comes up with surprises all the time. So that
could meaningfully move the timelines. I will say that when we started DeepMind back in 2010,
we thought of it as a 20-year project. And I think we’re on track actually, which is kind
of amazing for 20-year projects because usually they’re always 20 years away. That’s the joke
about whatever, quantum, AI, take your pick. But I think we’re on track. So I wouldn’t be surprised
if we had AGI-like systems within the next decade. Do you buy the model that once you have an AGI,
you have a system that basically speeds up further AI research? Maybe not in an overnight sense,
but over the course of months and years you would have much faster
progress than you would have otherwise had? I think that’s potentially possible. I think
it partly depends on what we, as a society, decide to use the first nascent AGI systems or
proto-AGI systems for. Even the current LLMs seem to be pretty good at coding and we have systems
like AlphaCode. We also have theorem proving systems. So one could imagine combining these
ideas together and making them a lot better. I could imagine these systems being quite good at
designing and helping us build future versions of themselves, but we also have to think about
the safety implications of that of course. I’m curious what you think about that. I’m
not saying this is happening this year, but eventually you’ll be developing a model where
you think there’s some chance that it’ll be capable of an intelligence explosion-like
dynamic once it’s fully developed. What would have to be true of that model at that point where
you’re comfortable continuing the development of the system? Something like, “I’ve seen these
specific evals, I’ve understood its internal thinking and its future thinking enough.”
We need a lot more understanding of the systems than we do today before I would even be confident
of explaining to you what we’d need to tick box there. I think what we’ve got to do in the next
few years, in the time before those systems start arriving, is come up with the right evaluations
and metrics. Ideally formal proofs, but it’s going to be hard for these types of systems, so at least
empirical bounds around what these systems can do. That’s why I think about things like deception as
being quite root node traits that you don’t want. If you’re confident that your system is exposing
what it actually thinks, then that opens up possibilities of using the system itself to
explain aspects of itself to you. The way I think about that is like this. If I were to play a game
of chess against Garry Kasparov, which I’ve played in the past, Magnus Carlsen, or the amazing chess
players of all time, I wouldn’t be able to come up with a move that they could. But they could
explain to me why they came up with that move and I could understand it post hoc, right? That’s the
sort of thing one could imagine. One of the capabilities that we could make use of these
systems is for them to explain it to us and even maybe get the proofs behind why they’re thinking
something, certainly in a mathematical problem. Got it. Do you have a sense of what the converse
answer would be? So what would have to be true where tomorrow morning you’re like “oh,
man, I didn’t anticipate this.” You see some specific observation tomorrow morning that
makes you say “we got to stop Gemini 2 training.” I could imagine that. This is where things like
the sandbox simulations are important. I would hope we’re experimenting in a safe,
secure environment when something very unexpected happens. There’s a new unexpected capability or
something that we didn’t want. We explicitly told the system we didn’t want it but then it
did and it lied about it. These are the kinds of things where one would want to then dig in
carefully. The systems that are around today are not dangerous, in my opinion, but in a few years
they might have potential. Then you would ideally pause and really get to the bottom of why it was
doing those things before one continued. Going back to Gemini, I’m curious what the
bottlenecks were in the development. Why not immediately make it one order
of magnitude bigger if scaling works? First of all, there are practical limits.
How much compute can you actually fit in one data center? You’re also bumping up against
very interesting distributed computing kind of challenges. Fortunately, we have some of
the best people in the world working on those challenges and cross data center training,
all of these kinds of things. There are very interesting hardware challenges and we have our
TPUs that we’re building and designing all the time as well as using GPUs. So there’s all of
that. Scaling laws also don’t just work by magic. You still need to scale up the hyperparameters,
and various innovations are going in all the time with each new scale. It’s not just about repeating
the same recipe at each new scale. You have to adjust the recipe and that’s a bit of an art
form. You have to sort of get new data points. If you try to extend your predictions and extrapolate
them several orders of magnitude out, sometimes they don’t hold anymore. There can be step
functions in terms of new capabilities and some things hold, other things don’t. Often you do need
those intermediate data points to correct some of your hyperparameter optimization and other things,
so that the scaling law continues to be true. So there are various practical limitations to that.
One order of magnitude is probably about the maximum that you want to do between each era.
That’s so fascinating. In the GPT-4 technical report, they say that they
were able to predict the training loss with a model with tens of thousands of times less
compute than GPT-4. They could see the curve. But the point you’re making is that the actual
capabilities that loss implies may not be so. Yeah, the downstream capabilities sometimes
don’t follow. You can often predict the core metrics like training loss or something like that,
but then it doesn’t actually translate into MMLU, or math, or some other actual capability that you
care about. They’re not necessarily linear all the time. There are non-linear effects there.
What was the biggest surprise to you during the development of
Gemini in terms of something like this happening? I wouldn’t say there was one big surprise. It was
very interesting trying to train things at that size and learning about all sorts of
things from an organizational standpoint, like how to babysit such a system and to
track it. There’s also things like getting a better understanding of the metrics you’re
optimizing versus the final capabilities that you want. I would say that’s still not a perfectly
understood mapping, but it’s an interesting one that we’re getting better and better at.
There’s a perception that maybe other labs are more compute-efficient than
DeepMind has been with Gemini. I don’t know what you make of that perception.
I don’t think that’s the case. I think that actually Gemini 1 used roughly the same amount
of compute, maybe slightly more, than what was rumored for GPT-4. I don’t know exactly what was
used but I think it was in the same ballpark. I think we’re very efficient with our compute
and we use our compute for many things. One is not just the scaling but, going back to earlier, more
innovations and ideas. A new innovation, a new invention, is only useful if it can also scale.
So you need quite a lot of compute to do new invention because you’ve got to test many things,
at least some reasonable scale, and make sure that they work at that scale. Also,
some new ideas may not work at a toy scale but do work at a larger scale. In fact,
those are the more valuable ones. So if you think about that exploration process,
you need quite a lot of compute to be able to do that. The good news is we’re pretty lucky
at Google. I think this year we’re going to have the most compute by far of any sort of research
lab. We hope to make very efficient and good use of that in terms of both scaling and the
capability of our systems and also new inventions. What’s been the biggest surprise to you, if you go
back to yourself in 2010 when you were starting DeepMind, in terms of what AI progress has looked
like? Did you anticipate back then that it would, in some large sense, amount to spending billions
of dollars into these models? Or did you have a different sense of what it would look like?
We thought that actually, and I know you’ve interviewed my colleague Shane. He always
thought in terms of compute curves and comparing it roughly to the brain,
how many neurons and synapses there are very loosely. Interestingly, we’re actually in that
kind of regime now with roughly the right order of magnitude of number of synapses in the brain
and the sort of compute that we have. But I think more fundamentally, we always thought that we bet
on generality and learning. So those were always at the core of any technique we would use. That’s
why we triangulated on reinforcement learning, and search, and deep learning as three types of
algorithms that would scale, be very general, and not require a lot of handcrafted human priors. We
thought that was the sort of failure mode of the efforts to build AI in the 90s in places like
MIT. There were very logic-based systems, expert systems, and masses of hand-coded,
handcrafted human information going into them that turned out to be wrong or too rigid. So we
wanted to move away from that and I think we spotted that trend early. Obviously, we used
games as our proving ground and we did very well with that. I think all of that was very successful
and maybe inspired others. AlphaGo, I think, was a big moment for inspiring many others to think “oh,
actually, these systems are ready to scale.” Of course then, with the advent of transformers,
invented by our colleagues at Google Research and Brain, that was the type of deep learning
that allowed us to ingest masses of amounts of information. That has really turbocharged where
we are today. So I think that’s all part of the same lineage. We couldn’t have predicted every
twist and turn there, but I think the general direction we were going in was the right one.
It’s fascinating if you read your old papers or Shane’s old papers. In Shane’s thesis in 2009,
he said “well, the way we would test for AI is, can you compress Wikipedia?” And that’s literally,
the loss function for LLMs. Or in your own paper in 2016 before transformers,
you were comparing neuroscience and AI and you said attention is what is needed.
Exactly. So we had these things called out and we had some early attention papers,
but they weren’t as elegant as transformers in the end, neural Turing machines and things
like this. Transformers were the nicer and more general architecture of that.
When you extrapolate all this out forward and you think about superhuman intelligence,
what does that landscape look like to you? Is it still controlled by a private company? What should
the governance of that look like concretely? I think that this is so consequential,
this technology. I think it’s much bigger than any one company or even industry in general.
I think it has to be a big collaboration with many stakeholders from civil society, academia,
government, etc. The good news is that with the popularity of the recent chatbot systems, I think
that has woken up many of these other parts of society to the fact that this is coming and what
it will be like to interact with these systems. And that’s great. It’s opened up lots of doors for
very good conversations. An example of that was the safety summit the UK hosted a few months ago,
which I thought was a big success in getting this international dialogue going. I think the whole of
society needs to be involved in deciding what we want to deploy these models for? How do we
want to use them and what do we not want to use them for? I think we’ve got to try and get some
international consensus around that and also make sure that these systems benefit everyone, for the
good of society in general. That’s why I push so hard for things like AI for science. I hope that
with things like our spin-out, Isomorphic, we’re going to start curing terrible diseases with AI,
accelerate drug discovery, tackle climate change, and do other amazing things. There are big
challenges that face humanity, massive challenges. I’m actually optimistic we can solve them because
we’ve got this incredibly powerful tool of AI coming down the line that we can apply to
help us solve many of these problems. Ideally, we would have a big consensus around that and a big
discussion at sort of the UN level if possible. One interesting thing is if you look at these
systems and chat with them, they’re immensely powerful and intelligent. But it’s interesting
the extent to which they haven’t automated large sections of the economy yet. Whereas if five years
ago I showed you Gemini, you’d be like “wow, this is totally coming for a lot of things.”
So how do you account for that? What’s going on that it hasn’t had the broader impact yet?
I think that just shows we’re still at the beginning of this new era. I think there are
some interesting use cases where you can use these chatbot systems to summarize stuff for you and do
some simple writing, maybe more boilerplate-type writing. But that’s only a small part of what we
all do every day. I think for more general use cases we still need new capabilities,
things like planning and search but also things like personalization and episodic memory. That’s
not just long context windows, but actually remembering what we spoke about 100 conversations
ago. I’m really looking forward to things like recommendation systems that help me find better,
more enriching material, whether that’s books or films or music and so on. I would use that type of
system every day. So I think we’re just scratching the surface of what these AI assistants could
actually do for us in our general, everyday lives and also in our work context as well. I think
they’re not reliable yet enough to do things like science with them. But I think one day,
once we fix factuality and grounding and other things, I think they could end up
becoming the world’s best research assistant for you as a scientist or as a clinician.
I want to ask about memory. You had this fascinating paper in 2007 where you talked
about the links between memory and imagination and how they, in some sense, are very similar. People
often claim that these models are just memorizing. How do you think about that claim? Is memorization
all you need because in some deep sense, that’s compression? What’s your intuition here?
At the limit, one maybe could try and memorize everything but it wouldn’t generalize out of your
distribution. The early criticisms of these early systems were that they were just regurgitating
and memorizing. I think clearly in the Gemini, GPT-4 type era, they are definitely generalizing
to new constructs. Actually my thesis, and that paper particularly that started that
area of imagination in neuroscience, was showing that first of all memory, at least human memory,
is a reconstructive process. It’s not a videotape. We sort of put it together back from components
that seem familiar to us, the ensemble. That’s what made me think that imagination might be the
same thing. Except in this case you’re using the same semantic components, but now you’re putting
it together in a way that your brain thinks is novel, for a particular purpose like planning. I
do think that that kind of idea is still probably missing from our current systems, pulling together
different parts of your world model to simulate something new that then helps with your planning,
which is what I would call imagination. For sure. Now you guys have the best models
in the world with the Gemini models. Do you plan on putting out some sort of framework
like the other two major AI labs have? Something like “once we see these specific capabilities,
unless we have these specific safeguards, we’re not going to continue development
or we’re not going to ship the product out.” Yes, we already have lots of internal checks
and balances but we’re going to start publishing. Actually, watch this space. We’re working on a
whole bunch of blog posts and technical papers that we’ll be putting out in the next few months
along similar lines of things like responsible scaling laws and so on. We have those implicitly
internally in various safety councils that people like Shane chair and so on. But it’s time for us
to talk about that more publicly I think. So we’ll be doing that throughout the course of the year.
That’s great to hear. Another thing I’m curious about is, there’s not only the risk
of the deployed model being something that people can use to do bad things,
but there’s also rogue actors, foreign agents, and so forth, being able to steal the weights
and then fine-tune them to do crazy things. How do you think about securing the weights to make sure
something like this doesn’t happen, making sure a very key group of people has access to them?
It’s interesting. First of all, there’s two parts. One is security, one is open source, which maybe
we can discuss. The security is super key just as normal cybersecurity type things. I think
we’re lucky at Google DeepMind. We’re behind Google’s firewall and cloud protection which I
think is best in class in the world corporately. So we already have that protection. Behind that,
we have specific DeepMind protections within our code base. It’s sort of a double layer
of protection. So I feel pretty good about that. You can never be complacent on that
but I feel it’s already the best in the world in terms of cyber defenses. We’ve got to carry
on improving that and again, things like the hardened sandboxes could be a way of doing that
as well. Maybe there are even specifically secure data centers or hardware solutions
to this too that we’re thinking about. I think that maybe in the next three, four, five years,
we would also want air gaps and various other things that are known in the security community.
So I think that’s key and I think all frontier labs should be doing that because otherwise for
rogue nation-states and other dangerous actors, there would obviously be a lot of incentive for
them to steal things like the weights. Of course, open source is another interesting question. We’re
huge proponents of open source and open science. We’ve published thousands of papers, things like
AlphaFold and transformers and AlphaGo. All of these things we put out there into the world,
published and open source, most recently GraphCast, our weather prediction system. But
when it comes to the general-purpose foundational technology, I think the question I would have for
open source proponents is, how does one stop bad actors, individuals or up to rogue states,
taking those same open source systems and repurposing them for harmful ends? We have to
answer that question. I don’t know what the answer is to that, but I haven’t heard a compelling,
clear answer to that from proponents of just open sourcing everything. So I think there
has to be some balance there. Obviously, it’s a complex question of what that is.
I feel like tech doesn’t get the credit it deserves for funding hundreds of billions of
dollars’ worth of R&D, obviously you have DeepMind with systems like AlphaFold and so on. When we
talk about securing the weights, as we said maybe right now it’s not something that is going to
cause the end of the world or anything, but as these systems get better and better, there’s the
worry that a foreign agent or something gets access to them. Presumably right now there’s
dozens to hundreds of researchers who have access to the weights. What’s a plan for getting the
weights in a situation room where if you need to access them it’s some extremely strenuous process
and no individual can really take them out? One has to balance that with allowing for
collaboration and speed of progress. Another interesting thing is that of course you want
brilliant independent researchers from academia or things like the UK AI Safety Institute and the US
one to be able to red team these systems. So one has to expose them to a certain extent,
although that’s not necessarily the weights. We have a lot of processes in place about making
sure that only if you need them, those people who need access have access. Right now, I think
we’re still in the early days of those kinds of systems being at risk. As these systems become
more powerful and more general and more capable, I think one has to look at the access question.
Some of these other labs have specialized in different things relative to safety,
Anthropic for example with interpretability. Do you have some sense of where you guys might
have an edge? Now that you have the frontier model, where are you guys going to be able to
put out the best frontier research on safety? I think we helped pioneer RLHF and other things
like that which can obviously be used for performance but also for safety. I think that
a lot of the self-play ideas and these kinds of things could also be used to auto-test a
lot of the boundary conditions that you have with the new systems. Part of the issue is
that with these very general systems, there’s so much surface area to cover about how these
systems behave. So I think we are going to need some automated testing. Again,
with things like simulations and games, very realistic virtual environments, I think we
have a long history of using those kinds of systems and making use of them for building
AI algorithms. I think we can leverage all of that history. And then around Google, we’re very lucky
to have some of the world’s best cybersecurity experts, hardware designers. I think we can bring
that to bear for security and safety as well. Let’s talk about Gemini. So now you guys have
the best model in the world. I’m curious. The default way to interact with these systems has
been through chat so far. Now that we have multimodal and all these new capabilities,
how do you anticipate that changing? Do you think that’ll still be the case?
I think we’re just at the beginning of actually understanding how exciting that might be to
interact with a full multimodal model system. It’ll be quite different from what we’re used
to today with the chatbots. I think the next versions of this over the next year, 18 months,
we’ll maybe have some contextual understanding of the environment around you through a camera
or a phone or some glasses. I could imagine that as the next step. And then I think we’ll
start becoming more fluid in understanding “let’s sample from a video, let’s use voice.” Maybe even
eventually things like touch and if you think about robotics, other types of sensors. So I
think the world’s about to become very exciting in the next few years as we start getting used
to the idea of what true multimodality means. On the robotics subject, when he was on the
podcast Ilya said that the reason OpenAI gave up on robotics was because they didn’t have enough
data in that domain, at least at the time they were pursuing it. You guys have put out
different things like Robo-Transformer and other things. Do you think that’s still a bottleneck
for robotics progress, or will we see progress in the world of atoms as well as the world of bits?
We’re very excited about our progress with things like Gato and RT-2. We’ve always liked robotics
and we’ve had amazing research in that. We still have that going now because we like the fact that
it’s a data-poor regime. That pushes us in very interesting research directions that we think
are going to be useful anyway: sampling efficiency and data efficiency in general, transfer learning,
learning from simulation and transferring that to reality, sim-to-real. All of these are very
interesting general challenges that we would like to solve. The control problem. So, we’ve always
pushed hard on that. I think Ilya is right. It is more challenging because of the data problem. But
I think we’re starting to see the beginnings of these large models being transferable to
the robotics regime. They can learn in the general domain, language domain and other things, and then
just treat tokens like Gato as any type of token. The token could be an action, it could be a word,
it could be part of an image, a pixel, or whatever it is. That’s what I think true multimodality is.
To begin with, it’s harder to train a system like that than a straightforward language
system. But going back to our early conversation on transfer learning, you start seeing that with a
true multimodal system, the other modalities benefit some different modalities. You get
better at language because you now understand a little bit about video. So I do think it’s
harder to get going, but ultimately we’ll have a more general, more capable system like that.
What ever happened to Gato? That was super fascinating that you could have it
play games and also do video and also do text. We’re still working on those kinds of systems,
but you can imagine we’re trying to build those ideas into our future generations of
Gemini to be able to do all of those things. Robotics, transformers, and things like that,
you can think of them as follow-ups to that. Will we see asymmetric progress in the domains in
which the self-play kinds of things you’re talking about will be especially powerful? So math and
code. Recently, you have these papers out about this. You can use these things to do really cool,
novel things. Will they be superhuman coders, but in other ways they might still be worse
than humans? How do you think about that? I think that we’re making great progress
with math and things like theorem proving and coding. But it’s still interesting if one looks
at creativity in general, and scientific endeavor in general. I think we’re getting to the stage
where our systems could help the best human scientists make their breakthroughs quicker,
almost triage the search space in some ways. Perhaps find a solution like AlphaFold does
with a protein structure. They’re not at the level where they can create the hypothesis themselves or
ask the right question. As any top scientist will tell you, the hardest part of science is actually
asking the right question. It’s boiling down that space to the critical question we should
go after and then formulating the problem in the right way to attack it. That’s not something our
systems really have any idea how to do, but they are suitable for searching large combinatorial
spaces if one can specify the problem with a clear objective function. So that’s very useful already
for many of the problems we deal with today, but not the most high-level creative problems.
DeepMind has published all kinds of interesting stuff in speeding
up science in different areas. If you think AGI is going to happen in the next 10 to 20 years,
why not just wait for the AGI to do it for you? Why build these domain-specific solutions?
I think we don’t know how long AGI is going to be. We always used to say, back even when
we started DeepMind, that we don’t have to wait for AGI in order to bring incredible benefits to
the world. My personal passion especially has been AI for science and health. You can see
that with things like AlphaFold and all of our various Nature papers on different domains and
material science work and so on. I think there’s lots of exciting directions and also impact in
the world through products too. I think it’s very exciting and a huge unique opportunity we
have as part of Google. They’ve got dozens of billion-user products that we can immediately
ship our advances into and then billions of people can improve, enrich, and enhance
their daily lives. I think it’s a fantastic opportunity for impact on all those fronts.
I think the other reason from the point of view of AGI specifically is that it battle tests
your ideas. You don’t want to be in a research bunker where you theoretically are pushing things
forward, but then actually your internal metrics start deviating from real-world things that people
would care about, or real-world impact. So you get a lot of direct feedback from these real-world
applications that then tells you whether your systems really are scaling or if we need to be
more data efficient or sample efficient. Because most real-world challenges require that. So it
kind of keeps you honest and pushes you to keep nudging and steering your research directions
to make sure they’re on the right path. So I think it’s fantastic. Of course, the world
benefits from that. Society benefits from that on the way, maybe many years before AGI arrives.
The development of Gemini is super interesting because it comes right at the heels of merging
these different organizations, Brain and DeepMind. I’m curious, what have been the challenges
there? What have been the synergies? It’s been successful in the sense that you have the best
model in the world now. What’s that been like? It’s been fantastic actually, over the last year.
Of course it’s been challenging to do, like any big integration coming together. You’re talking
about two world-class organizations with long, storied histories of inventing many
important things from deep reinforcement learning to transformers. So it’s very exciting to actually
pool all of that together and collaborate much more closely. We always used to be collaborating,
but more on a project-by-project basis versus a much deeper, broader collaboration like we
have now. Gemini is the first fruit of that collaboration, including the name Gemini
implying twins. Of course, a lot of other things are made more efficient like pooling compute
resources together and ideas and engineering. I think at the stage we’re at now, there are huge
amounts of world-class engineering that have to go into building the frontier systems. I
think it makes sense to coordinate that more. You and Shane started DeepMind partly because
you were concerned about safety. You saw AGI coming as a live possibility. Do you think
the people who were formerly part of Brain, that half of Google DeepMind now, approach
it in the same way? Have there been cultural differences there in terms of that question?
This is one of the reasons we joined forces with Google back in 2014. I think the entirety of
Google and Alphabet, not just Brain and DeepMind, takes these questions of responsibility very
seriously. Our kind of mantra is to try and be bold and responsible with these systems. I’m
obviously a huge techno-optimist but I want us to be cautious given the transformative power of
what we’re bringing into the world collectively. I think it’s important. It’s going to be one of
the most important technologies humanity will ever invent. So we’ve got to put all our efforts into
getting this right and be thoughtful and also humble about what we know and don’t
know about the systems that are coming and the uncertainties around that. In my view,
the only sensible approach when you have huge uncertainty is to be cautiously optimistic and
use the scientific method to try and have as much foresight and understanding about what’s coming
down the line and the consequences of that before it happens. You don’t want to be live A/B testing
out in the world with these very consequential systems because unintended consequences may be
quite severe. So I want us to move away, as a field, from a sort of “move fast and break things
attitude” which has maybe served the Valley very well in the past and obviously created important
innovations. I think in this case we want to be bold with the positive things that it can do
and make sure we advance things like medicine and science whilst being as responsible and
thoughtful as possible with mitigating the risks. That’s why it seems like the responsible scaling
policies are something that are a very good empirical way to pre-commit to these
kinds of things. Yes, exactly.
When you’re doing these evaluations and for example it turns out your next model could
help a layperson build a pandemic-class bioweapon or something, how would you think first of all
about making sure those weights are secure so that they don't get out? And second, what
would have to be true for you to be comfortable deploying that system? How would you make sure
that this latent capability isn’t exposed? The secure model part I think we’ve covered
with the cybersecurity and making sure that’s world-class and you’re monitoring all those
things. I think if a capability like that was discovered through red teaming or external
testing, independent testers like government institutes or academia or whatever, then we
would have to fix that loophole. Depending on what it was, that might require a different kind of
constitution perhaps, or different guardrails, or more RLHF to avoid that. Or you could remove some
training data, depending on what the problem is. I think there could be a number of mitigations. The
first part is making sure you detect it ahead of time. So that’s about the right evaluations
and right benchmarking and right testing. Then the question is how one would fix that before
you deployed it. But I think it would need to be fixed before it was deployed generally,
for sure, if that was an exposure surface. Final question. You’ve been thinking in terms
of the end goal of AGI at a time when other people thought it was ridiculous in 2010. Now
that we’re seeing this slow takeoff where we’re actually seeing generalization and intelligence,
what is like psychologically seeing this? What has that been like? Has it just been
sort of priced into your world model so it’s not new news for you? Or actually just
seeing it live, are you like “wow, something’s really changed”? What does it feel like?
For me, yes, it’s already priced into my world model of how things were going to go,
at least from the technology side. But obviously, we didn’t necessarily anticipate that the general
public would be so interested this early in the sequence. If ChatGPT and chatbots hadn’t gotten
the interest they ended up getting—which I think was quite surprising to everyone
that people were ready to use these things even though they were lacking in certain directions,
impressive though they are—then we would have produced more specialized systems built off
of the main track, like AlphaFold and AlphaGo, our scientific work. I think then the general
public maybe would have only paid attention later down the road when in a few years’ time,
we have more generally useful assistant-type systems. So that’s been interesting. That’s
created a different type of environment that we’re now all operating in as a field.
It’s a little bit more chaotic because there’s so many more things going on,
and there’s so much VC money going into it, and everyone’s sort of almost losing their minds over
it. The only thing I worry about is that I want to make sure that, as a field, we act responsibly
and thoughtfully and scientifically about this and use the scientific method to approach this
in an optimistic but careful way. I think I’ve always believed that that’s the right
approach for something like AI, and I just hope that doesn’t get lost in this huge rush.
Well, I think that’s a great place to close. Demis, thank you so much for your
time and for coming on the podcast. Thanks. It’s been a real pleasure.
استعرض المزيد من الفيديوهات ذات الصلة
Interview with Dr. Ilya Sutskever, co-founder of OPEN AI - at the Open University studios - English
In conversation with the Godfather of AI
Possible End of Humanity from AI? Geoffrey Hinton at MIT Technology Review's EmTech Digital
2 Ex-AI CEOs Debate the Future of AI w/ Emad Mostaque & Nat Friedman | EP #98
AI and Quantum Computing: Glimpsing the Near Future
Shane Legg (DeepMind Founder) - 2028 AGI, Superhuman Alignment, New Architectures
5.0 / 5 (0 votes)