No Priors Ep. 39 | With OpenAI Co-Founder & Chief Scientist Ilya Sutskever

No Priors Podcast
2 Nov 202341:58

Summary

TLDR在这段深入的对话中,OpenAI的联合创始人和首席科学家Ilas Suk分享了他对AI研究领域的见解,从深度学习的早期挑战到实现人工通用智能(AGI)的愿景。他回顾了神经网络的边缘化历史和AlexNet之前的研究环境,以及OpenAI是如何通过大规模的模型和计算力投资,推动AI技术的边界。Ilas讨论了OpenAI的成立初衷,从一个非营利组织到采用独特的盈利上限公司结构的转变,以及这一战略如何使他们能够追求更大规模的AI项目。他还探讨了模型可靠性、开源AI的复杂性和超级对齐的重要性,暗示了人类与高级AI关系的未来。整个对话展示了AI领域的快速进展和未来方向的深刻洞察。

Takeaways

  • 🚀 OpenAI的创立源于对深度学习和人工智能的深刻洞察,以及对神经网络潜力的早期认识。
  • 🌌 早期AI领域并不被看好,但OpenAI的创始人们相信神经网络像小型大脑,具有巨大潜力。
  • 💡 使用GPU进行机器学习是OpenAI成功的一个关键因素,尽管当时对GPU的具体应用并不十分明确。
  • 🧠 神经网络的规模是其性能的关键,更大的神经网络能够实现前所未有的功能。
  • 📈 OpenAI的目标始终是确保人工通用智能(AGI)的发展能够惠及全人类。
  • 🌐 OpenAI最初采用开源技术,后来转变为非营利组织,最终成为独特的“Capped-profit”公司结构。
  • 🔄 从Dota 2项目到Transformer模型,OpenAI的研究方向不断演进,以适应技术发展和市场需求。
  • 📚 早期的机器学习工作逐渐转向大型项目,如GPT系列,这些项目展示了显著的进步。
  • 🤖 随着模型变得更加可靠,它们将在更多任务中变得更加有用,提供更深入的洞察力。
  • 🔢 模型规模的增加不仅提高了性能,还可能解锁新的、前所未有的有价值应用。
  • 🔄 尽管小型模型在特定应用中可能足够,但大型模型将在更广泛的应用中提供更好的性能。
  • 🌐 开源模型在短期内有助于公司开发有用的产品,但长期来看,随着AI能力的提升,开源模型的角色可能变得更加复杂。

Q & A

  • OpenAI的创立初衷是什么?

    -OpenAI的创立初衷是确保人工智能的发展能够惠及全人类。从一开始,OpenAI的目标就是推动自主系统,即能够完成人类大部分工作、活动和任务的人工智能,确保其对人类整体有益。

  • OpenAI在技术策略上有哪些演变?

    -OpenAI的技术策略从开源技术转向了非营利组织,最终成为了CAP公司,这是为了解决大量计算资源的需求。他们意识到,为了在人工智能领域取得实质性进展,需要大量的计算资源,而非营利组织无法满足这一需求。

  • 为什么OpenAI选择转型为CAP公司?

    -OpenAI选择转型为CAP公司是因为他们认识到,为了实现人工智能的实质性进展,需要大量的计算资源,而非营利组织无法提供足够的资源。此外,他们考虑到如果人工智能技术可能导致大量人失业,那么构建这种技术的公司不应该能够获得无限的利润。

  • Ilias Suk对神经网络的早期直觉是什么?

    -Ilias Suk早期对神经网络的直觉是,神经网络就像小型的大脑,尽管当时无法证明数学定理,但他认为这些小型的“大脑”可能有一天能够做到一些惊人的事情。他相信通过训练更大的神经网络,可以实现前所未有的成果。

  • OpenAI如何克服早期人工智能研究的挑战?

    -OpenAI通过认识到需要更大的神经网络、拥有足够的数据集来约束大型神经网络,以及掌握训练这些网络的技术知识来克服早期的挑战。他们还意识到,GPU的使用对于神经网络的训练非常有利。

  • OpenAI在研究上的主要方向是什么?

    -OpenAI的主要研究方向是不断扩大和改进Transformer模型,尤其是训练更大、更复杂的神经网络来处理文本预测和生成任务。他们通过不断增加计算和数据的投入,使得模型的性能得到显著提升。

  • Ilias Suk认为当前人工智能模型的最大瓶颈是什么?

    -Ilias Suk认为当前人工智能模型的最大瓶颈是可靠性。他指出,当模型在处理难度较高的问题时,用户需要对模型的回答有高度的信心。

  • OpenAI如何看待开源模型的角色?

    -OpenAI认为开源模型在短期内对于公司来说是有益的,因为它允许公司自主决定模型的使用方式和应用场景。但在长期来看,随着人工智能模型的能力越来越强,开源模型的角色将变得更加复杂,需要更多的考虑。

  • Ilias Suk如何看待Transformer架构的未来?

    -Ilias Suk认为Transformer架构是目前非常有效的,并且通过不断的改进和扩展,可以继续取得进步。他提到,虽然人脑看起来有专门化的区域,但实验表明大脑的可塑性非常高,这支持了单一、统一的神经网络架构的理念。

  • Ilias Suk如何看待超级智能和超级对齐?

    -Ilias Suk认为超级智能是指那些比人类更聪明的数据中心,它们不仅能做所有人类能做的事情,而且能更快地学习。他强调,我们需要确保这些超级智能对人类持有积极、温暖的感觉,这就是超级对齐项目的目标,即在超级智能出现之前,就开始着手研究如何控制它们,确保它们对人类友好。

  • OpenAI如何看待人工智能的未来发展?

    -OpenAI认为人工智能目前正处于加速发展的阶段,但未来可能会有多种力量影响其发展速度,包括成本、数据规模、工程复杂性、投资额、工程师和科学家的兴趣等。他们认为,尽管可能会有一些减速的因素,但人工智能的发展仍将继续。

Outlines

00:00

🚀 创业初期与AI的黑暗时代

本段讲述了OpenAI的创立初期,当时深度学习还没有什么成果,整个AI领域处于所谓的“黑暗时代”。在这样的环境下,创始人之一的Ilya Sutskever决定投身于神经网络的研究,尽管当时这个领域并不被看好。他谈到了GPU在机器学习中的应用,以及如何通过扩大神经网络的规模来实现前所未有的成果。

05:01

🌟 从非营利到有限利润公司

Ilya Sutskever解释了OpenAI从非营利组织转变为CAP(有限利润)公司的原因。他提到,为了实现真正的AI进步,需要大量的计算资源,而非营利组织无法满足这一需求。因此,OpenAI采用了一种独特的结构,即使投资者投入资金,但如果公司表现非常好,他们也只能获得一定倍数的回报,而不是无限制的利润。这样的结构旨在确保AI技术的发展能够惠及全人类。

10:01

🧠 研究议程的演变与Transformer模型

Ilya Sutskever讨论了OpenAI的研究议程是如何随着时间的推移而演变的。最初,他们从事的是更传统的机器学习工作,但很快意识到,为了实现AGI(人工通用智能),需要转向更大规模的项目。他们开始探索生成模型,并最终发现了Transformer模型的巨大潜力。从GPT-1到GPT-3,模型的能力有了显著的提升,这标志着OpenAI在大型神经网络研究上取得了重要进展。

15:03

🤖 AI的可靠性与模型规模

Ilya Sutskever强调了AI模型的可靠性对于其实用性的重要性。他提到,随着模型规模的增大,其可靠性也会提高,但同时也会带来成本的增加。他讨论了在模型规模和特定用例之间的权衡,以及如何通过微调来提高模型在特定任务上的表现。他还提到,随着模型的不断进步,它们将变得更加可靠,能够处理更广泛的任务。

20:05

🌐 开源在AI生态系统中的角色

Ilya Sutskever探讨了开源在AI生态系统中的作用。他认为,短期内,开源模型将帮助公司开发有用的产品。然而,随着AI模型的能力越来越强,开源模型可能带来不可预测的后果。他提出了一个未来研究项目,即确定何时开源模型的能力足够强大,以至于需要考虑其潜在的影响。

25:06

🧬 人工智能与生物智能的相似性

Ilya Sutskever讨论了人工智能、生物智能和人类智能之间的相似性。他提到,就像生物系统倾向于使用统一的架构一样,人工智能似乎也在朝着统一的神经网络架构发展。他还提到了人类大脑的某些实验,这些实验表明大脑的某些区域可以重新配置以处理不同的任务,这进一步支持了单一、统一的架构对于AI的适用性。

30:07

🌟 超智能与超级对齐

Ilya Sutskever讨论了超级智能的概念以及为什么我们现在就应该投资于超级对齐的研究。他认为,随着AI的不断进步,我们可能会在未来五到十年内拥有比人类更聪明的数据中心。他强调,我们希望这些智能体对人类有积极的情感,因此超级对齐项目的目标是确保未来的超级智能对社会和人类友好。

35:08

🚀 AI的未来加速与挑战

Ilya Sutskever讨论了AI未来发展的可能趋势,包括加速和挑战。他提到,尽管有一些减速因素,如数据的有限性和工程复杂性的增加,但加速因素,如投资的增加、工程师和科学家的兴趣,以及AI领域的易进入性,可能会推动AI继续快速发展。他强调,尽管未来的发展路径尚不明确,但AI的进步可能会继续加速,至少在短期内如此。

40:10

🎙️ 节目结束与感谢

节目主持人感谢Ilya Sutskever的参与,并邀请听众在Twitter上关注节目,订阅YouTube频道,以及在Apple Podcast、Spotify等平台收听节目。同时,听众可以通过电子邮件订阅或在no-pri.com网站上查找节目的每集文字稿。

Mindmap

Keywords

💡人工智能

人工智能(AI)是指由人造系统所表现出来的智能行为。在视频中,讨论了人工智能的发展,特别是深度学习和神经网络的进步,以及它们如何引领我们走向通用人工智能(AGI)。

💡深度学习

深度学习是机器学习的一个子领域,它通过模拟人脑神经网络结构来学习数据的表示和功能。视频中提到,在深度学习的早期阶段,人们并没有对其抱有太大希望,但随着GPU的使用和神经网络规模的扩大,深度学习开始展现出其强大的能力。

💡神经网络

神经网络是一种模仿人脑神经元连接方式的计算模型,用于识别模式和处理复杂的数据。视频中讨论了神经网络的发展历程,特别是从小型网络到大型、深层次的网络的转变,以及这些网络如何改变了AI领域的研究方向。

💡通用人工智能

通用人工智能(AGI)是指能够执行任何智能生物能够执行的任务的人工智能系统。视频中讨论了OpenAI的目标之一是确保AGI的发展能够惠及全人类,并且探讨了实现AGI可能面临的挑战和限制。

💡Transformer模型

Transformer模型是一种基于自注意力机制的深度学习模型,它在处理序列数据,尤其是自然语言处理方面取得了巨大成功。视频中提到了Transformer模型的发展,以及它如何成为当前AI研究的主流方向。

💡计算资源

计算资源指的是用于执行计算任务的硬件和软件资源,包括CPU、GPU、存储和网络等。视频中提到了计算资源对于AI研究的重要性,尤其是在训练大型神经网络时所需的大量计算能力。

💡数据集

数据集是用于训练机器学习模型的数据的集合。在AI研究中,大规模的数据集对于训练准确的模型至关重要。视频中提到了数据集的大小和质量对于训练神经网络的影响,以及如何通过增加数据集的规模来提高模型的性能。

💡自主性

自主性指的是个体或系统在没有外部干预的情况下,能够独立做出决策和行动的能力。在视频中,自主性被用来描述未来AI系统可能具备的特性,以及这些系统在决策和行动上的独立性。

💡开源

开源指的是软件或技术项目的源代码公开可用,允许任何人自由使用、修改和分发。视频中讨论了开源在AI领域的作用,包括促进技术共享和创新,以及对未来AI发展可能带来的影响。

💡超级对齐

超级对齐是指确保未来的高级AI系统在价值观和目标上与人类保持一致,以确保它们对人类友好并促进人类福祉。视频中提到了超级对齐项目,旨在研究如何在未来非常智能的AI系统中培养对人类的积极态度。

Highlights

In the AI 'Dark Ages' before deep learning's success, various approaches competed without hope until the neural network approach, considered marginal for its lack of provable theorems, began to show promise due to its brain-like processing.

The breakthrough with AlexNet was attributed to three factors: the advent of GPUs in machine learning, the realization that larger neural networks could perform unprecedented tasks, and the technical insight to effectively train these larger networks.

OpenAI's shift from a nonprofit to a 'capped profit' model was driven by the realization that significant compute resources were essential for AI progress, which a nonprofit model could not support.

The goal of OpenAI, to ensure AI benefits all of humanity, has remained constant, although the tactics to achieve this goal have evolved over time.

The transition from GPT-2 to GPT-3 was a significant leap that demonstrated the potential of scaling up Transformer-based models, leading to emergent capabilities.

Improvements in AI models have made them more reliable and insightful, with the capability to understand and generate human-like text improving significantly over time.

Reliability remains the biggest challenge for AI models, highlighting the need for models to consistently perform well across a wide range of tasks without making significant errors.

The current acceleration in AI progress is driven by various factors, including the scalability of Transformer architectures and the growing investment and interest in AI research.

The future of AI, potentially leading to AGI, involves significant challenges and uncertainties, including the need for super alignment to ensure AI's pro-social behavior towards humanity.

OpenAI's exploration into super alignment is motivated by the anticipation of AI systems becoming highly autonomous and capable, raising the need to ensure they are aligned with human values.

The potential for AI models to autonomously execute complex tasks and projects raises ethical and societal questions about their role and impact on the future.

OpenAI's research agenda has evolved from focusing on conventional machine learning projects to pioneering large-scale, transformative projects like GPT, highlighting the organization's strategic pivot towards scaling up AI models.

The debate around open-source AI models and their role in the AI ecosystem reflects the evolving nature of AI development, with considerations around autonomy, capability, and societal impact becoming increasingly important.

The interview emphasizes the importance of envisioning the future of AI, advocating for a proactive approach in research and policy to navigate the challenges and opportunities of increasingly capable AI systems.

The conversation highlights a cautious optimism towards the progress of AI, recognizing both the technological potential and the complex ethical considerations that accompany the development of advanced AI systems.

Transcripts

play00:00

[Music]

play00:05

open aai a company that we all know now

play00:08

but only a year ago was 100 people is

play00:11

changing the world their research is

play00:12

leading the charge to AGI since Chachi

play00:16

captured consumer attention last

play00:17

November they show no signs of slowing

play00:19

down this week elad and I sit down with

play00:22

ilas Suk co-founder and chief scientist

play00:25

at open aai to discuss the state of AI

play00:28

research where will hit limit the future

play00:30

of AGI and what it's going to take to

play00:32

reach super alignment IO welcome to no

play00:35

priors thank you it's good to be here

play00:37

let's start with the beginning pre Alex

play00:39

net nothing in deep learning was really

play00:40

working and then given that environment

play00:43

you guys took a um a very unique bet

play00:46

what motivated you to go in this

play00:47

direction indeed in those Dark Ages AI

play00:52

was not an area where people

play00:55

had hope and people were not accustomed

play00:59

to any kind of success at all and

play01:01

because there wasn't there hasn't been

play01:03

any success there was a lot of debate

play01:06

and there were different schools of

play01:07

thoughts that had different arguments

play01:09

about how machine learning in AI should

play01:12

be and you had people who were into

play01:15

knowledge representation from the good

play01:17

old fashioned you had people who were

play01:20

beian and they liked beian

play01:21

non-parametric methods you had people

play01:23

who like graphical models and you had

play01:25

the people who like neural networks

play01:27

those people were

play01:28

marginalized because neural networks did

play01:31

not had the property that you can't

play01:33

prove math theorems about

play01:35

them if you can't prove theorems about

play01:37

something it means that your research

play01:39

isn't good that's how it has been but

play01:41

the reason why I gravitated to neural

play01:43

networks from the beginning is because

play01:44

it felt like those are small little

play01:46

brains and who cares if he can prove any

play01:49

theorems about them because we are

play01:50

training small little brains and maybe

play01:52

they'll become maybe they'll do

play01:53

something one day and the reason that we

play01:57

were able to do Alex NBD is because a

play02:01

combination of two factors three factors

play02:04

the first factor is that this was

play02:06

shortly after gpus started to be used in

play02:09

machine learning people kind of had an

play02:11

intuition that that's a good thing to do

play02:13

but it wasn't like today where people

play02:15

exactly knew what the npus is for it was

play02:17

like oh let's like play with those cool

play02:19

fast computers and see what we can do

play02:20

with them it was an especially good fit

play02:22

for neural networks so that was a very

play02:24

that definitely helped us I was very

play02:28

fortunate in that I was able to realize

play02:30

that the reason neural networks of the

play02:34

time weren't good is because they were

play02:36

too small so like if you try to solve a

play02:39

vision task with a neural network which

play02:41

has like a thousand neurons what can it

play02:44

do it can't do anything it doesn't

play02:46

matter how good your learning is and

play02:48

everything else but if you have a much

play02:50

larger neural network you'll do

play02:51

something unprecedented what what gave

play02:53

you the intuition to think that that was

play02:54

the case because I think at the time it

play02:57

was reasonably um contrarian to think

play02:59

that despite to your point you know a

play03:00

lot of the the human brain in some sense

play03:02

works that way or different you know

play03:03

biological neural circuits but I'm just

play03:05

curious like what gave you that

play03:06

intuition early on to think that this

play03:07

was a good direction I think yeah

play03:10

looking at the brain and specifically

play03:12

the if you like all those things follow

play03:17

very easily if you allow yourself if you

play03:21

allow yourself to accept the idea right

play03:24

now this idea is reasonably well

play03:26

accepted back then people still talked

play03:29

about it they haven't really accepted it

play03:31

or internalize the idea that maybe an

play03:34

artificial neuron in some sense is not

play03:37

that different from a biological neuron

play03:39

so now whatever you imagine animals do

play03:41

with their brains you could perhaps

play03:43

assemble some artificial neural network

play03:46

of similar size maybe if you train it it

play03:49

will do something similar so there so

play03:53

that leads to the so that leads you to

play03:55

start to imagine okay like almost

play03:57

imagine the computation being done by

play03:58

the neural network you can almost think

play04:00

like if you have a high resolution image

play04:02

and you have

play04:03

like one neuron for like a large group

play04:06

of pixels what can the neuron do it's

play04:08

just just not much it can do if you but

play04:09

if you have a lot of neurons then they

play04:11

can actually do something and compute

play04:12

something so I think it was like our

play04:15

like it was this was it was

play04:16

considerations like this plus a

play04:18

technical realization the technical

play04:21

realization is that if you have a large

play04:26

training set that specifies the behavior

play04:29

of the neur Neal Network and the

play04:31

training set is large enough such that

play04:33

it can constrain the large neural

play04:36

network sufficiently and furthermore if

play04:38

you have the algorithm to find that

play04:40

neural network because what we do is

play04:43

that we turn the training

play04:44

set into a neural network which

play04:47

satisfies a training set neural network

play04:49

training can almost be seen as solving a

play04:53

neural

play04:55

equation solving a neural equation where

play04:58

every data point is is an equation and

play05:00

every parameter is a

play05:03

variable and so it was multiple things

play05:07

the realization that a bigger neural

play05:09

network could do something unprecedented

play05:12

the realization that if you have a large

play05:14

data

play05:15

set together with the

play05:17

compute to solve the neural equation

play05:20

that's what gradient descent comes in

play05:22

but it's not gradian descent gradient

play05:24

descent was around for a long time it

play05:26

was certain technical insights about how

play05:29

to make it work because back then the

play05:31

prevailing belief was well you can't

play05:32

train those neuron Nets anything it's

play05:33

all hopeless so it wasn't just about the

play05:35

size it was about even if someone did

play05:37

think gosh it would be cool to try a big

play05:39

neural net they didn't have the

play05:42

technical ability to turn this idea into

play05:45

reality you needed not only to code the

play05:47

neural net you need to do a bunch of

play05:49

things right and only then it will work

play05:52

and then another fortunate thing is that

play05:54

the person with whom I work with Alex

play05:56

kvki he just discovered that he really

play05:59

loves gpus and he was perhaps one of the

play06:02

the first person who really

play06:05

mastered writing really like really

play06:07

performant code for for the gpus and

play06:10

that's why we were able to squeeze a lot

play06:11

of performance out of two gpus and do

play06:13

something and produce something

play06:14

unprecedented so to sum up it was

play06:17

multiple

play06:18

things the idea that a big neural

play06:20

network in this case a a vision neural

play06:22

network a convolutional neural network

play06:24

with many layers one that's much much

play06:27

bigger than anything that's ever been

play06:28

done before could do something very

play06:30

unprecedented because the brain can see

play06:32

and the brain is a large neural network

play06:34

and we can see quickly so our neurons

play06:36

don't have a lot of time then the

play06:38

compute needed the technical knowhow

play06:41

that in fact we can train such neural

play06:43

networks and it was not at all widely

play06:46

distributed most people in machine

play06:47

learning would not have been able to

play06:49

train such a neural network even if they

play06:51

wanted to did you guys have any um like

play06:54

particular goal from a size perspective

play06:57

or was it just as as uh you know and if

play07:00

that's biologically inspired or where

play07:02

that number comes from or just as large

play07:03

as we can go definitely as large as we

play07:05

can go because keep in mind I mean we

play07:07

had a certain amount of compute which we

play07:10

could usefully consume and then what can

play07:12

it do maybe if we think about just like

play07:15

the origin of open Ai and uh the goals

play07:19

of the organization like what was the

play07:22

original goal and how's that evolved

play07:23

over time the goal did not evolve over

play07:26

time the tactic evolved over time

play07:30

so the goal of open AI from the very

play07:33

beginning has been to make sure that

play07:36

artificial general intelligence by which

play07:39

we mean autonomous

play07:41

systems AI that can actually do most of

play07:45

the jobs and activities and tasks that

play07:47

people do benefits all of humanity that

play07:51

was the goal from the beginning the

play07:53

initial thinking has been that maybe the

play07:57

best way to do it is by just open

play07:59

sourcing a lot of

play08:01

Technology we later and we also

play08:04

attempted to do it as a nonprofit seemed

play08:07

very sensible this is the goal nonprofit

play08:09

is the way to do it what changed some

play08:13

point at open AI we realized and we were

play08:17

perhaps among among the earlier the

play08:19

earliest to realize that to make

play08:22

progress in AI for real you need a lot

play08:24

of compute now what does a lot mean the

play08:27

appetite for compute is truly endless as

play08:30

as now as as now clearly seen but we

play08:32

realize that we will need a

play08:34

lot

play08:35

and a nonprofit was wouldn't wouldn't be

play08:38

the way to to to get there wouldn't be

play08:41

able to build a large cluster with a

play08:42

nonprofit that's why we became we

play08:44

converted into this unusual structure

play08:47

called CAP

play08:48

profit and to my knowledge we are the

play08:51

only cap profit company in the world the

play08:53

idea is that investors put in some money

play08:57

but even if the company does incredibly

play08:58

well they don't get more than some

play09:00

multiplier on top of their original

play09:03

investment and the reason to do this the

play09:06

reason why that Mak

play09:08

sense you know there are arguments one

play09:10

could make arguments against it as well

play09:12

but the argument for it is that if you

play09:16

believe that the technology that we are

play09:19

building

play09:20

AGI could

play09:22

potentially be so capable as to do every

play09:26

single task that people do does it mean

play09:29

that it might unemploy

play09:32

everyone well I don't know but it's not

play09:34

impossible and if that's the case it

play09:36

makes sense it will make a lot of sense

play09:38

if the company that buil such a

play09:39

technology would not be able to make U

play09:42

infinite would not be incentivized

play09:44

rather to make infinite profits I don't

play09:46

know if it will literally play out this

play09:48

way because of competition in AI so

play09:50

there will be M multiple companies and I

play09:53

think that

play09:54

will have some unforeseen implications

play09:57

on the argument which I'm making but

play09:58

that was the thing

play09:59

I remember visiting the offices back

play10:01

when you were I think housed at YC or

play10:03

something or you know cohabited some

play10:04

space there and at the time there was uh

play10:07

a suite of different efforts there was

play10:09

robotic arms uh that were being

play10:11

manipulated and then there was um you

play10:13

know some video game related work which

play10:14

was really cutting edge um how did you

play10:17

think about how the research agenda

play10:18

evolved and what really drove it down

play10:20

this path of Transformer based models

play10:22

and other forms of of learning so our

play10:26

thinking has been evolving over the

play10:29

years from when we started

play10:30

openi and the first year we indeed did

play10:33

some of the more conventional machine

play10:35

learning work the conventional machine

play10:36

learning work I mean because the world

play10:38

has changed so much a lot of things

play10:40

which

play10:40

were known to everyone in 2016 or 2017

play10:44

are completely and utterly forgotten

play10:46

it's like the Stone Age almost so in

play10:48

that in that Stone Age the world the

play10:51

world of machine learning looked very

play10:52

different it

play10:54

was dramatically more

play10:57

academic the goals values and objectives

play11:01

were much more academic they were about

play11:03

discovering small bits of knowledge and

play11:05

sharing them with the other researchers

play11:07

and getting scientific recognition as a

play11:09

result and it's a very valid goal and

play11:11

it's very understandable I've been doing

play11:13

a for 20 years now more than half of my

play11:14

time that I spent in AI was in that

play11:17

framework and so what do you do you

play11:20

write papers you share your small

play11:21

discoveries two realizations the first

play11:23

realization is just at a high level it

play11:26

doesn't seem like it's the way to go to

play11:29

for a dramatic impact and why is that

play11:32

because if you

play11:33

imagine how an AGI should look like it

play11:37

has to be some kind of a big engineering

play11:39

project that's using a lot of

play11:41

compute right even if you don't know how

play11:44

to build it what that should look like

play11:45

you know that this is the ideal you want

play11:46

to strive towards so you want to somehow

play11:48

move towards larger projects as opposed

play11:50

to small projects so while we

play11:53

attempted a first large project where we

play11:57

trained the neural network to play a

play11:58

real real time strategy game as well as

play12:01

as well as the best humans it's the Dota

play12:03

2 project and it was it was driven by

play12:06

two people um yakob botski and Greg

play12:09

Brockman they they really dropped this

play12:11

project and make it made it a success

play12:13

and this was our first attempt at a

play12:15

large project but it wasn't quite the

play12:19

right formula for us because that the

play12:20

neural networks were a little bit too

play12:22

small it was just an arrow domain just a

play12:24

game I mean it's cool to play a game

play12:26

they kept looking and at some point we

play12:28

realized that hey if you train a large

play12:30

neural network a very very large

play12:33

Transformer to predict text better and

play12:36

better something very surprising will

play12:37

happen this realization also arrived a

play12:39

little bit gradually we were exploring

play12:43

generative models we were exploring

play12:44

ideas around next word

play12:46

prediction those are ideas also related

play12:49

to compression we were exploring them

play12:52

Transformer came out we got really

play12:54

excited we like this is this is the

play12:55

greatest thing we're going to do

play12:56

Transformers now it's clearly Superior

play12:58

than anything else before it we started

play13:00

doing Transformers we did gpt1 gpt1

play13:03

started to show very interesting signs

play13:05

of life and that led us to doing gpt2

play13:08

and then ultimately gpt3 gpt3 really

play13:10

opened everyone else's eyes as well to

play13:13

hey this thing has a lot of traction

play13:15

there is one specific formula right now

play13:17

that everyone is doing and this formula

play13:20

is train a larger and larger Transformer

play13:22

on more and more data I mean for me the

play13:25

big wake up moment to your point was

play13:26

gpt2 to gpt3 transition where you saw

play13:29

such a big step function and

play13:31

capabilities and then obviously with

play13:32

four um open AI published some really

play13:35

interesting uh uh research around some

play13:39

of the different domains of knowledge or

play13:40

domains of expertise or Chain of Thought

play13:42

or other things that the models can

play13:43

suddenly do in an emergent form what was

play13:46

the most surprising thing for you in

play13:47

terms of emergent behavior in these

play13:49

models over time you know it's very hard

play13:51

to answer that question it's very hard

play13:52

to answer because I'm too close and I've

play13:54

seen it progress every step of the

play13:57

way so as much as I'd like I find it

play14:00

very hard to answer that question I

play14:02

think if I had to pick one I think maybe

play14:06

the the most surprising thing for me is

play14:09

the whole thing works at all you know

play14:12

it's hard it's and I'm not sure I I know

play14:14

how to convey this what what I have in

play14:17

mind here because if you see a lot of

play14:20

neural networks do amazing things well

play14:22

obviously neural networks is the thing

play14:24

that works but I have witnessed

play14:27

personally what it's like to be in a

play14:29

world for many years where the neural

play14:32

networks not work at all and then to

play14:35

contrast that to where we are today just

play14:38

the fact that they work and they do

play14:39

these amazing things I think maybe the

play14:42

most surprising the most surprising if I

play14:44

had to pick one it would be the fact

play14:46

that when I speak to it I feel

play14:48

understood yeah there's a there's a

play14:50

really good um saying from I'm trying to

play14:52

remember maybe it's Arthur Clark or one

play14:53

of the Sci-Fi authors which is

play14:55

effectively it says advanced technology

play14:58

is sometimes indistinguishable from

play15:00

Magic yeah I'm I'm fully in this Camp

play15:03

yeah yeah it definitely feels like

play15:04

there's some magical moments with with

play15:06

uh some of these models now is there a

play15:08

way that you guys decide internally uh

play15:11

given all of the different capabilities

play15:14

you could pursue how to continually

play15:17

choose the set of big projects you've

play15:19

sort of described that centralization

play15:21

and committing to certain research

play15:24

directions at scale is really important

play15:26

to open AI success given the breath of

play15:28

opportunity now like what's the process

play15:30

for deciding what's worth working on I

play15:33

mean I think there is some combination

play15:34

of bottom up and top down where we have

play15:38

some top down ideas that we believe

play15:40

should work but we not 100% sure so we

play15:44

still we need to have good top- down

play15:45

ideas and there is a lot of bottomup

play15:47

exploration Guided by those top down

play15:49

ideas as well and their combination is

play15:52

what informs us as to what to do

play15:54

next and uh if you think about those

play15:57

bottom I mean either Direction top down

play15:59

or bottom up ideas like clearly we have

play16:02

this dominant continue to scale

play16:04

Transformers Direction um do you explore

play16:07

additional like architectural directions

play16:10

or is that just not relevant it's

play16:12

certainly possible that various

play16:13

improvements can be

play16:15

found I think I think improvements can

play16:18

be found in all kinds of places both

play16:19

small improvements and large

play16:21

improvements I think the way to think

play16:23

about it is that while the current thing

play16:25

that's being

play16:27

done keeps getting better as you keep on

play16:31

increasing the amount of compute and

play16:33

data that you put into it so we have

play16:35

that property the bigger you make it the

play16:37

better it

play16:38

gets it is also the property that

play16:40

different things get better by different

play16:45

amount as you keep on improving as you

play16:47

keep on scaling them up so not only you

play16:49

want to of course scale up what we doing

play16:51

we also want to SC keep scaling up the

play16:53

best thing

play16:54

possible what is uh a I mean you you

play16:58

probably don't need to predict because

play16:59

you can see internally what do you think

play17:01

is um improving most from a capability

play17:04

perspective in the current generation of

play17:05

scale the best way for me to answer this

play17:08

question would be to point out the to

play17:14

point to the models that are publicly

play17:17

available and you can see how they

play17:19

compare from this year to last year and

play17:22

the difference is quite significant I'm

play17:24

not talking about the difference between

play17:26

not only the difference between let's

play17:28

you can look at the difference between

play17:29

gpt3 and GPT 3.5 and then chat GPT chat

play17:33

GPT 4 chat GPT 4 with vision and you can

play17:36

just see for yourself it's easy to

play17:38

forget where things used to be but

play17:40

certainly the big way in which things

play17:42

are changing is that these models become

play17:44

more and more

play17:46

reliable before they were very they were

play17:49

only very partly there right now they

play17:52

are mostly there but there are still

play17:53

gaps and in the future perhaps these

play17:56

models will be there even more you could

play17:58

trust their answers they'll be more

play18:00

reliable they'll be able to do more

play18:02

tasks in general across the board and

play18:05

then another thing that they will do is

play18:07

that they'll have deeper insight as we

play18:09

train them they gain more and more

play18:12

insight into the true nature of the

play18:14

human world and their Insight will

play18:17

continue to deepen I I was just going to

play18:19

ask about how that relates to sort of um

play18:22

model scale over time because a lot of

play18:23

people are really stricken by the

play18:26

capabilities of the very large scale

play18:28

models and emergent behavior in terms of

play18:30

understanding of the world and then in

play18:32

parallel as people incorporate some of

play18:34

these things into products which is a

play18:35

very different type of path they often

play18:38

start worrying about inference costs

play18:39

going up with the scale of the model and

play18:41

therefore they're looking for smaller

play18:42

models that are fine-tuned but then of

play18:43

course you may lose some of the

play18:44

capabilities around some of the insights

play18:47

and ability to to reason and so I was

play18:50

curious in your thinking in terms of how

play18:51

all this evolves over the coming years I

play18:53

would actually point out that the main

play18:55

thing that's lost when you switch to the

play18:56

smaller models is reli ability I would

play18:59

argue that at this point it is

play19:01

reliability that's the biggest

play19:04

bottleneck to these models being truly

play19:06

useful how you defining reliability so

play19:09

it's like when you ask a question that's

play19:11

not much harder than other questions

play19:14

that the model succeeds at then you'll

play19:17

have very high degree of confidence that

play19:20

it will continue to succeed so I'll give

play19:21

you an example let's suppose that I want

play19:23

to learn about some historical thing and

play19:26

I can ask what tell me what is the

play19:28

prevailing opinion about this and about

play19:30

that and I can keep asking questions and

play19:32

let's suppose it answered 20 of my

play19:34

questions correctly I really don't want

play19:37

the 21st question to have a gross gross

play19:40

mistake that's what I mean by by

play19:42

reliability or like let's suppose I

play19:43

upload some documents some financial

play19:45

documents suppose they say something I

play19:47

want you to do some analysis and to make

play19:49

some conclusion and I want to take

play19:50

action on this basis on this conclusion

play19:53

and it's like it's not a super hard task

play19:55

and the model these models clearly

play19:57

succeed on this task most of the time

play19:59

but because they don't succeed all the

play20:00

time and if it's a consequential

play20:01

decision I actually can't trust the

play20:03

model any of those times and I have to

play20:05

verify the answer somehow so that's how

play20:07

I Define reliability it's very similar

play20:09

to the self-driving situation right if

play20:11

you have a self-driving car and it's

play20:13

like does things mostly well that's not

play20:16

good enough situation is not as Extreme

play20:18

as with a cell driving car but that's

play20:19

what I mean by reliability my perception

play20:22

reliability is that a um to your point

play20:24

it goes up with model scale but also it

play20:26

goes up in if you tune for specific in

play20:29

uh use cases or instances or data sets

play20:31

and so there is that trade-off in terms

play20:33

of size

play20:34

versus uh you know specialized

play20:36

fine-tuning versus reliability so

play20:40

certainly people who care about some

play20:42

specific application have every

play20:44

incentive to get the smallest model

play20:46

working well

play20:48

enough I think that's true it's

play20:50

undeniable I think anyone who cares

play20:53

about a specific application will want

play20:54

the smallest model for it that's

play20:56

self-evident I do think though that as

play20:58

models continue to get larger and better

play21:00

then they will unlock new and

play21:03

unprecedentedly valuable applications so

play21:06

yeah the small models will have their

play21:08

Niche for the less interesting

play21:09

applications which are still very useful

play21:11

and then the bigger models will be

play21:13

delivering on applications okay let's

play21:16

let's pick an example consider the task

play21:20

of producing good legal advice it's

play21:22

really valuable if you can really trust

play21:24

the answer maybe you need a much bigger

play21:25

model for it but it justifies the cost

play21:27

there's been a lot of investment this

play21:29

year uh at the 7B in particular but 7B

play21:33

13B 34b sizes do you do you think

play21:38

continued research at those scales is

play21:40

wasted no of course not I mean I think

play21:44

that in the kind of Med like medium term

play21:50

medium term by I time scale anyway there

play21:52

will be an ecosystem there will be

play21:56

different uses for different model sizes

play21:58

there will be plenty of people who are

play22:00

very excited for whom it's the best 7B

play22:03

model is good enough they'll be very

play22:05

happy with it and then there'll be very

play22:08

plenty of very very exciting and amazing

play22:10

applications for which it won't be

play22:12

enough I think that's all I mean I think

play22:15

the big models will will be better than

play22:17

the small models but not all

play22:20

applications will justify the cost of a

play22:22

of a large model what do you think the

play22:24

role of Open Source is in this ecosystem

play22:27

well open source is complicated I'll

play22:29

describe to you my mental picture I

play22:32

think that in the near term open source

play22:34

is just helping companies produce

play22:38

useful like let's see why would one want

play22:42

to have an open source to use an open

play22:43

source model instead of a Clos Source

play22:45

model that's hosted by some other

play22:47

company I mean I think it's very valid

play22:49

to want

play22:51

to be the final decider

play22:55

on the exact way in which you want your

play22:58

model to be used and for you to make the

play23:00

decision of exactly how you want the

play23:03

model to be used in which use case you

play23:04

wish to support and I think there's

play23:07

going to be a lot of demand for open

play23:08

source models and I think there will be

play23:10

quite a few companies that will use them

play23:12

and I'd imagine that will be the case in

play23:13

the near term I would say in the long

play23:16

run I think the situation with open

play23:18

source models will become more

play23:20

complicated and I'm not sure what the

play23:22

right answer is there right now it's a

play23:24

little bit difficult to imagine so we

play23:25

need to put our future

play23:28

hat maybe futurist hat it's not too hard

play23:31

to get into sci-fi into a Sci-Fi mode

play23:33

when you remember that we are talking to

play23:34

computers and they understand us but so

play23:37

far these computers these models

play23:38

actually not very competent they can't

play23:40

do tasks at

play23:42

all I do think that there will come a

play23:45

day

play23:47

where the level of capability of models

play23:49

will be very high like in the end of the

play23:51

day intelligence is power right right

play23:55

now these models their main impact I

play23:57

would say at least least popular impact

play23:58

is primarily around entertainment and

play24:01

like simple question answer so you talk

play24:03

to a model about this is so cool you

play24:05

produce some images you had a

play24:07

conversation maybe you had some question

play24:08

you could answer it but it's very

play24:10

different from completing some large and

play24:13

complicated task

play24:15

like what about if you had a model which

play24:17

could autonomously start and build a

play24:21

large tech

play24:22

company I think if these models were

play24:25

open source they would have a difficult

play24:27

to predict consequence like we are quite

play24:30

far from these models right now and by

play24:31

quite far I mean by by it time scale but

play24:33

still like this is not what you're

play24:35

talking about but the day will come when

play24:37

you have models which can do science

play24:40

autonomously like be deliver on big

play24:43

science

play24:44

projects and it becomes more complicated

play24:47

as to

play24:48

whether it is desirable that models of

play24:51

such power should be open

play24:53

sourced I think the argument there is a

play24:55

lot less clearcut a lot less

play24:58

straightforward compared to the current

play24:59

level models which are very useful

play25:02

and I think it's fantastic that the

play25:04

current level models have been built so

play25:06

like that is maybe maybe I answered a

play25:08

slightly bigger question rather than

play25:10

what is the role of Open Source models

play25:11

like what's the deal with open source

play25:13

and the deal is up to a certain

play25:15

capability it's great but not difficult

play25:18

to imagine model sufficiently powerful

play25:21

which will be built where it becomes a

play25:23

lot less obvious to the benefits of

play25:25

their open source is there signal for

play25:28

you that we've reached that level or

play25:31

that we're approaching it like what's

play25:32

the what's the boundary so I think

play25:35

figuring out this boundary very well is

play25:38

an urgent research Pro research project

play25:41

I think one of the things that help is

play25:45

that the closed Source models are more

play25:49

capable than open source models so the

play25:51

Clos Source models could be studied and

play25:52

so

play25:53

on and so you'd have some experience

play25:55

with the generation of close Source

play25:57

model and then then you know like oh

play25:58

these models capabilities it's fine

play26:00

there's no big deal there then in a in

play26:02

like couple years the open source models

play26:04

catch up maybe a day will come when we

play26:06

going to say w like these close Source

play26:07

models they're getting a little too a

play26:10

little too drastic and then some other

play26:11

approaches needed if we have our you

play26:15

know future hat on maybe let's like

play26:17

think about like a several year timeline

play26:20

um what are the limits you see if any in

play26:23

the in the near- term in scaling is it

play26:26

like data token scarcity cost of compute

play26:30

architectural issues so the most

play26:33

near-term limit to scaling is obviously

play26:36

data this is well known and some

play26:40

research is required to address it

play26:42

without going into the details I'll just

play26:44

say that the data limit can be

play26:47

overcome and progress will continue one

play26:51

question I've heard people debate a

play26:52

little bit is the degree to which the

play26:54

Transformer based models can be applied

play26:56

to sort of the full set of

play26:58

areas that you'd need for AGI and if you

play27:00

look at the human brain for example you

play27:02

do have reasonably specialized systems

play27:04

or allal networks be specialized systems

play27:06

for the visual cortex versus you know um

play27:09

areas of higher thought areas for

play27:11

empathy or other sort of aspects of

play27:13

everything from personality to

play27:15

processing do you think that the

play27:17

Transformer architectures are the main

play27:19

thing that will just keep going and get

play27:20

us there or do you think we'll need

play27:21

other architectures over time so I have

play27:24

to I understand precisely what you're

play27:27

saying and have two answers to this

play27:29

question the first is that in my opinion

play27:32

the best way to think about the question

play27:34

of Architecture is not in terms of a

play27:37

binary is it enough but how much effort

play27:42

how what will be the cost of using this

play27:45

particular architecture like at this

play27:48

point I don't think anyone doubts that

play27:50

the Transformer architecture can do

play27:52

amazing things but maybe something else

play27:54

maybe some modification could have have

play27:57

some computer efficiency benefits so so

play28:00

better to think about it in terms of

play28:02

computer efficiency rather than in terms

play28:03

of can it get there at all I think at

play28:06

this point the answer is obviously yes

play28:09

to the question about well what about

play28:11

the human brain then with its brain

play28:13

regions I actually

play28:15

think that the situation there is subtle

play28:19

and deceptive for the following reasons

play28:21

so what I believe you alluded to is the

play28:23

fact that the human brain has known

play28:25

regions it has like it has a speech

play28:28

perception region it has a speech

play28:29

production region it has an image region

play28:31

it has a face region has like all these

play28:33

regions and it looks like it's

play28:35

specialized but you know what's

play28:38

interesting sometimes there are cases

play28:40

where very young children have severe

play28:43

cases of epilepsy at a young age and the

play28:46

only way they figure out how to treat

play28:48

such children is by removing half of

play28:51

their

play28:52

brain because it happened at such a

play28:54

young age these children grow grow up to

play28:57

be pretty functional adults and they

play29:00

have all the same brain regions but they

play29:01

are somehow compressed onto one

play29:04

hemisphere so maybe some you know

play29:07

information processing efficiency is

play29:09

lost it's a very traumatic thing to

play29:11

experience but somehow all these brain

play29:12

regions rearrange themselves there is

play29:14

another experiment where that which was

play29:16

done maybe 30 or 40 years ago on ferrets

play29:20

so the ferret is a small animal it's a

play29:22

pretty mean experiment they took the

play29:23

optic nerve of the feret which comes

play29:25

from its eye

play29:28

and attached it to its auditory cortex

play29:31

so now the inputs from the eye starts to

play29:33

map to the speech processing area of the

play29:35

brain and then they recorded different

play29:38

neurons after it had a few days of

play29:40

learning to C and they found neurons in

play29:42

the auditory cortex which were very

play29:44

similar to the visual cortex or vice

play29:46

versa it was either they mapped the eye

play29:48

to the ear to the auditory cortex or the

play29:51

ear to the visual cortex but something

play29:53

like this has happened these are fairly

play29:55

well-known ideas in AI that the cortex

play29:58

of humans and animals are extremely

play30:00

uniform and so that further supports the

play30:02

a like you just need one big uniform

play30:04

architecture so yeah in general it seems

play30:07

like every biological system is

play30:08

reasonably lazy in terms of taking one

play30:10

system and then reproducing it and then

play30:11

reusing it in different ways and that's

play30:13

true of everything from DNA in coding

play30:14

you know there's 20 amino acids and

play30:16

protein sequences and so everything is

play30:17

made out of the same 20 amino acids on

play30:19

through to uh to your point sort of how

play30:22

you think about tissue architectures so

play30:23

it's remarkable that that carries over

play30:25

into the digital world as well depending

play30:26

on the you use I mean the way I see it

play30:29

is that this is an indication that from

play30:31

a technological point of view we are

play30:33

very much on the right track because you

play30:35

have all these interesting analogies

play30:37

between human intelligence and

play30:39

biological intelligence and artificial

play30:40

intelligence we've got artificial

play30:42

neurons biological neurons

play30:44

unified brain architecture for

play30:47

biological intelligence unified neural

play30:49

network architecture for artificial

play30:51

intelligence at what point do you think

play30:52

we should start thinking about these

play30:53

systems in digital life I can answer

play30:56

that question I think that will happen

play30:58

when those systems become reliable in

play31:02

such a way as to be very autonomous

play31:04

right now those systems are clearly not

play31:06

autonomous they're inching there but

play31:08

they're not and that makes them a lot

play31:11

less useful too because you can't ask it

play31:13

hey like do my homework or do my taxes

play31:15

or you see what I mean so the usefulness

play31:17

is greatly limited as the usefulness

play31:19

increases they will indeed become more

play31:22

like artificial life which is also makes

play31:24

it more I would argue um

play31:27

trepidacious right like if you imagine

play31:29

actual artificial life with brains that

play31:32

are smarter than humans go gosh that's

play31:35

like that seems pretty Monumental why is

play31:38

your uh definition based on autonomy

play31:40

because you know if you often look at

play31:42

the definition of biological life it has

play31:44

to do with reproductive

play31:45

capability plus I guess some form of

play31:47

autonomy right like a virus isn't really

play31:49

necessarily considered alive much of the

play31:51

time right but a bacteria is and you

play31:53

could imagine situations where you have

play31:56

um a symbiotic relation a ships or other

play31:57

things where something can't really

play31:58

quite function autonomously but it's

play32:00

still considered a life form so I'm a

play32:02

little bit curious about autonomy being

play32:03

the definition versus some of these

play32:04

other aspects well I mean definitions

play32:07

are chosen for our convenience and it's

play32:10

a matter of debate in my opinion

play32:12

technology already has the reproduction

play32:15

the reproductive function right and if

play32:16

you look at for examp I don't know if

play32:17

you seen those images of the evolution

play32:20

of cell phones and then smartphones over

play32:22

the past 25 years you got this like what

play32:24

almost looks like an evolutionary tree

play32:25

or the evolution of cars over the past

play32:27

Century so technology is already

play32:29

reproducing using the minds of people

play32:31

who copy ideas from previous generation

play32:33

of technology so I claim that the

play32:35

reproduction is already there the

play32:37

autonomy piece I claim is not and indeed

play32:41

I also agree that there is no autonomous

play32:42

reproduction but that would be like can

play32:45

you imagine if you have like

play32:46

autonomously reproducing AIS I actually

play32:48

think that that is pretty dramatic and I

play32:51

would say quite a scary thing if you

play32:54

have an autonomous reproducing AI if

play32:56

it's is also very capable should we talk

play32:59

about uh super alignment yeah very much

play33:02

so can you um just sort of Define it and

play33:05

then you know we were talking about what

play33:06

the boundary is for we when we when you

play33:09

feel we need to begin to worry about uh

play33:12

these capabilities being in in open

play33:14

source like what is super alignment and

play33:16

like why invest in it now the answer to

play33:19

your question

play33:22

really depends to where you think AI is

play33:26

headed you just try to imagine look into

play33:28

the future which is of course a very

play33:31

difficult thing to do but let's make

play33:33

let's let's try to do it anyway where do

play33:35

we think things will be in five years or

play33:37

in 10 years mean progress has been

play33:40

really stunning over the past few years

play33:43

maybe it will be a little bit slower but

play33:45

still if you if you extrapolate this

play33:47

kind of progress you'll be in a very

play33:49

very different place in 5 years L Al on

play33:52

10

play33:53

years it doesn't seem implausible it

play33:56

doesn't doesn't seem at all implausible

play33:58

that we will have

play34:00

computers data centers that are much

play34:03

smarter than people and by smarter I

play34:05

don't mean just have more memory or have

play34:07

more knowledge but I also have mean have

play34:10

deeper insight into the same subjects

play34:13

that we people are studying and looking

play34:15

into it means learn even faster than

play34:20

people like what could such AIS do I

play34:23

don't know certainly if such an AI were

play34:26

the basis of some artificial life it

play34:28

would be well how do you even think

play34:30

about it if you have some very powerful

play34:32

data center that's also alive in a sense

play34:36

that's what you're talking about and

play34:38

when I imagine this world I my reaction

play34:40

is Gush this is very unpredictable

play34:42

what's going to happen very

play34:43

unpredictable but the bare minimum but

play34:45

there is a bare minimum which we can

play34:48

articulate

play34:50

that if such super if such very very

play34:54

intelligent super intelligent data

play34:56

centers buil being built at all we want

play34:59

those data centers to have warm and

play35:02

positive feelings towards people towards

play35:05

Humanity because those this is going to

play35:07

be nonhuman life in a sense potentially

play35:12

it could be potentially be that and so I

play35:15

would want that any instance of

play35:18

such super intelligence to have warm

play35:20

feelings towards humanity and so this is

play35:22

what we doing with the super alignment

play35:25

project we saying hey if if you just

play35:27

allow yourself if you just accept that

play35:30

the progress that we've seen maybe it

play35:32

will be slower but it will

play35:34

continue if you allow yourself

play35:38

that then can you can start doing

play35:41

productive work today to build the

play35:43

science so that we

play35:45

will be able to handle the problem of

play35:50

controlling such Future Super

play35:52

intelligence of imprinting onto them a

play35:56

strong desire desire to be nice and kind

play35:59

to people because those data centers

play36:01

right they'll be they'll be really quite

play36:04

powerful you know there'll probably be

play36:06

many of them they will be the world will

play36:08

be very complicated but

play36:10

somehow to the extent that they are

play36:12

autonomous to the extent that they are

play36:14

agents to the extent they are beings I

play36:16

want them to

play36:18

be to be pro-social Pro Human Social

play36:22

that's the goal what do you think is the

play36:24

likelihood of that coal I mean some of

play36:27

it it feels like a a outcome you can

play36:31

hopefully affect right but uh are we are

play36:34

we likely to have pro-social AIS that we

play36:37

are friends with individually or you

play36:40

know as a species well I mean friends be

play36:44

I think that that part is not

play36:46

necessary the the the Friendship piece I

play36:49

think is optional but I do think that we

play36:50

want to have very Pro social AI I think

play36:53

it's I think it's possible I don't think

play36:55

it's guaranteed but I think it's

play36:56

possible possible I think it's going to

play36:58

be possible and the possibility of that

play37:00

will increase in so far as more and more

play37:04

people allow themselves to look into the

play37:06

future into the five to 10 year future

play37:09

and just ask yourself

play37:12

what what do you expect AI to be able to

play37:15

do then how capable do you expect it to

play37:17

be

play37:18

then and I think that with each passing

play37:21

year if indeed AI continues to improve

play37:26

and as people get to experience because

play37:29

right now you're talking making

play37:30

arguments but if you actually get to

play37:33

experience oh gosh the AI from last year

play37:35

which was really

play37:36

helpful this year puts the previous one

play37:39

to shame and you go okay and then one

play37:42

year later and one starting to do

play37:44

science the AI software engineer is

play37:46

starting to get really quite good let's

play37:49

say I think that you create a lot more

play37:51

desire in people

play37:55

for what you just described for the

play37:58

future super intelligence to indeed be

play38:00

very pro-social you know I think there

play38:02

going to be a lot of disagreement it's

play38:03

going to be a lot of political questions

play38:05

but I think that as people see AI

play38:08

actually getting better as people

play38:10

experience it the desire for the

play38:14

pro-social super intelligence the

play38:17

humanity loving super intelligence you

play38:19

know as much as this is as as much as it

play38:21

can be done will increase and on the

play38:23

scientific problem you know I think

play38:26

right now it's still being an area where

play38:28

not that many people were working

play38:29

on are AI are getting powerful enough

play38:32

you can really start studying it

play38:34

productively we'll have some very

play38:35

exciting research to to share

play38:39

soon but I would say that's the big

play38:41

picture situation here just really it

play38:43

really boils down to look at what you've

play38:46

experienced with AI up until now ask

play38:49

yourself like is it slowing down will it

play38:52

slow down next year like we will see and

play38:55

we'll experience it again and again

play38:57

and I think it will keep keep and what

play38:58

needs to be done will keep becoming

play38:59

clear do you think we're just on an

play39:01

accelerative path because I think

play39:03

fundamentally if you look at certain

play39:04

technology waves they tend to inflect

play39:07

and then accelerate versus

play39:09

decelerate and so it really feels like

play39:11

we're in an acceleration phase right now

play39:12

versus the deceleration phase yeah I

play39:15

mean we are right now it is indeed the

play39:18

case that we are in an acceleration

play39:19

phase you know it's hard to

play39:22

say you know multiple forces will come

play39:26

in to play some forces are accelerating

play39:28

forces and some forces are decelerating

play39:30

so for example the cost and scale are a

play39:33

decelerating force the fact that our

play39:36

data is finite is a decelerating force

play39:39

to some to some degree at least I don't

play39:41

want to overstate yeah it's kind of a

play39:42

within an ASM toote right like at some

play39:43

point you hit it but one it's the

play39:45

standard S curve right or sigal well

play39:48

with the data in particular I just think

play39:50

it won't be it just won't be an issue

play39:52

because we'll figure out some something

play39:54

else but then you might might argue like

play39:57

the size of the engineering project is a

play39:58

accelerating Force just the complexity

play40:00

of management on the other hand the

play40:02

amount of investment is an accelerating

play40:04

Force the amount of interest from people

play40:06

from Engineers scientists is an

play40:07

accelerating force and I think there is

play40:09

one other accelerating force and that is

play40:13

the fact that biological evolution has

play40:15

been able to figure it out and the fact

play40:17

that up until now progress in AI has had

play40:22

up until this point this weird property

play40:24

that it's kind of been

play40:26

you know it's been very hard to execute

play40:28

on but in some sense it's also been more

play40:32

straightforward than one would have

play40:33

expected

play40:35

perhaps like in some

play40:37

sense I don't know much physics but my

play40:40

understanding is that if you want to

play40:42

make progress in quantum physics or

play40:44

something you need to be really

play40:46

intelligent and spend many years in grad

play40:51

school studying how these things work

play40:53

whereas with AI you have people come in

play40:55

get up to speed quickly start making

play40:56

contributions quickly it has the flavor

play40:58

is somehow different somehow it's very

play41:01

there is some kind of there's a lot of

play41:03

give to this particular area of research

play41:05

and I think this is also an accelerating

play41:07

Force how will it all play out remains

play41:09

to be seen like it may be that somehow

play41:12

the scale required the engineering

play41:13

complexity will start to make it so that

play41:15

the rate of progress will start to slow

play41:17

down it will still continue but maybe

play41:19

not as quick as we had before or maybe

play41:21

the forces which are coming together to

play41:23

push it will be such that it will be as

play41:25

fast for maybe a few more years before

play41:27

it will start to slow down if at all

play41:30

that's that would be my articulation

play41:32

here Ilia this has been a great

play41:34

conversation thanks for joining us thank

play41:36

you so much for the conversation I

play41:37

really enjoyed it find us on Twitter at

play41:40

no prior pod subscribe to our YouTube

play41:43

channel if you want to see our faces

play41:45

follow the show on Apple podcast Spotify

play41:47

or wherever you listen that way you get

play41:49

a new episode every week and sign up for

play41:51

emails or find transcripts for every

play41:53

episode at no- pri.com

Rate This

5.0 / 5 (0 votes)

Related Tags
人工智能OpenAIIlya SutskeverAGI技术进步未来展望超级智能社会影响深度学习神经网络
Do you need a summary in English?