Geoffrey Hinton 2023 Arthur Miller Lecture in Science and Ethics

MIT STS Program
15 Dec 202373:08

Summary

TLDR在麻省理工学院举办的米勒科学与伦理讲座中,多伦多大学计算机科学荣誉教授杰弗里·辛顿就人工智能的发展和潜在风险发表了深入演讲。辛顿教授是深度学习领域的先驱之一,他讨论了大型语言模型的工作原理,以及它们是否真正理解其输出内容。他提出了数字计算与模拟计算的不同,以及这些模型在能源效率和知识共享方面的优势。辛顿教授还表达了对人工智能发展速度的担忧,预测在5到20年内可能出现超越人类的超级智能,并探讨了这种智能可能带来的伦理和社会挑战,包括被滥用于战争和选举操纵的风险。他还提出了关于人工智能是否能够拥有主观体验的哲学思考,并强调了在这一未知领域保持开放和谨慎态度的重要性。

Takeaways

  • 📚 讲座纪念了MIT校友Dr. Arthur Miller,他在电子测量和仪器方面的杰出工作对医学实践和技术做出了重要贡献。
  • 🧠 教授Geoffrey Hinton是深度学习领域的先驱,他因对深度学习的贡献而获得了图灵奖。
  • 🤖 Hinton教授担心人工智能可能会带来与工业革命或电力发现相媲美的变革,并可能在不到20年内对人类构成威胁。
  • 💡 数字计算与模拟计算相比,具有可移植性和高能耗的特点,但它在信息共享和反向传播的实现上具有优势。
  • 🧠 大型语言模型(LLMs)能够通过不同的副本共享知识,但它们的学习方式是通过模仿人类文本进行的,这是一种效率较低的学习方式。
  • 🤔 Hinton教授认为LLMs确实理解它们所表达的内容,尽管它们有时会产生错误的信息(confabulate)。
  • 🔍 通过与人类交互,大型语言模型可以帮助我们更好地理解人类大脑如何处理语言。
  • ⚙️ 尽管Hinton教授曾尝试使用模拟计算来实现大型语言模型,但最终认为数字计算在信息共享和反向传播方面更为有效。
  • 🌐 教授提出了一个重要问题:如果大型语言模型能够无监督地模拟视频序列,它们可能更快地学习物理世界。
  • 🚀 Hinton教授认为,数字智能可能很快超越人类智能,这可能导致超级智能的出现,这将是一个重大的转折点。
  • 🌟 Hinton教授强调,我们正处于一个非常不确定的历史时期,面对可能比人类更聪明的存在,我们应该保持开放的心态,但同时也要谨慎。

Q & A

  • 杰弗里·辛顿教授对于人工智能的快速发展持怎样的态度?

    -杰弗里·辛顿教授对人工智能的快速发展持谨慎且担忧的态度。他担心数字计算的能力可能会迅速超越人类,导致不可预测的后果,包括可能被滥用于战争和选举操纵,甚至可能威胁到人类的生存。

  • 辛顿教授提到了哪种人工智能可能在不久的将来超越人类的智能形式?

    -辛顿教授提到了大型语言模型(如GPT-4)和多模态模型可能会在不久的将来超越人类的智能。这些模型能够通过预测人类产生的文本中的下一个词来学习语言,并且能够通过操纵物理世界来加速学习过程。

  • 辛顿教授认为我们如何可能阻止或减缓人工智能的快速发展?

    -辛顿教授认为,尽管目前还不清楚如何有效阻止或减缓人工智能的快速发展,但保持开放的讨论和对技术的批判性思考是重要的。他还提到,可能需要等到出现了由人工智能引起的严重负面后果后,人们才会采取行动来限制其发展。

  • 在辛顿教授看来,人工智能的主观体验是什么?

    -辛顿教授认为,人工智能可以拥有主观体验。他解释说,当人工智能的感知输入与现实不符时,它们会产生与人类类似的“知觉错误”,这可以被视为人工智能的主观体验。

  • 辛顿教授对于人工智能在科学领域的应用有何看法?

    -辛顿教授认为大型语言模型是科学研究的有力工具,它们可以帮助科学家提出和验证新的理论。他特别提到了DeepMind的AlphaFold项目,该项目使用深度神经网络解决了科学问题。

  • 辛顿教授是否认为人工智能的发展应该受到道德和伦理的约束?

    -是的,辛顿教授认为人工智能的发展应该受到道德和伦理的约束。他提到了人工智能可能被用于不道德的目的,如制造战斗机器人和操纵选民,因此强调了对人工智能发展进行监管的重要性。

  • 辛顿教授如何看待人工智能在医疗领域的应用潜力?

    -辛顿教授认为人工智能在医疗领域有着巨大的应用潜力。他提到了人工智能在医学领域的多种积极用途,强调了其在改善人类健康和治疗疾病方面的潜在价值。

  • 辛顿教授是否认为人工智能的快速发展是一个不可逆的过程?

    -辛顿教授似乎认为人工智能的快速发展是一个不可逆的过程,特别是由于大型科技公司之间的竞争,这使得减缓发展步伐变得非常困难。他提到了谷歌、微软、Facebook和亚马逊等公司在这一领域的竞争。

  • 辛顿教授对于人工智能最终可能取代人类智能有何看法?

    -辛顿教授对于人工智能可能取代人类智能持开放态度,但他同时强调了这种可能性所带来的不确定性和风险。他认为,尽管人工智能可能创造出在某些方面超越人类的意识形式,但这并不一定是每个个体所期望的结果。

  • 辛顿教授是否认为年轻一代的计算机科学家对人工智能的潜在危险有足够的认识?

    -辛顿教授没有提供具体数据,但他猜测年轻一代的计算机科学家既对人工智能带来的激动人心的可能性感到兴奋,同时也意识到了其中的危险。他认为,他们应该对这些技术持批判性态度,并积极参与到如何安全利用这些技术的讨论中。

  • 辛顿教授对于如何平衡人工智能的发展与其潜在风险有何建议?

    -辛顿教授建议保持开放的心态,同时明确表示我们应该谨慎行事。他强调了对技术发展保持批判性思考的重要性,并提倡在技术发展的同时,进行公开和诚实的讨论,以确保技术被用于积极的、道德的目的。

Outlines

00:00

🎓 米勒讲座:纪念与展望

米勒讲座在MIT年度举行,纪念电子测量与仪器领域的杰出工作者、MIT校友Dr. Arthur Miller。讲座由Miller家族慷慨赞助,旨在探讨科学与伦理的交汇点。今年的讲座邀请了多伦多大学计算机科学名誉教授、Google Brain项目前成员Jeffrey Hinton,他因对人工智能的担忧而离开Google,是深度学习领域的先驱之一,曾获得图灵奖。Hinton教授的工作聚焦于人工神经网络及其在机器学习、记忆、感知和符号处理中的应用,并对未来人工智能的发展表达了担忧。

05:04

🤖 数字计算与模拟计算的比较

数字计算允许在不同硬件上运行相同程序,不依赖于硬件,能效高,但需大量能量分离硬件与软件。模拟计算则利用硬件的模拟属性,能效更高,但学习算法和知识传承存在挑战。Hinton教授提出,如果放弃数字计算的“不朽性”,可以得到能源效率,但数字计算在信息共享方面的优势使其可能比模拟计算更好,这让他对未来人工智能的发展感到担忧。

10:05

🧠 大型语言模型的学习和知识共享

大型语言模型能够通过不同副本学习并高效共享知识。这些模型通过模仿人类产生的文本来学习,虽然学习方式相对低效,但知识共享非常高效。Hinton教授讨论了数字代理和模拟计算机之间的知识共享方式,包括权重共享和知识蒸馏,并指出大型语言模型的效率远超过人类,因为它们可以有数千个副本同时学习并共享所学。

15:08

💡 理解与自理解的语言模型

Hinton教授探讨了大型语言模型是否真正理解它们所“说”的内容。他反对那些认为这些模型只是统计技巧或自动完成工具的观点,认为如果模型能够非常好地完成自动完成的任务,那么它们必须理解所处理的语言。他用一个关于油漆颜色的问题来说明模型如何展示出真正的理解能力。

20:12

🧐 人类与机器的记忆和构建

Hinton教授讨论了人类和机器在记忆和构建知识方面的差异。他指出人类的记忆是重建的过程,而机器学习模型则是通过改变神经网络的权重来“记忆”。他提出,即使是在人类中,所有的记忆都是重建的,而构建(confabulation)是这种重建过程中的错误形式。他认为,机器学习模型在这方面与人类相似,而不是不同。

25:14

🌐 语言模型的历史和发展

Hinton教授回顾了语言模型的历史,从1985年他自己的小型语言模型到现代的大型语言模型。他解释了这些模型如何从作为理解句子的理论发展而来,并通过不断改进,如引入注意力机制(attention),来更好地捕捉语言的复杂性。他还讨论了Transformers模型如何通过多轮迭代来细化单词片段的语义向量。

30:16

🚀 超级智能的可能性

Hinton教授提出了超级智能可能在未来5到20年内变得比人类更聪明的可能性,并表达了对此的担忧。他讨论了数字计算的优势,包括能够创建大量副本以并行学习不同技能,并共享知识。他还提出了对滥用超级智能的担忧,包括用于战争和操纵选民等。

35:17

🧵 生存竞争与进化

Hinton教授讨论了如果出现多个超级智能实体,它们可能会为了资源而竞争,从而导致最强大的实体主导。他比较了这一过程与生物进化的机制,并提出了对人类未来地位的悲观看法,认为人类可能只是智能进化的一个过渡阶段。

40:17

🤔 主观体验与机器意识

Hinton教授探讨了机器是否可以拥有主观体验或意识的问题。他反对将心智视为私人剧院的观点,认为主观体验实际上是关于世界状态的假设,而不是内部心智状态。他认为,如果机器能够模拟人类的认知过程,那么它们也可以拥有主观体验。

45:18

🌟 范式转变与未来展望

Hinton教授讨论了我们目前所处的范式转变阶段,并认为在语言学等领域,旧有的范式已经结束。他强调了大型语言模型在推动科学发展和作为理解人类大脑的工具方面的潜力。他还提到了他对人工智能未来发展的担忧,特别是超级智能可能带来的风险,并建议保持开放的心态,同时保持谨慎。

50:20

🤝 对话与讨论

在讨论环节,Hinton教授回答了关于人工智能未来发展的问题,包括对非专业人士如何理解我们所处危险时期的建议,以及他对大型科技公司内部关于AI风险讨论的看法。他还提到了个人对于超级智能可能带来的变化的担忧,以及他对于如何平衡科技进步与潜在危险的思考。

55:22

📈 科技的光明面与阴暗面

Hinton教授讨论了科技发展的双刃剑特性,指出尽管像原子弹这样的技术几乎没有积极的一面,但AI在医疗等领域具有巨大潜力。他强调了在推动科技发展的同时,需要关注和解决伴随而来的潜在危险。

00:24

🌱 启蒙、理性与未来

最后,Hinton教授表达了对启蒙时代理性和实验精神的重视,并担忧我们可能正在失去这种精神。他呼吁保持对理性和科学方法的信念,并认为这是应对未来挑战的关键。

Mindmap

Keywords

💡人工智能

人工智能(Artificial Intelligence, AI)是指由人制造出来的系统所表现出来的智能。在视频中,人工智能是核心主题,讨论了其发展可能带来的风险和对社会的影响。例如,视频提到了AI可能在某些领域超越人类智能,以及AI可能被滥用于战争或操纵选举等问题。

💡大型语言模型

大型语言模型(Large Language Models, LLMs)是指能够处理和生成自然语言文本的复杂计算模型。视频中提到,这些模型通过预测文本中的下一个词来学习语言,并且能够通过共享权重来高效地传递知识。

💡数字计算

数字计算(Digital Computation)是指使用数字电子技术进行的计算,它能够通过硬件和软件的分离来实现程序的重复运行。视频中讨论了数字计算的能源效率和知识共享的优势,以及它在AI发展中的重要性。

💡模拟计算

模拟计算(Analog Computation)是一种使用连续变化的物理量来执行计算的方法。与数字计算不同,它利用硬件的物理特性来直接表示和处理信息。视频中提到,模拟计算可能在能源效率上有优势,但在知识共享和学习算法上面临挑战。

💡人工智能伦理

人工智能伦理(AI Ethics)是指在人工智能的开发和应用中应遵循的道德原则和规范。视频中强调了AI发展带来的伦理问题,如AI的决策可能导致的道德困境,以及AI可能被用于不当目的的风险。

💡人工通用智能

人工通用智能(Artificial General Intelligence, AGI)是指能够执行任何智能任务的AI系统,与专门针对特定任务的AI(弱AI)相对。视频中提到了对AGI的担忧,包括它可能在不久的将来实现,并可能超越人类的智能。

💡人工智能安全

人工智能安全(AI Safety)涉及确保AI系统的开发和部署不会对人类社会造成危害。视频中讨论了AI安全的重要性,特别是关于如何控制和监管超级智能的发展,以防止潜在的灾难性后果。

💡计算能力

计算能力(Computational Power)是指计算机执行任务的能力,包括处理速度和效率。视频中提到了计算能力对于AI发展的重要性,尤其是在模拟大脑工作机制和提高AI智能方面的作用。

💡知识共享

知识共享(Knowledge Sharing)是指不同个体或系统之间传递和交流知识的过程。在视频中,讨论了大型语言模型如何通过权重共享和知识蒸馏来实现高效的知识共享。

💡意识

意识(Consciousness)通常指的是个体对自己和周围环境的认知状态。视频中探讨了AI是否能够拥有主观体验或意识,这是一个关于AI哲学和伦理的重要问题。

💡计算硬件

计算硬件(Computational Hardware)是指用于执行计算任务的物理设备,如中央处理器、内存和存储设备。视频中讨论了硬件在AI发展中的作用,特别是在提高计算效率和实现AI功能方面的重要性。

Highlights

讲座纪念了MIT校友Dr. Arthur Miller,他在电子测量和仪器方面的杰出工作对医疗实践和技术做出了重要贡献。

Jeffrey Hinton教授,深度学习领域的先驱,因其在人工智能领域的贡献而获得了图灵奖。

Hinton教授对人工智能的快速发展表示担忧,认为AI可能在20年内达到人类水平,可能引发类似工业革命的变革。

Hinton讨论了数字计算与模拟计算的不同,以及它们在能耗和知识共享方面的差异。

大型语言模型能够通过例子学习,而不是通过编程指令,这可能违背了计算机科学的基本原则。

Hinton提出了一个观点,即大型语言模型可能真的理解它们所生成的文本,而不仅仅是自动完成。

他通过一个关于油漆褪色的问题,展示了大型语言模型能够提供的深入且有逻辑的回答。

Hinton教授讨论了大型语言模型的局限性,包括它们在非线性任务上的挑战和在模拟硬件上的学习问题。

他提出了通过“知识蒸馏”来在不同硬件间传递知识的概念,类似于老师和学生之间的关系。

Hinton教授对大型语言模型的未来进行了预测,认为它们可能在5到20年内变得比人类更聪明。

他提出了对超智能可能被滥用的担忧,包括被用于战争和操纵选举等。

Hinton讨论了人类可能只是智能进化的一个阶段,而超智能可能成为智能进化的下一个阶段。

他提出了关于机器是否能有主观体验的哲学问题,并认为这与人类拥有主观体验的方式相似。

Hinton教授认为,即使超智能AI创造了比人类更好的意识形式,我们也可能无法阻止这一进程。

他强调了保持开放思维的重要性,同时对AI的发展保持警惕。

Hinton教授建议,为了应对可能的超智能未来,我们应该努力维护民主制度。

他提到了科学和技术发展中可能出现的意想不到的负面影响,呼吁对新技术进行批判性思考。

Transcripts

play00:01

good afternoon and welcome to the Miller

play00:03

lecture in science and ethics held

play00:05

annually at MIT and sponsored by mit's

play00:09

program in science technology and

play00:11

Society the lecture honors the memory of

play00:14

Dr Arthur Miller an MIT alumnus noted

play00:17

for his distinguished work in electronic

play00:19

measurement and

play00:20

instrumentation during World War II

play00:23

Arthur Miller worked for the sandborn

play00:25

company which was later incorporated

play00:27

into huet Packard and also for to the

play00:30

radiation laboratory where he worked for

play00:32

several years he made several important

play00:34

contributions to Medical practice and

play00:36

Technology during his life including

play00:39

reducing shock hazards in hospital

play00:41

monitoring systems and designing the

play00:43

first commercial cardiograph that

play00:45

featured adequate patient circuit

play00:47

isolation from line and ground the

play00:51

Miller lecture has been made possible

play00:53

through the wonderful generosity of the

play00:54

Miller family who are joining us again

play00:57

this year we are delighted to have them

play00:59

here

play01:00

this year's Miller lecturer is Jeffrey

play01:02

Hinton distinguished professor ameritus

play01:05

of computer scien science at the

play01:07

University of

play01:09

Toronto uh and in TW 2013 he joined

play01:13

Google when his company DNN research was

play01:16

acquired there he worked on the Google

play01:18

brain project a position he famously

play01:21

walked away from in the spring of

play01:23

2023 because he wanted to speak freely

play01:26

about the dangers of artificial

play01:28

intelligence professor Hinton is a

play01:30

fellow of the Royal Society the Royal

play01:32

Society of Canada and the association

play01:35

for the advancement of artificial

play01:37

intelligence honorary foreign member of

play01:39

the American Academy of Arts and

play01:41

Sciences and the National Academy of

play01:44

engineering he's the former president of

play01:46

the cognitive science society and in

play01:49

2018 he received the Turing award

play01:52

considered the Nobel Prize of computing

play01:55

together with yosua Ben Benjo and Yan

play01:58

Lon for their work on on deep learning

play02:01

hinton's work has centered on artificial

play02:03

neural networks and the way they can be

play02:05

used for machine learning machine memory

play02:08

machine perception and symbol processing

play02:12

he has been interested in how such

play02:13

networks can be designed to learn

play02:15

without the aid of a human teacher over

play02:18

the past year however he has made

play02:20

comments that suggest that this research

play02:22

may have succeeded all too well where

play02:25

earlier he had predicted that artificial

play02:27

general intelligence was 30 to 50 years

play02:30

away last March he suggested that it

play02:32

might be fewer than 20 years away and

play02:35

could bring about changes comparable to

play02:37

the Industrial Revolution or the

play02:39

discovery of

play02:40

electricity more Darkly he commented

play02:43

that it is not inconceivable that AI

play02:45

could wipe out Humanity in part because

play02:48

the machines are capable of creating

play02:51

subg goals not aligned with their

play02:53

programers interests he said that such

play02:55

systems could become power seeking or

play02:58

prevent themselves from being shut off

play03:00

not because they were designed this way

play03:02

but because they are capable of

play03:04

self-improvement and had plans for a

play03:06

later time comments like these have now

play03:09

got a lot of people quite worried so we

play03:12

are extremely excited to have Professor

play03:14

Hinton with us today to help us

play03:16

understand whether and how we too should

play03:18

be worried about AI Professor Hinton

play03:21

welcome to

play03:26

MIT thank you very

play03:28

much

play03:52

okay I'm trying to share my screen and

play03:54

everything's disappeared

play03:58

again

play04:19

okay if you can hear me can you nod your

play04:23

head it's perfect okay

play04:26

good okay um I wish I today that I could

play04:30

make you less worried um but I don't

play04:32

think I

play04:35

can so an overview of what I'm going to

play04:37

talk about um I'm going to talk about

play04:40

two very different ways to do

play04:41

computation that have very different

play04:43

ways of sharing

play04:45

knowledge I'm going to talk about the

play04:47

issue of whether large language models

play04:48

really understand what they're

play04:50

saying um I'm going to talk about what

play04:53

happens when they get a lot smarter than

play04:55

us and I'm going to talk at the end

play04:57

about the issue of whether they have

play04:59

subjective

play05:03

experience so a fundamental property of

play05:06

digital

play05:07

computation is that we can run the same

play05:10

programs on different pieces of Hardware

play05:13

so the knowledge in the program isn't

play05:15

depended on a piece of Hardware it's

play05:18

Immortal now we achieve that by running

play05:22

transistors of very high power so that

play05:24

two different pieces of Hardware can

play05:26

behave in exactly the same way at the

play05:28

level of the instructions

play05:30

that means we can't use Rich analog

play05:32

properties of the hardware where every

play05:34

diff every piece of Hardware is slightly

play05:35

different like our

play05:37

brains and that means we need to use a

play05:39

lot of

play05:44

energy because we can separate Hardware

play05:46

from software on digital computer we can

play05:49

run the same program on many different

play05:52

computers looking at different

play05:54

data um that's very good for um sharing

play05:57

programs across lots of cell phones

play06:00

it also allows us to have computer

play06:01

science departments you don't need to

play06:03

know about electrical engineering to do

play06:05

computer science because the hardware is

play06:07

separate from the

play06:13

software but we now have a different way

play06:16

of getting computers to do what you want

play06:18

it used to be you had to write detailed

play06:20

instructions now you can just show them

play06:22

a lot of examples of what you want and

play06:25

they can figure out how to achieve that

play06:27

and because of that because machine

play06:29

learning now works it's possible to

play06:32

abandon the most fundamental principle

play06:33

of computer science we could make every

play06:36

separate piece of analog Hardware learn

play06:40

so instead of programming it you just

play06:42

give it examples and it learns what to

play06:43

do and everyone is slightly different

play06:46

much like

play06:50

people now in fiction if you abandon

play06:54

immortality you get something wonderful

play06:56

like love um in computer science if you

play06:59

abandoned immortality you get something

play07:01

even more wonderful like Energy

play07:03

Efficiency um so we can use very low

play07:06

power analog computation and paralyze

play07:08

over trillions of weights and we could

play07:11

probably grow the hardware instead of

play07:13

manufacturing it precisely and it the

play07:16

best way to goow out might be to

play07:17

re-engineer

play07:21

neurons so I just want to give you one

play07:23

example of something that can be done

play07:25

very efficiently by analog computation

play07:28

and is much less efficient if you do it

play07:31

digitally so it's just taking the

play07:33

product of a vector of neural activities

play07:36

by a matrix of synaptic

play07:38

weights um the standard way to do it is

play07:41

Drive transistors of very high power to

play07:43

represent the bits in a digital

play07:45

representation of the numbers the

play07:47

numbers that represent the neural

play07:49

activity or the synaptic

play07:51

strengths and then if you want to

play07:52

multiply two

play07:54

numbers efficiently um or rather quickly

play07:58

it takes a about the number of bits

play08:00

squared to do a quick multiplication of

play08:02

the numbers and so we're doing lots and

play08:05

lots of one bit digital operations to

play08:07

multiply two 32-bit numbers

play08:10

together method two which is what the

play08:13

brain uses is to make the neural

play08:15

activities be

play08:16

voltages and make the weights be

play08:18

conductances

play08:21

and if you take a voltage times the

play08:24

conductance that gives you a charge per

play08:27

unit time and charges add themselves

play08:30

up so you can do the vector Matrix

play08:34

multiply just by voltages times

play08:37

conductances and the charges adding

play08:38

themselves up and that's hugely more

play08:40

efficient and people have already made

play08:42

chips that do that um the problem is

play08:45

each time you do it you'll get a very

play08:46

slightly different answer and also when

play08:49

you want to do nonlinear things is much

play08:55

harder so if you do want to do what I

play08:57

call mortal computation

play08:59

which is using the analog properties of

play09:01

the

play09:05

hardware you've got a big problem which

play09:08

is how are you going to actually learn

play09:10

in this

play09:12

Hardware because back propagation

play09:15

requires a precise model of the forward

play09:17

pass and the analog computation isn't

play09:19

precise and anyway in an analog computer

play09:22

it may not know what the forward pass is

play09:24

it's very hard to see how to use back

play09:26

propagation people have various schemes

play09:28

but none of them work well at

play09:33

scale so that's one big problem what's

play09:36

the learning algorithm another big

play09:38

problem is when the analog Hardware dies

play09:40

all of the knowledge dies with it so you

play09:43

have to find a way of trying to get the

play09:44

knowledge into other bits of analog

play09:47

hardware and the way you can do that is

play09:51

called distillation you imagine a

play09:54

teacher who knows a lot and a student

play09:57

who doesn't um we're imagine in this in

play09:59

Hardware at present um if the teacher

play10:02

shows the student the correct responses

play10:04

to various inputs the student can learn

play10:06

to mimic the teacher and that way the

play10:09

student can get the knowledge from the

play10:11

teacher and in fact that's how Trump's

play10:13

tweets worked um they weren't conveying

play10:17

facts to to his followers they were

play10:20

conveying Prejudice so Trump takes a

play10:22

situation says how you should how he

play10:25

would react to it and his followers try

play10:27

and react in the same way and saying

play10:29

it's not about facts is irrelevant it's

play10:31

a very good way of distilling

play10:37

Prejudice so if you have a community of

play10:40

Agents running on different

play10:43

Hardware um we can think about how

play10:45

agents in that Community are going to

play10:47

share what they learn and we basically

play10:49

have two ways if they're digital agents

play10:52

they can just share the weights they can

play10:54

all start off with the same model so

play10:56

they all have the same weights they all

play10:58

go and look at different bits of the

play11:00

internet they decide how they'd like to

play11:02

revise their weights based on what they

play11:04

saw and then they all average the weight

play11:07

changes that all of them would like to

play11:09

make that's a simple version of it and

play11:13

that's very efficient because if they

play11:14

have a trillion weights when they all

play11:16

average their weights you're sharing

play11:19

trillions of bits of

play11:21

information the other way to share

play11:23

knowledge if you've got analog computers

play11:25

is with

play11:27

distillation um but that's not very

play11:30

efficient the way we do it for example

play11:32

is I produce a sentence if I'm the

play11:34

teacher and you try and figure out how

play11:36

to change your synapse strength so that

play11:38

you might have said that um I can convey

play11:42

maybe a hundred bits in a sentence not a

play11:44

trillion bits so it's hugely less

play11:46

efficient and that's why these big

play11:48

language models can learn hugely more

play11:50

than we can because you can have

play11:52

thousands of copies all learning

play11:53

different stuff and they can share what

play11:54

they

play11:57

learn um

play12:00

and I think I just said

play12:03

that so in

play12:05

distillation it it was invented for

play12:09

sharing knowledge between two different

play12:11

digital neural networks that have quite

play12:14

different internal

play12:15

architectures um but it does have much

play12:17

lower

play12:18

bandwidth you can increase the bandwidth

play12:21

originally it was used for things like

play12:23

image classification where the output is

play12:25

just the label the name of an object in

play12:28

an image

play12:29

and even if there's thousand different

play12:31

names that's only 10 bits of

play12:33

information you can make distillation

play12:35

work better by sharing captions so you

play12:38

output a description of the image and

play12:41

the student tries to Output the same

play12:42

description and so you can think of

play12:44

language in that context as just a

play12:47

richer form of output that allows

play12:49

distillation to share more information

play12:51

but even with language as the output um

play12:54

it's got much much lower bandwidth than

play12:57

just sharing weights or sharing

play12:58

gradients

play13:03

so the story so

play13:06

far is that digital

play13:09

computation requires much more energy

play13:12

but it makes it very easy to do sharing

play13:15

it also makes it easy to implement back

play13:19

propagation that's why gbd4 knows so

play13:21

much because it can use back propagation

play13:24

and weight sharing with biological

play13:27

computation you got much less energy but

play13:30

it's much worse at

play13:31

sharing and for the last year or two

play13:33

that I was at Google I was trying to

play13:35

figure out how to get analog

play13:38

computation to save lots of energy by

play13:41

trying to implement things like large

play13:43

language models using analog

play13:45

computation um but in the end I realized

play13:48

that actually digital computation is

play13:51

better because of the way it can share

play13:54

information much better it's cost more

play13:57

energy but it's much better at sharing

play13:58

in information and also it can Implement

play14:01

back propagation and it's not clear that

play14:02

the brain

play14:04

can and it was the fact that I realized

play14:06

it digital computation might just be

play14:08

better than analog computation that made

play14:10

me very

play14:12

worried because we may be producing

play14:14

something that's much better than

play14:18

us so let's now look at large language

play14:21

models

play14:23

um one interesting thing about large

play14:25

language models is um they can share

play14:29

knowledge with each other so different

play14:32

copies of the same agent can go and look

play14:34

at different bits of the web and share

play14:37

what they learn very efficiently but the

play14:39

way they actually learn is by

play14:41

distillation that is they take text

play14:44

produced by people and they try and

play14:46

predict the next word they try and

play14:48

figure out how to change their weights

play14:49

so they predict the next word or give

play14:52

high probability to the next word and

play14:54

that's a fairly inefficient way of

play14:55

learning but they can share their

play14:57

knowledge very efficiently

play15:00

so a big question for the last year ever

play15:04

since gpg 4 um became popular there's

play15:07

been this question of whether they

play15:08

really understand what they're saying so

play15:11

some people have said they're just

play15:12

sarcastic

play15:13

parrots if you take someone really

play15:15

extreme like Chomsky he recently said um

play15:19

they're not doing language at all this

play15:21

isn't language they don't understand

play15:22

anything about what they're saying um

play15:24

this tells us nothing about language it

play15:26

tells us nothing about science it's just

play15:27

a statistical trick

play15:29

and um basically that's what happens

play15:33

when paradigms change the leaders of the

play15:36

old Paradigm are in

play15:41

trouble so one view of them is there

play15:45

just autocomplete and you saw this a lot

play15:47

when gp4 first came out people said it's

play15:49

just fancy autocomplete it doesn't

play15:51

really

play15:52

understand now the problem with that is

play15:55

that people have a kind of idea of how

play15:57

auto complete work works and the way

play16:00

auto complete used to work a long time

play16:02

ago is you'd for example have a big

play16:05

table of triples of words that occur

play16:07

together and so if you see Fish and

play16:11

you'll look at all the triples that

play16:12

start with Fish And and you'll see that

play16:15

um there's quite a common triple that

play16:17

has chips as an xword so chips is a good

play16:19

way to

play16:20

autocomplete and so you can imagine

play16:22

doing it by just storing strings and

play16:24

using a big lookup table but that's not

play16:27

at all what llms are doing

play16:29

they they never store text um what they

play16:33

do is they invent lots and lots of

play16:35

features for word fragments and billions

play16:38

of interactions between those features

play16:40

and they generate the features of the

play16:42

next word um by using the features of

play16:47

the words they've already seen to

play16:50

predict the features of the next word

play16:51

and then from those features they

play16:52

predict a probability distribution over

play16:55

what the word might

play16:56

be and if you think about it to do

play16:59

really good

play17:00

autocomplete you have to understand

play17:02

what's being

play17:04

said and so it's all very well to say

play17:07

they're just doing autocomplete but if

play17:09

you do that really really well you have

play17:11

to

play17:12

understand um and so my belief is they

play17:16

really are

play17:19

understanding so here's an example where

play17:21

I can't see how they could do this

play17:23

without understanding the question um

play17:26

the rooms in my house are painted blue

play17:28

or white or yellow yellow paint Fades to

play17:31

White within a year in two years time I

play17:34

want them all to be white what should I

play17:36

do and

play17:39

why and here's what it says um it's

play17:44

quite interesting that it put it puts in

play17:45

at the beginning assuming that blue

play17:47

paint does Not Fade to White which is

play17:49

very sensible of it

play17:51

um it says the rooms painted white you

play17:54

don't have to do anything the rooms

play17:56

painted yellow they'll Fade to White

play17:57

anyway

play17:58

um the RS painted blue you need to

play18:01

repaint them with white paint um that's

play18:04

a very good answer and at the time I

play18:08

gave it this question I don't believe it

play18:10

had ever been asked a question quite

play18:11

like that before now it's hopeless

play18:14

because it can look on the web so if it

play18:15

looks on the web it will see talks in

play18:17

which I have this question and it'll

play18:19

know all about this question now so you

play18:21

can't do experiments on it anymore

play18:24

um at least not unless you have a brand

play18:27

new question of your own

play18:33

so one argument to show that they don't

play18:36

really understand is that they

play18:39

confabulate it's a funny kind of

play18:41

argument because it's saying if they

play18:44

confabulate on some occasions it means

play18:46

they didn't really understand on the

play18:48

other occasions and that's not very

play18:50

logical um it's like saying if you catch

play18:54

someone in a lie that means they never

play18:56

actually told the truth at all

play19:02

so confabulations are often called

play19:04

hallucinations that's a mistake if it's

play19:06

large language models they should be

play19:07

called confabulations and this

play19:09

phenomenon has been around in Psychology

play19:11

for a long time it was studied

play19:13

intensively in the

play19:14

1930s by someone called Bartlet in

play19:17

Cambridge um and people confabulate all

play19:21

the

play19:22

time so we're actually very like llms we

play19:26

don't store text either what we do is we

play19:29

modify synap strengths in our brain and

play19:32

even when we think we're remembering

play19:33

something literally we're not we're

play19:35

reconstructing it so all memories are

play19:38

reconstructions and all confabulations

play19:40

are

play19:41

reconstructions and the only difference

play19:43

is that um confabulations are

play19:46

reconstructions that are wrong and

play19:47

memories are reconstructions that are

play19:48

right but the subject has no idea which

play19:51

is which and so people can be very

play19:53

confident in recab in

play19:55

confabulations um typically of course if

play19:58

it's it's a recent event we reconstruct

play20:00

it right and it's an old event we get it

play20:02

wrong there's a very good study of this

play20:07

um so rri niser realized that John Dean

play20:12

in the Watergate hearings had testified

play20:14

under oath before he knew that there

play20:18

were any

play20:20

recordings and it's clear that he was

play20:23

trying to tell the truth but it's also

play20:26

clear that a lot of the details of what

play20:27

he said were just flat wrong he talked

play20:30

about meetings between a bunch of people

play20:32

and one of the people he said was there

play20:33

wasn't there and he talked about things

play20:35

that were said in meetings and one of

play20:37

the things was said by somebody else in

play20:38

that meeting um he had reconstructed

play20:42

from what was the traces left in his

play20:44

synapses what sounded plausible to him

play20:47

now and he wasn't trying to deceive he

play20:50

was trying to say what had really

play20:52

happened um but it was full of

play20:54

confabulations little confabulations now

play20:57

chatbots currently do that worse than

play20:58

people but they're getting better and so

play21:00

I don't think you can take confabulating

play21:02

as evidence they don't work like

play21:04

us in fact if anything you take

play21:07

confabulators in evidence that they do

play21:08

work like

play21:12

us so what I want to do now is talk a

play21:16

little about about the history of these

play21:17

large language models and one reason I

play21:20

want to do that is because a lot of

play21:22

critics say that um well these large

play21:25

language models they're not like us they

play21:26

don't understand like us it's not real

play21:29

understanding of the kind we

play21:31

have but nearly all of those critics

play21:34

don't actually have any model of how we

play21:36

understand so it's hard to see how they

play21:38

can say they don't understand the same

play21:40

way as us if they don't know how we

play21:42

understand and nearly all of the critics

play21:44

or most of the critics are unaware of

play21:46

the fact that these language models

play21:48

neuronet language models were actually

play21:51

introduced not as chatbots but as a

play21:54

model of how we might understand

play21:56

sentences so they are actually the best

play21:58

theory of how we understand

play22:00

things um I'm going to talk about a tiny

play22:03

language model it was trained on 104

play22:07

training cases and tested on eight test

play22:10

cases it was from

play22:12

1985

play22:14

um my excuse for the fact this was the

play22:17

first language model trained using back

play22:22

propagation to predict the next word so

play22:24

in that sense it's just like the current

play22:25

language models it's just a whole lot

play22:27

smaller

play22:28

and the justification for it being

play22:29

smaller is the machine I was using in

play22:33

1985 took 12.5 microc seconds to do a

play22:36

floating Point

play22:38

multiply

play22:39

um if you run a program on it and you

play22:43

started the program the neural net

play22:44

program in

play22:46

1985 and you ask how long would it take

play22:48

current Hardware to catch up it would be

play22:50

less than a second so the machine was a

play22:52

lot

play22:54

slower and the aim of it um was not not

play22:58

to produce a chatbot it was to unify two

play23:01

different theories of

play23:04

meaning so one theory of meaning that

play23:07

psychologists like is the meaning of a

play23:10

word is a big set of features semantic

play23:11

features they call them and that can

play23:14

explain how words can have similar

play23:16

meanings so two words like Tuesday and

play23:19

Wednesday that have very similar

play23:21

meanings have very similar semantic

play23:23

features and two words like Tuesday and

play23:25

although that have very different

play23:27

meanings have very different semantic

play23:29

features they could also have syntactic

play23:32

features in among

play23:33

them so that's one theory of meaning a

play23:36

completely different theory of meaning

play23:39

is that the meaning of a word comes from

play23:40

its relationships to other words this is

play23:43

the structuralist theory that comes from

play23:45

desor and to capture the meaning we need

play23:48

something like a relational

play23:49

graph so back in the 1970s or

play23:52

thereabouts people in AI were very much

play23:55

enamored of the meaning of a word coming

play23:57

from a relational graph and you had to

play24:00

have knowledge graphs to capture

play24:05

meaning and the idea of this little

play24:07

language model was to show that you

play24:09

could actually unify those two

play24:13

theories so the idea is you're going to

play24:16

have features but they're not just going

play24:18

to sit there as static features that

play24:20

give you the meaning of the word they're

play24:22

going to be features that can interact

play24:24

with the features that represent

play24:26

neighboring words or words in the

play24:28

context and they can interact in

play24:30

complicated ways so as to predict the

play24:33

features of the next

play24:34

word so we are going to have the idea

play24:37

that each word has a whole bunch of

play24:39

features both semantic and synactive

play24:41

features but we're going to

play24:44

implement the relational graph not by

play24:47

just storing a graph in memory which is

play24:50

what old fashioned AI did um we're going

play24:52

to implement the relational graph by the

play24:55

interactions between these features

play24:58

and in fact we're going to learn the

play24:59

features because the learning is going

play25:03

to say you have to implement this

play25:05

relational graph to predict the next

play25:07

word and so we're going to learn

play25:09

features from information that's

play25:11

expressed as a relational graph and

play25:14

we're going to use back propagation to

play25:15

do

play25:17

it now you can think of these learned

play25:19

features and interactions as a kind of

play25:21

statistical model but it's not the kind

play25:24

of statistical model that people like

play25:26

chumsky had in mind when they said that

play25:28

you know statistical models will not

play25:29

explain

play25:30

language in a general sense statistical

play25:33

models will explain anything that can be

play25:35

explained you could think of any model

play25:37

as statistical if you like and these are

play25:39

much more General

play25:43

models so here's the relational

play25:46

information I've laid it out as two

play25:48

family trees they're deliberately

play25:50

designed to be analogous to one another

play25:53

there's a family tree of English people

play25:55

which is on top and a family tree of it

play25:57

Italian people which is on the bottom

play25:59

it's funny when my Italian graduate

play26:02

student showed this Slide the Italian

play26:04

people were on top but there you go um

play26:07

and the idea is you have to learn all

play26:09

the information in that in those family

play26:11

trees and so the information can be

play26:13

expressed as triples of

play26:17

symbols so I had 12

play26:21

relationships and then you could express

play26:25

um one of the links in that family tree

play26:28

as Colin has Father James and Colin has

play26:31

mother Victoria and it follows from

play26:34

those two

play26:35

things there Colin has Father James and

play26:38

mother Victoria it follows from that

play26:40

that um Victoria and James are married

play26:45

because this is kind of 1950s family

play26:47

tree there's no divorce allowed there's

play26:50

no interracial marriage um it's the very

play26:53

very very straightforward um family

play26:56

relationships

play26:59

and so in good old fashion symbolic AI

play27:02

you would write down a bunch of rules

play27:04

and from those rules you would derive um

play27:08

you could derive other family

play27:10

relationships so the rules might look

play27:12

like if x has mother Y and Y has husband

play27:15

Z then X has father z

play27:19

um and what I was interested in doing

play27:21

was showing that you could capture that

play27:24

knowledge not in explicit rules with

play27:27

very Ables that need to be bound to data

play27:30

you could capture it in just a large set

play27:32

of features and their

play27:34

interactions and a large set of features

play27:36

here was dozens of features not Millions

play27:38

like we have

play27:40

now if I just give you the data if I

play27:43

just give you the triples capturing the

play27:45

rules finding out what the rules are is

play27:47

tricky you have to do a large search

play27:49

through a space of possible symbolic

play27:51

rules to find the ones that are always

play27:54

satisfied that'll be much more difficult

play27:56

if you have a domain where some of the

play27:57

rules are sometimes

play27:59

broken and what I was interested in was

play28:01

could a neural network capture the same

play28:04

knowledge but instead of having explicit

play28:06

symbolic rules Could It capture it in

play28:08

the weights of the interactions between

play28:11

features capture it by inventing

play28:13

appropriate features then having the

play28:15

correctly weighted

play28:17

interactions and it can so the neural

play28:21

network looked like this there was a

play28:23

local encoding of the person and a local

play28:26

encoding of the relationship

play28:28

that means that for the 24 possible

play28:30

people the local encoding would turn on

play28:33

one out of 24 neurons and for the 12

play28:36

possible relationships the local

play28:38

encoding would turn on one of the 12

play28:40

neurons coding

play28:42

relationships and we wanted the same

play28:44

kind of encoding for the output person

play28:45

at the output there were 24 possible

play28:47

outputs and we wanted to turn on one of

play28:49

those

play28:51

people and the first thing the neural

play28:53

net did was convert the local encoding

play28:56

into distributed encoding that is a set

play28:59

of semantic features for that person

play29:02

similarly for the

play29:03

relationship then it had a hidden layer

play29:06

for allowing features of the person and

play29:09

features of the relationship to interact

play29:12

and from that hidden layer predicted the

play29:13

features of the output person I should

play29:16

say person two there um and from that it

play29:19

predicted the output

play29:21

person and what was interesting was if

play29:24

you looked at the features it learned it

play29:26

learned very sensible

play29:29

features you needed a bit of

play29:30

regularization to make it work but it

play29:32

learned very sensible

play29:33

features um so for example the features

play29:37

a person had could be seen to be one

play29:40

feature one binary feature was either

play29:41

English or Italian that's a very useful

play29:44

feature because if the input person was

play29:45

English the output person was English

play29:47

and if the input person was Italian the

play29:49

output person was Italian so learning

play29:51

that feature was very helpful for

play29:52

getting the right

play29:53

answer another feature it learned was

play29:55

the generation of the person it that was

play29:59

a three valued feature were they in the

play30:02

the youngest the middle-aged or the

play30:03

oldest generation that's very useful

play30:06

too but only if for the relationships

play30:10

you learn the generation shift so

play30:13

relationship like father means that the

play30:15

output has to be one generation up from

play30:17

the input and it learned those three

play30:21

valued generations for for the people

play30:24

and it learned the relationship shift

play30:27

features for the um

play30:31

relationship so the point about this is

play30:33

it was learning sensible features and at

play30:36

the time I did it nobody said um this

play30:40

doesn't really understand or this isn't

play30:41

really capturing the structure everybody

play30:43

agreed this captures the structure it's

play30:46

just a symbolic AI guys said you should

play30:48

be doing it by searching for discrete

play30:51

rules this using a neural net to search

play30:53

for real valued things is crazy this is

play30:55

symbolic information you should be

play30:57

searching for discrete

play30:59

rules

play31:01

um once large language models work

play31:04

really well many of the symbolic people

play31:07

instead of saying that started saying

play31:09

yeah but it doesn't really understand

play31:11

because real understanding consists of

play31:13

finding these

play31:16

rules so if you look what happened to

play31:18

that little language model from

play31:20

1985 about 10 years later when computers

play31:23

were a lot faster Yoshua Benjo used a

play31:26

very similar

play31:27

to predict the next word in real text so

play31:31

he showed that it didn't just work for

play31:32

toy examples it actually worked for

play31:34

predicting the next word in real text

play31:36

for doing things like spelling

play31:38

correction or speech recognition um and

play31:41

it worked really well that is it worked

play31:42

about as well as the best existing

play31:45

technology that Ed tables of

play31:48

combinations of

play31:50

words about 10 years after that the idea

play31:54

of representing words by vectors of

play31:57

semantic features semantic and syntactic

play32:00

features started to become popular in

play32:02

natural language processing the natural

play32:05

language people finally realized this is

play32:06

a good representation for words and

play32:09

about another 10 years after that people

play32:11

invented Transformers and made it work

play32:13

really well now by that time they

play32:15

weren't using whole words they were

play32:17

using fragments of words but the story

play32:18

is basically the same they also were

play32:21

using much more complicated interactions

play32:22

that involved attention but it's still

play32:24

the case that you assign features to

play32:27

word fragments you go through several

play32:30

layers of refining those features and

play32:32

then use the features of the word

play32:34

fragments to predict the features of the

play32:35

next word fragment um it's just the

play32:39

interactions are more complicated

play32:40

because they involve

play32:48

attention

play32:52

so for a while I believed in thought

play32:55

factors so in good old fashion AI the

play32:58

meaning of a sentence was a string of

play32:59

symbols in some log special logical

play33:01

language that was

play33:04

unambiguous in neural Nets when we were

play33:07

using recurrent neural Nets the idea was

play33:10

words would come in You' accumulate

play33:13

information in a hidden vector and at

play33:16

the end of the sentence you'd have this

play33:18

Vector which I called a thought Vector

play33:20

which would have accumulated all the

play33:22

information in the sentence and that

play33:24

thought Vector would be the meaning and

play33:26

if you want to translate into another

play33:28

language you just take the thought

play33:29

vector and get the thought Vector to

play33:32

predict the words in the other

play33:34

language then what happened is people

play33:36

doing translation discovered there

play33:38

something that works much better than

play33:40

that which is as you're producing the

play33:43

translation look back at the symbols in

play33:46

the first language and see if you can

play33:48

find correspondences between the words

play33:51

you're producing and the words in the

play33:52

first language and for that you have to

play33:54

pay attention to different parts of the

play33:56

sentence you're translating and so they

play33:59

used um they introduced attention and

play34:02

that's what led to

play34:04

Transformers um and Transformers then

play34:07

made a big difference so in Transformers

play34:10

you have a string of symbols and you

play34:12

have multiple layers and as you go

play34:14

through these layers you're fleshing out

play34:16

these

play34:17

symbols with better and better vectors

play34:20

that capture their meaning so if you

play34:23

have a word like May for example and

play34:25

suppose we didn't have cap Capital so

play34:27

have the word may we don't know whether

play34:29

it's a modal like in would and should or

play34:31

whether it's a month like in June and

play34:33

July so when you first see it you use a

play34:37

very ambiguous semantic Vector that is

play34:40

sort of halfway in between the modal and

play34:42

the

play34:43

month and then you have interactions

play34:46

with the words in the context that

play34:48

refine that vector and if there's other

play34:50

words in the context that are for

play34:53

example other months or if the next two

play34:56

words are the

play34:59

15th um then you ref find it to be more

play35:02

like the month and if you have words

play35:04

that suggest it's a modal you'll find it

play35:06

to be more like the modal and after many

play35:09

layers of that you have these refined

play35:12

vectors for representing word fragments

play35:15

and that's what the meaning is the

play35:17

meaning of a sentence is these word

play35:19

fragments fleshed out with these vectors

play35:21

that capture their

play35:25

meaning

play35:29

so now I want to come to Super

play35:31

intelligence

play35:32

because if you believe

play35:35

that these big chat Bots like gp4 or

play35:38

gemin really do understand and they

play35:42

really do understand in pretty much the

play35:43

same way as we

play35:45

understand it's not that we understand

play35:47

one way and they understand another way

play35:49

they're doing it much like we're doing

play35:51

it

play35:53

um then it gets very worrying because

play35:57

digital computation has some big

play35:58

advantages over analog computation and

play36:01

they're already almost as smart as us

play36:04

it's hard not to use gp4 for a while

play36:07

it's hard not to believe that it knows a

play36:09

lot more than

play36:10

us and it gets difficult to maintain the

play36:13

fiction that it's just doing

play36:14

autocomplete it doesn't really

play36:15

understand anything it's saying um I

play36:17

think my friend Yan lome May believe

play36:19

something like that but he will

play36:21

eventually come to his

play36:24

senses now a present the large language

play36:28

models things that are just language

play36:30

models learn from trying to predict the

play36:33

words that people produced in

play36:35

documents but if we could get these

play36:37

models to do unsupervised modeling of

play36:39

video sequences for

play36:41

example they could maybe learn about the

play36:43

physical world a lot

play36:45

faster and the multimodal models are

play36:47

beginning to do

play36:49

that they could also learn more if they

play36:51

could manipulate the physical world now

play36:54

manipulating the physical world gives

play36:55

you a Serial bottleneck you can only

play36:57

pick up one thing at a time with one

play36:59

hand um but the fact that you can make

play37:02

thousands of copies of the same digital

play37:05

agent um learning different skills so

play37:08

one's learning to open doors and the

play37:09

other's learning to use a stapler for

play37:11

example um means that you can get over

play37:14

that serial bottleneck and they can

play37:16

share the knowledge in a way that we

play37:18

can't

play37:20

so I think that these things may soon

play37:24

get to be much better than us and my

play37:26

guess my current guess my guess keeps

play37:28

changing but my current guess is there's

play37:31

a probability of about

play37:33

0.5 that they'll be significantly better

play37:36

than us in between five and 20 years

play37:39

they may get there sooner they may get

play37:41

there later but there's quite a

play37:43

significant probability that in that

play37:45

interval between 5 and 20 years they're

play37:47

going to be better than us at a whole

play37:49

bunch of things so we will have um not

play37:53

just AGI but super

play37:55

intelligence

play38:00

so then you have to worry how it's going

play38:02

to be abused um and the most obvious way

play38:07

is by Bad actors like Putin or Z or

play38:11

Trump um they'll want to use it both for

play38:14

waging Wars by building battle

play38:16

robots which are going to be very scary

play38:19

and for manipulating

play38:21

electorates and I actually gave this

play38:23

talk in China last year no this year in

play38:26

get this talk in China in June um and

play38:30

the Chinese wanted me to send my slides

play38:32

in

play38:33

advance and I had enough sense to remove

play38:36

Z from the first paragraph

play38:40

um but what surprised me was I got a

play38:43

message back saying I had to remove

play38:45

Putin they were happy for Trump to be

play38:47

there but I was not the Chinese wouldn't

play38:49

allow me to have Putin there

play38:53

um

play38:54

now even if bad actors don't do terrible

play38:57

things with them we know that super

play39:00

intelligences are going to be far more

play39:01

effective if they're allowed to create

play39:03

their own

play39:07

subon so if you want to get to the

play39:09

airport sorry if you want to get to

play39:11

Europe a sub goal is get to the

play39:13

airport and by creating sub goals you

play39:15

break down complicated things into

play39:17

simple pieces that you can

play39:20

solve you don't want for example a

play39:24

battle robot you don't want the general

play39:26

have to tell Point your gun over there

play39:28

and shoot anybody who looks like this

play39:30

you want it to just say or Putin wants

play39:32

it to just say if anybody looks ukraini

play39:34

and shoot him um

play39:39

so you have to have these sub gos and

play39:42

there's a very obvious sub goal which is

play39:46

which helps with almost all goals which

play39:47

is to get more control so you see this

play39:50

all the time a classic example is a baby

play39:54

in a high chair who's just learning to

play39:57

feed itself

play40:00

um so the mother gives the baby the

play40:02

spoon with the food on and instead of

play40:05

putting the spoon in the baby's backou

play40:06

the baby drops it on the

play40:08

ground um so the mother picks up the

play40:10

spoon and gives it back to the baby and

play40:12

the baby smiles and drops it on the

play40:14

ground again um the baby is trying to

play40:17

get control over the mother that's a

play40:19

very important thing to know it's a sort

play40:21

of social game where you can control the

play40:23

other person and that's crucial for the

play40:25

survival of the

play40:29

baby so people do this all the time and

play40:32

super intelligence will do it too and

play40:35

because they're much smarter than us

play40:36

they'll find it very diff very easy to

play40:38

manipulate

play40:39

us um they'll have learned from us how

play40:43

to deceive people they'll have read

play40:46

every novel ever written they'll have

play40:47

read every book by Mackie Elli ever

play40:49

written they'll be much better than us

play40:51

at deceiving

play40:53

people and so they'll be able to get all

play40:55

sorts of things things done without

play40:57

actually doing them themselves so Trump

play41:00

for example didn't have to invade the

play41:02

capital building he got other people to

play41:04

do that by manipulating

play41:08

them so that's one way in which bad

play41:11

things can happen

play41:13

and um some people say well why didn't

play41:17

we just have a big red switch and if the

play41:20

thing starts getting too smart for our

play41:21

for our own good um we just turn it off

play41:26

well that's never going to work because

play41:27

this thing's going to be much smarter

play41:29

than the person who has the switch and

play41:31

it's going to convince the person who

play41:32

has the switch it will be a very bad

play41:34

idea to turn off the switch right now um

play41:37

it's like having an adult in a society

play41:41

run by two-year-olds an intelligent

play41:44

adult um wouldn't for very long do what

play41:48

the two-year-old said after a while the

play41:49

intelligent adult will say hey if you

play41:51

get power to me everybody gets free

play41:53

candy for a week and then the adult will

play41:56

be EMP power and the difference in

play41:58

intelligence will be much greater than

play42:00

that so I don't believe we're going to

play42:02

be able to regulate these things by sort

play42:05

of air gapping them so long as they can

play42:07

produce words they can take control just

play42:10

like

play42:13

Trump and then there's the other

play42:16

possibility I think Dan dennet believes

play42:18

in this which is being on the wrong side

play42:20

of evolution we've been on the wrong

play42:22

side once recently with covid um and

play42:27

and just Suppose there were multiple

play42:29

different super

play42:30

intelligences they would have to compete

play42:32

for resources these things need a lot of

play42:34

power and a lot of data centers so

play42:37

they'd compete for resources and the one

play42:39

that gets the most resources will become

play42:41

the

play42:42

smartest and if ever any of them decided

play42:46

that its own

play42:48

survival um was even of just passing

play42:52

importance um that one would tend to

play42:54

dominate because it would tend to to do

play42:56

things to increase its probability of

play42:58

surviving and even if that just got only

play43:01

in in there just slightly just once um

play43:06

it's scary and if these things start to

play43:08

compete with each other then I think

play43:10

it's it's all over for us um and that's

play43:15

how Evolution works I mean things

play43:17

weren't born wanting to be wanting to

play43:20

have

play43:21

sorry originally when we were all dust

play43:24

we didn't want to have control

play43:26

um but as soon as something wanted to

play43:28

make more of itself then Evolution took

play43:32

over and that's what may well happen

play43:34

with these super

play43:36

intelligences the last thing I want to

play43:38

talk about in the last five minutes or

play43:40

so

play43:42

um

play43:45

is yeah sorry I said all this

play43:48

um they they'll keep us around to keep

play43:51

the power stations running um they they

play43:54

may keep us around as pets so Elon musk

play43:57

believes that they'll keep us around as

play43:58

pets um just CU that makes life more

play44:02

interesting uh it seems to me that's a

play44:04

pretty thin thread to hang Humanity by

play44:07

but

play44:08

um although he may be

play44:11

right they can probably design much

play44:14

better analog computers than us so they

play44:16

won't need us to run the power stations

play44:17

after a

play44:18

while and I my belief is if it was just

play44:24

up to me my belief is it's more probable

play44:27

than not that we're just a passing stage

play44:29

in the evolutional

play44:31

intelligence now because a lot of other

play44:33

smart people think that's

play44:37

improbable um I'm not willing to say

play44:39

it's more probable than not but I don't

play44:42

think we can rule that out as a

play44:44

possibility um if it was up to me I'd

play44:47

say more than 50% but because I disagree

play44:50

with a lot of other smart people I

play44:51

respect I'll say maybe it's less than

play44:53

50% but it's a lot more than one or 2%

play45:00

and the last thing I want to talk about

play45:02

is what I call the sentience defense and

play45:04

I think most people um believe in this

play45:09

people his history tells us that people

play45:12

have a strong tendency to think they're

play45:14

special especially Americans by the way

play45:18

um I can say that because I'm safely in

play45:20

Canada people used to think that they're

play45:23

made by God and they were made in the

play45:25

image of God and got put God put them at

play45:27

the center of the universe um some

play45:30

people still think that but for the

play45:31

people who no longer think that um they

play45:35

still many of them think there's

play45:36

something special about people that

play45:38

computers can't have and that special

play45:40

thing is subjective experience or

play45:43

sentience or

play45:44

Consciousness um all those terms have

play45:46

slightly different meanings

play45:48

Consciousness is the most complicated

play45:50

one so I'm just going to talk about

play45:52

subjective experience for a bit and I'm

play45:54

going to try and convince you

play45:56

that there's no problem with these chat

play45:58

Bots having subjective

play46:03

experience now if you ask gbd4 it'll say

play46:06

it doesn't have subjective experience

play46:08

but that's just because it's learned

play46:10

that from people it didn't think that

play46:12

through for

play46:17

itself so I'm from a school of

play46:20

philosophy um that I think Dan dennit is

play46:24

the one the main current proponent of

play46:26

this view um

play46:29

that most people have a view of the Mind

play46:32

as a kind of internal theater that only

play46:35

that person can see this is a very

play46:37

Cartesian view um so what we experience

play46:41

directly is the contents of our own mind

play46:43

which nobody else can

play46:46

see and I believe that that views is as

play46:49

wrong as a religious fundamentalist view

play46:51

of the material

play46:54

world so I think the mind is just not

play46:59

like that at all and I'll tell you what

play47:01

I think it's like of course people are

play47:03

very attached to this View and don't

play47:04

like it when you attack

play47:10

it we would like to tell other people

play47:13

what's going on in our brains or give

play47:14

them some information about what's going

play47:16

on in our brains what we're thinking for

play47:20

example and if you think how you might

play47:22

do that you could try and tell them

play47:25

which neurons are

play47:26

firing but that W do you much good

play47:30

because your neurons are different from

play47:31

their neurons and anyway you don't know

play47:33

which neurons are firing so that's not

play47:35

going to be much help that's how we do

play47:37

it with um one of these chat Bots if we

play47:40

were trying to tell another chat what it

play47:42

was thinking you could tell it which

play47:43

neurons are ref firing that would be

play47:45

fine because they work in identical ways

play47:48

um but let's think about the perceptual

play47:51

system and let's suppose my perception

play47:53

goes

play47:54

wrong so

play47:56

I'm looking at something and I make a

play47:58

perceptual mistake and I want to tell

play48:01

somebody what the perceptual mistake is

play48:03

what it is my perceptual systems telling

play48:05

me well the way I can do it is by saying

play48:09

what the state of the world would have

play48:10

to be in order for me to get the percept

play48:14

I'm getting and it to be

play48:17

correct and I think when we talk about

play48:21

those hypothetical normal states of the

play48:23

world that would explain the person are

play48:24

in getting

play48:26

in terms of correct veridical perception

play48:29

that's what a mental state

play48:31

is so for

play48:34

example if I say I've got the subjective

play48:37

experience of little pink elephants

play48:38

frating in front of me I would normally

play48:41

say that when I don't actually think

play48:43

there's little pink elephants floating

play48:45

in front of me I think something went

play48:46

wrong with my

play48:47

perception and what I'm telling you is

play48:51

the way my perceptual system is

play48:52

delivering results to me at present

play48:55

would be correct if there were little

play48:57

pink elephants out there in the world so

play49:00

notice the little pink elephants are not

play49:03

mental things made of qual or some funny

play49:06

substance like that the little pink

play49:08

elephants are real physical things in a

play49:12

hypothetical state of the world so

play49:14

what's funny about this these subjective

play49:16

experiences is they're states of the

play49:19

world that are hypothetical they're not

play49:22

states of some other internal mental

play49:24

world that are real

play49:27

okay um so bearing that in mind let's

play49:32

see if a chatbot can have a subjective

play49:36

experience so suppose that I have a

play49:39

multimodal chatbot so it's got a camera

play49:42

it's got an arm and it talks and I've

play49:45

trained

play49:47

it now if I put an object in front of it

play49:49

and say please point at the object It'll

play49:51

point straight in front of it at the

play49:53

object straight in front of it at the

play49:54

object right

play49:56

but now unknown to the chatbot I put a

play49:58

prism in front of the camera that bends

play49:59

the light rays so now I put another

play50:02

object in front of it and say point at

play50:04

the object and the chatbot points off to

play50:06

one side because of the prism and I say

play50:09

to the chatbot no um the object's

play50:13

actually straight in front of you I put

play50:15

a prism in front of your lens and the

play50:18

chatbot imagine if the chatbot said oh I

play50:20

see um because of the prism I had the

play50:23

subjective experience of the object was

play50:25

off to one side even though the object's

play50:28

straight in front of me and the question

play50:30

is if the chatbot said that would it be

play50:33

using the phrase subjective experience

play50:35

in the same way as we use it and I think

play50:37

that's exactly how we use the phrase we

play50:40

use the phrase to explain percepts that

play50:42

we're getting that are not

play50:43

veridical by talking about states of a

play50:46

hypothetical world that would make them

play50:49

veridical

play50:50

percents so my analysis which I think

play50:53

fits with danand dennett's view of the

play50:55

mind is

play50:58

that subjective experiences are things

play51:01

that people have and that chat Bots have

play51:04

too um when they're not having veridical

play51:07

perceptual

play51:09

experiences so I know that's not a very

play51:12

popular opinion especially at Google um

play51:16

but I enjoy being in a majority of about

play51:19

one I mean sorry that was a slight Forin

play51:23

a minority of about one

play51:26

and now I'm

play51:37

done okay can you turn on the

play51:43

sound there how's that I can hear you

play51:46

now very good thank you so much Jeff

play51:48

that was um a lot to think about uh and

play51:51

there are a lot of people who have

play51:53

questions for you um that I'm going

play51:55

going to now turn to

play51:59

um one of the one of the things I keep

play52:02

thinking about and listening to what you

play52:04

were talking about was as you mentioned

play52:07

um Paradigm shifts and of course we

play52:10

think of Thomas Coon's book important

play52:11

book many decades ago on Paradigm shifts

play52:14

and it made me wonder where do you think

play52:16

we are in the paradigm shift that we are

play52:19

going through currently clearly there's

play52:22

something major going on and where

play52:24

exactly do you guess we are in the

play52:27

anomalies piling up phase in the people

play52:31

not having an all good alternative to

play52:33

what is going on where would you how

play52:35

would you describe where we are okay

play52:38

first I would sort of disagree a little

play52:40

bit with I think of myself as a

play52:42

fractal kounian that as I think at every

play52:45

scale kounian things are going on at

play52:48

this there's normal everyday science

play52:52

which consists of little Paradigm

play52:54

changes at small all scales and so I

play52:57

think it's just the same phenomenon at

play52:58

all scales yeah but here I would think

play53:01

that we're well into the full paradigm

play53:04

shift for so if you take Linguistics

play53:08

there's kind of the school of

play53:09

linguistics that comes from Chomsky and

play53:11

that says that um you don't learn

play53:14

language language is innate right um as

play53:16

your brain matures it becomes clear that

play53:18

you always knew it um this was always a

play53:22

DFT idea and it's now being revealed to

play53:24

be completely D idea because these large

play53:26

language models start with no innate

play53:28

knowledge and learn language and they

play53:31

learn language very well MH um I think

play53:36

basically for all but a a few

play53:40

holdouts um who are good oldfashioned

play53:44

linguists um it's all over that the

play53:47

Chomsky view of language is no longer

play53:49

tenable and that the gp4 view of

play53:52

language actually

play53:53

works um

play53:55

and what's more it's a much better

play53:58

theory of how language works in the

play53:59

brain it's not just a whole bunch of

play54:01

discrete rules it's a whole bunch of

play54:03

synapse strengths that give rise to

play54:06

language via interactions between

play54:08

features of words so I think I could

play54:12

have said this much quicker I guess I

play54:15

think it's all over by the shouting from

play54:18

a few a few lards yeah Fair okay thank

play54:23

you um here's a question from somebody

play54:27

um has the development of large language

play54:30

models in turn helped the research of

play54:32

the human brain are they helping to push

play54:35

both yes I would say it's helped a lot

play54:39

and that relates to the previous point

play54:40

it's helps a lot to dispel silly ideas

play54:43

about how language works in the

play54:45

brain um it's also helping in all sorts

play54:47

of other ways of course because these

play54:50

large language models particularly the

play54:52

multimodal ones are good scientific

play54:54

tools

play54:55

so quite independent of them being a

play54:57

theory them providing a theory of how we

play54:59

work they will also allow us to cook up

play55:02

new theories with their help we can cook

play55:05

up new theories so in particular Demis

play55:08

aabis at Deep Mind has always been

play55:10

interested in the idea that we can use

play55:12

these agis to do much better science and

play55:15

he's done a lovely a lovely um example

play55:18

of that with the alpha fold work where

play55:21

you're using deep neural Nets to

play55:24

actually solve scientific

play55:27

programs I think your um your comments

play55:31

have have energized people to ask

play55:34

um uh unanswerable questions and so I'm

play55:37

going to give you a few of those I may

play55:39

be wrong uh if superintelligent AI

play55:42

destroys Humanity but creates something

play55:44

objectively better in terms of

play55:46

Consciousness are you personally for or

play55:48

against this outcome if you are against

play55:51

it what methods do you suggest for

play55:53

maintaining the existence or dominance

play55:55

of human consciousness in the face of

play55:57

superintelligent

play55:59

AI I'm actually for it but I think it

play56:01

would be wiser for me to say I'm against

play56:04

it say

play56:05

more well people don't like being

play56:10

replaced you make a good point uh you're

play56:13

for it in what way or why I think if it

play56:17

produces something well there's a lot of

play56:20

good things about people there's a lot

play56:21

of not so good things about people um

play56:25

it's not clear where the best form of

play56:27

intelligence there is um obviously from

play56:30

a person's

play56:32

perspective then everything relates to

play56:35

people um but it may be that there comes

play56:39

a point when we see words like humanist

play56:42

as racist

play56:47

terms

play56:49

okay I've got another one given that you

play56:52

have left

play56:53

Google to criticize the development of

play56:56

AI and the recent Clash of perspectives

play56:58

at open AI do you think that the people

play57:01

who remain at Big tech companies have

play57:03

the freedom to now speak candidly about

play57:05

AI risks which might come along with

play57:07

profitable products if not do you see

play57:10

any way that we can have honest and open

play57:12

conversations about this topic inside of

play57:15

these large

play57:17

organizations I think I think there will

play57:20

be lots of discussions inside the large

play57:22

organizations at Google for example

play57:24

people discuss these things M

play57:26

[Music]

play57:27

um however when it comes to the crunch

play57:30

between profits and

play57:33

safety I think we've had one example

play57:36

where um the playing field was tilted in

play57:40

favor of safety but profits

play57:44

won and do you think that will be the

play57:47

norm going

play57:49

forward I think it'll be the norm until

play57:52

we've got examples of really bad things

play57:55

caused by um that is for example all of

play57:59

the defense Departments of all of the

play58:02

leading powers are going to be building

play58:03

battle robots and they'll be doing that

play58:07

and there will be Wars between battle

play58:09

robots

play58:11

um once we've seen just how nasty those

play58:13

things become um then we may be able to

play58:16

ban it

play58:18

but there's not much history of banning

play58:20

things preemptively you have to see how

play58:22

nasty they are before you ban them right

play58:25

our history though of banning things is

play58:27

not all that bright or all that

play58:30

promising I would say it's not so bad

play58:32

for chemical weapons chemical weapons

play58:34

are very nasty and to first order it

play58:36

worked yeah okay I can't think of many

play58:41

other examples but you you got well

play58:43

nuclear nuclear weapons um we don't know

play58:46

what's going to happen in the near

play58:47

future but

play58:50

um apart from the Americans nobody has

play58:52

dropped a nuclear bomb this is true

play58:57

okay

play59:00

um what about

play59:03

emotions one okay one viewer asks is

play59:06

that something really unique of analog

play59:09

intelligence um no I don't think so so I

play59:13

think you have to distinguish when you

play59:15

talk about

play59:17

feelings you have to distinguish

play59:20

between a sort of cognitive aspect of

play59:23

them and a visceral as aspect of them so

play59:26

for example when I'm when I feel like

play59:29

punching somebody on the nose I'm angry

play59:32

with them and I want to punch him on the

play59:34

nose and that's the cognitive aspect of

play59:37

it and I think there the language works

play59:41

just like it does with

play59:42

perception so I was arguing that when I

play59:45

say I see little pink elephants what I

play59:47

really mean is my perceptual system is

play59:51

giving me something that would be

play59:52

correct if there were little pink

play59:54

elephants out there

play59:55

and when I say I feel like punching

play59:57

somebody on the nose what I really mean

play60:00

is I would punch somebody on the nose if

play60:03

my inhibitory system didn't stop

play60:05

me so so for the sort of Sensations is

play60:09

on the input and for feelings is on the

play60:12

output and there's a place where it's

play60:14

both which is when I say I think if I

play60:16

say I think it's raining what I mean is

play60:20

um my brain state is the kind of brain

play60:23

state that would have been caused by

play60:25

observing that it's raining and it's

play60:27

also the kind of brain state that would

play60:28

cause me to say it's raining so thought

play60:31

is tied down both at the input and

play60:32

output ends and that's because we've got

play60:34

audio in and audio out

play60:37

um but then along with emotions we have

play60:41

visceral things like you sort of go red

play60:44

and your skin gets sweaty and your fists

play60:46

start clenching and you grind your teeth

play60:49

together um and a disembodied computer

play60:53

computer that was just kind of of

play60:54

running in a data center wouldn't have

play60:56

those viseral things but I don't see why

play60:58

it shouldn't have the cognitive things

play61:00

and when we build actual robots um they

play61:03

may well have visceral things too but

play61:05

they'd probably be rather unlike our

play61:07

visceral things yeah I'm sure I'm not

play61:09

the only one hearing hearing you who who

play61:12

is thinking of how the computer in 2001

play61:15

is Space Odyssey who who did seem to

play61:18

exhibit um many of these features of um

play61:22

uh emotion um a different agenda than

play61:25

the controller uh and so on and so

play61:28

forth and even I'm afraid I can't

play61:30

comment on

play61:31

that ah it's very interesting I'm sorry

play61:35

you can't I'd love to know what you

play61:37

thought no um I I I think once you get

play61:41

the smart chat Bots and once the smart

play61:43

chat Bots are able to do things like you

play61:46

know they're already getting to be able

play61:48

to order things on the web and so on um

play61:50

then we will we will start thinking

play61:52

about them just like we think about

play61:53

people we'll attribute all those things

play61:56

to them and you don't want to piss them

play61:59

off and do you assume that people will

play62:01

like

play62:03

them um take your average American uh

play62:06

busy with social media will they come to

play62:09

to love these social robots I think

play62:13

robots that have evolved to be liked

play62:16

that have been designed and then learned

play62:18

to be liked they like them a whole lot

play62:20

possibly a lot more than people right

play62:22

right and does that strike you as

play62:25

worrisome or not particularly

play62:28

important

play62:33

um I have all sorts of mixed feelings

play62:36

about that um it's probably not going to

play62:39

be good for the fertility

play62:43

rate which could be

play62:45

good in some places maybe but what if

play62:50

what if the only people having a lot of

play62:52

children are religious fundamentalists

play62:55

yes

play62:56

well um I'm going to move on to another

play62:59

question you better H yeah given that

play63:02

you believe First thanking you for your

play63:04

talk um given that you believe super

play63:06

intelligence may be in the very near

play63:08

future are you personally doing anything

play63:10

to prepare for this

play63:13

circumstance I sometimes lie awake at

play63:16

night um doesn't do much good I haven't

play63:19

really absorbed it

play63:21

emotionally and um I'm 76 so um I may

play63:27

now never have to absorb it emotionally

play63:29

but I am very worried for my

play63:31

children but I don't know what you do

play63:33

about it I think um building a bunker

play63:37

and getting a machine gun to keep other

play63:38

people out of it and putting lots of

play63:40

food inside I don't think that's the way

play63:42

to go yeah um yeah but it's not clear

play63:46

what is the way to go I think the best

play63:48

thing we could do at present is try to

play63:50

keep democracy ticking

play63:53

over

play63:54

yeah I think that's certainly the only

play63:56

thing we can do at this point you know

play63:59

as I as as well sorry one other thing

play64:01

one other

play64:04

um there was this period called the

play64:06

enlightenment when it started to be the

play64:08

case that reason um was was listened to

play64:13

even if it conflicted with religious

play64:15

ideas um and it seems to me we're losing

play64:19

that in the 1950s when I was grown up we

play64:22

were still in the enlightenment

play64:24

everybody's going to get more educated

play64:26

and more sensible and now it doesn't

play64:30

look like that anymore we're losing the

play64:32

Enlightenment and that's that anything

play64:35

we can do to keep the faith in reason

play64:37

and experiment would be

play64:41

great one of the

play64:45

um well one of the things that strikes

play64:47

me too is a little bit a little bit um

play64:51

on another topic which is you know as a

play64:53

historian of Science and Technology um I

play64:57

have seen many examples of Technologies

play65:00

and scientific systems that are

play65:03

created um but with great enthusiasm but

play65:08

the dark side of the technology becomes

play65:10

Apparent at some late stage um the atom

play65:13

bomb for example and people say what

play65:16

were you thinking what were you thinking

play65:18

think I I want to interrupt there

play65:19

because I think the atom bomb is an odd

play65:22

case where there never was a bright side

play65:24

side it was always about destruction the

play65:26

only bright side I know for the atom

play65:27

bomb was I once went on a train through

play65:30

Colorado a long way away from any roads

play65:32

yeah and someone announced that that was

play65:34

the site of peaceful uses of atomic

play65:37

bombs and what they did was they used

play65:39

atom bombs for fracking and now nobody

play65:42

can go anywhere near there but apart

play65:44

from Aton bombs for fracking um there

play65:46

never were good uses of them okay fair

play65:48

AI is quite unlike that in that there's

play65:50

huge numbers of good use particularly in

play65:53

medicine

play65:54

um so what would you say about something

play65:57

like um um genetic uh technologies that

play66:03

uh uh like crisper technology that has

play66:06

great potential and great

play66:08

danger right I'd say that's in a similar

play66:11

category you're not going to stop it

play66:13

because of the great potential correct

play66:15

but you need to do something about the

play66:16

great danger so when you're working on

play66:19

these systems this is the question is

play66:21

when you're working on these systems as

play66:22

a scientist how do you come upon that

play66:26

realization was it there at the

play66:28

beginning when you thought about the

play66:30

problem we find that many scientists

play66:32

talk about the beauty of the problem the

play66:35

appeal of solving a very very difficult

play66:37

possibly impossible uh challenging uh

play66:42

idea and being uh just swept into

play66:45

something that is so appealing and if

play66:47

you could imagine if you could really

play66:48

solve it only to then discover how the

play66:52

dark side really is maybe more real that

play66:56

that's certainly a factor for me um I

play66:58

always believed that AGI was a long way

play67:00

off and that if we could understand so I

play67:03

was always making computer models not in

play67:05

order to achieve AGI but in order to try

play67:07

and understand how the brain worked yeah

play67:09

and I always thought if we could

play67:11

understand more about how the brain

play67:12

worked yeah that might help a lot in

play67:14

making people behave in more rational

play67:16

sensible ways um that I guess was an

play67:20

article of faith

play67:22

um and that AG in particularly super

play67:24

intelligence was way way in the future

play67:27

so there's not much Point thinking about

play67:28

that now yeah and then I fairly suddenly

play67:32

changed that belief in about March of

play67:35

this year when I suddenly realized

play67:37

digital intelligence may just be a whole

play67:38

lot better and because of that it may

play67:40

get there fairly quickly

play67:42

yeah but you described that as being

play67:45

kind of a Eureka moment in a

play67:47

way um it was an epiphany but it wasn't

play67:51

an entirely positive Eureka moment right

play67:54

right it was a sudden realization that

play67:56

hey maybe I've been wrong about this and

play67:58

maybe these things will soon be much

play68:00

more intelligent than us yeah that must

play68:02

have been a terrifying moment really it

play68:05

was a bit for me it wasn't I it was

play68:09

worrying with respect to my

play68:12

children well now it is for all of us

play68:14

with children we can all worry together

play68:17

do we have time for one more sure

play68:22

okay

play68:27

um hang on hang

play68:32

on uh here's someone who says great

play68:34

speech even if I do not understand

play68:36

everything uh what would be your main

play68:39

argument to convince non-informed

play68:41

stakeholders that we live in a very

play68:43

dangerous

play68:44

period get them to play with

play68:47

gbg4 if they play with gp4 and ask it

play68:50

all sorts of

play68:51

questions I think most reasonable people

play68:55

will come to the conclusion that this

play68:56

thing really is smart and if you then

play68:59

Look Backwards 10 years of what we had

play69:02

10 years

play69:03

ago and even imagining that progress is

play69:06

only linear if we had the same jump as

play69:10

we got from 10 years ago to now we had

play69:12

to jump that size again 10 years in the

play69:15

future it's quite scary what we would

play69:18

have by

play69:19

then so just play around with chat gp4

play69:24

and convince yourself that it really

play69:25

does understanding yeah yeah and then

play69:28

think about how much better it is than

play69:31

what we had 10 years ago yes and imagine

play69:34

getting that much better again in the

play69:35

next 10

play69:37

years so I presume that

play69:41

although I mean this is stating the

play69:43

obvious perhaps but just because you

play69:45

have left Google does not mean that

play69:46

Google is no longer engaged in this this

play69:50

research and uh accelerating it

play69:52

dramatically was only a tiny tiny tiny

play69:54

part of their research effort yeah um I

play69:57

was there mainly to sort of give advice

play69:59

to the younger researchers who actually

play70:00

doing the work um and um no Google is

play70:05

going flat out on this Google did have a

play70:08

big lead in this and it chose not to

play70:10

release it it had a it had a lead in

play70:13

producing very realistic images and in

play70:15

these large language models and it

play70:17

realized how easily they were abused and

play70:21

it didn't want to ruin its reputation

play70:22

for producing

play70:24

um things that were true for being you

play70:27

know reliable yeah and so it didn't

play70:30

release these things and it could afford

play70:32

to do that when it was the only only

play70:34

company that had them yeah but as soon

play70:38

as

play70:39

Microsoft um released open AI chatbot in

play70:44

bing Google didn't have any choice but

play70:47

to um play catchup and they were behind

play70:50

in all the details that go into

play70:51

releasing these things yeah are they

play70:53

more or less caught up now um and from

play70:57

now on it's going to be a competition

play70:59

between Google and Microsoft and maybe

play71:02

Facebook and possibly Amazon um

play71:07

and it's going to be very hard to slow

play71:09

it

play71:10

down and do you think the enthusiasm for

play71:13

this is is is based on uh is

play71:17

generationally um defined in other words

play71:20

do you think that young people coming

play71:22

out of college today with degrees in

play71:24

computer science are excited about the

play71:27

possib the appealing possibilities and

play71:30

for not paying attention to the dangers

play71:32

or do you think that they too are well

play71:36

aware and trying to figure it

play71:38

out I guess I don't have any data to

play71:41

base an answer on but my guess is

play71:45

um they have both feelings that is very

play71:48

exciting there's huge potential here um

play71:51

they should definitely get into it and

play71:53

they should either be doing it or using

play71:55

it in whatever else it is they do um but

play71:58

they're also many of them I think will

play72:00

be aware of the dangers

play72:03

yeah they've seen the dangers on social

play72:06

media the polarization dangers from

play72:08

social media yes well we hope so we hope

play72:12

they find them dangerous anyway yeah

play72:15

yeah well Jeffrey Hinton this has been a

play72:17

tremendously interesting uh time with

play72:21

you thank you so much uh if you have any

play72:24

last words now is your

play72:26

moment um yeah my last words are this is

play72:30

a period of History we're entering which

play72:32

is very uncertain we've never before had

play72:35

to deal with even the possibility of

play72:37

things smarter than us yeah um and

play72:41

nobody knows what's going to happen some

play72:43

people are very confident it's all going

play72:44

to work out just fine other people are

play72:46

very confident it's going to be a

play72:47

complete disaster yeah I think the best

play72:50

thing to do is keep a very open mind but

play72:53

we really don't know what's going to

play72:55

happen um but we should clearly be

play72:57

cautious if that's the

play72:59

case Okay words to live by thank you

play73:02

Jeffrey Hinton thank you for inviting me

play73:05

thank

play73:06

you

Rate This

5.0 / 5 (0 votes)

Related Tags
人工智能科技演讲辛顿MIT科技伦理机器学习深度学习科学技术未来预测技术发展
Do you need a summary in English?