Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

WIRED
8 Nov 202325:47

Summary

TLDR视频脚本通过一系列访谈和示例,探讨了算法在日常生活中的普遍应用和重要性。从计算机科学教授到数据科学家,不同领域的专家分享了他们对算法的理解,包括搜索算法、机器学习和深度学习等。同时,讨论了算法的未来发展,以及它们如何影响我们的生活和工作。视频强调了算法的基础知识和不断进步的技术之间的联系,以及对这些技术潜在的利弊进行批判性思考的重要性。

Takeaways

  • 📚 算法无处不在,无论是在物理世界还是虚拟世界中都扮演着重要角色。
  • 👨‍🏫 哈佛大学计算机科学教授David J. Malan强调算法是解决问题的机会。
  • 🖥️ 计算机的核心部件包括CPU(中央处理器)和内存(RAM)。
  • 🔍 算法可以是有步骤的指令集,用于指导人们或机器人完成特定任务。
  • 🥪 通过制作三明治的例子,展示了算法需要精确的指令来正确执行。
  • 🔎 搜索算法如Google和Bing背后的技术是快速有效地查找信息的关键。
  • 📈 数据科学和机器学习领域中,算法用于优化模型和解决业务问题。
  • 🤖 机器人学习结合了机器人技术和机器学习,用于开发智能系统。
  • 🧠 人工智能和机器学习的发展,使得算法在日常生活中的应用越来越广泛。
  • 🚀 算法的研究和发明涉及思考效率和连接不同领域的知识。
  • 🌐 算法的进步和理解与我们对技术的掌握和推进是松散耦合的。
  • 💡 尽管大型语言模型等先进技术的发展,基础算法的学习仍然对于理解技术至关重要。

Q & A

  • 算法的重要性体现在哪些方面?

    -算法的重要性体现在它们无处不在,无论是在物理世界还是虚拟世界中。算法为解决问题提供了机会,而我们生活中处处都有问题需要解决。

  • 计算机科学教授是如何定义计算机的?

    -计算机是电子设备,形状像矩形,可以进行敲击输入操作。

  • CPU和内存(RAM)在计算机中分别扮演什么角色?

    -CPU是计算机的大脑,负责响应指令,进行数学运算等操作。内存(RAM)则用于存储正在使用的程序和数据。

  • 硬盘(HDD或SSD)与内存(RAM)有什么区别?

    -内存(RAM)是易失性存储,用于存储当前正在使用的程序和数据。而硬盘(HDD或SSD)是非易失性存储,用于永久保存数据、文档等信息,即使断电也能保持信息不丢失。

  • 算法可以如何帮助我们提高效率?

    -算法通过提供精确、逐步的指令来解决问题,从而提高效率。例如,在搜索算法中,通过优化搜索过程,可以更快地找到所需信息。

  • 什么是递归算法?

    -递归算法是一种算法,它通过重复使用自身来解决问题,每次解决的都是问题的一个更小的部分,直到问题被完全解决。

  • 在数据科学领域,算法的作用是什么?

    -在数据科学领域,算法主要用于优化模型,找到最佳的数据集描述,或者作为数据产品的核心,帮助解决新闻编辑室和商业问题。

  • 大型语言模型(LLM)在人工智能中扮演什么角色?

    -大型语言模型是人工智能的一个特殊领域,主要用于预测下一个词或一系列符号。它们可以用于生成文本、聊天机器人等应用,并且在不断学习和优化以提高性能。

  • 为什么说我们对算法的理解与我们的进步是松散耦合的?

    -我们对算法的理解与我们的进步是松散耦合的,因为即使我们不理解算法的底层原理,也可以使用高级工具和库来实现复杂的功能。然而,深入理解算法仍然对于开发和优化更高效的算法至关重要。

  • 算法在日常生活中的应用有哪些?

    -算法在日常生活中的应用非常广泛,包括搜索引擎优化、推荐系统(如YouTube、Netflix)、社交媒体内容推荐(如TikTok的For You页面)等。算法帮助我们更有效地找到和组织信息,提高我们的生活质量。

  • 对于想要进入计算机科学领域的学生或成人,大型语言模型(LLM)的出现意味着什么?

    -大型语言模型(LLM)的出现并不意味着我们应该避免进入计算机科学领域。相反,它们提供了新的学习和研究机会,让我们可以探索如何训练、微调和应用这些模型来解决实际问题。

Outlines

00:00

📚 算法的基础与重要性

本段落介绍了计算机科学教授David J. Malan对算法的解释,强调了算法无处不在的重要性,无论是在物理世界还是虚拟世界。他通过日常生活中的例子,如制作三明治的步骤,解释了算法的概念,并指出算法的核心在于解决问题。此外,他还讨论了计算机的两个关键硬件组件:CPU和内存,并解释了它们的作用。

05:00

🔍 搜索算法的效率与优化

这一部分讨论了搜索算法的效率问题,通过查找电话簿中的联系人为例,说明了不同搜索算法的效率差异。首先提到了线性搜索的低效率,然后介绍了分而治之的策略,即每次跳过一定数量的条目来快速缩小搜索范围。这种方法比线性搜索快得多,因为它通过不断减半搜索范围来加速查找过程。

10:01

📈 算法的递归性与复杂性

在这一部分中,讨论了更复杂的算法类型——递归算法。递归算法是一种自我调用的算法,通过将问题分解成更小的部分来解决问题。通过与嘉宾Patricia的对话,解释了在计算机科学和数据科学中算法的应用,并通过磁性数字在黑板上的排序示例,展示了如何使用简单的步骤来解决看似复杂的问题。

15:01

🤖 机器学习与算法研究

本段落探讨了机器学习和算法研究的关系。作为第四年的博士生,讨论者分享了他在机器人学习和机器学习方面的研究经验。他强调了算法研究的重要性,尤其是在寻找低效问题和连接不同领域之间的联系方面。此外,讨论了机器学习在推荐系统、搜索引擎和社交媒体中的应用,以及算法如何影响我们的日常生活。

20:02

🌐 算法在数据科学和产业中的角色

在这一部分中,Chris Wiggins教授讨论了算法在数据科学和产业中的应用。他解释了数据科学家如何使用优化算法来找到最佳的数据集模型,并如何将算法转化为数据产品。此外,他还提到了算法在新闻室和商业问题中的应用,以及数据科学家如何将算法集成到他们的工作中。最后,他讨论了人工智能的兴起,以及它如何改变人们对算法的看法。

25:04

🚀 算法的未来发展与影响

最后一部分讨论了算法未来发展的可能方向,以及它们对我们生活的影响。讨论者提出了算法将在我们日常生活中扮演越来越重要的角色,并且它们可能会在一些意想不到的地方发挥作用。同时,他们也讨论了大数据和机器学习算法可能带来的风险,以及如何平衡算法的益处和潜在的负面影响。

Mindmap

Keywords

💡算法

算法是一系列定义清晰的指令序列,用于解决特定问题或执行特定任务。在视频中,算法被描述为无处不在的工具,无论是在物理世界还是虚拟世界中。算法的重要性在于它们提供了解决问题的机会,并且与我们日常生活中的许多活动息息相关。例如,视频中提到了搜索算法、排序算法和机器学习算法等。

💡计算机

计算机是一种电子设备,能够接收和处理数据,执行各种计算和任务。它通常由中央处理单元(CPU)和内存(RAM)等硬件组成,这些部件负责解释和执行指令。在视频中,计算机被比喻为一个矩形的设备,用户可以通过键盘输入指令来与之交互。

💡中央处理单元(CPU)

中央处理单元,简称CPU,是计算机的核心硬件之一,负责解释和执行计算机程序中的指令。CPU是计算机的大脑,能够处理各种计算任务,如数学运算和逻辑判断。

💡随机访问存储器(RAM)

随机访问存储器,简称RAM,是计算机的另一种重要硬件,用于临时存储正在运行的程序和当前使用的数据。与CPU协同工作,RAM确保计算机能够快速访问和处理信息。

💡精确性

精确性是指在执行任务或解决问题时的准确性和正确性。在算法和计算机程序中,精确性尤为重要,因为即使是微小的错误也可能导致完全不同的结果。

💡分治法

分治法是一种算法设计策略,它通过将大问题分解为小问题,然后递归地解决这些小问题来简化问题解决过程。这种方法通常用于提高解决问题的效率。

💡递归算法

递归算法是一种特殊的算法,它通过重复调用自身来解决问题。这种方法通常用于解决可以自然分解为更小、更简单子问题的问题。

💡机器学习

机器学习是人工智能的一个分支,它使计算机系统能够通过经验自我改进。这通常涉及使用大量数据训练模型,以便模型能够识别模式并做出预测或决策。

💡数据科学

数据科学是一门综合性学科,它使用统计学、机器学习和软件工程技术来分析、处理和解释大量数据,以发现有价值的信息和支持决策。

💡优化算法

优化算法是一类用于在给定的约束条件下寻找最佳解决方案的算法。这些算法在数据科学中尤为重要,因为它们可以帮助找到最适合数据集的模型或描述。

💡人工智能

人工智能是指使计算机系统模拟人类智能行为的技术,包括学习、推理、自我改进、感知和理解语言等能力。

Highlights

哈佛大学计算机科学教授David J. Malan解释算法的重要性及其在现实世界和虚拟世界中的普遍应用。

算法是解决问题的机会,生活中每个人都会遇到问题需要解决。

计算机被形象地描述为可以像手机一样电子化操作的矩形设备,能够响应指令进行计算。

介绍了计算机的核心硬件CPU(中央处理单元)和内存(RAM)的概念。

算法被定义为一系列指令,用于指导人们或机器人行动。

举例说明了日常生活中的算法,如睡前例行程序和制作三明治的步骤。

精确性在算法指令中的重要性,以及它在现实世界问题解决中的应用。

通过与人的互动,展示了如何创建一个简单的算法来解决特定问题。

讨论了搜索算法的概念,以及如何在电话簿中查找联系人的示例。

介绍了分治算法的概念,以及它如何通过将问题分解成更小的部分来提高解决问题的效率。

递归算法的定义,以及它如何通过自我复制来解决相同的问题。

讨论了算法在社交媒体和搜索引擎中的应用,以及它们如何影响我们日常生活的各个方面。

纽约大学的博士生讨论了算法研究和发明的过程,以及机器学习在算法发展中的作用。

哥伦比亚大学的副教授和纽约时报的首席数据科学家Chris Wiggins讨论了数据科学中的算法应用。

探讨了人工智能和数据科学之间的联系,以及它们如何共同推动技术进步。

讨论了大型语言模型(LLMs)和Chat GPT等工具对计算机科学领域的影响,以及它们如何改变人们对AI的看法。

强调了即使在高级算法看似遥不可及的情况下,通过学习和研究,最终也能够理解和掌握它们。

Transcripts

play00:00

- Hello world.

play00:00

My name is David J. Malan

play00:02

and I'm a professor of computer science

play00:03

at Harvard University.

play00:05

Today, I've been asked to explain algorithms

play00:07

in five levels of increasing difficulty.

play00:10

Algorithms are important

play00:11

because they really are everywhere,

play00:13

not only in the physical world,

play00:14

but certainly in the virtual world as well.

play00:16

And in fact, what excites me about algorithms

play00:18

is that they really represent an opportunity

play00:20

to solve problems.

play00:21

And I dare say, no matter what you do in life,

play00:24

all of us have problems to solve.

play00:28

So, I'm a computer science professor,

play00:30

so I spend a lot of time with computers.

play00:31

How would you define a computer for them?

play00:33

- Well, a computer is electronic,

play00:37

like a phone but it's a rectangle,

play00:40

and you can type like tick, tick, tick.

play00:44

And you work on it.

play00:45

- Nice. Do you know any of the parts

play00:48

that are inside of a computer?

play00:50

- No.

play00:51

- Can I explain a couple of them to you?

play00:53

- Yeah.

play00:53

- So, inside of every computer is some kind of brain

play00:56

and the technical term for that is CPU,

play00:59

or central processing unit.

play01:00

And those are the pieces of hardware

play01:02

that know how to respond to those instructions.

play01:06

Like moving up or down, or left or right,

play01:09

knows how to do math like addition and subtraction.

play01:12

And then there's at least one other type of

play01:13

hardware inside of a computer called memory

play01:16

or RAM, if you've heard of this?

play01:18

- I know of memory because you have to memorize stuff.

play01:21

- Yeah, exactly.

play01:21

And computers have even different types of memory.

play01:23

They have what's called RAM, random access memory,

play01:26

which is where your games, where your programs

play01:29

are stored while they're being used.

play01:31

But then it also has a hard drive,

play01:32

or a solid state drive, which is where your data,

play01:35

your high scores, your documents,

play01:37

once you start writing essays and stories in the future.

play01:40

- It stays there.

play01:41

- Stays permanently.

play01:42

So, even if the power goes out,

play01:43

the computer can still remember that information.

play01:45

- It's still there because

play01:47

the computer can't just like delete all of the words itself.

play01:52

- Hopefully not.

play01:53

- Because your fingers could only do that.

play01:55

Like you have to use your finger to delete

play01:58

all of the stuff. - Exactly.

play02:00

- You have to write.

play02:01

- Yeah, have you heard of an algorithm before?

play02:02

- Yes. Algorithm is a list of instructions to tell people

play02:07

what to do or like a robot what to do.

play02:09

- Yeah, exactly.

play02:10

It's just step by step instructions for doing something,

play02:14

for solving a problem, for instance.

play02:16

- Yeah, so like if you have a bedtime routine,

play02:18

then at first you say, "I get dressed, I brush my teeth,

play02:23

I read a little story, and then I go to bed."

play02:25

- All right.

play02:26

Well how about another algorithm?

play02:27

Like what do you tend to eat for lunch?

play02:30

Any types of sandwiches you like?

play02:31

- I eat peanut butter.

play02:33

- Let me get some supplies from the cupboard here.

play02:35

So, should we make an algorithm together?

play02:36

- Yeah.

play02:37

- Why don't we do it this way?

play02:39

Why don't we pretend like I'm a computer

play02:41

or maybe I'm a robot, so I only understand your instructions

play02:44

and so I want you to feed me, no pun intended, an algorithm.

play02:48

So, step-by-step instructions for solving this problem.

play02:51

But remember, algorithms, you have to be precise,

play02:54

you have to give...

play02:55

- The right instructions.

play02:57

- [David] The right instructions.

play02:58

Just do it for me. So, step one was what?

play03:00

- Open the bag.

play03:02

- [David] Okay. Opening the bag of bread.

play03:04

- Stop. - [David] Now what?

play03:05

- Grab the bread and put it on the plate.

play03:07

- [David] Grab the bread and put it on the plate.

play03:11

- Take all the bread back and put it back in there.

play03:13

[David laughing]

play03:14

- So, that's like an undo command.

play03:16

- Yeah.

play03:16

- Little control Z? Okay.

play03:18

- Take one bread and put it on the plate.

play03:20

- Okay.

play03:21

- Take the lid off the peanut butter.

play03:23

- [David] Okay, take the lid off the peanut butter.

play03:24

- Put the lid down.

play03:25

- [David] Okay. - Take the knife.

play03:28

- [David] Take the knife.

play03:29

- [Addison] Put the blade inside the peanut butter

play03:32

and spread the peanut butter on the bread.

play03:35

- I'm going to take out some peanut butter

play03:37

and I'm going to spread the peanut butter on the bread.

play03:40

- I put a lot of peanut butter on

play03:42

because I love peanut butter.

play03:43

- Oh, apparently. I thought I was messing with you here...

play03:44

- No, no it's fine.

play03:45

But I think you're apparently happy with this.

play03:47

- [Addison] Put the knife down,

play03:49

and then grab one bread and put it on top

play03:51

of the second bread, sideways.

play03:55

- Sideways.

play03:57

- Like put it flat on.

play03:58

- Oh, flat ways, okay.

play04:00

- [Addison] And now, done. You're done with your sandwich.

play04:02

- Should we take a delicious bite?

play04:04

- Yep. Let's take a bite.

play04:06

- [David] Okay, here we go.

play04:08

What would the next step be here?

play04:10

- Clean all this mess up.

play04:11

[David laughing]

play04:11

- Clean all this mess up, right.

play04:13

We made an algorithm, step by step instructions

play04:16

for solving some problem.

play04:17

And if you think about now,

play04:18

how we made peanut butter and jelly sandwiches,

play04:20

sometimes we were imprecise and you didn't give me

play04:23

quite enough information to do the algorithm correctly,

play04:25

and that's why I took out so much bread.

play04:27

Precision, being very, very correct with your instructions

play04:31

is so important in the real world

play04:33

because for instance, when you're using the worldwide web

play04:35

and you're searching for something on Google or Bing...

play04:38

- You want to do the right thing.

play04:40

- [David] Exactly.

play04:41

- So, like if you type in just Google,

play04:43

then you won't find the answer to your question.

play04:46

- Pretty much everything we do in life is an algorithm,

play04:49

even if we don't use that fancy word to describe it.

play04:51

Because you and I are sort of following instructions

play04:53

either that we came up with ourselves

play04:55

or maybe our parents told us how to do these things.

play04:59

And so, those are just algorithms.

play05:00

But when you start using algorithms in computers,

play05:03

that's when you start writing code.

play05:05

[upbeat music]

play05:09

What do you know about algorithms?

play05:11

- Nothing really, at all honestly.

play05:13

I think it's just probably a way to store information

play05:16

in computers.

play05:17

- And I dare say, even though you might not have

play05:19

put this word on it, odds are you executed as a human,

play05:22

multiple algorithms today even before you came here today.

play05:26

Like what were a few things that you did?

play05:28

- I got ready.

play05:29

- Okay. And get ready. What does that mean?

play05:30

- Brushing my teeth, brushing my hair.

play05:32

- [David] Okay.

play05:33

- Getting dressed.

play05:33

- Okay, so all of those, frankly, if we really

play05:35

dove more deeply, could be broken down into

play05:39

step-by-step instructions.

play05:41

And presumably your mom, your dad, someone in the past

play05:44

sort of programmed you as a human to know what to do.

play05:47

And then after that, as a smart human,

play05:48

you can sort of take it from there

play05:49

and you don't need their help anymore.

play05:51

But that's kind of what we're doing

play05:52

when we program computers.

play05:53

Something maybe even more familiar nowadays,

play05:55

like odds are you have a cell phone.

play05:57

Your contacts or your address book.

play05:59

But let me ask you why that is.

play06:01

Like why does Apple or Google or anyone else

play06:03

bother alphabetizing your contacts?

play06:05

- I just assumed it would be easier to navigate.

play06:07

- What if your friend happened to be at the very bottom

play06:10

of this randomly organized list?

play06:12

Why is that a problem? Like he or she's still there.

play06:15

- I guess it would take a while to get to

play06:16

while you're scrolling.

play06:17

- That, in of itself, is kind of a problem

play06:19

or it's an inefficient solution to the problem.

play06:21

So, it turns out that back in my day,

play06:22

before there were cell phones, everyone's numbers

play06:25

from my schools were literally printed in a book,

play06:27

and everyone in my town and my city, my state

play06:29

was printed in an actual phone book.

play06:31

Even if you've never seen this technology before,

play06:33

how would you propose verbally to find John

play06:36

in this phone book? - Or I would just flip through

play06:37

and just look for the J's I guess.

play06:39

- Yeah. So, let me propose that we start that way.

play06:41

I could just start at the beginning

play06:42

and step by step I could just look at each page,

play06:45

looking for John, looking for John.

play06:48

Now even if you've never seen this here technology before,

play06:51

it turns out this is exactly what your phone could be doing

play06:54

in software, like someone from Google or Apple or the like,

play06:57

they could write software that uses a technique

play06:59

in programming known as a loop,

play07:01

and a loop, as the word implies,

play07:02

is just sort of do something again and again.

play07:04

What if instead of starting from the beginning

play07:06

and going one page at a time,

play07:07

what if I, or what if your phone goes like two pages

play07:09

or two names at a time?

play07:11

Would this be correct do you think?

play07:12

- Well you could skip over John, I think.

play07:15

- In what sense?

play07:15

- If he's in one of the middle pages that you skipped over.

play07:17

- Yeah, so sort of accidentally and frankly

play07:20

with like 50/50 probability,

play07:21

John could get sandwiched in between two pages.

play07:24

But does that mean I have to throw

play07:25

that algorithm out altogether?

play07:27

- Maybe you could use that strategy until you get close

play07:30

to the section and then switch to going one by one.

play07:31

- Okay, that's nice.

play07:32

So, you could kind of like go twice as fast

play07:34

but then kind of pump the brakes as you near your exit

play07:37

on the highway, or in this case near the J section

play07:39

of the book.

play07:40

- Exactly.

play07:41

- And maybe alternatively, if I get to like

play07:42

A, B, C, D, E, F, G, H, I, J, K,

play07:44

if I get to the K section,

play07:46

then I could just double back like one page

play07:49

just to make sure John didn't get sandwiched

play07:50

between those pages.

play07:51

So, the nice thing about that second algorithm

play07:53

is that I'm flying through the phone book

play07:55

like two pages at a time.

play07:56

So, 2, 4, 6, 8, 10, 12.

play07:57

It's not perfect, it's not necessarily correct

play08:00

but it is if I just take one extra step.

play08:02

So, I think it's fixable,

play08:03

but what your phone is probably doing

play08:05

and frankly what I and like my parents and grandparents

play08:08

used to do back in the day was we'd probably go roughly

play08:09

to the middle of the phone book here,

play08:11

and just intuitively, if this is an alphabetized phone book

play08:14

in English, what section am I probably going to

play08:16

find myself in roughly?

play08:17

- K?

play08:18

- Okay. So, I'm in the K section.

play08:20

Is John going to be to the left or to the right?

play08:21

- To the left.

play08:22

- Yeah.

play08:23

So, John is going to be to the left or the right

play08:24

and what we can do here, though your phone

play08:25

does something smarter, is tear the problem in half,

play08:27

throw half of the problem away,

play08:30

being left with just 500 pages now.

play08:32

But what might I next do?

play08:34

I could sort of naively just start at the beginning again,

play08:36

but we've learned to do better.

play08:38

I can go roughly to the middle here.

play08:39

- And you can do it again. - Yeah, exactly.

play08:41

So, now maybe I'm in the E section,

play08:43

which is a little to the left.

play08:45

So, John is clearly going to be to the right,

play08:47

so I can again tear the problem poorly in half,

play08:51

throw this half of the problem away,

play08:53

and I claim now that if we started with a thousand pages,

play08:56

now we've gone to 500, 250,

play08:57

now we're really moving quickly.

play08:59

- Yeah.

play09:00

- [David] And so, eventually I'm hopefully dramatically

play09:01

left with just one single page

play09:04

at which point John is either on that page

play09:06

or not on that page, and I can call him.

play09:08

Roughly how many steps might this third algorithm take

play09:12

if I started with a thousand pages

play09:14

then went to 500, 250, 125,

play09:17

how many times can you divide 1,000 in half? Maybe?

play09:21

- 10.

play09:21

- That's roughly roughly 10.

play09:22

Because in the first algorithm,

play09:23

looking again for someone like Zoe in the worst case

play09:26

might have to go all the way through a thousand pages.

play09:28

But the second algorithm you said was 500,

play09:30

maybe 501, essentially the same thing.

play09:32

So, twice as fast.

play09:34

But this third and final algorithm is sort of fundamentally

play09:37

faster because you're sort of dividing and conquering it

play09:40

in half, in half, in half,

play09:41

not just taking one or two bites out of it out of a time.

play09:45

So, this of course is not how we used to use phone books

play09:47

back in the day since otherwise they'd be single use only.

play09:49

But it is how your phone is actually searching for Zoe,

play09:53

for John, for anyone else, but it's doing it in software.

play09:56

- Oh, that's cool.

play09:57

- So, here we've happened to focus on searching algorithms,

play09:59

looking for John in the phone book.

play10:00

But the technique we just used

play10:01

can indeed be called divide and conquer,

play10:03

where you take a big problem and you divide and conquer it,

play10:06

that is you try to chop it up into smaller,

play10:08

smaller, smaller pieces.

play10:09

A more sophisticated type of algorithm,

play10:11

at least depending on how you implement it,

play10:13

something known as a recursive algorithm.

play10:15

Recursive algorithm is essentially an algorithm

play10:18

that uses itself to solve the exact same problem

play10:21

again and again, but chops it smaller, and smaller,

play10:24

and smaller ultimately.

play10:26

[upbeat music]

play10:29

- Hi, my name's Patricia.

play10:30

- Patricia, nice to meet you.

play10:31

Where are you a student at?

play10:32

- I am starting my senior year now at NYU.

play10:34

- Oh nice. And what have you been studying

play10:35

the past few years?

play10:36

- I studied computer science and data science.

play10:38

- If you were chatting with a non-CS,

play10:40

non-data science friend of yours,

play10:41

how would you explain to them what an algorithm is?

play10:44

- Some kind of systematic way of solving a problem,

play10:47

or like a set of steps to kind of solve

play10:50

a certain problem you have.

play10:52

- So, you probably recall learning topics

play10:54

like binary search versus linear search, and the like.

play10:56

- Yeah.

play10:57

- So, I've come here complete with a

play10:58

actual chalkboard with some magnetic numbers on it here.

play11:02

How would you tell a friend to sort these?

play11:04

- I think one of the first things we learned was

play11:07

something called bubble sort.

play11:08

It was kind of like focusing on smaller bubbles

play11:12

I guess I would say of the problem,

play11:14

like looking at smaller segments rather than

play11:16

the whole thing at once.

play11:17

- What is I think very true about what you're hinting at

play11:20

is that bubble sort really focuses on local, small problems

play11:25

rather than taking a step back trying to fix

play11:26

the whole thing, let's just fix the obvious problems

play11:29

in front of us. So, for instance, when we're trying to get

play11:30

from smallest to largest,

play11:32

and the first two things we see are eight followed by one,

play11:34

this looks like a problem 'cause it's out of order.

play11:37

So, what would be the simplest fix,

play11:39

the least amount of work we can do

play11:40

to at least fix one problem?

play11:41

- Just switch those two numbers

play11:43

'cause one is obviously smaller than eight.

play11:45

- Perfect. So, we just swap those two then.

play11:47

- You would switch those again.

play11:48

- Yeah, so that further improves the situation

play11:51

and you can kind of see it,

play11:52

that the one and the two are now in place.

play11:54

How about eight and six?

play11:55

- [Patricia] Switch it again.

play11:55

- Switch those again. Eight and three?

play11:57

- Switch it again.

play11:58

[fast forwarding]

play12:00

- And conversely now the one and the two are closer to,

play12:03

and coincidentally are exactly where we want them to be.

play12:07

So, are we done?

play12:08

- No.

play12:09

- Okay, so obviously not, but what could we do now

play12:12

to further improve the situation?

play12:14

- Go through it again but you don't need

play12:17

to check the last one anymore because we know

play12:19

that number is bubbled up to the top.

play12:21

- Yeah, because eight has indeed bubbled all the way

play12:23

to the top. So, one and two?

play12:25

- [Patricia] Yeah, keep it as is.

play12:26

- Okay, two and six?

play12:27

- [Patricia] Keep it as is.

play12:28

- Okay, six and three?

play12:29

- Then you switch it.

play12:30

- Okay, we'll switch or swap those.

play12:31

Six and four?

play12:32

- [Patricia] Swap it again.

play12:33

- Okay, so four, six and seven?

play12:35

- [Patricia] Keep it.

play12:36

- Okay. Seven and five?

play12:37

- [Patricia] Swap it.

play12:38

- Okay. And then I think per your point,

play12:39

we're pretty darn close.

play12:41

Let's go through once more.

play12:43

One and two? - [Patricia] Keep it.

play12:45

- Two three? - [Patricia] Keep it.

play12:46

- Three four? - [Patricia] Keep it.

play12:47

- Four six? - [Patricia] Keep it.

play12:48

- Six five?

play12:49

- [Patricia] And then switch it.

play12:49

- All right, we'll switch this. And now to your point,

play12:51

we don't need to bother with the ones

play12:53

that already bubbled their way up.

play12:54

Now we are a hundred percent sure it's sorted.

play12:56

- Yeah.

play12:57

- And certainly the search engines of the world,

play12:58

Google and Bing and so forth,

play12:59

they probably don't keep webpages in sorted order

play13:03

'cause that would be a crazy long list

play13:04

when you're just trying to search the data.

play13:06

But there's probably some algorithm underlying what they do

play13:08

and they probably similarly, just like we,

play13:10

do a bit of work upfront to get things organized

play13:14

even if it's not strictly sorted in the same way

play13:16

so that people like you and me and others

play13:18

can find that same information.

play13:20

So, how about social media?

play13:22

Can you envision where the algorithms are in that world?

play13:25

- Maybe for example like TikTok, like the For You page,

play13:28

'cause those are like recommendations, right?

play13:31

It's sort of like Netflix recommendations

play13:33

except more constant because it's just every video

play13:36

you scroll, it's like that's a new recommendation basically.

play13:39

And it's based on what you've liked previously,

play13:41

what you've saved previously, what you search up.

play13:43

So, I would assume there's some kind of algorithm there

play13:45

kind of figuring out like what to put on your For You page.

play13:48

- Absolutely. Just trying to keep you presumably

play13:50

more engaged.

play13:51

So, the better the algorithm is,

play13:52

the better your engagement is,

play13:54

maybe the more money the company then makes on the platform

play13:57

and so forth.

play13:58

So, it all sort of feeds together.

play13:59

But what you're describing really is more

play14:00

artificially intelligent, if I may,

play14:03

because presumably there's not someone at TikTok

play14:05

or any of these social media companies saying,

play14:07

"If Patricia likes this post, then show her this post.

play14:10

If she likes this post, then show her this other post."

play14:13

Because the code would sort of grow infinitely long

play14:15

and there's just way too much content for a programmer

play14:17

to be having those kinds of conditionals,

play14:20

those decisions being made behind the scenes.

play14:23

So, it's probably a little more artificially intelligent.

play14:26

And in that sense you have topics like neural networks,

play14:29

and machine learning which really describe

play14:31

taking as input things like what you watch,

play14:33

what you click on, what your friends watch,

play14:35

what they click on, and sort of trying to infer

play14:37

from that instead, what should we show Patricia

play14:39

or her friends next?

play14:41

- Oh, okay. Yeah. Yeah.

play14:42

That makes like the distinction more...

play14:44

Makes more sense now.

play14:45

- Nice. - Yeah.

play14:46

[upbeat music]

play14:49

- I am currently a fourth year PhD student at NYU.

play14:52

I do robot learning, so that's half and half

play14:55

robotics and machine learning.

play14:56

- Sounds like you've dabbled with quite a few algorithms.

play14:59

So, how does one actually research algorithms

play15:01

or invent algorithms?

play15:02

- The most important way is just trying to think about

play15:04

inefficiencies, and also think about connecting threads.

play15:07

The way I think about it is that algorithm for me

play15:11

is not just about the way of doing something,

play15:12

but it's about doing something efficiently.

play15:14

Learning algorithms are practically everywhere now.

play15:17

Google, I would say for example,

play15:18

is learning every day about like,

play15:21

"Oh what articles, what links might be better than others?"

play15:24

And re-ranking them.

play15:25

There are recommender systems all around us, right?

play15:29

Like content feeds and social media,

play15:31

or you know, like YouTube or Netflix.

play15:34

What we see is in a large part determined by this kind of

play15:37

learning algorithms.

play15:39

- Nowadays there's a lot of concerns

play15:40

around some applications of machine learning

play15:43

like deep fakes where it can kind of learn how I talk

play15:46

and learn how you talk and even how we look,

play15:48

and generate videos of us.

play15:50

We're doing this for real, but you could imagine

play15:52

a computer synthesizing this conversation eventually.

play15:54

- Right.

play15:55

- But how does it even know what I sound like

play15:57

and what I look like, and how to replicate that?

play15:59

- All of this learning algorithms that we talk about, right?

play16:01

A lot, like what goes in there is just

play16:05

lots and lots of data.

play16:06

So, data goes in, something else comes out.

play16:08

What comes out is whatever objective function

play16:10

that you optimize for.

play16:11

- Where is the line between algorithms that

play16:14

play games with and without AI?

play16:17

- I think when I started off my undergrad,

play16:19

the current AI machine learning

play16:21

was not very much synonymous.

play16:23

- Okay.

play16:24

- And even in my undergraduate, in the AI class,

play16:27

they learned a lot of classical algorithms for game plays.

play16:29

Like for example, the A star search, right?

play16:31

That's a very simple example of how you can play a game

play16:34

without having anything learned.

play16:37

This is very much, oh you are at a game state,

play16:40

you just search down, see what are the possibilities

play16:43

and then you pick the best possibility that it can see,

play16:46

versus what you think about when you think about,

play16:48

ah yes, gameplay like the alpha zero for example,

play16:53

or alpha star, or there are a lot of, you know,

play16:55

like fancy new machine learning agents that are

play16:58

even learning very difficult games like Go.

play17:01

And those are learned agents, as in they are getting better

play17:05

as they play more and more games.

play17:07

And as they get more games, they kind of

play17:09

refine their strategy based on the data that I've seen.

play17:12

And once again, this high level abstraction

play17:15

is still the same.

play17:16

You see a lot of data and you'll learn from that.

play17:18

But the question is what is objective function

play17:21

that you're optimizing for?

play17:22

Is it winning this game?

play17:23

Is it forcing a tie or is it, you know,

play17:26

opening a door in a kitchen?

play17:27

- So, if the world is very much focused on supervised,

play17:30

unsupervised reinforcement learning now,

play17:32

what comes next five, ten years, where is the world going?

play17:35

- I think that this is just going to be more and more,

play17:39

I don't want to use the word encroachment,

play17:41

but that's what it feels like of algorithms

play17:43

into our everyday life.

play17:44

Like even when I was taking the train here, right?

play17:46

The trains are being routed with algorithms,

play17:48

but this has existed for you know, like 50 years probably.

play17:52

But as I was coming here, as I was checking my phone,

play17:54

those are different algorithms,

play17:56

and you know, they're kind of getting all around us,

play17:59

getting there with us all the time.

play18:01

They're making our life better most places, most cases.

play18:05

And I think that's just going to be a continuation

play18:07

of all of those.

play18:08

- And it feels like they're even in places

play18:09

you wouldn't expect, and there's just so much data

play18:11

about you and me and everyone else online

play18:13

and this data is being mined and analyzed,

play18:15

and influencing things we see and hear it would seem.

play18:18

So, there is sort of a counterpoint which might be good

play18:20

for the marketers, but not necessarily good for you and me

play18:22

as individuals.

play18:23

- We are human beings, but for someone

play18:26

we might be just a pair of eyes who are

play18:28

carrying a wallet, and are there to buy things.

play18:32

But there is so much more potential for these algorithms

play18:35

to just make our life better without

play18:37

changing much about our life.

play18:39

[upbeat music]

play18:43

- I'm Chris Wiggins. I'm an associate professor

play18:44

of Applied Mathematics at Columbia.

play18:46

I'm also the chief data scientist of the New York Times.

play18:48

The data science team at the New York Times

play18:49

develops and deploys machine learning

play18:51

for newsroom and business problems.

play18:53

But I would say the things that we do mostly, you don't see,

play18:55

but it might be things like personalization algorithms,

play18:58

or recommending different content.

play19:00

- And do data scientists, which is rather distinct

play19:02

from the phrase computer scientists.

play19:04

Do data scientists still think in terms of algorithms

play19:06

as driving a lot of it?

play19:08

- Oh absolutely, yeah.

play19:09

In fact, so in data science and academia,

play19:11

often the role of the algorithm is

play19:13

the optimization algorithm that helps you find the best

play19:16

model or the best description of a data set.

play19:17

And data science and industry, the goal,

play19:20

often it's centered around an algorithm

play19:22

which becomes a data product.

play19:24

So, a data scientist in industry might be

play19:26

developing and deploying the algorithm,

play19:28

which means not only understanding the algorithm

play19:30

and its statistical performance,

play19:32

but also all of the software engineering

play19:34

around systems integration, making sure that that algorithm

play19:37

receives input that's reliable and has output that's useful,

play19:41

as well as I would say the organizational integration,

play19:44

which is how does a community of people

play19:46

like the set of people working at the New York Times

play19:47

integrate that algorithm into their process?

play19:50

- Interesting. And I feel like AI based startups

play19:52

are all the rage and certainly within academia.

play19:54

Are there connections between AI

play19:55

and the world of data science?

play19:57

- Oh, absolutely.

play19:57

- The algorithms that they're in,

play19:58

can you connect those dots for...

play19:59

- You're right that AI as a field has really exploded.

play20:01

I would say particularly many people experienced a ChatBot

play20:05

that was really, really good.

play20:06

Today, when people say AI,

play20:08

they're often thinking about large language models,

play20:10

or they're thinking about generative AI,

play20:12

or they might be thinking about a ChatBot.

play20:14

One thing to keep in mind is a ChatBot is a special case

play20:17

of generative AI, which is a special case of using

play20:19

large language models, which is a special case of using

play20:21

machine learning generally,

play20:23

which is what most people mean by AI.

play20:25

You may have moments that are what John McCarthy called,

play20:28

"Look Ma, no hands," results,

play20:30

where you do some fantastic trick and you're not quite sure

play20:32

how it worked.

play20:33

I think it's still very much early days.

play20:34

Large language models is still in the point of

play20:37

what might be called alchemy and that people are building

play20:39

large language models without a real clear,

play20:41

a priori sense of what the right design is

play20:43

for a right problem.

play20:45

Many people are trying different things out,

play20:46

often in large companies where they can afford

play20:48

to have many people trying things out,

play20:49

seeing what works, publishing that,

play20:52

instantiating it as a product.

play20:53

- And that itself is part of the scientific process

play20:55

I would think too.

play20:56

- Yeah, very much. Well, science and engineering,

play20:58

because often you're building a thing

play20:59

and the thing does something amazing.

play21:01

To a large extent we are still looking for

play21:04

basic theoretical results around why

play21:06

deep neural networks generally work.

play21:09

Why are they able to learn so well?

play21:11

They're huge, billions of parameter models

play21:13

and it's difficult for us to interpret

play21:15

how they're able to do what they do.

play21:17

- And is this a good thing, do you think?

play21:19

Or an inevitable thing that we, the programmers,

play21:21

we, the computer scientists, the data scientists

play21:22

who are inventing these things,

play21:24

can't actually explain how they work?

play21:27

Because I feel like friends of mine in industry,

play21:28

even when it's something simple and relatively familiar

play21:31

like auto complete, they can't actually tell me

play21:33

why that name is appearing at the top of the list.

play21:35

Whereas years ago when these algorithms were more

play21:38

deterministic and more procedural,

play21:39

you could even point to the line that made that name

play21:42

bubble up to the top. - [Chris] Absolutely.

play21:44

- So, is this a good thing, a bad thing,

play21:45

that we're sort of losing control perhaps in some sense

play21:47

of the algorithm?

play21:48

- It has risks.

play21:49

I don't know that I would say that it's good or bad,

play21:51

but I would say there's lots of scientific precedent.

play21:53

There are times when an algorithm works really well

play21:55

and we have finite understanding of why it works

play21:57

or a model works really well

play21:59

and sometimes we have very little understanding

play22:01

of why it works the way it does.

play22:03

- In classes I teach, certainly spend a lot of time on

play22:04

fundamentals, algorithms that have been taught in classes

play22:07

for decades now, whether it's binary search,

play22:09

linear search, bubble sorts, selection sort or the like,

play22:12

but if we're already at the point where I can pull up

play22:15

chat GPT, copy paste a whole bunch of numbers or words

play22:18

and say, "Sort these for me,"

play22:20

does it really matter how Chat GPT is sorting it?

play22:24

Does it really matter to me as the user

play22:25

how the software is sorting it?

play22:27

Do these fundamentals become more dated and less important

play22:30

do you think?

play22:30

- Now you're talking about the ways in which code

play22:33

and computation is a special case of technology, right?

play22:36

So, for driving a car, you may not necessarily need

play22:39

to know much about organic chemistry,

play22:41

even though the organic chemistry is how the car works.

play22:45

So, you can drive the car and use it in different ways

play22:47

without understanding much about the fundamentals.

play22:49

So, similarly with computation, we're at a point

play22:51

where the computation is so high level, right?

play22:54

You can import psychic learn and you can go from zero

play22:56

to machine learning in 30 seconds.

play22:58

It's depending on what level you want to understand

play23:00

the technology, where in the stack, so to speak,

play23:03

it's possible to understand it and make wonderful things

play23:05

and advance the world without understanding it

play23:08

at the particular level of somebody who actually might have

play23:10

originally designed the actual optimization algorithm.

play23:13

I should say though, for many of the optimization

play23:15

algorithms, there are cases where an algorithm

play23:16

works really well and we publish a paper,

play23:18

and there's a proof in the paper,

play23:21

and then years later people realize

play23:22

actually that proof was wrong and we're really

play23:23

still not sure why that optimization works,

play23:25

but it works really well or it inspires people

play23:28

to make new optimization algorithms.

play23:29

So, I do think that the goal of understanding algorithms

play23:34

is loosely coupled to our progress

play23:36

and advancing grade algorithms, but they don't always

play23:39

necessarily have to require each other.

play23:40

- And for those students especially,

play23:42

or even adults who are thinking of now steering into

play23:44

computer science, into programming,

play23:46

who were really jazzed about heading in that direction

play23:49

up until, for instance, November of 2022,

play23:51

when all of a sudden for many people

play23:53

it looked like the world was now changing

play23:55

and now maybe this isn't such a promising path,

play23:58

this isn't such a lucrative path anymore.

play23:59

Are LLMs, are tools like Chat GPT reason not to perhaps

play24:04

steer into the field?

play24:05

- Large language models are a particular architecture

play24:06

for predicting, let's say the next word,

play24:09

or a set of tokens more generally.

play24:11

The algorithm comes in when you think about

play24:13

how is that LLM to be trained or also how to be fine tuned.

play24:18

So, the P of GPT is a pre-trained algorithm.

play24:22

The idea is that you train a large language model

play24:25

on some corpus of text, could be encyclopedias,

play24:28

or textbooks, or what have you.

play24:30

And then you might want to fine tune that model

play24:34

around some particular task or

play24:36

some particular subset of texts.

play24:38

So, both of those are examples of training algorithms.

play24:41

So, I would say people's perception

play24:43

of artificial intelligence has really changed a lot

play24:45

in the last six months, particularly around November of 2022

play24:51

when people experienced a really good ChatBot.

play24:53

The technology though had been around already before.

play24:55

Academics had already been working with Chat GPT three

play24:58

before that and GPT two and GPT one.

play25:01

And for many people it sort of opened up this conversation

play25:03

about what is artificial intelligence

play25:05

and what could we do with this?

play25:06

And what are the possible good and bad, right?

play25:08

Like any other piece of technology.

play25:10

Kranzburg's first law of technology,

play25:11

technology is neither good, nor bad, nor is it neutral.

play25:14

Every time we have some new technology,

play25:15

we should think about it's capabilities

play25:17

and the good, and the possible bad.

play25:19

- [David] As with any area of study,

play25:21

algorithms offer a spectrum from the most basic

play25:23

to the most advanced.

play25:25

And even if right now, the most advanced of those algorithms

play25:28

feels out of reach because you just

play25:30

don't have that background,

play25:31

with each lesson you learn, with each algorithm you study,

play25:34

that end game becomes closer and closer

play25:36

such that it will, before long, be accessible to you

play25:39

and you will be at the end of that most advanced spectrum.

Rate This

5.0 / 5 (0 votes)

相关标签
算法原理计算机科学教授访谈哈佛大学数据科学机器学习AI应用编程教育技术进步未来趋势