「アルゴリズム」って何?ハーバードの教授が教える最先端を目指すための基礎と道のり | 5 Levels | WIRED Japan

WIRED.jp
14 Jun 202425:47

Summary

TLDRこのスクリプトは、アルゴリズムの重要性と多様性について解説しています。コンピュータサイエンスの教授であるデイビッド・J・マ博士が、アルゴリズムが問題解決の機会を提供し、日常生活に潜むアルゴリズムの例を説明します。また、CPUやRAMなどのコンピュータの内部構成要素や、ソートアルゴリズム、検索アルゴリズム、機械学習、ディープラーニングなど、高度なトピックも触れています。さらに、アルゴリズムの研究や発展、そしてそれに伴う倫理的な問題についても議論しています。

Takeaways

  • 😀 デビッド・J・マ教授は、アルゴリズムの重要性について説明しました。彼はハーバード大学のコンピューターサイエンス教授です。
  • 😀 アルゴリズムは、物理世界だけでなく仮想世界でも問題解決の手段として広く使われています。
  • 😀 コンピュータの基本的なハードウェア要素(CPU、RAM、ハードドライブ)について説明しました。
  • 😀 アルゴリズムとは、特定の問題を解決するためのステップバイステップの指示のリストであると定義されました。
  • 😀 デビッド教授と子供がピーナッツバターサンドイッチを作るアルゴリズムを一緒に作成しました。
  • 😀 アルゴリズムの精度が重要であり、正確な指示を出す必要があることを強調しました。
  • 😀 二分探索アルゴリズムを使って電話帳から名前を検索する方法を説明しました。
  • 😀 アルゴリズムの「分割統治法」について、問題を小さく分割して解決する方法を説明しました。
  • 😀 学生や研究者が、アルゴリズムを発明したり研究したりする方法について議論しました。
  • 😀 現代のAIや機械学習アルゴリズムがどのようにして私たちの日常生活に浸透しているかについて述べました。

Q & A

  • アルゴリズムとは何かを説明してください。

    -アルゴリズムとは、問題を解決するための一連のステップや手順を意味します。例えば、就寝のルーティンやサンドイッチ作りのプロセスなどがアルゴリズムの一例です。

  • コンピュータのCPUとは何を意味していますか?

    -CPU、つまり中央処理ユニットはコンピュータの「脳」であり、命令に応じて動作するハードウェアの部品です。算術演算や方向の移動などの基本的な操作を行います。

  • コンピュータのメモリとRAMとはどのような関係がありますか?

    -メモリやRAMはコンピュータの「記憶力」であり、使用中のプログラムやゲームが保存される場所です。電源が切れてもハードドライブやソリッドステートドライブに保存されたデータは失われません。

  • 「バブルソート」アルゴリズムの基本的な考え方はどのようなものですか?

    -バブルソートはローカルな小さい問題に焦点を当て、一番小さい値から順番に並べ替えるアルゴリズムです。隣接する要素同士を比較し、順序が逆であれば交換することで、徐々に配列を整序します。

  • 「二分探索」アルゴリズムの利点は何ですか?

    -二分探索アルゴリズムは、大きな問題を半分に分割し、その半分を捨てて問題を解決する「分治法」を利用しています。これにより、検索時間を大幅に短縮できます。

  • 「再帰アルゴリズム」とは何を意味していますか?

    -再帰アルゴリズムは、同じ問題を繰り返し小さくしていくプロセスで、アルゴリズム自体が自分自身を呼び出します。これは問題をより小さな部分問題に分割し、解決することで効率性を高める方法です。

  • ソーシャルメディアにおけるアルゴリズムの例として挙げられた「TikTokの“For You”ページ」の仕組みを説明してください。

    -TikTokの“For You”ページは、ユーザーが過去に好評だった投稿や興味を示した内容に基づいて、新しい投稿を推薦するアルゴリズムを利用しています。これにより、ユーザーのエンゲージメントを高めることができます。

  • 機械学習アルゴリズムと統計的アルゴリズムの違いは何ですか?

    -機械学習アルゴリズムは、データから学習しパフォーマンスを向上させる一方で、統計的アルゴリズムは特定のデータセットに関する最良のモデルを見つけるために最適化されます。機械学習はより広い範囲でデータからの学習を意味するのに対し、統計的アルゴリズムは特定のデータセットに焦点を当てます。

  • アルゴリズムの「分治法」とはどのような手法ですか?

    -分治法は大きな問題を小さく分割し、それぞれの小問題を解決した後、結果を組み合わせて元の問題の解決策を得るアルゴリズムです。この手法は効率性を高めるために広く使われています。

  • アルゴリズムの研究や開発において、最も重要な要素は何ですか?

    -アルゴリズムの研究や開発では、効率性や問題解決の最適な方法を見つけることが最も重要です。また、アルゴリズムの理論的基礎を理解し、それがどのように機能し効果を発揮するかを理解することも重要です。

  • アルゴリズムの適用分野が拡大する中で、個人のプライバシーに対する影響はどのようになっていますか?

    -アルゴリズムの適用が拡大するにつれて、個人データの収集と分析が行われることが増え、プライバシーに対する懸念が高まっています。アルゴリズムはマーケターにとって有益である一方で、個人にとっては不要なターゲティングや侵入的な広告表示の原因になることがあります。

  • 最近のAI技術の発展において、アルゴリズムの理解がどの程度重要ですか?

    -AI技術の発展の中でも、アルゴリズムの理解は依然として重要です。アルゴリズムはAIの基礎であり、機械学習やディープラーニングなどの高度な技術を理解するためには、基本的なアルゴリズムの知識が必要です。

  • アルゴリズムの「ブラックボックス」問題とは何ですか?

    -「ブラックボックス」問題は、アルゴリズムの内部動作やその決定プロセスが理解できない状態を指します。特にディープラーニングのような複雑なアルゴリズムでは、モデルがなぜ特定の結果を出すのかを説明することが難しいことがあります。

  • アルゴリズムの「分治法」を実践する際に、どのような注意点がありますか?

    -分治法を実践する際には、問題が適切に分割され、各サブ問題に対する解決策が独立して考慮され、最終的な結果が正確に組み合わせられることが重要です。また、分割の方法や再結合のアルゴリズムの選択にも注意が必要です。

Outlines

00:00

👨‍🏫 算法の基礎と重要性

デイビッド・J・マ教授はハーバード大学のコンピューターサイエンス教授として、アルゴリズムの5つのレベルを解説します。アルゴリズムは物理世界だけでなく、バーチャル世界でも問題解決の機会を提供しています。コンピュータは電子機器で、CPUとRAMが重要な役割を果たします。アルゴリズムはステップバイステップの指令で、例えばベッドタイムルーティンやランチのサンドイッチ作りを例に説明します。正確性は非常に重要で、ネット検索のアルゴリズムもその一例です。

05:00

🔍 検索アルゴリズムの進化

アルゴリズムの応用として、電話帳での検索方法が語られます。最初のアルゴリズムはページを1つずつ確認する線形探索ですが、2ページずつ飛ばす方法で高速化が図られます。さらに、二分探索法と呼ばれるアルゴリズムが紹介され、問題を半分に分割し効率的に解決するプロセスが説明されています。これは現代の携帯電話の連絡先検索にも応用されています。

10:02

🤖 再帰アルゴリズムとソート技術

再帰アルゴリズムは自分自身を用いて問題を小さく分割し、解決する高度なアルゴリズムです。バブルソートが例に挙げられ、ローカルな問題を小さくまとめることで全体を整えていく方法が説明されています。また、ソートアルゴリズムの他にも、機械学習やディープラーニングにおけるアルゴリズムの応用が触れられ、TikTokの「For You」ページの推薦システムなど、アルゴリズムが日常生活に潜入している例が示されています。

15:03

🧠 機械学習とアルゴリズムの研究

アルゴリズムの研究と開発は、非効率性を見つけ出して解決策を提供することにあります。機械学習は日常に溶け込んでおり、Googleの検索エンジンやNetflixの推薦システムなどがその一例です。ディープラーニングは、巨大なデータから学習し、ゲームや偽造動画の生成など、多様な応用が見られます。アルゴリズムの研究は、理論的基礎を追求する一方で、実際の問題解決にもつながっています。

20:04

📊 データサイエンスとアルゴリズムの実践

データサイエンスとアルゴリズムの実践が語られます。ニューヨークタイムズのデータサイエンティストとして、機械学習をニュースルームやビジネス問題に適用しています。アルゴリズムは最適なモデルを見つけるための最適化アルゴリズムとして機能し、統計的性能だけでなく、ソフトウェア工学やシステム統合も重要です。AIスタートアップとアカデミアの間にも、アルゴリズムの応用が広がっており、その間で繋がりが生まれています。

25:05

🤖 AIの進化とアルゴリズムの役割

AIの進化とアルゴリズムの役割が議論されています。大規模言語モデルは、アルゴリズムのトレーニングとファインチューニングに関与し、その設計や問題解決能力に対する理解が進化しています。しかし、アルゴリズムのブラックボックス化が進むことで、完全な制御が失われているとも言えます。アルゴリズムの理解は、技術の進歩と密接に関係しており、基本的なアルゴリズムから高度なものへと学びを深めていくことが求められます。

🛠️ 技術の善悪とアルゴリズムの未来

技術の善悪について議論し、アルゴリズムの可能性と潜在的なリスクについて考察しています。技術は中立ではなく、新たな技術には良い面と悪い面の両方があると指摘されています。アルゴリズムの学習は、基本から高度なレベルへと進化し、最終的には理解し、適用することができるようになります。アルゴリズムの研究は、技術の進歩と共に進化し続けるでしょう。

Mindmap

Keywords

💡アルゴリズム

アルゴリズムとは、問題を解決するための一連のステップや手順のことで、コンピューターサイエンスの基礎概念です。このビデオでは、アルゴリズムが物理世界だけでなく、仮想世界でも広く存在し、問題解決の機会を提供していると強調されています。例えば、ベッドタイムルーティンやランチのサンドイッチ作りの過程が、アルゴリズムの一種であると説明されています。

💡コンピュータ

コンピュータは、電子的なデバイスで、矩形の形状でタイピングが可能な機器です。ビデオでは、コンピュータの内部にCPU(Central Processing Unit)やRAM(Random Access Memory)などの部品があると説明されています。CPUは命令に応答し、数学的な演算を行うハードウェアであり、RAMはプログラムやゲームが使用されている間を保存するメモリです。

💡CPU

CPUは、コンピューターの「脳」であり、技術用語で「Central Processing Unit」の略です。ビデオでは、CPUが命令に応答し、方向を移動したり、数学の加減算を行う役割を持っていると説明されています。

💡RAM

RAMは「Random Access Memory」の略で、コンピューターのメモリの一形態です。ビデオでは、RAMがプログラムが使用中的时候にデータを一時的に保存する場所であると説明されています。

💡ハードドライブ

ハードドライブやソリッドステートドライブは、コンピューターのデータが永続的に保存される場所です。ビデオでは、電源が切れても情報は保持されるため、データの安全性が確保されると説明されています。

💡バイナリサーチ

バイナリサーチは、データがアルファベット順に並べ替えられた配列やリスト内で効率的に検索を行うアルゴリズムです。ビデオでは、電話帳でジョンを検索する例として、バイナリサーチの概念が紹介されており、問題を半分に分けながら解決する「分治法」として位置づけられています。

💡バブルソート

バブルソートは、隣接する要素を比較し、順序が逆であれば交換する反復的プロセスです。ビデオでは、数字の並び替えの例として説明されており、ローカルな小規模な問題に焦点を当てながら全体を整える方法とされています。

💡ディープラーニング

ディープラーニングは、アルゴリズムの1つのタイプで、人工知能の分野で使われます。ビデオでは、ディープラーニングが学習アルゴリズムの一形態であり、大量のデータからパターンを学習し、新しい予測や判断を行う能力を持っていると説明されています。

💡機械学習

機械学習は、コンピューターがデータから学習し、判断や予測を行う能力を獲得するプロセスです。ビデオでは、機械学習がアルゴリズムの応用であり、例えばニューヨークタイムズのコンテンツのパーソナライゼーションやソーシャルメディアの「あなたに適したページ」の推薦システムに使われていると紹介されています。

💡最適化アルゴリズム

最適化アルゴリズムは、特定の問題に対する最適な解決策を見つけるために設計されたアルゴリズムです。ビデオでは、データサイエンスで最適化アルゴリズムがモデルの最適な説明を見つけるために使用されると説明されており、アルゴリズムの理論的な理解と実践的な応用が結びついていると強調されています。

💡ラーニングエージェント

ラーニングエージェントは、経験を通じて学習し、ゲームやタスクの戦略を改善するアルゴリズムです。ビデオでは、AlphaZeroやAlphaStarなどの例として提唱されており、これらのエージェントはゲームをプレイしながら学習し、データに基づいて戦略を洗練化することができると説明されています。

💡データサイエンティスト

データサイエンティストは、統計学、コンピューターサイエンス、ドメイン知識を組み合わせてデータから洞察を得る専門家です。ビデオでは、データサイエンティストがアルゴリズムを開発し、ビジネスやニュースルームの問題に適用する役割を持っていると紹介されています。

💡ラージ言語モデル

ラージ言語モデルは、大量のテキストデータで事前学習され、特定のタスクやテキストのサブセットに対してファインチューニングされる機械学習モデルです。ビデオでは、最新の技術としてラージ言語モデルが話題されており、そのトレーニングとファインチューニングのプロセスがアルゴリズムの応用例として説明されています。

Highlights

David J Ma教授强调算法在现实世界和虚拟世界中无处不在,并代表解决问题的机会。

计算机被定义为一种电子设备,具有中央处理器(CPU)和随机存取存储器(RAM)等硬件部件。

算法是一系列指令,用于解决问题,例如睡前常规或制作三明治的步骤。

精确性在算法中至关重要,错误的指令可能导致错误的结果。

搜索算法,如二分查找,展示了如何通过分治法快速定位信息。

递归算法通过自我调用解决相同问题的更小版本。

排序算法,如冒泡排序,通过逐步解决局部问题来优化整体情况。

社交媒体算法,如TikTok的推荐系统,基于用户行为和偏好来提供个性化内容。

学习算法在日常生活中的应用,如搜索引擎和推荐系统,正在不断增长。

深度伪造技术展示了学习算法如何模拟人类的语音和外观。

机器学习与经典算法的结合,如AlphaZero在围棋中的应用。

算法研究涉及寻找效率和连接性,以及优化问题的新方法。

数据科学家使用算法来优化模型并解决新闻编辑室和商业问题。

大型语言模型(LLMs)是预测下一个词的架构,而算法涉及训练和微调这些模型。

尽管大型语言模型的工作原理不完全清楚,但它们在实际应用中表现出色。

算法的理解和进步不一定需要彼此,但它们是松散耦合的。

随着技术的发展,对算法基础知识的需求可能会减少,但它们仍然是技术进步的基础。

算法提供了从基础到高级的广泛范围,即使最先进的算法现在可能难以理解,但随着学习的深入,它们将变得更加可访问。

Transcripts

play00:00

hello world my name is David J Ma and

play00:02

I'm a professor of computer science at

play00:04

Harvard University today I've been asked

play00:05

to explain algorithms in five levels of

play00:08

increasing difficulty algorithms are

play00:11

important because they really are

play00:12

everywhere not only in the physical

play00:14

world but certainly in the virtual world

play00:16

as well and in fact what excites me

play00:17

about algorithms is that they really

play00:19

represent an opportunity to solve

play00:21

problems and I dare say no matter what

play00:23

you do in life all of us have problems

play00:25

to

play00:27

solve so I'm a computer science

play00:29

professor so I spend a lot of time with

play00:31

computers how would you define a

play00:32

computer for them well a computer is

play00:36

electronic like a phone but it's um a

play00:39

rectangle and you like could type like

play00:43

tick tick tick and you work on it nice

play00:46

do you know any of the parts that are

play00:48

inside of a computer um no can I explain

play00:52

a couple of them to you yeah so like

play00:53

inside of every computer is some kind of

play00:56

brain and the technical term for that is

play00:58

CPU or Central processing unit and those

play01:01

are the pieces of Hardware that know how

play01:04

to respond to those instructions like

play01:06

moving up or down or left or right knows

play01:09

how to do math like addition and

play01:11

subtraction and then there's at least

play01:12

one other type of Hardware inside of a

play01:15

computer called memory or Ram if you've

play01:17

heard of this I know memory because you

play01:19

have to memorize stuff yeah exactly and

play01:21

computers have even different types of

play01:23

memory they have what's called Ram

play01:25

random access memory which is where your

play01:27

games where your programs are stored

play01:29

while they're being used but then it

play01:31

also has a a hard drive or a solid state

play01:33

drive which is where your data your high

play01:36

scores your documents once you start

play01:38

writing essays and and stories in the

play01:40

future stays stays permanently so even

play01:42

if the power goes out the computer can

play01:44

still remember that information it's

play01:46

still there because the computer can't

play01:49

just like delete all the words itself

play01:53

because your fingers can only do that

play01:56

like you have to use your finger to

play01:58

delete all the stuff exactly have you

play02:00

heard of an algorithm before um yes

play02:02

algorithm is a list of instructions to

play02:06

tell people what to do or like a robot

play02:09

what to do yeah exactly it's so it's

play02:11

just stepbystep instructions for doing

play02:14

something for solving a problem for yeah

play02:16

so like if you have a bedtime routine

play02:18

then first you say I get dressed I brush

play02:22

my teeth I read a little story and then

play02:25

I go to bed all right well how about

play02:27

another algorithm like um what do you

play02:28

tend to eat for for lunch any types of

play02:30

sandwiches you like uh I eat peanut

play02:33

butter let me get some supplies from the

play02:34

cupboard here so should we make an

play02:36

algorithm together for why don't we do

play02:39

it this way why don't we pretend like

play02:40

I'm a computer or maybe I'm a robot so I

play02:42

only understand your instructions and so

play02:45

I want you to feed me Noe pun intended

play02:47

in algorithm so step-by-step

play02:49

instructions for solving this problem

play02:51

but remember algorithms you have to be

play02:53

precise you have to give the right

play02:56

instructions the right instructions just

play02:58

do it for me so step step one was what

play03:01

open the bag okay opening the bag of

play03:03

bread stop grab the bread and put it on

play03:07

the plate grab the bread and put it on

play03:09

the

play03:10

plate take all the bread back and put it

play03:13

back in there so that's like an undo

play03:15

command little control Z okay take one

play03:19

bread and put it on the plate take the

play03:21

lid off the peanut butter okay take the

play03:23

lid off the peanut butter put the lid

play03:26

down okay take the knife take the knife

play03:29

put put the blade inside the peanut

play03:32

butter and spread the peanut butter on

play03:34

the bread I'm going to take out some

play03:36

peanut butter and I'm going to spread

play03:38

the peanut butter on the bread I put a

play03:41

lot of peanut butter on because I love

play03:42

peanut butter apparently I thought I was

play03:44

messing with you here but I think you're

play03:46

happy with this Put The Knife down and

play03:49

then grab one bread and put it on top of

play03:52

the second bread

play03:53

sideways

play03:56

sideways like put it flat on oh flat

play03:59

ways okay and now done you're done with

play04:01

your sandwich should we take a delicious

play04:03

bite yep let's take a bite okay here we

play04:07

go what would be the next step for you

play04:09

here clean all this mess up clean all

play04:12

this mess up right we made an algorithm

play04:15

step by-step instructions for solving

play04:17

some problem and if you think about now

play04:18

how we made peanut butter and jelly

play04:20

sandwiches sometimes we were imprecise

play04:22

you didn't give me quite enough

play04:24

information to do the algorithm

play04:25

correctly and that's why I took out so

play04:26

much bread Precision being very very

play04:29

very correct with your instructions is

play04:31

so important in the real world because

play04:33

for instance when you're using the

play04:35

worldwide web and you're searching for

play04:36

something on Google or B you want to do

play04:39

the right thing so like if you type in

play04:42

just Google then you won't find the

play04:45

answer to your question pretty much

play04:47

everything we do in life is an algorithm

play04:49

even if we don't use that fancy word to

play04:51

describe it because you and I are sort

play04:52

of following instructions either that we

play04:55

came up with ourselves or maybe our

play04:56

parents told us how to do these things

play04:59

and so those are just algorithms but

play05:00

when you start using algorithms in uh

play05:02

computers that's when you start writing

play05:08

code what do you know about algorithms

play05:11

nothing really um at all honestly I

play05:13

think it's just probably a way to store

play05:15

information um in computers and I dare

play05:18

say even though you might not have put

play05:19

this word on it odds are you executed as

play05:21

a human multiple algorithms today even

play05:25

before you came here today like what

play05:26

were a few things that you did I got

play05:28

ready okay and get ready what does that

play05:30

mean brushing my teeth brushing my hair

play05:32

okay getting dressed okay so all of

play05:34

those frankly if we really um Dove more

play05:37

deeply could be broken down into

play05:39

stepbystep instructions and presumably

play05:41

your mom your dad someone in the past

play05:44

sort of programmed you as a human to

play05:46

know what to do and then after that as a

play05:48

smart human you can sort of take it from

play05:49

there and you don't need their help

play05:50

anymore but that's kind of what we're

play05:51

doing when we program computers

play05:54

something may be even more familiar

play05:55

nowadays like odds are you have a cell

play05:57

phone your contacts or your address book

play05:59

let me ask you why that is like why does

play06:02

Apple or Google or anyone else bother

play06:03

alphabetizing your contacts I just

play06:05

assumed it would be easier to navigate

play06:07

what if your friend happened to be at

play06:10

the very bottom of this randomly

play06:11

organized list like why is that a

play06:13

problem like he or she is still there I

play06:15

guess it would take a while to get to

play06:16

while you're scrolling that in of itself

play06:17

is kind of a problem or it's an

play06:19

inefficient solution to the problem so

play06:21

it turns out that back in my day before

play06:22

there were cell phones like everyone's

play06:24

numbers from my schools like were

play06:26

literally printed in a book and everyone

play06:28

in my town and my city my state was

play06:29

printed in an actual phone book even if

play06:31

you've never seen this technology before

play06:33

how would you propose verbally to find

play06:35

John in this phone book or I would just

play06:37

slip through and just look for the J I

play06:39

guess yeah so let me propose that we

play06:40

start that way I could just start at the

play06:42

beginning and step by step I could just

play06:44

look at each page looking for John

play06:47

looking for John now even if you've

play06:49

never seen this here technology before

play06:51

it turns out this is exactly what your

play06:52

phone could be doing in software like

play06:55

someone from Google or Apple or the like

play06:57

they could write software that uses a

play06:59

technique and programming known as a

play07:00

loop and a loop as the word implies is

play07:02

just sort of do something again and

play07:04

again what if instead of starting from

play07:05

the beginning and going one page at a

play07:07

time what if I or what if your phone

play07:08

goes like two pages or two names at a

play07:10

time would this be correct do you think

play07:13

well you could skip over John I think in

play07:15

what sense if he's in one of the middle

play07:16

pages that you skipped over yeah so sort

play07:18

of accidentally and frankly with like

play07:20

50/50 probability John could get

play07:22

sandwiched in between two pages but does

play07:24

that mean I have to throw that algorithm

play07:26

out Al together maybe you could use that

play07:28

strategy and until you get close to the

play07:30

section and then switch to going one by

play07:31

one okay that's nice so you could kind

play07:33

of like go twice as fast but then kind

play07:35

of pump the brakes as you near your exit

play07:37

on the highway or in this case near the

play07:38

J section of the book exactly and maybe

play07:41

alternatively if I get to like a BC d e

play07:43

f g h i j k if I get to the K section

play07:46

then I could just double back like one

play07:48

page just to make sure John didn't get

play07:50

sandwiched between those pages so the

play07:52

nice thing about that second algorithm

play07:53

is that I'm flying through the phone

play07:54

book like two pages at a time so 2 4 6 8

play07:57

10 12 it's not perfect it's not NE neily

play07:59

correct but it is if I just take like

play08:01

one extra step so I think it's fixable

play08:03

but what your phone is probably doing

play08:05

and frankly what I and like my parents

play08:07

and grandparents used to do back in the

play08:08

day is we'd probably go roughly to the

play08:10

middle of the phone book here and just

play08:12

intuitively if this is an alphabetized

play08:13

phone book in English what section am I

play08:15

probably going to find myself in roughly

play08:18

okay okay so I'm in the K section is

play08:20

John going to be to the left or to the

play08:21

right to the left yeah so John is going

play08:23

to be to the left or the right and what

play08:24

we can do here though your phone does

play08:26

something smarter is tear the problem in

play08:28

half throw half of the problem away

play08:30

being left with just 500 pages now but

play08:33

what might I next do I could sort of

play08:35

naively just start at the beginning

play08:36

again but we've learned to do better I

play08:38

can go roughly to the middle here do it

play08:40

again yeah exactly so now maybe I'm in

play08:42

the E section which is a little to the

play08:45

left so John is clearly going to be to

play08:47

the right so I can again tear the

play08:49

problem portly in half throw this half

play08:52

of the problem away and I claim now that

play08:54

if we started with 1,000 Pages now we've

play08:56

gone to 500 250 now we're really moving

play08:59

quickly

play09:00

eventually I'm hopefully dramatically

play09:02

with just one single page at point is

play09:05

either on that page or not on that page

play09:07

and I can call him roughly how many

play09:10

steps might this third algorithm take if

play09:12

I started with a th000 Pages then went

play09:14

to 500 250 125 like how many times can

play09:18

you divide 1,000 and half maybe 10

play09:21

that's roughly 10 because in the first

play09:23

algorithm looking again for someone like

play09:25

Zoe in the worst case might have to go

play09:26

all the way through thousand pages but

play09:28

the second algorithm you said was 500

play09:30

maybe 5001 essentially the same thing so

play09:33

twice as fast but this third and final

play09:35

algorithm is sort of fundamentally

play09:37

faster because you're you're sort of

play09:39

dividing and conquering it in half and

play09:41

half and half not just taking one or two

play09:43

bites out of it at of a time so this of

play09:45

course is not how we used to use phone

play09:47

books back in the day since otherwise

play09:48

they'd be single use only but it is how

play09:50

your phone is actually searching for Zoe

play09:53

for John for anyone else but it's doing

play09:55

it in software oh that's cool so here

play09:57

we've happened to focus on searching

play09:58

algorithms looking for John in the phone

play10:00

book but the technique we just Ed can

play10:02

indeed be called divide and conquer

play10:03

where you take a big problem and you

play10:05

divide and conquer that is you try to

play10:07

chop it up into smaller smaller smaller

play10:09

pieces a more sophisticated type of

play10:11

algorithm at least depending on how you

play10:12

implement it something known as a

play10:14

recursive algorithm recursive algorithm

play10:16

is essentially an algorithm that uses

play10:19

itself to solve the exact same problem

play10:21

again and again but chops it smaller and

play10:24

smaller and smaller

play10:28

ultimately hi my name is Patricia

play10:30

Patricia nice to meet you where are you

play10:31

a student at I'm starting my senior year

play10:33

now at NYU oh nice and what have you

play10:35

been studying the past few years I study

play10:37

computer science and data science if you

play10:38

were chatting with a non-cs non-data

play10:40

science friend of yours like how would

play10:41

you explain to them what an algorithm is

play10:44

some kind of like systematic way of like

play10:46

solving a problem or like a set of like

play10:49

steps to kind of solve a certain like

play10:51

problem you have so you probably recall

play10:53

learning topics like binary search

play10:55

versus linear search and like so I've

play10:57

come here uh complete with a chalkboard

play11:00

with some magnetic numbers on it here

play11:02

like how would you tell a friend to sort

play11:04

these I think one of the first things we

play11:06

learned was something called Bubble

play11:08

swort it was kind of like focusing on

play11:10

like smaller like bubbles I guess I

play11:12

would say like of the problem like

play11:14

looking at like smaller segments rather

play11:16

than like the whole thing at once what

play11:17

is I think very true about what you're

play11:19

hinting at is that bubble sort really

play11:21

focuses on like local small problems

play11:25

rather than taking a step back trying to

play11:26

fix the whole thing let's just fix the

play11:28

obvious problems in of us so for

play11:29

instance when we're trying to get from

play11:31

smallest to largest and the first two

play11:32

things we see are eight followed by one

play11:35

this looks like a problem cuz it's out

play11:36

of order so what would be the simplest

play11:38

fix the least amount of work we can do

play11:40

to at least fix one problem just like

play11:41

switch those two numbers cuz one is

play11:43

obviously smaller than eight perfect so

play11:45

we just swap those two then you would

play11:47

switch those again yeah so that further

play11:50

improves the situation and you can kind

play11:51

of see it that the one and the two are

play11:53

now in place how about 8 and six switch

play11:55

it again switch those again 8 and three

play11:57

switch it again

play12:00

and conversely now the one and the two

play12:03

are closer to and coincidentally are

play12:04

exactly where we want them to be so are

play12:07

we done no okay so obviously not but

play12:10

what could we do now to further improve

play12:14

the situation go through it again but

play12:16

like you don't need to check the last

play12:18

one anymore because we know like that

play12:20

number is bubbled up to the top Yeah

play12:22

because it has indeed bubbled all the

play12:23

way to the top so one and two yeah keep

play12:26

it as is okay two and six keep it as is

play12:28

okay and three then you switch it okay

play12:30

we switch or swap those six and four

play12:32

swap it again okay so four and uh six

play12:35

and seven uh keep it okay s and five

play12:37

swap it okay and then I think per your

play12:39

point we're pretty darn close let's go

play12:41

through once more one and two keep it

play12:45

two three keep it 3 four keep it 4 six

play12:48

keep it 6 five and then switch it all

play12:49

right we'll switch this and now to your

play12:51

point we don't need to bother with the

play12:52

ones that already bubbled their way up

play12:54

now we're 100% sure it's sorted yeah and

play12:57

certainly the search engines of the

play12:58

world Google and and so forth they

play13:00

probably don't keep web pages in sorted

play13:02

order cuz that would be a crazy long

play13:04

list when you're just trying to search

play13:05

the data but there's probably some

play13:07

algorithm underlying what they do and

play13:08

they probably similarly just like we do

play13:11

a bit of work upfront to get things

play13:13

organized even if it's not strictly

play13:15

sorted in the same way so that people

play13:16

like you and me and others can find that

play13:19

same information so how about social

play13:21

media can you envision where the

play13:23

algorithms are in that world like maybe

play13:26

for example like Tik Tok like the for

play13:27

you page it's kind of like cuz those are

play13:29

like Rec like recommendations right it's

play13:31

like sort of like Netflix

play13:33

recommendations except more constant

play13:35

because it's just like every video you

play13:37

scroll it's like that's a new

play13:38

recommendation basically and it's like

play13:39

based on like what you've liked

play13:40

previously what you've like saved

play13:42

previously what you search up so I would

play13:44

assume there's some kind of algorithm

play13:45

there kind of figuring out like what to

play13:47

put on your for you page absolutely just

play13:49

trying to keep you presumably more

play13:50

engaged so the better the algorithm is

play13:52

the better your engagement is maybe the

play13:54

more money the company then makes on the

play13:56

platform and so forth so it all sort of

play13:58

feeds together but what you're

play13:59

describing really is more artificially

play14:01

intelligent if I may because presumably

play14:04

there's not someone at Tik Tok or any of

play14:05

these social media companies saying if

play14:07

Patricia likes this post then show her

play14:10

this post if she likes this post then

play14:12

show her this other post because the

play14:13

code would sort of grow infinitely long

play14:15

and there's just way too much content

play14:17

for a programmer to be having those

play14:19

kinds of conditionals those those

play14:21

decisions being made behind the scenes

play14:24

so it's probably a little more

play14:25

artificially intelligent and in that

play14:27

sense You have topics like neural

play14:28

network works and machine learning which

play14:31

really describe taking as input things

play14:33

like what you watch what you click on

play14:34

what your friends watch what they click

play14:36

on and sort of trying to infer from that

play14:38

instead what should we show Patricia or

play14:40

her friends next okay yeah yeah that

play14:42

makes like the distinction more makes

play14:44

more sense now

play14:48

yeah I am currently a fourth year PhD

play14:51

student at NYU I do robot learning so

play14:54

that's half and half Robotics and

play14:55

Mission learning sounds like you've

play14:57

dabbled with quite a few algorithms so

play14:59

how does one actually research

play15:00

algorithms or invent algorithms the most

play15:02

important was just trying to think about

play15:04

inefficiencies and also think about

play15:06

Connecting Threads the way I think about

play15:08

it is that algorithm for me is not just

play15:11

about the way of doing something but

play15:13

it's about doing something efficiently

play15:14

learning algorithms are practically

play15:16

everywhere now CU Google I would say for

play15:18

example is learning every day about like

play15:21

oh what what articles what links might

play15:23

be better than others and reranking them

play15:26

um there are recommender systems all

play15:28

around us right like content feeds and

play15:31

social media or you know like YouTube or

play15:33

Netflix what we see is in a large part

play15:36

determined by this kind of learning

play15:38

algorithms nowadays there's a lot of

play15:40

concerns around some applications of

play15:42

machine learning and like deep fakes

play15:44

where it can kind of learn how I talk

play15:46

and learn how you talk and even how we

play15:48

look and generate videos of us we're

play15:50

doing this for real but you could

play15:51

imagine a computer synthesizing this

play15:53

conversation eventually but how does it

play15:55

even know what I sound like and what I

play15:57

look like and how to replicate that all

play15:59

of this learning algorithms that we talk

play16:01

about right uh a lot like what goes in

play16:04

there is just lots and lots of data so

play16:06

data goes in something else comes out

play16:08

what comes out is whatever objective

play16:10

function that you optimize for like

play16:11

where is the line between algorithms

play16:13

that like play games with and without AI

play16:17

I think when I started off my undergrad

play16:19

the current AI machine learning was not

play16:23

very much synonymous okay and even in my

play16:25

undergraduate in the AI class they

play16:27

learned a lot of classical algorithms

play16:28

for game plays like for example the a

play16:31

star search right that's a very simple

play16:33

example of how you can play a game

play16:35

without having anything learned this is

play16:38

very much oh you are at a game State you

play16:40

just search down see what are the

play16:42

possibilities and then you pick the best

play16:45

possibility that it can see versus what

play16:47

you think about when you think about I

play16:49

gameplay like the alpha zero for example

play16:53

or Alpha star or there are a lot of you

play16:55

know like fancy new machine learning

play16:57

agents that are you know even like

play16:59

learning very difficult games like go

play17:01

and those are learned agents as in they

play17:04

are getting better as they play more and

play17:07

more games and as they get more games

play17:09

they kind of refine their strategy based

play17:11

on the data that they seen and once

play17:13

again this high level abstraction is

play17:15

still the same you see a lot of data and

play17:17

you learn from that right but the

play17:19

question is what is objective function

play17:21

that you're optimizing for is it winning

play17:23

this game is it forcing a tie or is it

play17:25

you know like opening a door in a

play17:27

kitchen so if the world is very focused

play17:29

on supervised unsupervised reinforcement

play17:31

learning now like what comes next 5 10

play17:33

years where's the world going I think

play17:36

that this is just U going to be more and

play17:39

more I don't want to use the word

play17:41

encroachment but that's what it feels

play17:42

like of algorithms into our everyday

play17:44

life like even when I was taking the

play17:46

train here right the trains are being

play17:47

routed with algorithms but this has

play17:49

existed for you know like 50 years

play17:51

probably but as I was coming here as I

play17:54

was checking my phone those are

play17:55

different algorithms and you know

play17:57

they're they're kind of getting getting

play17:59

all around us getting there're with us

play18:01

all the time they're making our life

play18:02

better most places most cases and I

play18:05

think that's just going to be

play18:06

continuation of all of those and it

play18:08

feels like they're even in places you

play18:09

wouldn't expect and there's just so much

play18:11

data about you and me and everyone else

play18:13

online and this data is being mined and

play18:14

analyzed and influencing things we see

play18:16

and here it would seem so there is sort

play18:18

of a Counterpoint which might be good

play18:20

for the marketers but not necessarily

play18:22

good for you and me as individuals you

play18:24

know like we're human beings but for

play18:25

someone we might be just a pair of eyes

play18:28

who are

play18:29

you know carrying a wallet and are there

play18:31

to buy things but there is so much more

play18:33

potential for this algorithms to just

play18:35

make our life better without you know

play18:38

like changing much about our

play18:42

life I'm Chris Wiggins from an associate

play18:44

professor of Applied Mathematics at

play18:45

Columbia I'm also the chief data

play18:47

scientist of the New York Times the data

play18:48

science team at the New York Times

play18:50

develops and deploys machine learning

play18:51

for Newsroom and business problems but I

play18:53

would say the things that we do mostly

play18:55

you don't see but it might be things

play18:56

like personalization algorithms or

play18:58

recommending different content and do

play19:00

data scientists which is rather distinct

play19:02

from the phrase computer scientist do

play19:04

data scientists still think in terms of

play19:06

algorithms as driving a lot of it oh

play19:08

absolutely yeah in fact so in data

play19:10

science and Academia often the role of

play19:12

the algorithm is the optimization

play19:14

algorithm that helps you find the best

play19:16

model or the best description of a data

play19:17

set okay in data science and industry

play19:20

the goal often it's centered around an

play19:22

algorithm which becomes a data product

play19:24

okay so a data scientist in Industry

play19:26

might be developing and deploying the

play19:28

algorithm which means not only

play19:30

understanding the algorithm and its

play19:31

statistical performance but also all of

play19:33

the software engineering around systems

play19:35

integration making sure that that

play19:37

algorithm receives input that's reliable

play19:40

and has output that's useful as well as

play19:42

I would say the organizational

play19:43

integration which is how does a

play19:45

community of people like the set of

play19:46

people working at the New York Times

play19:48

integrate that algorithm into their

play19:49

process interesting and I feel like AI

play19:51

based startups are all the rage and

play19:53

certainly within Academia are there

play19:54

connections between Ai and the world of

play19:56

data science absolutely the algorithms

play19:58

that they're in connect those dots for

play19:59

you're right that AI as a field has

play20:01

really exploded I would say particularly

play20:03

many people experienced a chatbot that

play20:05

was really really good today when people

play20:06

say I AI they're often thinking about

play20:09

large language models or they're

play20:11

thinking about generative AI or they

play20:13

might be thinking about a chatbot one

play20:14

thing to keep in mind is a chatbot is a

play20:16

special case of generative AI which is a

play20:18

special case of using large language

play20:20

models which is a special case of using

play20:22

machine learning generally which is what

play20:24

most people mean by AI you may have

play20:26

moments that are um what John McCarthy

play20:28

called look M No Hands results where you

play20:30

do some fantastic trick and you're not

play20:32

quite sure how it worked I think it's

play20:33

still very much early days large

play20:35

language models is still in the point of

play20:37

what might be called alchemy that people

play20:39

are building large language models

play20:40

without a real clear a priori sense of

play20:42

what the right design is for a right

play20:44

problem many people are trying different

play20:46

things out often in large companies

play20:47

where they can afford to have many

play20:49

people trying things out seeing what

play20:50

works publishing that instantiating it

play20:52

as a product and that itself is part of

play20:54

the scientific process I would think too

play20:56

yeah very much well science and

play20:57

engineering because often you're

play20:59

building a thing and the thing does

play21:01

something amazing to large extent we are

play21:03

still looking for basic theoretical

play21:05

results around why deep neural networks

play21:08

generally work why are they able to

play21:10

learn so well they're huge billions of

play21:12

parameter models and it's difficult for

play21:14

us to interpret how they are able to do

play21:16

what they do and is this a good thing do

play21:18

you think or an inevitable thing that we

play21:20

the programmers we the computer

play21:21

scientists the data science who are

play21:23

inventing these things can't actually

play21:25

explain how they work because I feel

play21:27

like friends of mine in industry even

play21:29

when it's something simple and

play21:30

relatively familiar like autocomplete

play21:32

they can't actually tell me like why

play21:33

that name is appearing at the top of the

play21:35

list where is years ago when these

play21:37

algorithms were more deterministic and

play21:39

more procedural you could even point to

play21:41

the line that made that name bubble up

play21:43

to the top so is this a good thing a bad

play21:45

thing that we're sort of losing control

play21:47

perhaps in some sense of the algorithms

play21:48

it has risks I don't know that I would

play21:50

say that it's good or bad but I would

play21:51

say there's lots of scientific precedent

play21:53

there are times when an algorithm works

play21:54

really well and we have finite

play21:56

understanding of why it works or a model

play21:58

works really well and sometimes we have

play22:00

very little understanding of why it

play22:02

works the way it does in classes I teach

play22:04

certainly spend a lot of time on

play22:05

fundamentals algorithms that have been

play22:07

taught in classes for decades now

play22:08

whether it's binary search linear search

play22:10

bubble swort selection sort or the like

play22:13

but are if we're already at the point

play22:15

where I can pull up chat GPT copy paste

play22:17

a whole bunch of numbers or words and

play22:19

say sort these for me does it really

play22:22

matter how chat GPT is sorting it does

play22:24

it really matter to me as the user how

play22:26

the software is sorting it like do these

play22:28

fundamentals become more dated and less

play22:30

important do you think now you're

play22:31

talking about the ways in which code and

play22:33

computation is a special case of

play22:35

Technology right so for driving a car

play22:38

you may not necessarily need to know

play22:40

much about organic chemistry even though

play22:41

if if the organic chemistry is how the

play22:44

car works right so you can drive the car

play22:46

and use it in different ways without

play22:48

understanding much about the

play22:49

fundamentals so similarly with

play22:50

computation we're at a point where the

play22:52

computation is so high level right as

play22:54

you you know you can import pyit learn

play22:56

and you can go from zero to machine

play22:57

learning in 30 seconds it's depending on

play22:59

what level you want to understand the

play23:00

technology where in the stack so to

play23:02

speak um it's possible to understand it

play23:05

and make wonderful things and Advance

play23:06

the world without understanding it at

play23:08

the particular level of somebody who

play23:10

actually might have originally designed

play23:11

the actual optimization algorithm I

play23:13

should say though from any of the

play23:14

optimization algorithms there are cases

play23:16

where an algorithm works really well and

play23:18

we publish a paper and there's a proof

play23:20

in the paper and then years later people

play23:22

realize actually that prove was wrong

play23:23

and we're really still not sure why that

play23:24

optimization works but it works really

play23:26

well or it inspires people to make new

play23:28

optimization algorithms so I I do think

play23:32

that the the goal of understanding

play23:34

algorithms is Loosely coupled to our

play23:36

progress and advancing grade algorithms

play23:38

but they don't always necessarily have

play23:39

to require each other and for those

play23:41

students especially or even adults who

play23:43

are thinking of now steering into

play23:45

computer science into programming who

play23:47

were really jazzed about heading in that

play23:49

direction up until for instance November

play23:50

of 2022 when all of a sudden for many

play23:53

people it looked like the world was now

play23:54

changing and now maybe this isn't such a

play23:57

promising path this isn't such a

play23:58

lucrative path anymore are llms are

play24:02

tools like chat GPT reason not to

play24:03

perhaps steer into the field large

play24:05

language models are a particular

play24:06

architecture for predicting let's say

play24:08

the next word or a set of tokens more

play24:10

generally the algorithm comes in when

play24:12

you think about how is that llm to be

play24:15

trained or also how to be fine-tuned so

play24:19

the P of GPT is a pre-trained algorithm

play24:23

the idea is that you train a large

play24:24

language model on some Corpus of text

play24:27

could be encyclopedias or textbooks or

play24:30

what have you and then you might want to

play24:32

fine-tune that model around some

play24:35

particular task or some particular

play24:37

subset of texts so both of those are

play24:40

examples of training algorithms so I

play24:42

would say people's perception of

play24:43

artificial intelligence has really

play24:45

changed a lot in the last 6 months

play24:47

particularly around November of 2022

play24:51

when people experienced a really good

play24:52

chatbot the technology though had been

play24:54

around already before academics had

play24:56

already been working with chat gpt3

play24:58

before that and GPT 2 and gpt1 and and

play25:01

for many people it sort of opened up

play25:03

this conversation about what is

play25:04

artificial intelligence and what could

play25:05

we do with this and what are the

play25:07

possible good and bad right like any

play25:08

other piece of technology cransberg

play25:10

first law of technology technology is

play25:12

neither good nor bad nor is it neutral

play25:14

every time we have some new technology

play25:15

we should think about its capabilities

play25:17

and the good and the possible bad as

play25:20

with any area of study algorithms offer

play25:22

a spectrum from the most basic to the

play25:24

most advanced and even if right now the

play25:26

most advanced of those algorithms feels

play25:28

Out Of Reach because you just don't have

play25:30

that background with each lesson you

play25:32

learn with each algorithm you study that

play25:34

endgame becomes closer and closer such

play25:36

that it will before long be accessible

play25:38

to you and you will be at the end of

play25:40

that most advanced Spectrum

Rate This

5.0 / 5 (0 votes)

相关标签
アルゴリズムコンピューターサイエンス教授インタビュー学生意見データサイエンス機械学習AI技術検索アルゴリズムソートアルゴリズムソーシャルメディア
您是否需要英文摘要?