Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

WIRED
8 Nov 202325:47

Summary

TLDRThe video script is an engaging discussion on the importance and prevalence of algorithms in our daily lives, led by David J. Malan, a professor of computer science at Harvard University. It covers the concept of algorithms as step-by-step problem-solving instructions, essential in both the physical and virtual worlds. The script explores various levels of algorithmic complexity, from simple tasks like making a sandwich to more complex scenarios such as searching for a contact in a phone book. It also delves into the role of algorithms in computer science, data science, and artificial intelligence, including their use in machine learning, recommender systems, and large language models. The discussion highlights the impact of algorithms on personalization in social media, the challenges of understanding and controlling complex AI systems, and the future of algorithm development. The script emphasizes the continuous evolution of algorithms in our lives, their potential benefits, and the need for critical thinking about their application and consequences.

Takeaways

  • πŸ“š Algorithms are fundamental to solving problems and are prevalent in both the physical and virtual worlds.
  • πŸ’‘ A computer's CPU is its 'brain', executing instructions like math operations and directional movements.
  • 🧠 Memory, or RAM, is crucial for a computer's operation, storing programs and data temporarily or permanently.
  • 🍞 Algorithms can be as simple as a bedtime routine or making a sandwich, requiring precise and step-by-step instructions.
  • πŸ” Search engines like Google use algorithms to organize and retrieve information efficiently from vast databases.
  • πŸ”’ Sorting algorithms, such as bubble sort, are foundational in computer science and involve solving problems step by step locally.
  • πŸ€– Machine learning and AI are intertwined, with algorithms being used to improve and personalize content in various applications.
  • πŸš„ Algorithms are used in various aspects of life, from social media feeds to train routing, enhancing efficiency and personalization.
  • 🧐 Data scientists and AI researchers are still exploring the 'why' behind the effectiveness of certain algorithms, like deep neural networks.
  • πŸ“ˆ The role of algorithms in industry is not just about creating them but also about integrating them into systems and processes effectively.
  • 🌐 As algorithms become more integrated into daily life, it's important to consider their implications, both positive and negative, on society.

Q & A

  • What is an algorithm and why are they important?

    -An algorithm is a list of step-by-step instructions for solving a problem or performing a task. They are important because they are ubiquitous, representing opportunities to solve problems not only in the physical world but also in the virtual world, and are fundamental to how we interact with technology in our daily lives.

  • How would you define a computer?

    -A computer is an electronic device, typically rectangular in shape, that allows for input through typing and other forms of interaction. It contains a CPU (central processing unit) which acts as its 'brain' and is capable of executing instructions, as well as memory (RAM) for temporary data storage and a hard drive or solid state drive for long-term data storage.

  • What is the role of memory in a computer?

    -Memory, or RAM (Random Access Memory), is a type of hardware inside a computer where active data is stored while the computer is running programs or games. It allows for quick access to this data. Additionally, computers have a hard drive or solid state drive for permanent data storage, which retains information even when the power is off.

  • How does the concept of precision relate to algorithms?

    -Precision is critical in algorithms because it ensures that the correct steps are taken to achieve the desired outcome. An imprecise algorithm can lead to incorrect results or inefficiencies. For instance, when searching for information on the internet, precise instructions (search terms) are necessary to find the correct information.

  • What is a recursive algorithm and how does it work?

    -A recursive algorithm is a sophisticated type of algorithm that calls itself to solve the same problem, but in smaller parts. It works by repeatedly breaking down a problem into smaller and smaller sub-problems until it becomes simple enough to solve directly. This method is a key concept in divide and conquer strategies.

  • How do sorting algorithms like bubble sort operate?

    -Bubble sort is a simple sorting algorithm that repeatedly steps through the list to be sorted, compares each pair of adjacent items, and swaps them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted.

  • What is the significance of machine learning in today's world?

    -Machine learning is a subset of artificial intelligence that provides systems the ability to learn and improve from experience without being explicitly programmed. It is significant because it is used in recommender systems, search engines, and various applications that require making predictions or decisions based on data.

  • How do large language models (LLMs) like Chat GPT work?

    -Large language models like Chat GPT are trained on a large corpus of text to predict the next word or a set of tokens in a sequence. They use machine learning to understand patterns in language and generate human-like text based on the input provided to them.

  • What is the role of data scientists in developing algorithms?

    -Data scientists play a crucial role in developing and deploying algorithms. They not only understand the algorithm and its statistical performance but also handle software engineering aspects, such as systems integration, ensuring reliable input, and useful output. They also consider organizational integration, which involves how the algorithm fits into the workflow of a company or institution.

  • Why is it important to understand the fundamentals of algorithms even when high-level tools are available?

    -Understanding the fundamentals of algorithms is important because it provides a deeper comprehension of how technology works, which can be beneficial for troubleshooting, optimizing performance, and innovating new solutions. Even though high-level tools abstract away the complexity, knowing the basics can help in leveraging these tools more effectively and understanding their limitations.

  • How do you perceive the future of algorithms and their role in our daily lives?

    -The future of algorithms is likely to see even more integration into our daily lives, from routing trains to personalizing content on our phones. Algorithms will continue to make our lives more efficient in many ways, but it's important to be aware of their influence and to consider the ethical implications of their use.

  • What are the ethical considerations when developing and using algorithms?

    -Ethical considerations include ensuring transparency in how algorithms make decisions, especially when they impact people's lives significantly. There are concerns about privacy, as algorithms often rely on large amounts of personal data. Additionally, there's the risk of bias in algorithms if they are trained on biased data, which can lead to unfair outcomes.

Outlines

00:00

πŸ“š Introduction to Algorithms and Computer Science

David J. Malan, a Harvard University professor, introduces the concept of algorithms as fundamental to solving problems in both the physical and virtual worlds. He explains that a computer's CPU is its 'brain,' executing instructions, and that memory, or RAM, is used for temporary storage of programs and games. The hard drive or solid state drive is for permanent data storage. Malan also touches on the importance of precision in algorithms, using the example of making a peanut butter sandwich to illustrate step-by-step instructions.

05:00

πŸ” Searching Algorithms and Efficiency

The discussion moves to searching algorithms, emphasizing the importance of efficiency in finding information. An example is given about finding a contact in an alphabetized list, comparing the efficiency of linear search to more advanced methods like binary search, which uses a divide and conquer strategy. The concept of a loop in programming is introduced as a technique to repeat tasks, and the idea of a recursive algorithm is briefly mentioned as a more sophisticated type of algorithm.

10:01

πŸ€– Recursive Algorithms and Problem-Solving

Patricia, an NYU senior studying computer science and data science, explains algorithms as a systematic way of solving problems. The conversation covers different types of sorting algorithms, such as bubble sort, which addresses local issues within a data set. The concept of divide and conquer is further explored, and recursive algorithms are introduced as algorithms that call themselves to solve smaller instances of the same problem. Patricia also discusses the role of algorithms in content recommendation systems like those used by social media platforms.

15:01

🧠 Machine Learning and Algorithmic Research

A PhD student discusses the process of researching and inventing algorithms, focusing on identifying inefficiencies and making connections. Machine learning is highlighted as a field where algorithms learn from data, with examples given such as Google's search algorithms and recommender systems. The student also touches on concerns related to the application of machine learning, like deep fakes, and the importance of data in training algorithms. The conversation concludes with thoughts on the future of algorithms and their increasing presence in everyday life.

20:02

πŸš€ The Role of Algorithms in Data Science and AI

Chris Wiggins, an associate professor and chief data scientist at the New York Times, discusses the role of algorithms in data science. He explains that algorithms are crucial for optimizing models and creating data products. The conversation covers the distinction between data scientists and computer scientists, the importance of software engineering in deploying algorithms, and the impact of AI and machine learning on various industries. Wiggins also addresses the rise of AI-based startups and the relationship between AI and data science.

25:04

πŸ€– The Impact of Large Language Models

The discussion focuses on large language models (LLMs) and their impact on the perception of artificial intelligence. It's noted that while LLMs have changed public perception, particularly after the release of advanced chatbots, the underlying technology has been in development for some time. The conversation explores the concept of generative AI and the use of machine learning to create models that can perform tasks like sorting data or generating text. There's also a debate on whether the rise of such advanced tools diminishes the importance of understanding fundamental algorithms, with the argument that a high-level understanding can be sufficient for practical use without delving into the intricacies of the algorithms themselves.

🌟 The Spectrum of Algorithms and Their Future

David J. Malan concludes the discussion by emphasizing the spectrum of algorithms, from basic to advanced, and encouraging continuous learning. He suggests that as one studies and understands algorithms, they become more accessible and can be applied to a wide range of problems. The potential for both good and bad outcomes with technology is acknowledged, reflecting on how algorithms, as part of technology, are neither inherently positive nor negative but have the power to influence outcomes based on their application.

Mindmap

Keywords

πŸ’‘Algorithm

An algorithm is a step-by-step procedure to solve a problem or achieve a desired outcome. In the video, algorithms are presented as fundamental to computer science and integral to solving problems in both the physical and virtual worlds. The script uses examples such as sorting routines and searching techniques to illustrate how algorithms work in practice.

πŸ’‘Computer

A computer is an electronic device used for processing and storing data, typically in binary form. It is defined in the script as a rectangle-shaped device that you can type on, highlighting its interactive nature. The video emphasizes the role of computers in executing algorithms through their hardware components.

πŸ’‘CPU (Central Processing Unit)

The CPU is often referred to as the 'brain' of a computer. It is the hardware component responsible for executing instructions, such as arithmetic and logical operations. In the context of the video, the CPU's role is crucial for running algorithms that perform various tasks and solve problems.

πŸ’‘Memory

Memory, specifically RAM (Random Access Memory), is a type of hardware in a computer used for storing data temporarily. The video explains that RAM is where active programs and games are stored while in use, distinguishing it from long-term storage like hard drives. Memory is essential for the functioning of algorithms as it allows for data to be retained and manipulated during processing.

πŸ’‘Divide and Conquer

Divide and conquer is a problem-solving strategy where a complex problem is broken down into smaller, more manageable sub-problems. The video uses the analogy of searching for a name in a phone book to illustrate how this technique can make algorithms more efficient by reducing the amount of work needed to reach a solution.

πŸ’‘Recursive Algorithm

A recursive algorithm is a type of algorithm that solves a problem by solving smaller instances of the same problem. The video explains that this method involves the algorithm calling itself with a reduced problem size until a base condition is met. It is a more sophisticated approach that is central to many advanced programming techniques.

πŸ’‘Machine Learning

Machine learning is a subset of artificial intelligence that involves the development of algorithms capable of learning from data and improving over time. The video discusses how machine learning algorithms are used in recommender systems on platforms like Netflix and YouTube, shaping the content that users see based on their interactions.

πŸ’‘Data Science

Data science is a field that involves the extraction of knowledge and insights from data, often using algorithms to analyze and interpret complex datasets. In the video, data science is presented as a discipline that relies heavily on algorithms for tasks such as personalization and content recommendation, emphasizing its role in modern analytics.

πŸ’‘Neural Networks

Neural networks are computing systems inspired by the human brain that are used in machine learning to recognize patterns and solve complex problems. The video touches on neural networks in the context of AI, highlighting their role in processing large amounts of data and making predictions or decisions without explicit programming.

πŸ’‘Large Language Models (LLMs)

Large language models are advanced AI systems capable of understanding and generating human-like text based on vast amounts of data. The video discusses LLMs in the context of AI advancements, noting their use in creating chatbots and the ethical considerations surrounding their ability to generate deceptive content like deep fakes.

πŸ’‘Optimization Algorithm

An optimization algorithm is used to find the best or most efficient solution to a problem from a set of available solutions. The video mentions optimization algorithms in the context of data science, where they are employed to find the most accurate model or representation of a dataset, underscoring their importance in statistical analysis and model development.

Highlights

David J. Malan, a professor of computer science at Harvard University, explains algorithms across five levels of increasing difficulty.

Algorithms are ubiquitous, important in both the physical and virtual worlds, and represent opportunities to solve problems.

A computer is defined as an electronic device with a CPU (central processing unit) and memory for processing instructions and storing data.

RAM (Random Access Memory) is distinguished from a hard drive or solid state drive for storing programs and data while in use.

Algorithms are described as step-by-step instructions for solving problems, similar to a bedtime routine or making a sandwich.

Precision in algorithms is crucial, as demonstrated by the sandwich-making example where imprecision led to unnecessary steps.

The concept of a loop in programming is introduced as a technique to repeat actions, making processes more efficient.

Divide and conquer is a strategy for solving problems by breaking them down into smaller, more manageable parts.

Recursive algorithms are a type of algorithm that use themselves to solve smaller instances of the same problem.

Sorting algorithms, such as bubble sort, focus on local problems and make incremental improvements to achieve a desired outcome.

Social media platforms use algorithms to personalize content and recommendations, aiming to increase user engagement.

Machine learning algorithms are used in recommender systems to predict and curate content based on user behavior and preferences.

The line between classical algorithms and AI is blurred, with AI often involving learning from data to refine strategies and predictions.

The future of algorithms is expected to involve deeper integration into everyday life, improving efficiency and personalization.

Data scientists and AI researchers are still trying to understand why certain complex models, like deep neural networks, work so well.

The role of algorithms in industry often revolves around creating data products that require software engineering and systems integration.

Despite the rise of AI and large language models, understanding the fundamentals of algorithms remains important for innovation and advancement in the field.

The impact of algorithms on society raises questions about control, transparency, and the ethical use of technology.

For those interested in computer science and programming, the evolving landscape of AI presents opportunities to contribute to the development and application of algorithms.

Transcripts

play00:00

- Hello world.

play00:00

My name is David J. Malan

play00:02

and I'm a professor of computer science

play00:03

at Harvard University.

play00:05

Today, I've been asked to explain algorithms

play00:07

in five levels of increasing difficulty.

play00:10

Algorithms are important

play00:11

because they really are everywhere,

play00:13

not only in the physical world,

play00:14

but certainly in the virtual world as well.

play00:16

And in fact, what excites me about algorithms

play00:18

is that they really represent an opportunity

play00:20

to solve problems.

play00:21

And I dare say, no matter what you do in life,

play00:24

all of us have problems to solve.

play00:28

So, I'm a computer science professor,

play00:30

so I spend a lot of time with computers.

play00:31

How would you define a computer for them?

play00:33

- Well, a computer is electronic,

play00:37

like a phone but it's a rectangle,

play00:40

and you can type like tick, tick, tick.

play00:44

And you work on it.

play00:45

- Nice. Do you know any of the parts

play00:48

that are inside of a computer?

play00:50

- No.

play00:51

- Can I explain a couple of them to you?

play00:53

- Yeah.

play00:53

- So, inside of every computer is some kind of brain

play00:56

and the technical term for that is CPU,

play00:59

or central processing unit.

play01:00

And those are the pieces of hardware

play01:02

that know how to respond to those instructions.

play01:06

Like moving up or down, or left or right,

play01:09

knows how to do math like addition and subtraction.

play01:12

And then there's at least one other type of

play01:13

hardware inside of a computer called memory

play01:16

or RAM, if you've heard of this?

play01:18

- I know of memory because you have to memorize stuff.

play01:21

- Yeah, exactly.

play01:21

And computers have even different types of memory.

play01:23

They have what's called RAM, random access memory,

play01:26

which is where your games, where your programs

play01:29

are stored while they're being used.

play01:31

But then it also has a hard drive,

play01:32

or a solid state drive, which is where your data,

play01:35

your high scores, your documents,

play01:37

once you start writing essays and stories in the future.

play01:40

- It stays there.

play01:41

- Stays permanently.

play01:42

So, even if the power goes out,

play01:43

the computer can still remember that information.

play01:45

- It's still there because

play01:47

the computer can't just like delete all of the words itself.

play01:52

- Hopefully not.

play01:53

- Because your fingers could only do that.

play01:55

Like you have to use your finger to delete

play01:58

all of the stuff. - Exactly.

play02:00

- You have to write.

play02:01

- Yeah, have you heard of an algorithm before?

play02:02

- Yes. Algorithm is a list of instructions to tell people

play02:07

what to do or like a robot what to do.

play02:09

- Yeah, exactly.

play02:10

It's just step by step instructions for doing something,

play02:14

for solving a problem, for instance.

play02:16

- Yeah, so like if you have a bedtime routine,

play02:18

then at first you say, "I get dressed, I brush my teeth,

play02:23

I read a little story, and then I go to bed."

play02:25

- All right.

play02:26

Well how about another algorithm?

play02:27

Like what do you tend to eat for lunch?

play02:30

Any types of sandwiches you like?

play02:31

- I eat peanut butter.

play02:33

- Let me get some supplies from the cupboard here.

play02:35

So, should we make an algorithm together?

play02:36

- Yeah.

play02:37

- Why don't we do it this way?

play02:39

Why don't we pretend like I'm a computer

play02:41

or maybe I'm a robot, so I only understand your instructions

play02:44

and so I want you to feed me, no pun intended, an algorithm.

play02:48

So, step-by-step instructions for solving this problem.

play02:51

But remember, algorithms, you have to be precise,

play02:54

you have to give...

play02:55

- The right instructions.

play02:57

- [David] The right instructions.

play02:58

Just do it for me. So, step one was what?

play03:00

- Open the bag.

play03:02

- [David] Okay. Opening the bag of bread.

play03:04

- Stop. - [David] Now what?

play03:05

- Grab the bread and put it on the plate.

play03:07

- [David] Grab the bread and put it on the plate.

play03:11

- Take all the bread back and put it back in there.

play03:13

[David laughing]

play03:14

- So, that's like an undo command.

play03:16

- Yeah.

play03:16

- Little control Z? Okay.

play03:18

- Take one bread and put it on the plate.

play03:20

- Okay.

play03:21

- Take the lid off the peanut butter.

play03:23

- [David] Okay, take the lid off the peanut butter.

play03:24

- Put the lid down.

play03:25

- [David] Okay. - Take the knife.

play03:28

- [David] Take the knife.

play03:29

- [Addison] Put the blade inside the peanut butter

play03:32

and spread the peanut butter on the bread.

play03:35

- I'm going to take out some peanut butter

play03:37

and I'm going to spread the peanut butter on the bread.

play03:40

- I put a lot of peanut butter on

play03:42

because I love peanut butter.

play03:43

- Oh, apparently. I thought I was messing with you here...

play03:44

- No, no it's fine.

play03:45

But I think you're apparently happy with this.

play03:47

- [Addison] Put the knife down,

play03:49

and then grab one bread and put it on top

play03:51

of the second bread, sideways.

play03:55

- Sideways.

play03:57

- Like put it flat on.

play03:58

- Oh, flat ways, okay.

play04:00

- [Addison] And now, done. You're done with your sandwich.

play04:02

- Should we take a delicious bite?

play04:04

- Yep. Let's take a bite.

play04:06

- [David] Okay, here we go.

play04:08

What would the next step be here?

play04:10

- Clean all this mess up.

play04:11

[David laughing]

play04:11

- Clean all this mess up, right.

play04:13

We made an algorithm, step by step instructions

play04:16

for solving some problem.

play04:17

And if you think about now,

play04:18

how we made peanut butter and jelly sandwiches,

play04:20

sometimes we were imprecise and you didn't give me

play04:23

quite enough information to do the algorithm correctly,

play04:25

and that's why I took out so much bread.

play04:27

Precision, being very, very correct with your instructions

play04:31

is so important in the real world

play04:33

because for instance, when you're using the worldwide web

play04:35

and you're searching for something on Google or Bing...

play04:38

- You want to do the right thing.

play04:40

- [David] Exactly.

play04:41

- So, like if you type in just Google,

play04:43

then you won't find the answer to your question.

play04:46

- Pretty much everything we do in life is an algorithm,

play04:49

even if we don't use that fancy word to describe it.

play04:51

Because you and I are sort of following instructions

play04:53

either that we came up with ourselves

play04:55

or maybe our parents told us how to do these things.

play04:59

And so, those are just algorithms.

play05:00

But when you start using algorithms in computers,

play05:03

that's when you start writing code.

play05:05

[upbeat music]

play05:09

What do you know about algorithms?

play05:11

- Nothing really, at all honestly.

play05:13

I think it's just probably a way to store information

play05:16

in computers.

play05:17

- And I dare say, even though you might not have

play05:19

put this word on it, odds are you executed as a human,

play05:22

multiple algorithms today even before you came here today.

play05:26

Like what were a few things that you did?

play05:28

- I got ready.

play05:29

- Okay. And get ready. What does that mean?

play05:30

- Brushing my teeth, brushing my hair.

play05:32

- [David] Okay.

play05:33

- Getting dressed.

play05:33

- Okay, so all of those, frankly, if we really

play05:35

dove more deeply, could be broken down into

play05:39

step-by-step instructions.

play05:41

And presumably your mom, your dad, someone in the past

play05:44

sort of programmed you as a human to know what to do.

play05:47

And then after that, as a smart human,

play05:48

you can sort of take it from there

play05:49

and you don't need their help anymore.

play05:51

But that's kind of what we're doing

play05:52

when we program computers.

play05:53

Something maybe even more familiar nowadays,

play05:55

like odds are you have a cell phone.

play05:57

Your contacts or your address book.

play05:59

But let me ask you why that is.

play06:01

Like why does Apple or Google or anyone else

play06:03

bother alphabetizing your contacts?

play06:05

- I just assumed it would be easier to navigate.

play06:07

- What if your friend happened to be at the very bottom

play06:10

of this randomly organized list?

play06:12

Why is that a problem? Like he or she's still there.

play06:15

- I guess it would take a while to get to

play06:16

while you're scrolling.

play06:17

- That, in of itself, is kind of a problem

play06:19

or it's an inefficient solution to the problem.

play06:21

So, it turns out that back in my day,

play06:22

before there were cell phones, everyone's numbers

play06:25

from my schools were literally printed in a book,

play06:27

and everyone in my town and my city, my state

play06:29

was printed in an actual phone book.

play06:31

Even if you've never seen this technology before,

play06:33

how would you propose verbally to find John

play06:36

in this phone book? - Or I would just flip through

play06:37

and just look for the J's I guess.

play06:39

- Yeah. So, let me propose that we start that way.

play06:41

I could just start at the beginning

play06:42

and step by step I could just look at each page,

play06:45

looking for John, looking for John.

play06:48

Now even if you've never seen this here technology before,

play06:51

it turns out this is exactly what your phone could be doing

play06:54

in software, like someone from Google or Apple or the like,

play06:57

they could write software that uses a technique

play06:59

in programming known as a loop,

play07:01

and a loop, as the word implies,

play07:02

is just sort of do something again and again.

play07:04

What if instead of starting from the beginning

play07:06

and going one page at a time,

play07:07

what if I, or what if your phone goes like two pages

play07:09

or two names at a time?

play07:11

Would this be correct do you think?

play07:12

- Well you could skip over John, I think.

play07:15

- In what sense?

play07:15

- If he's in one of the middle pages that you skipped over.

play07:17

- Yeah, so sort of accidentally and frankly

play07:20

with like 50/50 probability,

play07:21

John could get sandwiched in between two pages.

play07:24

But does that mean I have to throw

play07:25

that algorithm out altogether?

play07:27

- Maybe you could use that strategy until you get close

play07:30

to the section and then switch to going one by one.

play07:31

- Okay, that's nice.

play07:32

So, you could kind of like go twice as fast

play07:34

but then kind of pump the brakes as you near your exit

play07:37

on the highway, or in this case near the J section

play07:39

of the book.

play07:40

- Exactly.

play07:41

- And maybe alternatively, if I get to like

play07:42

A, B, C, D, E, F, G, H, I, J, K,

play07:44

if I get to the K section,

play07:46

then I could just double back like one page

play07:49

just to make sure John didn't get sandwiched

play07:50

between those pages.

play07:51

So, the nice thing about that second algorithm

play07:53

is that I'm flying through the phone book

play07:55

like two pages at a time.

play07:56

So, 2, 4, 6, 8, 10, 12.

play07:57

It's not perfect, it's not necessarily correct

play08:00

but it is if I just take one extra step.

play08:02

So, I think it's fixable,

play08:03

but what your phone is probably doing

play08:05

and frankly what I and like my parents and grandparents

play08:08

used to do back in the day was we'd probably go roughly

play08:09

to the middle of the phone book here,

play08:11

and just intuitively, if this is an alphabetized phone book

play08:14

in English, what section am I probably going to

play08:16

find myself in roughly?

play08:17

- K?

play08:18

- Okay. So, I'm in the K section.

play08:20

Is John going to be to the left or to the right?

play08:21

- To the left.

play08:22

- Yeah.

play08:23

So, John is going to be to the left or the right

play08:24

and what we can do here, though your phone

play08:25

does something smarter, is tear the problem in half,

play08:27

throw half of the problem away,

play08:30

being left with just 500 pages now.

play08:32

But what might I next do?

play08:34

I could sort of naively just start at the beginning again,

play08:36

but we've learned to do better.

play08:38

I can go roughly to the middle here.

play08:39

- And you can do it again. - Yeah, exactly.

play08:41

So, now maybe I'm in the E section,

play08:43

which is a little to the left.

play08:45

So, John is clearly going to be to the right,

play08:47

so I can again tear the problem poorly in half,

play08:51

throw this half of the problem away,

play08:53

and I claim now that if we started with a thousand pages,

play08:56

now we've gone to 500, 250,

play08:57

now we're really moving quickly.

play08:59

- Yeah.

play09:00

- [David] And so, eventually I'm hopefully dramatically

play09:01

left with just one single page

play09:04

at which point John is either on that page

play09:06

or not on that page, and I can call him.

play09:08

Roughly how many steps might this third algorithm take

play09:12

if I started with a thousand pages

play09:14

then went to 500, 250, 125,

play09:17

how many times can you divide 1,000 in half? Maybe?

play09:21

- 10.

play09:21

- That's roughly roughly 10.

play09:22

Because in the first algorithm,

play09:23

looking again for someone like Zoe in the worst case

play09:26

might have to go all the way through a thousand pages.

play09:28

But the second algorithm you said was 500,

play09:30

maybe 501, essentially the same thing.

play09:32

So, twice as fast.

play09:34

But this third and final algorithm is sort of fundamentally

play09:37

faster because you're sort of dividing and conquering it

play09:40

in half, in half, in half,

play09:41

not just taking one or two bites out of it out of a time.

play09:45

So, this of course is not how we used to use phone books

play09:47

back in the day since otherwise they'd be single use only.

play09:49

But it is how your phone is actually searching for Zoe,

play09:53

for John, for anyone else, but it's doing it in software.

play09:56

- Oh, that's cool.

play09:57

- So, here we've happened to focus on searching algorithms,

play09:59

looking for John in the phone book.

play10:00

But the technique we just used

play10:01

can indeed be called divide and conquer,

play10:03

where you take a big problem and you divide and conquer it,

play10:06

that is you try to chop it up into smaller,

play10:08

smaller, smaller pieces.

play10:09

A more sophisticated type of algorithm,

play10:11

at least depending on how you implement it,

play10:13

something known as a recursive algorithm.

play10:15

Recursive algorithm is essentially an algorithm

play10:18

that uses itself to solve the exact same problem

play10:21

again and again, but chops it smaller, and smaller,

play10:24

and smaller ultimately.

play10:26

[upbeat music]

play10:29

- Hi, my name's Patricia.

play10:30

- Patricia, nice to meet you.

play10:31

Where are you a student at?

play10:32

- I am starting my senior year now at NYU.

play10:34

- Oh nice. And what have you been studying

play10:35

the past few years?

play10:36

- I studied computer science and data science.

play10:38

- If you were chatting with a non-CS,

play10:40

non-data science friend of yours,

play10:41

how would you explain to them what an algorithm is?

play10:44

- Some kind of systematic way of solving a problem,

play10:47

or like a set of steps to kind of solve

play10:50

a certain problem you have.

play10:52

- So, you probably recall learning topics

play10:54

like binary search versus linear search, and the like.

play10:56

- Yeah.

play10:57

- So, I've come here complete with a

play10:58

actual chalkboard with some magnetic numbers on it here.

play11:02

How would you tell a friend to sort these?

play11:04

- I think one of the first things we learned was

play11:07

something called bubble sort.

play11:08

It was kind of like focusing on smaller bubbles

play11:12

I guess I would say of the problem,

play11:14

like looking at smaller segments rather than

play11:16

the whole thing at once.

play11:17

- What is I think very true about what you're hinting at

play11:20

is that bubble sort really focuses on local, small problems

play11:25

rather than taking a step back trying to fix

play11:26

the whole thing, let's just fix the obvious problems

play11:29

in front of us. So, for instance, when we're trying to get

play11:30

from smallest to largest,

play11:32

and the first two things we see are eight followed by one,

play11:34

this looks like a problem 'cause it's out of order.

play11:37

So, what would be the simplest fix,

play11:39

the least amount of work we can do

play11:40

to at least fix one problem?

play11:41

- Just switch those two numbers

play11:43

'cause one is obviously smaller than eight.

play11:45

- Perfect. So, we just swap those two then.

play11:47

- You would switch those again.

play11:48

- Yeah, so that further improves the situation

play11:51

and you can kind of see it,

play11:52

that the one and the two are now in place.

play11:54

How about eight and six?

play11:55

- [Patricia] Switch it again.

play11:55

- Switch those again. Eight and three?

play11:57

- Switch it again.

play11:58

[fast forwarding]

play12:00

- And conversely now the one and the two are closer to,

play12:03

and coincidentally are exactly where we want them to be.

play12:07

So, are we done?

play12:08

- No.

play12:09

- Okay, so obviously not, but what could we do now

play12:12

to further improve the situation?

play12:14

- Go through it again but you don't need

play12:17

to check the last one anymore because we know

play12:19

that number is bubbled up to the top.

play12:21

- Yeah, because eight has indeed bubbled all the way

play12:23

to the top. So, one and two?

play12:25

- [Patricia] Yeah, keep it as is.

play12:26

- Okay, two and six?

play12:27

- [Patricia] Keep it as is.

play12:28

- Okay, six and three?

play12:29

- Then you switch it.

play12:30

- Okay, we'll switch or swap those.

play12:31

Six and four?

play12:32

- [Patricia] Swap it again.

play12:33

- Okay, so four, six and seven?

play12:35

- [Patricia] Keep it.

play12:36

- Okay. Seven and five?

play12:37

- [Patricia] Swap it.

play12:38

- Okay. And then I think per your point,

play12:39

we're pretty darn close.

play12:41

Let's go through once more.

play12:43

One and two? - [Patricia] Keep it.

play12:45

- Two three? - [Patricia] Keep it.

play12:46

- Three four? - [Patricia] Keep it.

play12:47

- Four six? - [Patricia] Keep it.

play12:48

- Six five?

play12:49

- [Patricia] And then switch it.

play12:49

- All right, we'll switch this. And now to your point,

play12:51

we don't need to bother with the ones

play12:53

that already bubbled their way up.

play12:54

Now we are a hundred percent sure it's sorted.

play12:56

- Yeah.

play12:57

- And certainly the search engines of the world,

play12:58

Google and Bing and so forth,

play12:59

they probably don't keep webpages in sorted order

play13:03

'cause that would be a crazy long list

play13:04

when you're just trying to search the data.

play13:06

But there's probably some algorithm underlying what they do

play13:08

and they probably similarly, just like we,

play13:10

do a bit of work upfront to get things organized

play13:14

even if it's not strictly sorted in the same way

play13:16

so that people like you and me and others

play13:18

can find that same information.

play13:20

So, how about social media?

play13:22

Can you envision where the algorithms are in that world?

play13:25

- Maybe for example like TikTok, like the For You page,

play13:28

'cause those are like recommendations, right?

play13:31

It's sort of like Netflix recommendations

play13:33

except more constant because it's just every video

play13:36

you scroll, it's like that's a new recommendation basically.

play13:39

And it's based on what you've liked previously,

play13:41

what you've saved previously, what you search up.

play13:43

So, I would assume there's some kind of algorithm there

play13:45

kind of figuring out like what to put on your For You page.

play13:48

- Absolutely. Just trying to keep you presumably

play13:50

more engaged.

play13:51

So, the better the algorithm is,

play13:52

the better your engagement is,

play13:54

maybe the more money the company then makes on the platform

play13:57

and so forth.

play13:58

So, it all sort of feeds together.

play13:59

But what you're describing really is more

play14:00

artificially intelligent, if I may,

play14:03

because presumably there's not someone at TikTok

play14:05

or any of these social media companies saying,

play14:07

"If Patricia likes this post, then show her this post.

play14:10

If she likes this post, then show her this other post."

play14:13

Because the code would sort of grow infinitely long

play14:15

and there's just way too much content for a programmer

play14:17

to be having those kinds of conditionals,

play14:20

those decisions being made behind the scenes.

play14:23

So, it's probably a little more artificially intelligent.

play14:26

And in that sense you have topics like neural networks,

play14:29

and machine learning which really describe

play14:31

taking as input things like what you watch,

play14:33

what you click on, what your friends watch,

play14:35

what they click on, and sort of trying to infer

play14:37

from that instead, what should we show Patricia

play14:39

or her friends next?

play14:41

- Oh, okay. Yeah. Yeah.

play14:42

That makes like the distinction more...

play14:44

Makes more sense now.

play14:45

- Nice. - Yeah.

play14:46

[upbeat music]

play14:49

- I am currently a fourth year PhD student at NYU.

play14:52

I do robot learning, so that's half and half

play14:55

robotics and machine learning.

play14:56

- Sounds like you've dabbled with quite a few algorithms.

play14:59

So, how does one actually research algorithms

play15:01

or invent algorithms?

play15:02

- The most important way is just trying to think about

play15:04

inefficiencies, and also think about connecting threads.

play15:07

The way I think about it is that algorithm for me

play15:11

is not just about the way of doing something,

play15:12

but it's about doing something efficiently.

play15:14

Learning algorithms are practically everywhere now.

play15:17

Google, I would say for example,

play15:18

is learning every day about like,

play15:21

"Oh what articles, what links might be better than others?"

play15:24

And re-ranking them.

play15:25

There are recommender systems all around us, right?

play15:29

Like content feeds and social media,

play15:31

or you know, like YouTube or Netflix.

play15:34

What we see is in a large part determined by this kind of

play15:37

learning algorithms.

play15:39

- Nowadays there's a lot of concerns

play15:40

around some applications of machine learning

play15:43

like deep fakes where it can kind of learn how I talk

play15:46

and learn how you talk and even how we look,

play15:48

and generate videos of us.

play15:50

We're doing this for real, but you could imagine

play15:52

a computer synthesizing this conversation eventually.

play15:54

- Right.

play15:55

- But how does it even know what I sound like

play15:57

and what I look like, and how to replicate that?

play15:59

- All of this learning algorithms that we talk about, right?

play16:01

A lot, like what goes in there is just

play16:05

lots and lots of data.

play16:06

So, data goes in, something else comes out.

play16:08

What comes out is whatever objective function

play16:10

that you optimize for.

play16:11

- Where is the line between algorithms that

play16:14

play games with and without AI?

play16:17

- I think when I started off my undergrad,

play16:19

the current AI machine learning

play16:21

was not very much synonymous.

play16:23

- Okay.

play16:24

- And even in my undergraduate, in the AI class,

play16:27

they learned a lot of classical algorithms for game plays.

play16:29

Like for example, the A star search, right?

play16:31

That's a very simple example of how you can play a game

play16:34

without having anything learned.

play16:37

This is very much, oh you are at a game state,

play16:40

you just search down, see what are the possibilities

play16:43

and then you pick the best possibility that it can see,

play16:46

versus what you think about when you think about,

play16:48

ah yes, gameplay like the alpha zero for example,

play16:53

or alpha star, or there are a lot of, you know,

play16:55

like fancy new machine learning agents that are

play16:58

even learning very difficult games like Go.

play17:01

And those are learned agents, as in they are getting better

play17:05

as they play more and more games.

play17:07

And as they get more games, they kind of

play17:09

refine their strategy based on the data that I've seen.

play17:12

And once again, this high level abstraction

play17:15

is still the same.

play17:16

You see a lot of data and you'll learn from that.

play17:18

But the question is what is objective function

play17:21

that you're optimizing for?

play17:22

Is it winning this game?

play17:23

Is it forcing a tie or is it, you know,

play17:26

opening a door in a kitchen?

play17:27

- So, if the world is very much focused on supervised,

play17:30

unsupervised reinforcement learning now,

play17:32

what comes next five, ten years, where is the world going?

play17:35

- I think that this is just going to be more and more,

play17:39

I don't want to use the word encroachment,

play17:41

but that's what it feels like of algorithms

play17:43

into our everyday life.

play17:44

Like even when I was taking the train here, right?

play17:46

The trains are being routed with algorithms,

play17:48

but this has existed for you know, like 50 years probably.

play17:52

But as I was coming here, as I was checking my phone,

play17:54

those are different algorithms,

play17:56

and you know, they're kind of getting all around us,

play17:59

getting there with us all the time.

play18:01

They're making our life better most places, most cases.

play18:05

And I think that's just going to be a continuation

play18:07

of all of those.

play18:08

- And it feels like they're even in places

play18:09

you wouldn't expect, and there's just so much data

play18:11

about you and me and everyone else online

play18:13

and this data is being mined and analyzed,

play18:15

and influencing things we see and hear it would seem.

play18:18

So, there is sort of a counterpoint which might be good

play18:20

for the marketers, but not necessarily good for you and me

play18:22

as individuals.

play18:23

- We are human beings, but for someone

play18:26

we might be just a pair of eyes who are

play18:28

carrying a wallet, and are there to buy things.

play18:32

But there is so much more potential for these algorithms

play18:35

to just make our life better without

play18:37

changing much about our life.

play18:39

[upbeat music]

play18:43

- I'm Chris Wiggins. I'm an associate professor

play18:44

of Applied Mathematics at Columbia.

play18:46

I'm also the chief data scientist of the New York Times.

play18:48

The data science team at the New York Times

play18:49

develops and deploys machine learning

play18:51

for newsroom and business problems.

play18:53

But I would say the things that we do mostly, you don't see,

play18:55

but it might be things like personalization algorithms,

play18:58

or recommending different content.

play19:00

- And do data scientists, which is rather distinct

play19:02

from the phrase computer scientists.

play19:04

Do data scientists still think in terms of algorithms

play19:06

as driving a lot of it?

play19:08

- Oh absolutely, yeah.

play19:09

In fact, so in data science and academia,

play19:11

often the role of the algorithm is

play19:13

the optimization algorithm that helps you find the best

play19:16

model or the best description of a data set.

play19:17

And data science and industry, the goal,

play19:20

often it's centered around an algorithm

play19:22

which becomes a data product.

play19:24

So, a data scientist in industry might be

play19:26

developing and deploying the algorithm,

play19:28

which means not only understanding the algorithm

play19:30

and its statistical performance,

play19:32

but also all of the software engineering

play19:34

around systems integration, making sure that that algorithm

play19:37

receives input that's reliable and has output that's useful,

play19:41

as well as I would say the organizational integration,

play19:44

which is how does a community of people

play19:46

like the set of people working at the New York Times

play19:47

integrate that algorithm into their process?

play19:50

- Interesting. And I feel like AI based startups

play19:52

are all the rage and certainly within academia.

play19:54

Are there connections between AI

play19:55

and the world of data science?

play19:57

- Oh, absolutely.

play19:57

- The algorithms that they're in,

play19:58

can you connect those dots for...

play19:59

- You're right that AI as a field has really exploded.

play20:01

I would say particularly many people experienced a ChatBot

play20:05

that was really, really good.

play20:06

Today, when people say AI,

play20:08

they're often thinking about large language models,

play20:10

or they're thinking about generative AI,

play20:12

or they might be thinking about a ChatBot.

play20:14

One thing to keep in mind is a ChatBot is a special case

play20:17

of generative AI, which is a special case of using

play20:19

large language models, which is a special case of using

play20:21

machine learning generally,

play20:23

which is what most people mean by AI.

play20:25

You may have moments that are what John McCarthy called,

play20:28

"Look Ma, no hands," results,

play20:30

where you do some fantastic trick and you're not quite sure

play20:32

how it worked.

play20:33

I think it's still very much early days.

play20:34

Large language models is still in the point of

play20:37

what might be called alchemy and that people are building

play20:39

large language models without a real clear,

play20:41

a priori sense of what the right design is

play20:43

for a right problem.

play20:45

Many people are trying different things out,

play20:46

often in large companies where they can afford

play20:48

to have many people trying things out,

play20:49

seeing what works, publishing that,

play20:52

instantiating it as a product.

play20:53

- And that itself is part of the scientific process

play20:55

I would think too.

play20:56

- Yeah, very much. Well, science and engineering,

play20:58

because often you're building a thing

play20:59

and the thing does something amazing.

play21:01

To a large extent we are still looking for

play21:04

basic theoretical results around why

play21:06

deep neural networks generally work.

play21:09

Why are they able to learn so well?

play21:11

They're huge, billions of parameter models

play21:13

and it's difficult for us to interpret

play21:15

how they're able to do what they do.

play21:17

- And is this a good thing, do you think?

play21:19

Or an inevitable thing that we, the programmers,

play21:21

we, the computer scientists, the data scientists

play21:22

who are inventing these things,

play21:24

can't actually explain how they work?

play21:27

Because I feel like friends of mine in industry,

play21:28

even when it's something simple and relatively familiar

play21:31

like auto complete, they can't actually tell me

play21:33

why that name is appearing at the top of the list.

play21:35

Whereas years ago when these algorithms were more

play21:38

deterministic and more procedural,

play21:39

you could even point to the line that made that name

play21:42

bubble up to the top. - [Chris] Absolutely.

play21:44

- So, is this a good thing, a bad thing,

play21:45

that we're sort of losing control perhaps in some sense

play21:47

of the algorithm?

play21:48

- It has risks.

play21:49

I don't know that I would say that it's good or bad,

play21:51

but I would say there's lots of scientific precedent.

play21:53

There are times when an algorithm works really well

play21:55

and we have finite understanding of why it works

play21:57

or a model works really well

play21:59

and sometimes we have very little understanding

play22:01

of why it works the way it does.

play22:03

- In classes I teach, certainly spend a lot of time on

play22:04

fundamentals, algorithms that have been taught in classes

play22:07

for decades now, whether it's binary search,

play22:09

linear search, bubble sorts, selection sort or the like,

play22:12

but if we're already at the point where I can pull up

play22:15

chat GPT, copy paste a whole bunch of numbers or words

play22:18

and say, "Sort these for me,"

play22:20

does it really matter how Chat GPT is sorting it?

play22:24

Does it really matter to me as the user

play22:25

how the software is sorting it?

play22:27

Do these fundamentals become more dated and less important

play22:30

do you think?

play22:30

- Now you're talking about the ways in which code

play22:33

and computation is a special case of technology, right?

play22:36

So, for driving a car, you may not necessarily need

play22:39

to know much about organic chemistry,

play22:41

even though the organic chemistry is how the car works.

play22:45

So, you can drive the car and use it in different ways

play22:47

without understanding much about the fundamentals.

play22:49

So, similarly with computation, we're at a point

play22:51

where the computation is so high level, right?

play22:54

You can import psychic learn and you can go from zero

play22:56

to machine learning in 30 seconds.

play22:58

It's depending on what level you want to understand

play23:00

the technology, where in the stack, so to speak,

play23:03

it's possible to understand it and make wonderful things

play23:05

and advance the world without understanding it

play23:08

at the particular level of somebody who actually might have

play23:10

originally designed the actual optimization algorithm.

play23:13

I should say though, for many of the optimization

play23:15

algorithms, there are cases where an algorithm

play23:16

works really well and we publish a paper,

play23:18

and there's a proof in the paper,

play23:21

and then years later people realize

play23:22

actually that proof was wrong and we're really

play23:23

still not sure why that optimization works,

play23:25

but it works really well or it inspires people

play23:28

to make new optimization algorithms.

play23:29

So, I do think that the goal of understanding algorithms

play23:34

is loosely coupled to our progress

play23:36

and advancing grade algorithms, but they don't always

play23:39

necessarily have to require each other.

play23:40

- And for those students especially,

play23:42

or even adults who are thinking of now steering into

play23:44

computer science, into programming,

play23:46

who were really jazzed about heading in that direction

play23:49

up until, for instance, November of 2022,

play23:51

when all of a sudden for many people

play23:53

it looked like the world was now changing

play23:55

and now maybe this isn't such a promising path,

play23:58

this isn't such a lucrative path anymore.

play23:59

Are LLMs, are tools like Chat GPT reason not to perhaps

play24:04

steer into the field?

play24:05

- Large language models are a particular architecture

play24:06

for predicting, let's say the next word,

play24:09

or a set of tokens more generally.

play24:11

The algorithm comes in when you think about

play24:13

how is that LLM to be trained or also how to be fine tuned.

play24:18

So, the P of GPT is a pre-trained algorithm.

play24:22

The idea is that you train a large language model

play24:25

on some corpus of text, could be encyclopedias,

play24:28

or textbooks, or what have you.

play24:30

And then you might want to fine tune that model

play24:34

around some particular task or

play24:36

some particular subset of texts.

play24:38

So, both of those are examples of training algorithms.

play24:41

So, I would say people's perception

play24:43

of artificial intelligence has really changed a lot

play24:45

in the last six months, particularly around November of 2022

play24:51

when people experienced a really good ChatBot.

play24:53

The technology though had been around already before.

play24:55

Academics had already been working with Chat GPT three

play24:58

before that and GPT two and GPT one.

play25:01

And for many people it sort of opened up this conversation

play25:03

about what is artificial intelligence

play25:05

and what could we do with this?

play25:06

And what are the possible good and bad, right?

play25:08

Like any other piece of technology.

play25:10

Kranzburg's first law of technology,

play25:11

technology is neither good, nor bad, nor is it neutral.

play25:14

Every time we have some new technology,

play25:15

we should think about it's capabilities

play25:17

and the good, and the possible bad.

play25:19

- [David] As with any area of study,

play25:21

algorithms offer a spectrum from the most basic

play25:23

to the most advanced.

play25:25

And even if right now, the most advanced of those algorithms

play25:28

feels out of reach because you just

play25:30

don't have that background,

play25:31

with each lesson you learn, with each algorithm you study,

play25:34

that end game becomes closer and closer

play25:36

such that it will, before long, be accessible to you

play25:39

and you will be at the end of that most advanced spectrum.

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
AlgorithmsComputer ScienceHarvardAIData ScienceMachine LearningSortingSearch EnginesSocial MediaPersonalizationOptimizationChatbotsLanguage ModelsData ProductsFundamentalsTechnology TrendsEducation