Intro to Algorithms: Crash Course Computer Science #13

CrashCourse
24 May 201711:44

Summary

TLDR这集CrashCourse计算机科学课程介绍了编程语言和算法的基础知识。Carrie Anne老师首先回顾了之前课程中关于高级编程语言如Python或Java的内容,讨论了不同类型的编程语句,如赋值、条件语句和循环,以及如何将这些语句放入函数中进行计算,例如计算指数。接着,她强调了算法的重要性,解释了算法是完成计算的具体步骤,并指出即使结果相同,不同的算法也有优劣之分。课程还提到了算法一词的起源,以及高效算法的构建如何催生了计算科学,并最终发展成为现代计算机科学。 课程中特别提到了排序算法,这是一个在计算机科学中非常著名的问题。通过一个简单的例子,介绍了选择排序(Selection Sort)的基本概念和步骤,并用伪代码展示了其过程。选择排序的时间复杂度是N的平方,这对于大型数据集来说效率不高。然后,课程转向了归并排序(Merge Sort),这是一种更高效的算法,其时间复杂度是N乘以对数N。通过图解,展示了如何将数组分成更小的部分,然后逐个合并它们以完成排序。 最后,课程探讨了图搜索问题,特别是Dijkstra算法,这是一种经典的解决图搜索问题的算法,用于找到图中节点之间的最短路径。通过一个关于Highgarden和Winterfell之间最快路线的示例,解释了Dijkstra算法的工作原理。课程结束时强调了算法在现代世界中的普遍性和重要性,并鼓励观众进一步探索算法的世界。

Takeaways

  • 📚 编程语言的高级特性允许我们通过不同的语句类型,如赋值、条件判断(if)、循环(loops)等来执行计算。
  • 🔢 函数是执行计算的代码块,例如计算指数,但实现相同功能的函数可以有不同的算法。
  • ⏱️ 算法的效率取决于完成计算所需的步骤数量,以及它使用的内存量。
  • 🌟 算法(algorithm)一词来源于波斯数学家穆罕默德·伊本·穆萨·花剌子模,他是代数的先驱之一。
  • 💻 计算机科学是一门围绕计算的科学,它随着高效算法的构建而发展起来。
  • 🔍 排序是计算机科学中一个经典的问题,涉及对名字或数字等进行排序。
  • 📈 选择排序(Selection Sort)是一种简单的排序算法,通过重复寻找最小元素并将其放到数组的前端来实现排序。
  • 🔽 选择排序的时间复杂度是O(N^2),这意味着随着数组大小的增加,运行时间会呈指数级增长。
  • 🔄 归并排序(Merge Sort)是一种更高效的排序算法,它通过将数组分成更小的部分并逐步合并它们来工作,其时间复杂度是O(N log N)。
  • 🛣️ 图搜索是计算机科学的另一个经典问题,涉及在图(由节点和边组成的网络)中找到路径。
  • 🔬 迪杰斯特拉算法(Dijkstra's algorithm)是一种用于在加权图中找到最短路径的算法。
  • 🚀 算法在我们的现代世界中无处不在,从排序到路径查找,它们是计算机科学的核心组成部分。

Q & A

  • 在这段视频脚本中,提到了哪些高级编程语言的例子?

    -视频脚本中提到了Python和Java作为高级编程语言的例子。

  • 脚本中提到的计算指数的函数有哪些可能的解决方案?

    -脚本中提到,计算指数的函数有多种可能的解决方案,即使它们产生相同的数值结果,使用的算法也可能不同。

  • 算法这个词来源于哪个文化背景,以及它与哪位历史人物有关?

    -算法这个词来源于波斯,与波斯数学家Muḥammad ibn Mūsā al-Khwārizmī有关,他是代数学的奠基人之一。

  • 脚本中提到了哪些排序算法?

    -脚本中提到了选择排序(Selection Sort)和归并排序(Merge Sort)两种排序算法。

  • 选择排序算法的时间复杂度是什么?

    -选择排序算法的时间复杂度是N的平方,即O(N^2)。

  • 归并排序算法的时间复杂度是什么?

    -归并排序算法的时间复杂度是N乘以对数N,即O(N log N)。

  • 脚本中提到的图搜索问题通常涉及哪些元素?

    -图搜索问题通常涉及节点(城市)和边(道路),边可以带有成本或权重(旅行时间)。

  • Dijkstra算法是如何确定最短路径的?

    -Dijkstra算法从成本最低的节点开始,沿着所有连接的节点逐步计算成本,直到找到目标节点的最低总成本路径。

  • 脚本中提到的Dijkstra算法的原始复杂度是多少?

    -Dijkstra算法的原始复杂度是图中节点数量的平方。

  • 改进后的Dijkstra算法的复杂度是多少?

    -改进后的Dijkstra算法的复杂度是图中节点数量乘以节点数量的对数,再加上边的数量。

  • 脚本中提到的服务,如Google Maps,使用哪种类型的算法来找到最佳路线?

    -脚本中提到,像Google Maps这样的服务使用的算法与Dijkstra算法类似,用以计算最佳路线。

  • 脚本中提到的计算机科学中的一个重要概念是什么?

    -脚本中提到的计算机科学中的一个重要概念是算法,它在现代世界中无处不在,对于解决问题和提高效率至关重要。

Outlines

00:00

📚 算法基础与排序算法

Carrie Anne在CrashCourse计算机科学课程中介绍了高级编程语言的基础知识,包括不同类型的编程语句,如赋值、条件判断和循环。她强调了算法的重要性,即完成计算的具体步骤,以及算法的效率问题。通过比较不同算法的优劣,她引入了排序算法的概念,并以选择排序(Selection Sort)为例,说明了算法的工作原理和效率。此外,还介绍了排序算法的复杂度表示方法——大O表示法(Big O notation),并通过比较不同大小数组的排序时间,展示了选择排序的效率问题。

05:00

🔍 归并排序与图搜索算法

Carrie Anne继续探讨了归并排序(Merge Sort)算法,这是一种比选择排序更高效的算法。她通过将数组不断分割直至单个元素,然后逐对合并的方式,展示了如何实现归并排序。归并排序的复杂度为N乘以对数N,这比选择排序的N平方复杂度要高效得多。接着,她引入了图搜索问题,通过一个寻找最快路线的例子,介绍了Dijkstra算法。Dijkstra算法通过记录从起点到每个节点的最低成本,并逐步探索所有可能的路径来找到最优解。她还提到了Dijkstra算法的改进,以及它在现实世界中的应用,如Google Maps等导航服务。

10:04

🌐 算法的实际应用与重要性

在课程的最后,Carrie Anne强调了算法在现代世界中的普遍性和重要性。她提到了Dijkstra算法的原始版本和改进版本,以及它们在处理大规模问题时的效率差异。她指出,尽管存在许多不同的图搜索算法,但Dijkstra算法及其变种在现实世界中的应用非常广泛,例如在Google Maps等服务中用于计算最佳路径。她鼓励观众继续探索算法的世界,并期待在下一集中与大家见面。

Mindmap

Keywords

💡算法

算法是计算机科学中的一个核心概念,指的是解决特定问题的明确步骤集合。在视频中,算法被用来比较不同排序方法的效率,如选择排序和归并排序。算法的效率通常用时间复杂度来衡量,例如选择排序的时间复杂度是N的平方,而归并排序是N乘以对数N。

💡选择排序

选择排序是一种简单的排序算法,它通过重复扫描数组,找出最小(或最大)的元素,将其与数组的当前位置元素交换,从而实现排序。视频中通过一个包含8个元素的数组示例,展示了选择排序的步骤和过程。

💡归并排序

归并排序是一种分治算法,它首先将数组分成两半,然后递归地对这两半进行排序,最后将排序好的两半合并成一个有序数组。视频中通过一个8个元素的数组,展示了归并排序的分半、递归和合并过程。

💡时间复杂度

时间复杂度是衡量算法运行时间与输入规模之间关系的一个指标,通常用大O表示法来描述。例如,选择排序的时间复杂度是O(N^2),而归并排序是O(N log N)。视频中通过比较不同排序算法的时间复杂度,说明了算法效率的重要性。

💡大O表示法

大O表示法是一种数学符号,用于描述算法运行时间的上界。它是算法分析中的一个重要工具,用于估计算法的运行时间随输入规模增长的变化趋势。视频中提到了大O表示法,并用它来比较不同排序算法的效率。

💡冒泡排序

冒泡排序是一种简单的排序算法,通过重复遍历要排序的数列,一次比较两个元素,如果它们的顺序错误就把它们交换过来。这个过程重复进行,直到没有再需要交换的元素为止。虽然视频中没有直接提到冒泡排序,但它是排序算法中一个著名的例子。

💡图搜索

图搜索是计算机科学中用于在图结构中查找特定路径的算法。图由节点(顶点)和连接这些节点的边(线)组成。在视频中,图搜索被用来寻找从一个节点到另一个节点的最短路径,例如从Highgarden到Winterfell的最快路线。

💡迪杰斯特拉算法

迪杰斯特拉算法是一种用于在加权图中找到单个源点到所有其他顶点的最短路径的算法。视频中通过一个包含6个城市和9条路线的图,展示了如何使用迪杰斯特拉算法来找到从Highgarden到Winterfell的最快路径。

💡节点

在图论中,节点(或顶点)是图的基本概念,代表图中的一个点。在视频中,节点被用来表示城市,如Highgarden、King's Landing等,这些城市通过路线(边)相互连接。

💡

边是图论中的另一个基本概念,用来表示连接两个节点的线。在视频中,边被用来表示城市之间的路线,每条路线都有一个与旅行时间相关的成本或权重。

💡分治法

分治法是一种算法设计策略,它将一个难以直接解决的大问题分解成一些规模较小的相同问题,递归解决这些小问题,然后合并这些小问题的解来解决原来的大问题。视频中的归并排序就是使用了分治法的思想。

Highlights

编程语言的不同语句类型,如赋值、条件语句和循环,以及如何将这些语句放入函数中进行计算,例如计算指数。

即使结果相同,不同的算法也有优劣之分,通常计算步骤越少的算法越好。

算法(algorithm)一词源自波斯数学家穆罕默德·伊本·穆萨·花剌子模,他是代数的创始人之一。

高效的算法设计问题早在现代计算机出现之前就存在,它催生了围绕计算的整个科学领域,最终发展成为现代计算机科学。

排序是计算机科学中最著名、最基础的算法问题之一,涉及到对名字或数字等的排序。

计算机科学家已经发明了多种排序算法,如冒泡排序(Bubble Sort)和意大利面排序(Spaghetti Sort)。

选择排序(Selection Sort)是一种简单的排序算法,通过重复扫描数组找到最小值并交换到正确的位置。

选择排序算法的时间复杂度是N的平方,意味着它对于大规模数据集效率不高。

归并排序(Merge Sort)是一种更高效的排序算法,其时间复杂度为N乘以对数N。

归并排序通过将数组分成更小的部分,然后逐个合并它们来工作,直到整个数组被排序。

图搜索是计算机科学的另一个经典算法问题类别,涉及到在图(由节点和边组成的网络)中寻找路径。

Dijkstra算法是一种经典的图搜索算法,用于找到图中两个节点之间的最短路径。

Dijkstra算法通过记录每个节点的累积成本,并不断更新最短路径来工作。

Dijkstra算法的原始版本在1956年被提出,后来经过改进,提高了效率。

改进后的Dijkstra算法的时间复杂度是图中节点数乘以对数,再加上边数。

算法在现代世界无处不在,没有它们,现代世界将不可能存在。

计算机科学家的工作包括利用现有算法和在需要时编写新算法。

Transcripts

play00:03

Hi, I’m Carrie Anne, and welcome to CrashCourse Computer Science!

play00:05

Over the past two episodes, we got our first taste of programming in a high-level language,

play00:10

like Python or Java.

play00:11

We talked about different types of programming language statements – like assignments,

play00:14

ifs, and loops – as well as putting statements into functions that perform a computation,

play00:18

like calculating an exponent.

play00:20

Importantly, the function we wrote to calculate exponents is only one possible solution.

play00:24

There are other ways to write this function – using different statements in different

play00:27

orders – that achieve exactly the same numerical result.

play00:30

The difference between them is the algorithm, that is the specific steps used to complete

play00:34

the computation.

play00:35

Some algorithms are better than others even if they produce equal results.

play00:38

Generally, the fewer steps it takes to compute, the better it is, though sometimes we care

play00:42

about other factors, like how much memory it uses.

play00:45

The term algorithm comes from Persian polymath Muḥammad ibn Mūsā al-Khwārizmī who was

play00:49

one of the fathers of algebra more than a millennium ago.

play00:52

The crafting of efficient algorithms – a problem that existed long before modern computers

play00:56

– led to a whole science surrounding computation, which evolved into the modern discipline of…

play01:00

you guessed it!

play01:01

Computer Science!

play01:02

INTRO

play01:11

One of the most storied algorithmic problems in all of computer science is sorting… as

play01:15

in sorting names or sorting numbers.

play01:18

Computers sort all the time.

play01:19

Looking for the cheapest airfare, arranging your email by most recently sent, or scrolling

play01:23

your contacts by last name -- those all require sorting.

play01:27

You might think “sorting isn’t so tough… how many algorithms can there possibly be?”

play01:30

The answer is: a lot.

play01:32

Computer Scientists have spent decades inventing algorithms for sorting, with cool names like

play01:36

Bubble Sort and Spaghetti Sort.

play01:38

Let’s try sorting!

play01:39

Imagine we have a set of airfare prices to Indianapolis.

play01:42

We’ll talk about how data like this is represented in memory next week, but for now, a series

play01:47

of items like this is called an array.

play01:49

Let’s take a look at these numbers to help see how we might sort this programmatically.

play01:52

We’ll start with a simple algorithm.

play01:55

First, let’s scan down the array to find the smallest number.

play01:58

Starting at the top with 307.

play01:59

It’s the only number we’ve seen, so it’s also the smallest.

play02:03

The next is 239, that’s smaller than 307, so it becomes our new smallest number.

play02:09

Next is 214, our new smallest number.

play02:11

250 is not, neither is 384, 299, 223 or 312.

play02:19

So we’ve finished scanning all numbers, and 214 is the smallest.

play02:22

To put this into ascending order, we swap 214 with the number in the top location.

play02:27

Great.

play02:28

We sorted one number!

play02:29

Now we repeat the same procedure, but instead of starting at the top, we can start one spot

play02:33

below.

play02:34

First we see 239, which we save as our new smallest number.

play02:38

Scanning the rest of the array, we find 223 is the next smallest, so we swap this with

play02:43

the number in the second spot.

play02:44

Now we repeat again, starting from the third number down.

play02:47

This time, we swap 239 with 307.

play02:51

This process continues until we get to the very last number, and voila, the array is

play02:55

sorted and you’re ready to book that flight to Indianapolis!

play02:58

The process we just walked through is one way – or one algorithm – for sorting an

play03:02

array.

play03:03

It’s called Selection Sort -- and it’s pretty basic.

play03:05

Here’s the pseudo-code.

play03:06

This function can be used to sort 8, 80, or 80 million numbers - and once you've written

play03:11

the function, you can use it over and over again.

play03:14

With this sort algorithm, we loop through each position in the array, from top to bottom,

play03:17

and then for each of those positions, we have to loop through the array to find the smallest

play03:21

number to swap.

play03:22

You can see this in the code, where one FOR loop is nested inside of another FOR loop.

play03:26

This means, very roughly, that if we want to sort N items, we have to loop N times,

play03:30

inside of which, we loop N times, for a grand total of roughly N times N loops...

play03:35

Or N squared.

play03:36

This relationship of input size to the number of steps the algorithm takes to run characterizes

play03:41

the complexity of the Selection Sort algorithm.

play03:44

It gives you an approximation of how fast, or slow, an algorithm is going to be.

play03:48

Computer Scientists write this order of growth in something known as – no joke – “big

play03:52

O notation”.

play03:53

N squared is not particularly efficient.

play03:56

Our example array had n = 8 items, and 8 squared is 64.

play04:00

If we increase the size of our array from 8 items to 80, the running time is now 80

play04:05

squared, which is 6,400.

play04:07

So although our array only grew by 10 times - from 8 to 80 – the running time increased

play04:11

by 100 times – from 64 to 6,400!

play04:16

This effect magnifies as the array gets larger.

play04:18

That’s a big problem for a company like Google, which has to sort arrays with millions

play04:22

or billions of entries.

play04:23

So, you might ask, as a burgeoning computer scientist, is there a more efficient sorting algorithm?

play04:28

Let’s go back to our old, unsorted array and try a different algorithm, merge sort.

play04:33

The first thing merge sort does is check if the size of the array is greater than 1.

play04:36

If it is, it splits the array into two halves.

play04:39

Since our array is size 8, it gets split into two arrays of size 4.

play04:43

These are still bigger than size 1, so they get split again, into arrays of size 2, and

play04:47

finally they split into 8 arrays with 1 item in each.

play04:50

Now we are ready to merge, which is how “merge sort” gets its name.

play04:53

Starting with the first two arrays, we read the first – and only – value in them,

play04:57

in this case, 307 and 239.

play05:00

239 is smaller, so we take that value first.

play05:03

The only number left is 307, so we put that value second.

play05:07

We’ve successfully merged two arrays.

play05:09

We now repeat this process for the remaining pairs, putting them each in sorted order.

play05:13

Then the merge process repeats.

play05:15

Again, we take the first two arrays, and we compare the first numbers in them.

play05:19

This time its 239 and 214.

play05:22

214 is lowest, so we take that number first.

play05:25

Now we look again at the first two numbers in both arrays: 239 and 250.

play05:31

239 is lower, so we take that number next.

play05:34

Now we look at the next two numbers: 307 and 250.

play05:37

250 is lower, so we take that.

play05:40

Finally, we’re left with just 307, so that gets added last.

play05:44

In every case, we start with two arrays, each individually sorted, and merge them into a

play05:48

larger sorted array.

play05:50

We repeat the exact same merging process for the two remaining arrays of size two.

play05:54

Now we have two sorted arrays of size 4.

play05:56

Just as before, we merge, comparing the first two numbers in each array, and taking the

play06:00

lowest.

play06:01

We repeat this until all the numbers are merged, and then our array is fully sorted again!

play06:05

The bad news is: no matter how many times we sort these, you’re still going to have

play06:08

to pay $214 to get to Indianapolis.

play06:11

Anyway, the “Big O” computational complexity of merge sort is N times the Log of N.

play06:16

The N comes from the number of times we need to compare and merge items, which is directly

play06:20

proportional to the number of items in the array.

play06:22

The Log N comes from the number of merge steps.

play06:25

In our example, we broke our array of 8 items into 4, then 2, and finally 1.

play06:29

That’s 3 splits.

play06:31

Splitting in half repeatedly like this has a logarithmic relationship with the number

play06:34

of items - trust me!

play06:36

Log base 2 of 8 equals 3 splits.

play06:38

If we double the size of our array to 16 – that's twice as many items to sort – it only increases

play06:42

the number of split steps by 1 since log base 2 of 16 equals 4.

play06:46

Even if we increase the size of the array more than a thousand times, from 8 items to

play06:50

8000 items, the number of split steps stays pretty low.

play06:54

Log base 2 of 8000 is roughly 13.

play06:56

That's more, but not much more than 3 -- about four times larger – and yet we’re sorting

play07:01

a lot more numbers.

play07:02

For this reason, merge sort is much more efficient than selection sort.

play07:06

And now I can put my ceramic cat collection in name order MUCH faster!

play07:10

There are literally dozens of sorting algorithms we could review, but instead, I want to move

play07:14

on to my other favorite category of classic algorithmic problems: graph search!

play07:18

A graph is a network of nodes connected by lines.

play07:20

You can think of it like a map, with cities and roads connecting them.

play07:23

Routes between these cities take different amounts of time.

play07:26

We can label each line with what is called a cost or weight.

play07:29

In this case, it’s weeks of travel.

play07:31

Now let’s say we want to find the fastest route for an army at Highgarden to reach the

play07:35

castle at Winterfell.

play07:36

The simplest approach would just be to try every single path exhaustively and calculate

play07:41

the total cost of each.

play07:42

That’s a brute force approach.

play07:43

We could have used a brute force approach in sorting, by systematically trying every

play07:47

permutation of the array to check if it’s sorted.

play07:50

This would have an N factorial complexity - that is the number of nodes, times one less,

play07:56

times one less than that, and so on until 1.

play07:58

Which is way worse than even N squared.

play08:00

But, we can be way more clever!

play08:03

The classic algorithmic solution to this graph problem was invented by one of the greatest

play08:06

minds in computer science practice and theory, Edsger Dijkstra, so it’s appropriately named

play08:11

Dijkstra's algorithm.

play08:12

We start in Highgarden with a cost of 0, which we mark inside the node.

play08:16

For now, we mark all other cities with question marks - we don’t know the cost of getting

play08:20

to them yet.

play08:21

Dijkstra's algorithm always starts with the node with lowest cost.

play08:24

In this case, it only knows about one node, Highgarden, so it starts there.

play08:28

It follows all paths from that node to all connecting nodes that are one step away, and

play08:32

records the cost to get to each of them.

play08:34

That completes one round of the algorithm.

play08:36

We haven’t encountered Winterfell yet, so we loop and run Dijkstra's algorithm again.

play08:40

With Highgarden already checked, the next lowest cost node is King's Landing.

play08:44

Just as before, we follow every unvisited line to any connecting cities.

play08:47

The line to The Trident has a cost of 5.

play08:50

However, we want to keep a running cost from Highgarden, so the total cost of getting to

play08:54

The Trident is 8 plus 5, which is 13 weeks.

play08:57

Now we follow the offroad path to Riverrun, which has a high cost of 25, for a total of 33.

play09:02

But we can see inside of Riverrun that we’ve already found a path with a lower cost of

play09:06

just 10.

play09:07

So we disregard our new path, and stick with the previous, better path.

play09:10

We’ve now explored every line from King's Landing and didn’t find Winterfell, so we

play09:14

move on.

play09:15

The next lowest cost node is Riverrun, at 10 weeks.

play09:18

First we check the path to The Trident, which has a total cost of 10 plus 2, or 12.

play09:23

That’s slightly better than the previous path we found, which had a cost of 13, so

play09:27

we update the path and cost to The Trident.

play09:29

There is also a line from Riverrun to Pyke with a cost of 3.

play09:31

10 plus 3 is 13, which beats the previous cost of 14, and so we update Pyke's path and

play09:36

cost as well.

play09:37

That’s all paths from Riverrun checked... so... you guessed it, Dijkstra's algorithm

play09:41

loops again.

play09:42

The node with the next lowest cost is The Trident and the only line from The Trident

play09:46

that we haven’t checked is a path to Winterfell!

play09:49

It has a cost of 10, plus we need to add in the cost of 12 it takes to get to The Trident,

play09:53

for a grand total cost of 22.

play09:55

We check our last path, from Pyke to Winterfell, which sums to 31.

play09:59

Now we know the lowest total cost, and also the fastest route for the army to get there,

play10:03

which avoids King’s Landing!

play10:05

Dijkstra's original algorithm, conceived in 1956, had a complexity of the number of nodes

play10:09

in the graph squared.

play10:10

And squared, as we already discussed, is never great, because it means the algorithm can’t

play10:14

scale to big problems - like the entire road map of the United States.

play10:18

Fortunately, Dijkstra's algorithm was improved a few years later to take the number of nodes

play10:22

in the graph, times the log of the number of nodes, PLUS the number of lines.

play10:26

Although this looks more complicated, it’s actually quite a bit faster.

play10:30

Plugging in our example graph, with 6 cities and 9 lines, proves it.

play10:33

Our algorithm drops from 36 loops to around 14.

play10:36

As with sorting, there are innumerable graph search algorithms, with different pros and cons.

play10:41

Every time you use a service like Google Maps to find directions, an algorithm much like

play10:45

Dijkstra's is running on servers to figure out the best route for you.

play10:48

Algorithms are everywhere and the modern world would not be possible without them.

play10:52

We touched only the very tip of the algorithmic iceberg in this episode, but a central part

play10:57

of being a computer scientist is leveraging existing algorithms and writing new ones when

play11:01

needed, and I hope this little taste has intrigued you to SEARCH further.

play11:05

I’ll see you next week.

Rate This

5.0 / 5 (0 votes)

Related Tags
计算机科学算法排序选择排序归并排序图搜索Dijkstra算法效率编程语言数据结构计算复杂性优化
Do you need a summary in English?