Data Structures: Crash Course Computer Science #14
Summary
TLDR这段视频脚本介绍了计算机科学中使用的基本数据结构,包括数组、字符串、矩阵、结构体、链表、队列、栈、树和图。数组是存储在内存中的一系列值,可以通过索引访问。字符串是字符的数组,以空字符结尾。矩阵是数组的数组,用于存储二维数据。结构体允许将多个变量捆绑在一起存储。链表是一种灵活的数据结构,可以通过改变节点的指针来动态地添加或删除节点。队列遵循先进先出(FIFO)的原则,而栈遵循后进先出(LIFO)的原则。树是一种层级结构,其中每个节点可能有多个子节点。图是一种更为通用的数据结构,允许任意节点之间的连接。不同的数据结构适用于不同的计算任务,选择正确的数据结构可以大大简化编程工作。大多数编程语言都提供了内置的数据结构库,如C++的标准模板库和Java的类库,这使得程序员可以不必从头开始实现这些结构,而是利用它们进行更高层次的抽象和更有趣的编程工作。
Takeaways
- 📚 数据结构对于算法的执行至关重要,它们决定了数据在计算机内存中的存储方式。
- 🔍 数组是基本的数据结构,可以存储一系列值,并通过索引来访问特定的值。
- 🔑 字符串实际上是字符的数组,以null字符(二进制0)结束,以标识字符串在内存中的结束。
- 🖇️ 矩阵可以视为数组的数组,用于存储二维数据,如电子表格中的网格或屏幕上的像素。
- 🏢 结构体(Struct)允许将相关的变量组合在一起存储,如银行账户号和余额。
- 🔗 指针是一种特殊变量,指向内存中的一个位置,用于创建链表等灵活的数据结构。
- 🔄 链表是一种可以动态扩展或缩短的数据结构,通过改变节点的指针来添加或删除元素。
- 📈 队列(Queue)和栈(Stack)是基于链表的更复杂的数据结构,分别遵循FIFO和LIFO原则。
- 🌳 树是一种数据结构,由节点组成,每个节点有零个或多个子节点,用于表示层次关系。
- 🔍 二叉树是特殊的树,每个节点最多有两个子节点,常用于各种算法中。
- 🔗 图(Graph)是一种更为通用的数据结构,允许节点之间有任意的连接,包括循环。
- 🛠️ 编程语言通常提供内置的或库中的数据结构,如C++的STL和Java的JCL,以简化实现复杂数据结构的工作。
Q & A
数据结构在计算机科学中的作用是什么?
-数据结构在计算机科学中用于组织和存储数据,以便数据可以被高效地检索和读取。它们允许算法以一种结构化的方式运行,从而提高程序的性能和效率。
数组(Array)在内存中是如何存储的?
-数组在内存中是连续存储的。它们是一系列值,按照内存地址的连续性存放。例如,一个包含7个数字的数组可能被存储在内存位置1000开始的地方,这些数字一个接一个地存储在连续的内存地址中。
如何通过索引访问数组中的特定值?
-通过指定索引来访问数组中的特定值。几乎所有的编程语言都从索引0开始计数,使用方括号语法来表示数组访问。例如,要访问数组中索引为5的元素,程序会跳转到内存位置1000加上5的偏移量。
字符串(String)是如何存储在内存中的?
-字符串实际上是字符的数组。它们在内存中存储时,每个字符占用一个位置,字符串以一个特殊的null字符(二进制值0)结束,这表示字符串的结束。
矩阵(Matrix)可以如何表示?
-矩阵可以被看作是数组的数组。例如,一个3x3的矩阵实际上是一个包含3个元素的数组,每个元素本身也是一个包含3个元素的数组。
如何通过索引访问矩阵中的特定值?
-访问矩阵中的特定值需要指定两个索引。例如,要访问矩阵中索引为(2,1)的元素,意味着要查找子数组2中索引为1的元素。
结构体(Struct)在数据存储中有什么作用?
-结构体允许将相关的变量组合在一起存储。它们可以创建复合数据结构,能够一次性存储多个数据片段,如银行账户的账号和余额。
链表(Linked List)与数组相比有什么优势?
-链表是一种灵活的数据结构,可以通过改变节点的指针来动态地扩展或缩短。与数组相比,链表可以轻松地插入、删除、排序、分割和反转,非常适合需要频繁变动的数据结构。
队列(Queue)和栈(Stack)是如何基于链表实现的?
-队列遵循先进先出(FIFO)的原则,像邮局排队一样,通过改变链表的头指针来实现入队(enqueue)和出队(dequeue)。栈遵循后进先出(LIFO)的原则,像堆叠的煎饼,通过压栈(push)和弹栈(pop)操作来管理数据。
树(Tree)和图(Graph)数据结构的主要区别是什么?
-树是一种有层级结构的数据结构,其中每个节点除了有一个父节点外,还可以有多个子节点,但树中的节点没有循环。而图是一种更为通用的数据结构,允许任意两个节点之间的连接,包括循环。
为什么选择合适的数据结构对于编程很重要?
-选择合适的数据结构可以显著提高程序的性能和开发效率。不同的数据结构有不同的特性,适用于不同的计算需求。大多数编程语言都提供了丰富的数据结构库,如C++的Standard Template Library和Java的Java Class Library,使得程序员可以不必从零开始实现数据结构。
Outlines
📚 数据结构基础
Carrie Anne 在这一段落中介绍了计算机科学中数据结构的重要性和基本概念。她首先提到了算法运行时数据存储的问题,并强调了数据组织的必要性。接着,她回顾了数组(或称为列表、向量)这一基本数据结构,解释了数组是如何在内存中存储一系列值的,并介绍了如何通过索引来访问数组中的特定值。此外,她还提到了字符串、矩阵、结构体(Struct)等数据结构,以及它们在内存中的存储方式和使用场景。最后,她讨论了如何使用结构体构建更复杂的数据结构,如链表,以及链表的灵活性和如何通过改变指针来动态地扩展或缩短链表。
🔗 链表、队列、栈和树
在第二段落中,Carrie Anne 深入探讨了链表的工作原理,包括它们是如何通过指针连接各个节点来形成一个灵活的数据结构。她解释了链表的动态特性,如如何通过改变指针来插入新节点或重新排序链表。此外,她还介绍了队列和栈这两种基于链表的更复杂的数据结构,包括它们的FIFO(先进先出)和LIFO(后进先出)原则。最后,她讨论了树结构,包括二叉树的概念,以及如何通过增加指针来构建更复杂的树结构。她还提到了图结构,用于表示任意连接的数据。最后,她强调了选择合适的数据结构对于简化编程工作的重要性,并提到了大多数编程语言都提供了内置的数据结构库,如C++的Standard Template Library和Java的Java Class Library。
Mindmap
Keywords
💡数据结构
💡数组
💡字符串
💡矩阵
💡结构体
💡链表
💡队列
💡栈
💡树
💡图
💡标准库
Highlights
计算机科学中,数据结构用于存储算法运行所需的数据,以保持数据的有序性,便于检索和读取。
数组(Array)是一种基本的数据结构,它允许存储一系列值,并通过索引来访问特定的值。
几乎所有编程语言都从索引0开始数组,并使用方括号语法来表示数组访问。
数组在内存中的存储是连续的,可以通过内存地址和偏移量来访问数组中的元素。
字符串(String)实际上是字符的数组,以null字符(二进制值0)结束,表示字符串在内存中的结束。
矩阵(Matrix)可以视为数组的数组,用于存储二维数据,如电子表格中的数字网格或计算机屏幕上的像素。
结构体(Struct)允许将相关的变量组合在一起存储,例如将银行账户号码和余额捆绑在一起。
链表(Linked List)是一种灵活的数据结构,可以通过改变节点的指针来动态地扩展或缩短。
队列(Queue)是一种遵循先进先出(FIFO)原则的数据结构,类似于邮局的排队系统。
栈(Stack)是一种后进先出(LIFO)的数据结构,数据被压入栈顶并从栈顶弹出。
树(Tree)是一种数据结构,其中每个节点可以有零个或多个子节点,顶层节点称为根节点,没有子节点的称为叶节点。
二叉树是一种特殊类型的树,其中每个节点最多有两个子节点。
图(Graph)是一种数据结构,用于表示复杂的关系,其中节点可以指向任何其他节点,没有根和叶的概念。
不同的数据结构具有不同的属性,适用于特定的计算任务,选择合适的数据结构可以大大简化工作。
大多数编程语言都带有包含现成数据结构的库,如C++的标准模板库(STL)和Java的类库(JCL)。
程序员可以利用数据结构的抽象能力,进行更高级别的操作,而不必从零开始实现数据结构。
Transcripts
Hi, I'm Carrie Anne, and welcome to Crash Course Computer Science!
Last episode, we discussed a few example classic algorithms, like sorting a list of numbers
and finding the shortest path in a graph.
What we didn’t talk much about, is how the data the algorithms ran on was stored in computer
memory.
You don’t want your data to be like John Green’s college dorm room, with food, clothing
and papers strewn everywhere.
Instead, we want our data to be structured, so that it’s organized, allowing things
to be easily retrieved and read.
For this, computer scientists use Data Structures!
INTRO
We already introduced one basic data structure last episode, Arrays, also called lists or
Vectors in some languages.
These are a series of values stored in memory.
So instead of just a single value being saved into a variable, like ‘j equals 5’, we
can define a whole series of numbers, and save that into an array variable.
To be able to find a particular value in this array, we have to specify an index.
Almost all programing languages start arrays at index 0, and use a square bracket syntax
to denote array access.
So, for example, if we want to add the values in the first and third spots of our array
‘j’, and save that into a variable ‘a’, we would write a line of code like this.
How an array is stored in memory is pretty straightforward.
For simplicity, let’s say that the compiler chose to store ours at memory location 1,000.
The array contains 7 numbers, and these are stored one after another in memory, as seen here.
So when we write “j index of 0”, the computer goes to memory location 1,000, with an offset
of 0, and we get the value 5.
If we wanted to retrieve “j index of 5”, our program goes to memory location 1000,
plus an offset of 5, which in this case, holds a value of 4.
It’s easy to confuse the fifth number in the array with the number at index 5.
They are not the same.
Remember, the number at index 5 is the 6th number in the array because the first number
is at index 0.
Arrays are extremely versatile data structures, used all the time, and so there are many functions
that can handle them to do useful things.
For example, pretty much every programming language comes with a built-in sort function,
where you just pass in your array, and it comes back sorted.
So there’s no need to write that algorithm from scratch.
Very closely related are Strings, which are just arrays of characters, like letters, numbers,
punctuation and other written symbols.
We talked about how computers store characters way back in Episode 4.
Most often, to save a string into memory, you just put it in quotes, like so.
Although it doesn’t look like an array, it is.
Behind the scenes, the memory looks like this.
Note that the string ends with a zero in memory.
It’s not the character zero, but the binary value 0.
This is called the null character, and denotes the end of the string in memory.
This is important because if I call a function like “print quote”, which writes the string
to the screen, it prints out each character in turn starting at the first memory location,
but it needs to know when to stop!
Otherwise, it would print out every single thing in memory as text.
The zero tells string functions when to stop.
Because computers work with text so often, there are many functions that specifically
handle strings.
For example, many programming languages have a string concatenation function, or “strcat”,
which takes in two strings, and copies the second one to the end of the first.
We can use arrays for making one dimensional lists, but sometimes you want to manipulate
data that is two dimensional, like a grid of numbers in a spreadsheet, or the pixels
on your computer screen.
For this, we need a Matrix.
You can think of a Matrix as an array of arrays!
So a 3 by 3 matrix is really 2 an array of size 3, with each index storing an array of
size 3.
We can initialize a matrix like so.
In memory, this is packed together in order like this.
To access a value, you need to specify two indexes, like “J index of 2, then index
of 1” - this tells the computer you’re looking for the item in subarray 2 at position 1.
And this would give us the value 12.
The cool thing about matrices is we’re not limited to 3 by 3 -- we can make them any
size we want -- and we can also make them any number of dimensions we want.
For example, we can create a five dimensional matrix and access it like this.
That’s right, you now know how to access a five dimensional matrix -- tell your friends!
So far, we’ve been storing individual numbers or letters into our arrays or matrices.
But often it’s useful to store a block of related variables together.
Like, you might want to store a bank account number along with its balance.
Groups of variables like these can be bundled together into a Struct.
Now we can create variables that aren’t just single numbers, but are compound data
structures, able to store several pieces of data at once.
We can even make arrays of structs that we define, which are automatically bundled together
in memory.
If we access, for example, J index of 0, we get back the whole struct stored there, and
we can pull the specific account number and balance data we want.
This array of structs, like any other array, gets created at a fixed size that can’t
be enlarged to add more items.
Also, arrays must be stored in order in memory, making it hard to add a new item to the middle.
But, the struct data structure can be used for building more complicated data structures
that avoid these restrictions.
Let’s take a look at this struct that’s called a “node”.
It stores a variable, like a number, and also a pointer.
A pointer is a special variable that points, hence the name, to a location in memory.
Using this struct, we can create a linked list, which is a flexible data structure that
can store many nodes.
It does this by having each node point to the next node in the list.
Let’s imagine we have three node structs saved in memory, at locations 1000, 1002 and 1008.
They might be spaced apart, because they were created at different times, and other data
can sit between them.
So, you see that the first node contains the value 7, and the location 1008 in its “next”
pointer.
This means that the next node in the linked list is located at memory location 1008.
Looking down the linked list, to the next node, we see it stores the value 112 and points
to another node at location 1002.
If we follow that, we find a node that contains the value 14 and points back to the first
node at location 1000.
So this linked list happened to be circular, but it could also have been terminated by
using a next pointer value of 0 -- the null value -- which would indicate we’ve reached
the end of the list.
When programmers use linked lists, they rarely look at the memory values stored in the next
pointers.
Instead, they can use an abstraction of a linked list, that looks like this, which is
much easier to conceptualize.
Unlike an array, whose size has to be pre-defined, linked lists can be dynamically extended or
shortened.
For example, we can allocate a new node in memory, and insert it into this list, just
by changing the next pointers.
Linked Lists can also easily be re-ordered, trimmed, split, reversed, and so on.
Which is pretty nifty!
And pretty useful for algorithms like sorting, which we talked about last week.
Owing to this flexibility, many more-complex data structures are built on top of linked lists
The most famous and universal are queues and stacks.
A queue – like the line at your post office – goes in order of arrival.
The person who has been waiting the longest, gets served first.
No matter how frustrating it is that all you want to do is buy stamps and the person in
front of you seems to be mailing 23 packages.
But, regardless, this behavior is called First-In First-Out, or FIFO.
That’s the first part.
Not the 23 packages thing.
Imagine we have a pointer, named “post office queue”, that points to the first node in
our linked list.
Once we’re done serving Hank, we can read Hank’s next pointer, and update our “post
office queue” pointer to the next person in the line.
We’ve successfully dequeued Hank -- he’s gone, done, finished.
If we want to enqueue someone, that is, add them to the line, we have to traverse down
the linked list until we hit the end, and then change that next pointer to point to
the new person.
With just a small change, we can use linked lists as stacks, which are LIFO…
Last-In First-Out.
You can think of this like a stack of pancakes... as you make them, you add them to the top
of stack.
And when you want to eat one, you take them from the top of the stack.
Delicious!
Instead of enqueueing and dequeuing, data is pushed onto the stack and popped from the stacks.
Yep, those are the official terms!
If we update our node struct to contain not just one, but two pointers, we can build trees,
another data structure that’s used in many algorithms.
Again, programmers rarely look at the values of these pointers, and instead conceptualize
trees like this: The top most node is called the root.
And any nodes that hang from other nodes are called children nodes.
As you might expect, nodes above children are called parent nodes.
Does this example imply that Thomas Jefferson is the parent of Aaron Burr?
I’ll leave that to your fanfiction to decide.
And finally, any nodes that have no children -- where the tree ends -- are called Leaf Nodes.
In our example, nodes can have up to two children, and for that reason, this particular data
structure is called a binary tree.
But you could just as easily have trees with three, four or any number of children by modifying
the data structure accordingly.
You can even have tree nodes that use linked lists to store all the nodes they point to.
An important property of trees – both in real life and in data structures – is that
there’s a one-way path from roots to leaves.
It’d be weird if roots connected to leaves, that connected to roots.
For data that links arbitrarily, that include things like loops, we can use a graph data
structure instead.
Remember our graph from last episode of cities connected by roads?
This can be stored as nodes with many pointers, very much like a tree, but there is no notion
of roots and leaves, and children and parents…
Anything can point to anything!
So that’s a whirlwind overview of pretty much all of the fundamental data structures
used in computer science.
On top of these basic building blocks, programmers have built all sorts of clever variants, with
slightly different properties -- data structures like red-black trees and heaps, which we don’t
have time to cover.
These different data structures have properties that are useful for particular computations.
The right choice of data structure can make your job a lot easier, so it pays off to think
about how you want to structure your data before you jump in.
Fortunately, most programming languages come with libraries packed full of ready-made data structures.
For example, C++ has its Standard Template Library, and Java has the Java Class Library.
These mean programmers don’t have to waste time implementing things from scratch, and
can instead wield the power of data structures to do more interesting things, once again
allowing us to operate at a new level of abstraction!
I’ll see you next week.
تصفح المزيد من مقاطع الفيديو ذات الصلة
快速上手了解智能合約(NFT標準) | TON Blockchain, TEP62
Intro to Algorithms: Crash Course Computer Science #13
Representing Numbers and Letters with Binary: Crash Course Computer Science #4
Artificial Intelligence Explained Simply in 1 Minute! ✨
How to write more flexible game code
Save and persist data with UserDefaults | Todo List #4
5.0 / 5 (0 votes)