Matrix multiplication as composition | Chapter 4, Essence of linear algebra

3Blue1Brown
8 Aug 201610:03

Summary

TLDRThis educational video script delves into the concept of linear transformations and their representation through matrices. It explains how these transformations can be visualized as altering space while maintaining grid lines' parallelism and even spacing, with the origin fixed. The script emphasizes that a linear transformation is determined by the effect on basis vectors, like i-hat and j-hat in two dimensions. It then illustrates how matrix multiplication represents applying one transformation after another, highlighting the importance of order in such operations. The script uses examples to clarify the geometric interpretation of matrix multiplication and introduces the concept of matrix composition, demonstrating its computation and reinforcing the idea with visual aids. It concludes by advocating a conceptual understanding of matrix operations over rote memorization, promising to extend these concepts beyond two dimensions in a future video.

Takeaways

  • 📚 Linear transformations are represented by matrices, which map vectors to vectors while preserving the structure of space.
  • 📏 Linear transformations can be visualized as 'smooshing' space without altering the parallelism and spacing of grid lines, with the origin fixed.
  • 🎯 The outcome of a linear transformation on any vector is determined by the transformation's effect on the basis vectors (i-hat and j-hat in 2D).
  • 📈 A vector's new position after a transformation can be found by multiplying its original coordinates with the new coordinates of the basis vectors.
  • 📋 The coordinates where the basis vectors land after a transformation are recorded as the columns of a matrix, which defines matrix-vector multiplication.
  • 🔄 The composition of two linear transformations results in a new transformation that can be represented by a matrix formed by the final positions of i-hat and j-hat.
  • 🔢 Matrix multiplication geometrically represents the sequential application of one transformation followed by another, read from right to left.
  • 🔄 Understanding matrix multiplication through the lens of transformations can simplify complex concepts and proofs, such as associativity.
  • 🧩 The order of matrix multiplication matters because different sequences of transformations can lead to distinct outcomes.
  • 📉 The example of applying a shear followed by a rotation versus a rotation followed by a shear demonstrates the importance of the order of operations.
  • 🌐 Associativity in matrix multiplication is intuitively understood as applying a series of transformations in the same sequence, regardless of grouping.

Q & A

  • What is a linear transformation?

    -A linear transformation is a function that takes vectors as inputs and produces vectors as outputs, preserving the operations of vector addition and scalar multiplication. It can be visually thought of as 'smooshing' space in a way that grid lines remain parallel and evenly spaced, with the origin fixed.

  • How are linear transformations represented using matrices?

    -Linear transformations are represented using matrices by recording the coordinates where the basis vectors (like i-hat and j-hat in two dimensions) land after the transformation. These coordinates become the columns of the matrix, and matrix-vector multiplication is used to compute the transformation of any vector.

  • Why are basis vectors important in linear transformations?

    -Basis vectors are important because any vector in the space can be described as a linear combination of these basis vectors. Knowing where the basis vectors land after a transformation allows us to determine where any other vector will land.

  • What is the geometric consequence of linear transformations keeping grid lines parallel and evenly spaced?

    -The geometric consequence is that after a transformation, a vector with coordinates (x, y) will land at x times the new coordinates of i-hat plus y times the new coordinates of j-hat, maintaining the linear relationship between input and output vectors.

  • How does matrix multiplication relate to applying one linear transformation after another?

    -Matrix multiplication geometrically represents the composition of two linear transformations. You first apply the transformation represented by the matrix on the right, and then apply the transformation represented by the matrix on the left.

  • What is the composition matrix and how is it formed?

    -The composition matrix is a matrix that captures the overall effect of applying one linear transformation after another as a single action. It is formed by determining the final locations of the basis vectors after both transformations and using these locations as the columns of the new matrix.

  • How does the order of matrix multiplication affect the result?

    -The order of matrix multiplication is significant because it represents applying one transformation after another in a specific sequence. Changing the order can result in a completely different transformation, as the effect of each transformation on the basis vectors will be different.

  • Why is matrix multiplication considered associative?

    -Matrix multiplication is associative because applying three transformations in the order of A then B then C will have the same effect regardless of whether you first combine A with B or B with C. This is intuitive when thinking of transformations being applied sequentially.

  • What does it mean to multiply a matrix by a vector?

    -Multiplying a matrix by a vector means applying the linear transformation represented by the matrix to the vector. This is done by taking the dot product of the matrix's rows (or columns) with the vector, resulting in a new vector that represents the transformed input vector.

  • How can understanding matrix multiplication as transformations help with learning?

    -Understanding matrix multiplication as applying one transformation after another provides a conceptual framework that makes the properties of matrix multiplication more intuitive and easier to grasp, rather than just memorizing the mathematical process.

  • What is the significance of the convention of recording the coordinates of i-hat and j-hat landing as matrix columns?

    -Recording the coordinates where i-hat and j-hat land as matrix columns standardizes the representation of linear transformations. It allows for a systematic way to compute the transformation of any vector through matrix-vector multiplication.

Outlines

00:00

📚 Recap on Linear Transformations and Matrices

This paragraph begins with a quick recap on linear transformations, emphasizing their importance and suggesting viewers revisit the previous video for a full understanding. Linear transformations are functions that take vectors as inputs and produce vectors as outputs. They can be visualized as distorting space while keeping grid lines parallel and evenly spaced, with the origin fixed. The essence of a linear transformation is determined by its effect on the basis vectors of the space, in two dimensions being i-hat and j-hat. Any vector can be expressed as a linear combination of these basis vectors. The transformed coordinates of i-hat and j-hat, recorded as matrix columns, allow for the computation of the landing spot of any vector after the transformation through matrix-vector multiplication. The paragraph then transitions into discussing the composition of transformations, where one transformation is applied after another, resulting in a new linear transformation that can be represented by a matrix formed by the final positions of i-hat and j-hat.

05:00

🔄 Understanding Matrix Multiplication through Transformation Composition

The second paragraph delves into the concept of matrix multiplication as a means to represent the composition of two linear transformations. It uses the example of a rotation followed by a shear to illustrate how the resulting transformation can be encapsulated in a single matrix. The process involves determining the new positions of the basis vectors i-hat and j-hat after each transformation and using these positions to form the columns of the new matrix. The paragraph clarifies that matrix multiplication represents the application of one transformation followed by another, and it introduces the concept of reading matrix multiplication from right to left, which is attributed to function notation. It further provides an example using two specific matrices, M1 and M2, to demonstrate how to find the composition matrix by multiplying the matrices in sequence. The summary highlights the importance of understanding the geometric meaning behind matrix multiplication rather than just memorizing the algorithm. It concludes with a discussion on the importance of order in matrix multiplication due to the different effects of applying transformations in different sequences and touches on the associative property of matrix multiplication, explaining it through the concept of applying transformations in sequence.

Mindmap

Keywords

💡Linear Transformations

Linear transformations are mathematical functions that map vectors to vectors while preserving the operations of vector addition and scalar multiplication. In the context of the video, linear transformations are visualized as operations that 'smoosh' space while keeping grid lines parallel and evenly spaced, with the origin fixed. The script emphasizes that these transformations are fully characterized by where they take the basis vectors of the space, such as i-hat and j-hat in two dimensions.

💡Matrices

A matrix is a rectangular array of numbers arranged in rows and columns. In the video, matrices are used to represent linear transformations. The script explains that by recording the new coordinates of the basis vectors (i-hat and j-hat) as columns in a matrix, one can compute the image of any vector under the transformation through matrix-vector multiplication.

💡Basis Vectors

Basis vectors are fundamental vectors in a vector space that can be combined through linear operations to describe any vector in that space. The script uses i-hat and j-hat to represent the standard basis vectors in two-dimensional space, which are essential for defining the effect of a linear transformation.

💡Matrix-Vector Multiplication

Matrix-vector multiplication is the process of multiplying a matrix by a vector, resulting in a new vector. The video script describes this process as the computational way to apply a linear transformation to a vector, using the transformed basis vectors recorded in the matrix.

💡Transformation Composition

Transformation composition refers to the process of applying one linear transformation after another. The script illustrates this concept by describing a scenario where a rotation is followed by a shear, resulting in a new linear transformation that can be represented by a single matrix.

💡Matrix Product

The matrix product is the result of multiplying two matrices together. In the video, this is related to the geometric concept of applying one transformation after another. The script explains that the order of matrix multiplication corresponds to the order of applying transformations, which is crucial for capturing the correct composite effect.

💡Geometric Interpretation

Geometric interpretation involves understanding mathematical concepts by visualizing them in a geometric context. The video script encourages viewers to think about matrix multiplication geometrically, as a way to apply one transformation after another, which helps in understanding the properties of matrix operations.

💡Associativity

Associativity is a property of an operation that states that the way it groups does not change the result. In the context of the video, matrix multiplication is shown to be associative, meaning that the order of operations when multiplying three or more matrices does not matter, as long as the order of the matrices remains the same.

💡Shear Transformation

A shear transformation is a type of linear transformation that distorts a shape by sliding it parallel to itself. The script uses the example of a shear transformation that fixes the i-hat vector and moves the j-hat vector to the right, illustrating how different orders of transformations can lead to different results.

💡Rotation Transformation

A rotation transformation is a linear transformation that turns a shape around a fixed point without changing its size or shape. The video script describes a 90-degree counterclockwise rotation as an example of a transformation that, when composed with a shear, results in a different outcome depending on the order of application.

Highlights

Linear transformations can be represented using matrices.

Linear transformations are functions with vectors as inputs and outputs.

Transformations visually 'smoosh' space while keeping grid lines parallel and evenly spaced.

A linear transformation is determined by where it takes the basis vectors.

For 2D, basis vectors are i-hat and j-hat, and any vector can be described as a linear combination of them.

Grid lines' property post-transformation allows us to compute landing coordinates.

Matrix columns record where i-hat and j-hat land after a transformation.

Matrix-vector multiplication is defined by the sum of scaled matrix columns.

Matrix composition represents the effect of applying one transformation after another.

Composition of transformations results in a new linear transformation.

Matrix multiplication geometrically means applying one transformation then another.

Order of matrix multiplication matters as it changes the overall transformation effect.

Matrix multiplication is read from right to left, following function notation.

Associative property of matrix multiplication is intuitive when thought of as sequential transformations.

Visualizing transformations can help understand matrix multiplication properties without computation.

The video encourages exploring matrix multiplication through the lens of transformations.

Upcoming video content will extend these concepts beyond two dimensions.

Transcripts

play00:10

Hey everyone, where we last left off, I showed what linear

play00:13

transformations look like and how to represent them using matrices.

play00:18

This is worth a quick recap because it's just really important,

play00:21

but of course if this feels like more than just a recap, go back and watch the full video.

play00:25

Technically speaking, linear transformations are functions with vectors

play00:29

as inputs and vectors as outputs, but I showed last time how we can think

play00:33

about them visually as smooshing around space in such a way that grid

play00:37

lines stay parallel and evenly spaced, and so that the origin remains fixed.

play00:41

The key takeaway was that a linear transformation is completely determined by where it

play00:46

takes the basis vectors of the space, which for two dimensions means i-hat and j-hat.

play00:51

This is because any other vector could be described

play00:54

as a linear combination of those basis vectors.

play00:57

A vector with coordinates x, y is x times i-hat plus y times j-hat.

play01:03

After going through the transformation, this property that grid

play01:06

lines remain parallel and evenly spaced has a wonderful consequence.

play01:10

The place where your vector lands will be x times the transformed

play01:14

version of i-hat plus y times the transformed version of j-hat.

play01:18

This means if you keep a record of the coordinates where i-hat lands and the

play01:22

coordinates where j-hat lands, you can compute that a vector which starts at x,

play01:27

y must land on x times the new coordinates of i-hat plus y times the new coordinates

play01:32

of j-hat.

play01:33

The convention is to record the coordinates of where i-hat and j-hat

play01:37

land as the columns of a matrix, and to define this sum of the scaled

play01:41

versions of those columns by x and y to be matrix-vector multiplication.

play01:46

In this way, a matrix represents a specific linear transformation,

play01:50

and multiplying a matrix by a vector is what it means

play01:53

computationally to apply that transformation to that vector.

play01:58

Alright, recap over, on to the new stuff.

play02:01

Oftentimes, you find yourself wanting to describe the

play02:04

effects of applying one transformation and then another.

play02:07

For example, maybe you want to describe what happens when you first

play02:11

rotate the plane 90 degrees counterclockwise, then apply a shear.

play02:15

The overall effect here, from start to finish,

play02:17

is another linear transformation, distinct from the rotation and the shear.

play02:22

This new linear transformation is commonly called the

play02:25

composition of the two separate transformations we applied.

play02:28

And like any linear transformation, it can be described

play02:32

with a matrix all of its own by following i-hat and j-hat.

play02:36

In this example, the ultimate landing spot for i-hat after both transformations is 1,1,

play02:41

so let's make that the first column of a matrix.

play02:44

Likewise, j-hat ultimately ends up at the location negative 1,0,

play02:48

so we make that the second column of the matrix.

play02:52

This new matrix captures the overall effect of applying a rotation then a shear,

play02:57

but as one single action, rather than two successive ones.

play03:03

Here's one way to think about that new matrix.

play03:05

If you were to take some vector and pump it through the rotation, then the shear,

play03:09

the long way to compute where it ends up is to first multiply it on the left by the

play03:14

rotation matrix, then take whatever you get and multiply that on the left by the shear

play03:19

matrix.

play03:20

This is, numerically speaking, what it means to

play03:23

apply a rotation then a shear to a given vector.

play03:26

But whatever you get should be the same as just applying this new composition matrix

play03:31

that we just found by that same vector, no matter what vector you chose,

play03:35

since this new matrix is supposed to capture the same overall effect as the rotation

play03:40

then shear action.

play03:42

Based on how things are written down here, I think it's reasonable to

play03:45

call this new matrix the product of the original two matrices, don't you?

play03:50

We can think about how to compute that product more generally in just a moment,

play03:53

but it's way too easy to get lost in the forest of numbers.

play03:56

Always remember that multiplying two matrices like this has the

play04:00

geometric meaning of applying one transformation then another.

play04:05

One thing that's kind of weird here is that this has us reading from right to left.

play04:10

You first apply the transformation represented by the matrix on the right,

play04:13

then you apply the transformation represented by the matrix on the left.

play04:17

This stems from function notation, since we write functions on the left of variables,

play04:21

so every time you compose two functions, you always have to read it right to left.

play04:25

Good news for the Hebrew readers, bad news for the rest of us.

play04:29

Let's look at another example.

play04:31

Take the matrix with columns 1,1 and negative 2,0, whose transformation looks like this.

play04:37

And let's call it M1.

play04:40

Next, take the matrix with columns 0,1 and 2,0, whose transformation looks like this.

play04:47

And let's call that guy M2.

play04:49

The total effect of applying M1 then M2 gives us a new transformation,

play04:54

so let's find its matrix.

play04:56

But this time, let's see if we can do it without watching the animations,

play05:00

and instead just using the numerical entries in each matrix.

play05:04

First, we need to figure out where i-hat goes.

play05:08

After applying M1, the new coordinates of i-hat,

play05:11

by definition, are given by that first column of M1, namely 1,1.

play05:16

To see what happens after applying M2, multiply the matrix for M2 by that vector 1,1.

play05:25

Working it out, the way I described last video, you'll get the vector 2,1.

play05:30

This will be the first column of the composition matrix.

play05:34

Likewise, to follow j-hat, the second column of

play05:37

M1 tells us that it first lands on negative 2,0.

play05:42

Then, when we apply M2 to that vector, you can work out the matrix-vector product

play05:49

to get 0, negative 2, which becomes the second column of our composition matrix.

play05:56

Let me talk through that same process again, but this time I'll show variable entries

play06:00

in each matrix, just to show that the same line of reasoning works for any matrices.

play06:05

This is more symbol-heavy and will require some more room,

play06:08

but it should be pretty satisfying for anyone who has previously been taught matrix

play06:12

multiplication the more rote way.

play06:14

To follow where i-hat goes, start by looking at the first column of

play06:17

the matrix on the right, since this is where i-hat initially lands.

play06:22

Multiplying that column by the matrix on the left is how you can tell where the

play06:26

intermediate version of i-hat ends up after applying the second transformation.

play06:31

So the first column of the composition matrix will always

play06:34

equal the left matrix times the first column of the right matrix.

play06:42

Likewise, j-hat will always initially land on the second column of the right matrix.

play06:48

So multiplying the left matrix by this second column will give its final location,

play06:53

and hence that's the second column of the composition matrix.

play07:00

Notice there's a lot of symbols here, and it's common to be taught this formula as

play07:04

something to memorize, along with a certain algorithmic process to help remember it.

play07:09

But I really do think that before memorizing that process,

play07:12

you should get in the habit of thinking about what matrix

play07:15

multiplication really represents, applying one transformation after another.

play07:19

Trust me, this will give you a much better conceptual framework that

play07:22

makes the properties of matrix multiplication much easier to understand.

play07:27

For example, here's a question.

play07:28

Does it matter what order we put the two matrices in when we multiply them?

play07:33

Well, let's think through a simple example, like the one from earlier.

play07:37

Take a shear, which fixes i-hat and smooshes j-hat over to the right,

play07:41

and a 90 degree rotation.

play07:43

If you first do the shear, then rotate, we can see that

play07:47

i-hat ends up at 0,1 and j-hat ends up at negative 1,1.

play07:51

Both are generally pointing close together.

play07:53

If you first rotate, then do the shear, i-hat ends up over at 1,1,

play07:58

and j-hat is off in a different direction at negative 1,0, and they're pointing,

play08:03

you know, farther apart.

play08:06

The overall effect here is clearly different, so evidently, order totally does matter.

play08:12

Notice, by thinking in terms of transformations,

play08:14

that's the kind of thing that you can do in your head by visualizing.

play08:18

No matrix multiplication necessary.

play08:21

I remember when I first took linear algebra, there was this one homework

play08:25

problem that asked us to prove that matrix multiplication is associative.

play08:29

This means that if you have three matrices, A, B, and C,

play08:32

and you multiply them all together, it shouldn't matter if you first compute A times B,

play08:37

then multiply the result by C, or if you first multiply B times C,

play08:41

then multiply that result by A on the left.

play08:44

In other words, it doesn't matter where you put the parentheses.

play08:48

Now, if you try to work through this numerically, like I did back then,

play08:52

it's horrible, just horrible, and unenlightening for that matter.

play08:55

But when you think about matrix multiplication as applying

play08:59

one transformation after another, this property is just trivial.

play09:03

Can you see why?

play09:04

What it's saying is that if you first apply C, then B,

play09:08

then A, it's the same as applying C, then B, then A.

play09:12

I mean, there's nothing to prove.

play09:14

You're just applying the same three things one after the other, all in the same order.

play09:19

This might feel like cheating, but it's not.

play09:21

This is an honest-to-goodness proof that matrix multiplication is associative,

play09:25

and even better than that, it's a good explanation for why that property should be true.

play09:31

I really do encourage you to play around more with this idea,

play09:34

imagining two different transformations, thinking about what happens when

play09:38

you apply one after the other, and then working out the matrix product numerically.

play09:42

Trust me, this is the kind of playtime that really makes the idea sink in.

play09:47

In the next video, I'll start talking about extending

play09:49

these ideas beyond just two dimensions. See you then!

Rate This

5.0 / 5 (0 votes)

Related Tags
Linear AlgebraTransformationsMatrix MultiplicationVisual GeometryVector AnalysisEducational ContentMathematics TutorialSpace VisualizationBasis VectorsGeometric Composition