Singular Value Decomposition (SVD): Mathematical Overview

Steve Brunton
19 Jan 202012:50

Summary

TLDRThis video delves into Singular Value Decomposition (SVD), a fundamental matrix factorization technique in numerical linear algebra. It explains how to compute and interpret SVD in the context of data matrices, like those derived from images or physical systems. The presenter introduces the concept of decomposing a data matrix X into three matrices: U, Sigma, and V^T, highlighting the properties of unitary matrices and the significance of the diagonal, hierarchically ordered singular values. The video promises to explore the computation, interpretation, and applications of SVD for data approximation, modeling, and analysis in future segments.

Takeaways

  • πŸ“š Singular Value Decomposition (SVD) is a crucial matrix decomposition technique in numerical linear algebra.
  • 🌟 The data matrix X can be decomposed into three matrices: U, Sigma, and V transpose (UT * Ξ£ * VT).
  • πŸ” U and V are unitary (orthogonal) matrices, meaning their columns are orthonormal and provide a complete basis for the vector space.
  • πŸ“Š Sigma is a diagonal matrix with non-negative, hierarchically ordered singular values, indicating the importance of each component.
  • πŸ–ΌοΈ Examples of data matrices include collections of images reshaped into column vectors or snapshots of a physical system over time.
  • 🧬 The columns of U, known as left singular vectors, can represent 'eigenfaces' in image data or 'eigenflows' in physical systems.
  • πŸ”„ V transpose represents the mixtures or time series of these eigenvectors, showing how they combine to form the original data points.
  • πŸ”‘ SVD is computationally efficient and can be easily performed in various programming languages like MATLAB, Python, and R.
  • πŸ” The decomposition is guaranteed to exist and is unique up to the sign of the singular vectors, which can be flipped without affecting the result.
  • πŸ› οΈ SVD has applications in approximating data matrices, building models, linear regression, and principal components analysis.
  • πŸ“ˆ The ordering of singular values allows for the identification and potential omission of less significant components for data simplification and noise reduction.

Q & A

  • What is the Singular Value Decomposition (SVD)?

    -Singular Value Decomposition is a matrix decomposition technique used in numerical linear algebra, where a given matrix X is decomposed into the product of three matrices: U, Ξ£ (Sigma), and V transpose.

  • What are the types of matrices involved in SVD?

    -In SVD, U and V are unitary (or orthogonal) matrices, and Ξ£ is a diagonal matrix containing the singular values.

  • How can a data matrix X be represented in terms of column vectors?

    -A data matrix X can be represented as a collection of column vectors, such as X1, X2, ..., XM, where each column vector corresponds to an individual data point or measurement.

  • What is an example of how a data matrix X could be formed using images?

    -A data matrix X can be formed by reshaping images of faces into tall skinny column vectors, where each face image is converted into a column vector with a million dimensions (assuming a megapixel image).

  • How can the concept of a data matrix be applied to a physical system?

    -In a physical system, such as fluid flow, the data matrix X can be formed by reshaping snapshots of the flow field at different times into column vectors, representing the state of the system as it evolves.

  • What is the significance of the columns of U in the context of SVD?

    -The columns of U, also known as the left singular vectors, are orthogonal and hierarchically ordered, representing the most significant variations in the data matrix X in terms of their ability to describe the data.

  • What does the diagonal matrix Ξ£ represent in SVD?

    -The diagonal matrix Ξ£ contains the singular values of the data matrix X, which are non-negative and ordered in decreasing magnitude, representing the importance of the corresponding columns of U and V.

  • How are the columns of V interpreted in SVD?

    -The columns of V, or the right singular vectors, when transposed, represent the mixtures or combinations of the columns of U that make up the original data vectors in X, scaled by the singular values.

  • What is the computational advantage of SVD?

    -SVD is computationally efficient and can be easily computed in various programming languages like MATLAB, Python, and R using built-in functions.

  • Why is SVD guaranteed to exist and what does its uniqueness imply?

    -SVD is guaranteed to exist for any matrix X, and it is unique up to the sign of the singular vectors, meaning the decomposition is consistent but the signs of U and V can be flipped without affecting the result.

  • How can SVD be used in practical applications such as linear regression or principal components analysis?

    -SVD can be used to approximate the data matrix X, build predictive models, perform linear regression, and conduct principal components analysis by focusing on the dominant singular values and their corresponding vectors.

Outlines

00:00

πŸ“š Introduction to Singular Value Decomposition

The video script begins with an introduction to singular value decomposition (SVD), a fundamental matrix decomposition technique in numerical linear algebra. The speaker explains that SVD can be used to decompose a data matrix X, represented as a collection of column vectors, into the product of three matrices: U, Sigma, and V^T. Examples are given to illustrate how such a matrix X can be constructed from images of human faces or from snapshots of a physical system like fluid flow. The U and V matrices are described as unitary or orthogonal matrices, while Sigma is a diagonal matrix with non-negative, hierarchically ordered singular values. The speaker emphasizes the importance of these matrices in representing and approximating the original data matrix X.

05:00

πŸ” Understanding the Components of SVD

In the second paragraph, the speaker delves deeper into the components of SVD. The U and V matrices are unitary, meaning their transposes are their inverses and their columns are orthonormal. Sigma is a diagonal matrix with singular values that are non-negative and ordered in decreasing magnitude. The columns of U and V are referred to as eigenfaces or eigen flows, depending on the context, and are hierarchically arranged based on their importance in describing the variations in the data matrix X. The speaker also explains that the columns of U and V can be thought of as eigenvectors of correlation matrices between the rows and columns of X. The importance of the singular values in determining the relative significance of the corresponding columns of U and V is highlighted.

10:01

πŸ’» Computing and Applications of SVD

The final paragraph focuses on the practical aspects of computing SVD and its applications. The speaker mentions that SVD is easy to compute using common programming languages like MATLAB, Python, and R, and that it is guaranteed to exist and is unique up to the sign of the vectors. The speaker also discusses how SVD can be used to approximate the data matrix X by focusing on the dominant columns of U and V and their corresponding singular values. Applications of SVD in linear regression, principal components analysis, and modeling are briefly mentioned, with the promise of further exploration in future videos.

Mindmap

Keywords

πŸ’‘Singular Value Decomposition (SVD)

Singular Value Decomposition (SVD) is a mathematical technique used in linear algebra to decompose a matrix into three other matrices. In the context of the video, SVD is used to break down a data matrix X into the product of three matrices: U, Sigma, and V^T (V transpose). This decomposition is crucial for understanding the underlying structure of the data and for various applications like data compression and noise reduction. The video script explains how SVD can be computed and interpreted, emphasizing its utility in handling matrices derived from images or physical systems.

πŸ’‘Data Matrix X

The data matrix X is a collection of column vectors, each representing a different observation or measurement. In the video, examples include reshaping images of faces into column vectors or capturing snapshots of a fluid flow. The matrix X is central to the discussion as it is the input for the SVD process. The script uses the example of a megapixel image being reshaped into a million-by-one vector, illustrating how high-dimensional data can be represented in a matrix form.

πŸ’‘Unitary Matrices

Unitary matrices, also known as orthogonal matrices, are matrices whose columns are orthonormal vectors. This means that the vectors are orthogonal (perpendicular to each other) and have unit length. In the video, both U and V matrices are described as unitary, indicating that they provide a complete basis for the vector space in which the data matrix X resides. The script emphasizes that the transpose of these matrices is equal to their inverse, which is a key property of unitary matrices.

πŸ’‘Diagonal Matrix Sigma

The diagonal matrix Sigma is a key component of the SVD, containing the singular values of the data matrix X. These singular values are non-negative and hierarchically ordered, with the largest value representing the most significant variation in the data. The script explains that Sigma is crucial for understanding the importance of different features in the data, as the magnitude of the singular values indicates the relative importance of the corresponding columns in U and V.

πŸ’‘Eigenfaces

In the context of the video, 'eigenfaces' refers to the columns of the U matrix when the data matrix X consists of images of faces. These eigenfaces are hierarchically arranged and represent different aspects of the faces, with the first eigenface being the most significant in terms of variation. The script uses this term to illustrate how SVD can be used to capture essential features in image data, making it a powerful tool in facial recognition and image analysis.

πŸ’‘Eigenflows

The term 'eigenflows' is used in the video to describe the columns of the U matrix when the data matrix X represents snapshots of a physical system, such as fluid flow. These eigenflows capture the essential patterns in the flow field, with each eigenflow corresponding to a different mode of variation. The script explains how the eigenflows can be used to understand and predict the behavior of the physical system over time.

πŸ’‘Singular Values

Singular values are the non-negative numbers that appear on the diagonal of the Sigma matrix in the SVD. They represent the magnitude of the variations captured by the corresponding columns in the U and V matrices. The script highlights that these values are hierarchically ordered, with the largest singular value indicating the most significant variation in the data. The singular values play a crucial role in determining the importance of the features represented by the eigenvectors.

πŸ’‘V Transpose

The V transpose matrix, denoted as V^T in the video, is a key component of the SVD. It contains the right singular vectors that, when combined with the singular values, describe how the original data matrix X can be reconstructed. The script explains that the columns of V^T represent the mixtures of eigenfaces or eigenflows needed to reconstruct each column of X, making it a critical part of the decomposition process.

πŸ’‘Approximation

In the context of the video, approximation refers to the process of reconstructing the data matrix X using only a subset of the columns from the U and V matrices and the corresponding singular values from Sigma. This is possible because some singular values may be very small and can be ignored without significantly affecting the accuracy of the reconstruction. The script discusses how this can be used for data compression and noise reduction, highlighting the practical applications of SVD.

πŸ’‘Computational Ease

The video script emphasizes the ease of computing the SVD, mentioning that it can be done using simple commands in MATLAB, Python, R, and other programming languages. This ease of computation is important because it makes SVD accessible for a wide range of applications, from data analysis to physical system modeling. The script assures viewers that SVD is not only conceptually powerful but also practically easy to implement.

Highlights

Singular value decomposition (SVD) is a crucial matrix decomposition technique in numerical linear algebra.

SVD can be used to interpret and compute data matrices, such as collections of column vectors representing images or physical systems.

Examples of data matrices include megapixel images reshaped into column vectors and fluid flow systems represented over time.

The data matrix X is decomposed into three matrices: U, Sigma, and V transpose, where U and V are unitary and Sigma is diagonal.

U and V matrices are orthogonal, meaning their transposes are their inverses, providing a complete basis for the vector space.

Sigma matrix contains non-negative, hierarchically ordered singular values, indicating the importance of corresponding columns in U and V.

The columns of U can be thought of as 'eigenfaces' or principal components that capture the essence of the data in a hierarchical manner.

SVD allows for the approximation of the original data matrix X by considering only the dominant columns of U and V and their corresponding singular values.

The columns of V represent the mixtures or combinations of principal components needed to reconstruct the original data vectors.

SVD is easily computed in various programming languages, including MATLAB, Python, and R, and is guaranteed to exist and be unique up to sign flips.

The technique is applicable in various fields, such as image processing, where it can be used to analyze and reconstruct faces, and in physical sciences for analyzing flow fields.

SVD can be used for building models, linear regression, and principal components analysis, providing a versatile tool for data analysis.

The video promises further exploration of how to compute SVD matrices and their interpretations in the context of correlation matrices.

The significance of the ordering of singular values and their corresponding vectors in capturing the most information from the data matrix is emphasized.

The practical applications of SVD in approximating large datasets and reducing dimensionality are highlighted, showcasing its utility in data compression and noise reduction.

The video concludes with a teaser for upcoming content that will delve deeper into the computation and interpretation of SVD for various applications.

Transcripts

play00:00

[Music]

play00:06

welcome back so we're talking about the

play00:08

singular value decomposition which is

play00:11

one of the most useful matrix

play00:13

decompositions in numerical linear

play00:14

algebra and right now I'm gonna walk you

play00:17

through how its computed and how its

play00:19

interpreted in terms of a data matrix X

play00:21

so we're going to start with the data

play00:23

matrix X which I'm going to define as a

play00:26

collection of column vectors X 1 X 2 so

play00:32

on and so forth up to X M and so the way

play00:36

I think about this I'm going to give you

play00:37

a couple of examples of where this

play00:39

matrix X could come from for example I

play00:42

could take a bunch of pictures of

play00:44

people's faces so I could take person 1

play00:47

and I could reshape that face into a

play00:51

tall skinny column vector X 1 so let's

play00:53

say this is a megapixel image then I

play00:56

would reshape that into a million by 1

play00:58

column vector X 1 so that's person 1 I

play01:01

could do the same thing for person 2 and

play01:04

reshape them into a tall skinny X vector

play01:07

and so on and so forth up to X M so

play01:11

let's say I have M people I can get

play01:14

these M column vectors and in this case

play01:17

I'm going to say that X K is in our n

play01:21

where n is the dimension of my

play01:24

measurements so in this case of a

play01:25

megapixel image and would be a million

play01:29

dimensions a million by one vector these

play01:31

reshaped X vectors and I might have M of

play01:34

them where M might be you know a

play01:36

thousand people's faces so a thousand

play01:38

columns each column has a million pixels

play01:41

N equals a million okay another example

play01:44

of this data matrix could be I'll just

play01:48

draw it again another example I think

play01:50

about a lot is if these vectors X come

play01:53

from some physical system some

play01:55

real-world system in science or

play01:57

engineering so for example maybe I have

play02:00

a fluid flow I have some fluid flow past

play02:05

a circle and I get this nice Karman

play02:07

vortex treat that we're used to seeing

play02:08

and maybe I have this flow field as it

play02:12

evolves in time

play02:16

okay and again what I can do is I can

play02:18

take this first snapshot and I can

play02:21

reshape that into a tall skinny vector X

play02:23

1 the second snapshot in time so this is

play02:26

X 1 X 2 dot dot dot up to XM and so in

play02:32

this case of a physical system now the

play02:35

columns have the additional

play02:36

interpretation that they are the state

play02:38

of the system as it evolves in time ok

play02:42

so there's lots of ways you can build a

play02:45

data matrix from either a library of

play02:47

human faces or from a flow field that's

play02:50

evolving in time or some other physical

play02:52

process that you're measuring that's

play02:53

evolving in time and so what the

play02:56

singular value decomposition is going to

play02:58

allow us to do is take this matrix X and

play03:01

decompose it or represent it as the

play03:03

product of three other matrices okay

play03:06

we're gonna have this equals u times

play03:09

Sigma times V transpose ok and I'm gonna

play03:13

tell you all about these matrices over

play03:15

the next few videos I'm gonna give you

play03:17

what they mean how to compute them and

play03:19

how they're useful for approximating

play03:21

this matrix X the things I want to tell

play03:23

you right now is that the U matrix and

play03:26

the V matrix are what are called unitary

play03:29

matrices or orthogonal matrices and

play03:32

Sigma is a diagonal matrix so I'm going

play03:35

to write this out in kind of matrix form

play03:38

again where we have u 1 u 2 dot dot all

play03:42

the way up to u n this is an N by n

play03:46

square matrix times this Sigma matrix

play03:50

and the Sigma matrix is diagonal so I

play03:54

have Sigma 1 Sigma 2 dot dot dot now

play03:59

because there are only M columns of X

play04:02

there will only be M nonzero singular

play04:05

values and everything else below will be

play04:09

0 I'll explain this more in a bit and

play04:11

then the last matrix is this V matrix V

play04:15

1 V 2 dot dot V M transposed ok so we're

play04:22

taking our data matrix X and we're

play04:24

representing it as the product of u

play04:26

times Sigma times V where these each

play04:29

have their

play04:30

intuitive and physical interpretations

play04:32

of what these columns of U and V mean

play04:35

and what the entries of Sigma mean okay

play04:37

so a few things that are important to

play04:40

note these these columns of u are kind

play04:46

of they had the same shape as a column

play04:50

of X so if X is a million by one vector

play04:52

then the columns of U are million by one

play04:55

vectors and the way I think about these

play04:57

these are in some sense in this face

play05:00

example these would be my eigen my

play05:04

eigenfaces these are hierarchically

play05:07

arranged so u1 is somehow more important

play05:10

than u2 and so on and so forth in terms

play05:13

of their ability to describe the

play05:15

variants in the columns of X so in the

play05:18

case of faces these are eigen faces in

play05:20

the case of flow fields there I give you

play05:26

of my original data matrix ax ok so a

play05:31

couple of things I need to point out so

play05:33

U and V are unitary and essentially what

play05:40

that means is that you transpose u

play05:43

equals the identity and so does uu

play05:46

transpose ok so it's this so this matrix

play05:50

u the columns are orthonormal so they're

play05:53

all orthogonal and have unit length and

play05:55

they provide a complete basis for all of

play05:59

our n for this n dimensional subspace or

play06:02

this n dimensional vector space form the

play06:03

columns of my data live okay so you have

play06:06

this property that U is orthogonal and

play06:08

so it's transpose is its inverse

play06:10

same thing with V V V transpose equals V

play06:14

transpose V equals the identity matrix

play06:17

this one is an N by n this one is an M

play06:20

by M ok so that's important the other

play06:25

important thing to note is that Sigma is

play06:28

diagonal which I've already described

play06:32

here and it's non negative and

play06:36

hierarchically ordered so it's in

play06:38

decreasing magnitude so Sigma 1

play06:41

is greater than or equal to Sigma 2 is

play06:43

greater than or equal to Sigma 3 and so

play06:46

on and so forth is greater than or equal

play06:47

to Sigma M which is again greater than

play06:50

or equal to 0 so they're all

play06:51

non-negative some of them could be 0 but

play06:54

Sigma 1 is always greater than or equal

play06:56

to Sigma 2 is greater than or equal to

play06:58

Sigma 3 and so on and so forth ok so

play07:01

what this means is that the first column

play07:06

of U corresponding to Sigma 1 in the

play07:08

first column of V corresponding to Sigma

play07:10

1 are somehow more important than the

play07:13

second columns and those are more

play07:15

important than the third columns and so

play07:17

on and so forth

play07:18

in describing the information in the

play07:20

data matrix X and the relative

play07:22

importance of these of these columns of

play07:26

U and columns of V are given by the

play07:29

corresponding singular value so that's

play07:32

what we call these we call the u matrix

play07:34

the left singular vectors we call V the

play07:42

right singular vectors and we call this

play07:47

diagonal matrix Sigma a matrix of

play07:51

singular values

play07:54

so each of these Sigma's is a singular

play07:56

value of X corresponding to the singular

play07:59

vector U and the singular vector V ok

play08:02

and this is going to allow us at some

play08:05

point if some of these Sigma's are

play08:06

really really small we're going to be

play08:08

able to ignore them to chop them off and

play08:10

approximate the matrix X only in terms

play08:13

of the first few dominant columns of U

play08:15

and columns of V and the first dominant

play08:18

singular value Sigma so I'll talk all

play08:20

about that later but this is kind of an

play08:23

important concept is that these are

play08:25

ordered by importance ordered by

play08:30

importance and that ordering carries

play08:34

with it the ordering of the the columns

play08:36

of U and V as well so in the case of

play08:39

eigenfaces or of a data matrix X

play08:43

consisting of columns that are reshaped

play08:46

human faces these column vectors of you

play08:50

have the exact same size as a vector X

play08:53

and so these can be reshaped into faces

play08:56

as well these can be again reshaped into

play09:00

these eigen faces and so on and so forth

play09:04

ok so they have a clear interpretation

play09:06

the entries of Sigma are hierarchically

play09:09

organized that has a clear

play09:10

interpretation and these columns of V

play09:13

also have have an interpretation this

play09:16

one is a little bit trickier to explain

play09:17

because it's V transpose so I'm going to

play09:19

do my best here to to say this in a way

play09:23

that's clear and actually I'm going to

play09:24

start in the eigen flows case so if our

play09:27

data was from a flow field evolving in

play09:29

time then these would be eigen flows

play09:32

hierarchically organized the amount of

play09:35

energy that each of these column vectors

play09:37

captures of the flow would be given by

play09:39

these corresponding decreasing Sigma's

play09:41

and then V one would be essentially the

play09:45

time series for how this first mode u1

play09:49

evolves in this flow so each of these

play09:52

snapshots has a certain amount of u1 in

play09:55

it and the amount of how that mode

play09:57

varies in time is given by V so these

play10:00

are kind of eigen time series

play10:05

in the case of physics ok these eigen

play10:07

time series in the case of the human

play10:10

faces you would basically take this V

play10:12

transpose this big V transpose matrix so

play10:17

now it's V 1 transpose V 2 transpose and

play10:21

so on and so forth and essentially the

play10:24

first column of this V transpose matrix

play10:28

would tell me the mixture of all of

play10:32

those eigen faces the exact mixture that

play10:35

I have to take of all of those columns

play10:36

of U to add them up to equal x1 and the

play10:40

second column of this V transpose matrix

play10:42

would be the eigen mixture of these

play10:45

columns that add up to x2 and so on and

play10:48

so forth of course scaled by by the

play10:50

singular value so you can think of these

play10:52

as kind of mixtures of views to make X's

play11:03

ok so the first column makes x1 the

play11:05

second column makes x2 and so on and so

play11:07

forth so very interpretable very kind of

play11:11

easy to understand the columns of you

play11:15

have the same size as a column of X so

play11:17

if these are people these are eigen

play11:19

people if these are flow fields these

play11:21

are eigen flow fields and they're

play11:22

hierarchically organized and arranged

play11:25

based on these singular values Sigma so

play11:29

the last things I want to tell you these

play11:31

are very very easy to compute so in

play11:33

MATLAB you just say u s v equals SVD of

play11:40

X I called this Sigma s because I didn't

play11:44

want to type Sigma but that's you signal

play11:46

V equals the SVD of X it's just as easy

play11:49

in Python and R and almost every other

play11:51

language it's also guaranteed to exist

play11:55

that's really important it's guaranteed

play11:58

to exist

play12:01

and it's also unique up to flipping the

play12:07

sign of these vectors I can make this

play12:09

negative u1 and this negative v1 and

play12:11

nothing changes but up to that this is

play12:13

unique and it's guaranteed to exist okay

play12:16

good so we're gonna talk in the next few

play12:18

videos about how you actually compute

play12:20

these matrices use Sigma and V again how

play12:24

we interpret U and V as essentially

play12:26

eigenvectors of correlation matrices

play12:29

between the rows and columns of X and

play12:32

then we are going to be able to use the

play12:34

singular value decomposition to

play12:35

approximate the data matrix X to build

play12:38

models to do linear regression principal

play12:41

components analysis I could for example

play12:43

build models on these eigen time series

play12:45

if I have a physical system we're going

play12:47

to talk about all of that soon thank you

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Singular Value DecompositionData MatrixEigenfacesMatrix DecompositionNumerical Linear AlgebraUnitary MatricesOrthogonal MatricesEigenvectorsData ApproximationEigenflows