 Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial.

Presentation on theme: "Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial."— Presentation transcript:

Chapter 28 – Part II Matrix Operations

Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial pivoting Gaussian elimination with partial pivoting LUP factorization LUP factorization Error analysis Error analysis Complexity of matrix multiplication & inversion Complexity of matrix multiplication & inversion SVD and PCA SVD and PCA SVD and PCA applications SVD and PCA applications

Project 2 – image compression using SVD Project 2 – image compression using SVD

Eigenvalues & Eigenvectors Eigenvectors (for a square m  m matrix S ) Eigenvectors (for a square m  m matrix S ) How many eigenvalues are there at most? How many eigenvalues are there at most? only has a non-zero solution if this is a m -th order equation in λ which can have at most m distinct solutions (roots of the characteristic polynomial) – can be complex even though S is real. eigenvalue(right) eigenvector Example Some of these slides are adapted from notes of Stanford CS276 & CMU CS385

Matrix-vector multiplication has eigenvalues 3, 2, 0 with corresponding eigenvectors On each eigenvector, S acts as a multiple of the identity matrix: but as a different multiple on each. Any vector (say x= ) can be viewed as a combination of the eigenvectors: x = 2v 1 + 4v 2 + 6v 3

Matrix vector multiplication Thus a matrix-vector multiplication such as Sx can be rewritten in terms of the eigenvalues/vectors: Thus a matrix-vector multiplication such as Sx can be rewritten in terms of the eigenvalues/vectors: Even though x is an arbitrary vector, the action of S on x is determined by the eigenvalues/vectors. Even though x is an arbitrary vector, the action of S on x is determined by the eigenvalues/vectors. Suggestion: the effect of “small” eigenvalues is small. Suggestion: the effect of “small” eigenvalues is small.

Eigenvalues & Eigenvectors - For symmetric matrices, eigenvectors for distinct eigenvalues are orthogonal - All eigenvalues of a real symmetric matrix are real. - All eigenvalues of a positive semidefinite matrix are non-negative ? ? ?

Example Let Let Then Then The eigenvalues are 1 and 3 (nonnegative, real). The eigenvalues are 1 and 3 (nonnegative, real). The eigenvectors are orthogonal (and real): The eigenvectors are orthogonal (and real): Real, symmetric.

Let be a square matrix with m linearly independent eigenvectors Let be a square matrix with m linearly independent eigenvectors Theorem: Exists an eigen decomposition Theorem: Exists an eigen decomposition Columns of U are eigenvectors of S Columns of U are eigenvectors of S Diagonal elements of are eigenvalues of Diagonal elements of are eigenvalues of Eigen/diagonal Decomposition

Diagonal decomposition (2) Let U have the eigenvectors as columns: Then, SU can be written And S=U  U –1. Thus SU=U , or U –1 SU= 

Diagonal decomposition - example Recall The eigenvectors: Inverting U, Then, S=U  U –1 =

Example continued Let’s divide U (and multiply U –1 ) by Then, S= Q(Q -1 = Q T ) 

If is a symmetric matrix: If is a symmetric matrix: Theorem: Exists a (unique) eigen decomposition Theorem: Exists a (unique) eigen decomposition where Q is orthogonal: where Q is orthogonal: Q -1 = Q T Q -1 = Q T Columns of Q are normalized eigenvectors Columns of Q are normalized eigenvectors Columns are orthogonal. Columns are orthogonal. (everything is real) (everything is real) Symmetric Eigen Decomposition

Exercise Examine the symmetric eigen decomposition, if any, for each of the following matrices: Examine the symmetric eigen decomposition, if any, for each of the following matrices:

Singular Value Decomposition mmmmmnmnV is n  n For an m  n matrix A of rank r there exists a factorization (Singular Value Decomposition = SVD) as follows: The columns of U are orthogonal eigenvectors of AA T. The columns of V are orthogonal eigenvectors of A T A. Singular values. Eigenvalues 1 … r of AA T are the eigenvalues of A T A.

Singular Value Decomposition Illustration of SVD dimensions and sparseness Illustration of SVD dimensions and sparseness

SVD example Let Thus m=3, n=2. Its SVD is Typically, the singular values arranged in decreasing order. …

SVD can be used to compute optimal low-rank approximations. SVD can be used to compute optimal low-rank approximations. Approximation problem: Find A k of rank k such that Approximation problem: Find A k of rank k such that A k and X are both m  n matrices. Typically, want k << r. Low-rank Approximation Frobenius norm = 2-norm

Solution via SVD Solution via SVD Low-rank Approximation set smallest r-k singular values to zero column notation: sum of rank 1 matrices k

Approximation error How good (bad) is this approximation? How good (bad) is this approximation? It’s the best possible, measured by the Frobenius norm of the error: It’s the best possible, measured by the Frobenius norm of the error: where the  i are ordered such that  i   i+1. Suggests why Frobenius error drops as k increased.

SVD Application Image compression Image compression

SVD example Eigen values of A’A: ? Eigen vector? Matrix U?

SVD example -- Action of A on a unit circle apply  Vx U

PCA Principal Components Analysis Principal Components Analysis

Data Presentation Example: 53 Blood and urine measurements (wet chemistry) from 65 people (33 alcoholics, 32 non- alcoholics). Example: 53 Blood and urine measurements (wet chemistry) from 65 people (33 alcoholics, 32 non- alcoholics). Matrix Format Matrix Format Spectral Format Spectral Format

Univariate Bivariate Trivariate Data Presentation

Better presentation than ordinate axes? Better presentation than ordinate axes? Do we need a 53 dimension space to view data? Do we need a 53 dimension space to view data? How to find the ‘best’ low dimension space that conveys maximum useful information? How to find the ‘best’ low dimension space that conveys maximum useful information? One answer: Find “Principal Components” One answer: Find “Principal Components”

Principal Components First PC is direction of maximum variance from origin First PC is direction of maximum variance from origin Subsequent PCs are orthogonal to 1st PC and describe maximum residual variance Subsequent PCs are orthogonal to 1st PC and describe maximum residual variance

We wish to explain/summarize the underlying variance-covariance structure of a large set of variables through a few linear combinations of these variables. We wish to explain/summarize the underlying variance-covariance structure of a large set of variables through a few linear combinations of these variables. The Goal

Applications Uses: Uses: Data Visualization Data Visualization Data Reduction Data Reduction Data Classification Data Classification Trend Analysis Trend Analysis Factor Analysis Factor Analysis Noise Reduction Noise Reduction Examples: Examples: How many unique “sub-sets” are in the sample? How many unique “sub-sets” are in the sample? How are they similar / different? How are they similar / different? What are the underlying factors that influence the samples? What are the underlying factors that influence the samples? Which time / temporal trends are (anti)correlated? Which time / temporal trends are (anti)correlated? Which measurements are needed to differentiate? Which measurements are needed to differentiate? How to best present what is “interesting”? How to best present what is “interesting”? Which “sub-set” does this new sample rightfully belong? Which “sub-set” does this new sample rightfully belong?

This is accomplished by rotating the axes. X1X1 X2X2 Trick: Rotate Coordinate Axes Suppose we have a population measured on p random variables X 1,…,X p. Note that these random variables represent the p-axes of the Cartesian coordinate system in which the population resides. Our goal is to develop a new set of p axes (linear combinations of the original p axes) in the directions of greatest variability: Suppose we have a population measured on p random variables X 1,…,X p. Note that these random variables represent the p-axes of the Cartesian coordinate system in which the population resides. Our goal is to develop a new set of p axes (linear combinations of the original p axes) in the directions of greatest variability:

PCA: General From k original variables: x 1,x 2,...,x k : Produce k new variables: y 1,y 2,...,y k : y 1 = a 11 x 1 + a 12 x 2 +... + a 1k x k y 2 = a 21 x 1 + a 22 x 2 +... + a 2k x k... y k = a k1 x 1 + a k2 x 2 +... + a kk x k

PCA: General From k original variables: x 1,x 2,...,x k : Produce k new variables: y 1,y 2,...,y k : y 1 = a 11 x 1 + a 12 x 2 +... + a 1k x k y 2 = a 21 x 1 + a 22 x 2 +... + a 2k x k... y k = a k1 x 1 + a k2 x 2 +... + a kk x k such that: such that: y k 's are uncorrelated (orthogonal) y 1 explains as much as possible of original variance in data set y 2 explains as much as possible of remaining variance etc.

1st Principal Component, y 1 2nd Principal Component, y 2

PCA Scores x i2 x i1 y i,1 y i,2

PCA Eigenvalues λ1λ1 λ2λ2

PCA: Another Explanation From k original variables: x 1,x 2,...,x k : Produce k new variables: y 1,y 2,...,y k : y 1 = a 11 x 1 + a 12 x 2 +... + a 1k x k y 2 = a 21 x 1 + a 22 x 2 +... + a 2k x k... y k = a k1 x 1 + a k2 x 2 +... + a kk x k such that: y k 's are uncorrelated (orthogonal) y 1 explains as much as possible of original variance in data set y 2 explains as much as possible of remaining variance etc. y k 's are Principal Components

Example AppleSamsungNokia student1121 student24213 student3781 student4845 Mean 5 4 5 -meanAppleSamsungNokia student1-4-2-4 student2-28 student324-4 student4300

Example AppleSamsungNokia student1121 student24213 student3781 student4845 Mean 5 4 5 -meanAppleSamsungNokia student1-4-2-4 student2-28 student324-4 student4300 Sample co-variance matrix: S=B’B/(n-1) = PCA of B  finding eigenvalues and eigenvectors of S: S=VDV’ V= D= 1 st principle component: [-0.0740, -0.3030, 0.9501]’, y1 = -0.0740A-0.3030S+0.9501N 2 nd principle component: [0.8193 0.5247 0.2312]’, y2 … 3 rd principle component: [0.5686 -0.7955 -0.2094]’, y3 … Total variance = tr(D) = 50 = tr(S) 1060 68-8 0 32 0.56860.8193-0.0740 -0.79550.5247-0.3030 -0.20940.23120.9501 1.605700 013.8430 0034.5513

Example AppleSamsungNokia student1121 student24213 student3781 student4845 Mean 5 4 5 -meanAppleSamsungNokia student1-4-2-4 student2-28 student324-4 student4300 Sample co-variance matrix: S=B’B/(n-1) = PCA of B  finding eigenvalues and eigenvectors of S: S=VDV’ V= D= 1 st principle component: [-0.0740, -0.3030, 0.9501], y1 = -0.0740A-0.3030S+0.9501N 2 nd principle component: [0.8193 0.5247 0.2312], y2 … 3 rd principle component: [0.5686 -0.7955 -0.2094], y3 … Total variance = tr(D) = 50 = tr(S) 1060 68-8 0 32 0.56860.8193-0.0740 -0.79550.5247-0.3030 -0.20940.23120.9501 1.605700 013.8430 0034.5513

New view of the data y1 = -0.0740A-0.3030S+0.9501N, … y1 = -0.0740A-0.3030S+0.9501N, … C = BV = C = BV = Cy3y2y1 student10.1542-5.2514-2.8984 student2-0.6528-0.01918.2808 student3-1.20722.8126-5.1604 student41.70582.4579-0.222 y1 y2 x1 x2 AppleSamsungNokia student1121 student24213 student3781 student4845

Another look of the Data Var^-1*BASN student11.73213.46411.7321 student20.68270.34132.2186 student31.84892.11310.2641 student43.84311.92152.4019 BAppleSamsungNokia student1121 student24213 student3781 student4845 Make all variances = 1 -meanASN student1-0.29461.50410.077925 student2-1.344-1.61870.564425 student3-0.17780.1531-1.39008 student41.8164-0.03850.747725 S=VDV’, lambda=[0.5873; 1.4916; 2.2370]; Cy3y2y1 student10.88880.9583-0.8043 student20.2289-0.53122.1001 student3-0.94281.048-0.01 student4-0.1749-1.4751-1.2858 y1 y2

Download ppt "Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial."

Similar presentations