Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9 Data Analysis Martin Russell.

Similar presentations


Presentation on theme: "Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9 Data Analysis Martin Russell."— Presentation transcript:

1 Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9 Data Analysis Martin Russell

2 Slide 2 EE3J2 Data Mining Objectives  To review basic data analysis  To review the notions of mean, variance and covariance  To explain Principle Components Analysis (PCA)

3 Slide 3 EE3J2 Data Mining Example from speech processing Plot of high-frequency energy vs low- frequency energy, for 25 ms speech segments, sampled every 10ms

4 Slide 4 EE3J2 Data Mining Basic statistics Sample mean Sample variance ‘y’ Sample variance ‘x’ ‘y’ max ‘y’ min ‘x’ min ‘x’ max

5 Slide 5 EE3J2 Data Mining Basic statistics  Denote samples by X = x 1, x 2, …,x T, where x t = (x t 1, x t 2, …, x t N )  The sample mean  (X) is given by:

6 Slide 6 EE3J2 Data Mining More basic statistics  The sample variance  (X) is given by:

7 Slide 7 EE3J2 Data Mining Covariance  As the x value increases, the y value also increases  This is (positive) co-variance  If y decreases as x increases, the result is negative covariance

8 Slide 8 EE3J2 Data Mining Definition of covariance  The covariance between the m th and n th components of the sample data is defined by:  In practice it is useful to subtract the mean  (X) from each of the data points x t. The sample mean is then 0 and

9 Slide 9 EE3J2 Data Mining Data with mean subtracted Implies positive covariance

10 Slide 10 EE3J2 Data Mining Sample data rotated through 2  Implies negative covariance

11 Slide 11 EE3J2 Data Mining Data with covariance removed

12 Slide 12 EE3J2 Data Mining Principle Components Analysis  PCA is the technique which I used to diagonalise the sample covariance matrix  The first step is to write the covariance matrix in the form: where D is diagonal and U is a matrix corresponding to a rotation  Can do this using SVD (see lecture 8) or eigenvalue decomposition

13 Slide 13 EE3J2 Data Mining PCA continued  U implements rotation through angle  e 1 is the first column of U d 11 is the variance in the direction e 1 e 2 is the second column of U d 22 is the variance in the direction e 2 e1e1 e2e2

14 Slide 14 EE3J2 Data Mining Summary  Basic data analysis  Means, variance and covariance  Principle Components Analysis


Download ppt "Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9 Data Analysis Martin Russell."

Similar presentations


Ads by Google