Download presentation

Presentation is loading. Please wait.

Published byBrian Mooney Modified over 2 years ago

1
Data Analysis Lecture 8 Tijl De Bie

2
Dimensionality reduction How to deal with high-dimensional data? How to visualize it? How to explore it? Dimensionality reduction is one way…

3
Projections in vector spaces wx Meaning: –||w||*||x||*cos(theta) –For unit norm w: projection of x on w –To express hyperplanes: wx=b –To express halfspaces: wx>b All these interpretations are relevant

4
Projections in vector spaces [Some drawings…]

5
Variance of a projection wx=xw is the projection of x on w Let X contain many points x as its rows Projection of all points in X is: –Xw = (x 1 w, x 2 w, …, x n w) Variance of projection on w: –sum i (x i w/||w||) 2 = (wXXw)/(ww) –Or, if ||w||=1, this is: sum i (x i w) 2 = wXXw

6
Principal Component Analysis Direction / unit vector w with largest variance? –max w wXXw subject to ww=1 Lagrangian: –L(w) = wXXw-lambda(ww-1) Gradient w.r.t. w equal to zero: –2*XXw=2*lambda*w –(XX)*w=lambda*w Eigenvalue problem!

7
Principal Component Analysis Find w as dominant eigenvector of XX! Then we can project the data on this w For no other projection the variance is larger This projection is the best 1-D representation of the data

8
Principal Component Analysis Best 1-D representation given by projection on dominant eigenvector Second best w: the second eigenvector and so on…

9
Technical but important… I havent mentioned: –The data should be centred –That is: the mean of each of the features should be 0 –If that is not the case: subtract from each feature its mean (centering)

10
Clustering Another way to make sense of high- dimensional data Find coherent groups in the data Points that are: –close to one another within a cluster, but –distant from points in other clusters

11
Distances between points Distance between points: ||x i -x j || Can we assign points to K different clusters –each of which is coherent –distant from each other? Define the clusters by means of cluster centres m k with k=1,2,…,K

12
K-means cost function Ideal clustering: –||x i -m k(i) || small for all x i if m k(i) is its cluster centre –sum i ||x i -m k(i) || 2 small Unfortunately: hard to minimise… Simultaneous optimisation of: –k(i) (which cluster centre for which point) –m k (where are the cluster centres) Iterative strategy!

13
K-means clustering Iteratively optimise centres and cluster assignments K-means algorithm: –Start with random choices of K centres m k –Set k(i)=argmin k ||x i -m k || 2 –Set m k =mean({x i : k(i)=k}) Do this for many different random starts, and pick the best result (with lowest cost)

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google