Download presentation
Presentation is loading. Please wait.
Published byCharlene Lewis Modified over 9 years ago
1
Survey on ICA Technical Report, Aapo Hyvärinen, 1999. http://ww.icsi.berkeley.edu/~jagota/NCS
2
2nd-order methods PCA / factor analysis Higher order methods Projection pursuit / Blind deconvolution ICA definitions criteria for identifiability relations to other methods Applications Contrast functions Algorithms Outline
3
x = As + n General model Observations Mixing matrix Noise Latent variables, factors, independent components
4
s = Wx Find transformation s = f (x) Consider only linear transformation:
5
Principal component analysis Find direction(s) where variance of w T x is maximized. Equivalent to finding the eigenvectors of C=E(xx T ) corresponding to the k largest eigenvalues
6
Principal component analysis
7
Closely related to PCA x = As + n Method of principal factors: –Assumes knowledge of covariance matrix of the noise: E(nn T ) –PCA on: C = E(xx T )– E(nn T ) Factors are not defined uniquely, but only up to a rotation Factor analysis
8
Projection pursuit Redundancy reduction Blind deconvolution Requires assumption that data are not Gaussian Higher order methods
9
Find direction w, such that w T x has an ’interesting’ distribution Argued that interesting directions are those that show the least Gaussian distribution Projection pursuit
10
Differential entropy Maximised when f is a Gaussian density Minimize H(w T x) to find projection pursuit directions (y = w T x) Difficult to estimate the density of w T x
11
Example: projection pursuit
12
Observe filtered version of s(t): x(t) = s(t)*g(t) Find filter h(t), such that s(t) = h(t)*x(t ) Blind deconvolution
13
Seismic: ”statistical deconvolution” Example blind deconvolution
14
Blind deconvolution (3) g(t) s(t) t t
15
Blind deconvolution (4)
16
Definition 1 (General definition) ICA of a random vector x consists of finding a linear transformation, s=Wx, so that the components, s i, are as independent as possible, in the sense of maximizing some function F(s 1,..,s m ) that measure independence. ICA definitions
17
Definition 2 (Noisy ICA) ICA of a random vector x consists of estimating the following model for the data: x = As + n where the latent variables s i are assumed independent Definition 3 (Noise-free ICA) x = As ICA definitions
18
ICA requires statistical independence Distinguish between statistically independent and uncorrelated variables Statistically independent: Uncorrelated: Statistical independence
19
All the independent components, but one, must be non-Gaussian The number of observed mixtures must be at least as large the number of independent components, m >= n The matrix A must be of full column rank Note: with m < n, A may still be indentifiable Identifiability of ICA model
20
Redundancy reduction Noise free case –Find ’interesting’ projections –Special case of projection pursuit Blind deconvolution Factor analysis for non-Gaussian data Related to non-linear PCA Relations to other methods
21
Relations to other methods (2)
22
Blind source separation –Cocktail party problem Feature extraction Blind deconvolution Applications of ICA
23
Blind source separation
24
ICA method = Objective function + Optimization algorithm Objective (contrast) functions Multi-unit contrast functions –Find all independent components One-unit contrast functions –Find one independent component (at a time)
25
Mutual information Mutual information is zero if the y i are independent Difficult to estimate, approximations exist
26
Mutual information (2) Alternative definition
27
Mutual information (3) H(X) H(Y) H(X|Y) H(Y|X) I(X,Y)
28
Non-linear PCA Add non-linearity function g(.) in the formula for PCA
29
Find one vector, w, so that w T x equals one of the independent components, s i Related to projection pursuit Prior knowledge of number of independent components not needed One-unit contrast functions
30
Difference between differential entropy of y and differential entropy of Gaussian variable with same variance Negentropy If the y i are uncorrelated, the mutual information can be expressed as J(y) can be approximated by higher-order cumulants, but estimation is sensitive to outliers
31
Have x=As, want to find s=Wx Preprocessing –Centering of x –Sphering (whitening) of x Find transformation; v=Qx such that E(vv T )=I Found via PCA / SVD Sphering does not solve problem alone Algorithms
32
Jutten-Herault –Cancel non-linear cross-correlations –Non-diagonal terms of W are updated according to Algorithms (2) –The y i are updated iteratively as y = (I+W) -1 x Non-linear decorrelation Non-linear PCA FastICA,..., etc.
33
Definitions of ICA Conditions for identifiability of model Relations to other methods Contrast functions –One-unit / multi-unit –Mutual information / Negentropy Applications of ICA Algorithms Summary
34
Noisy ICA Tailor-made methods for certain applications Use of time correlations if x is a stochastic process Time delays/echoes in cocktail-party problem Non-linear ICA Future research
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.