Survey on ICA Technical Report, Aapo Hyvärinen, 1999.

Survey on ICA Technical Report, Aapo Hyvärinen, 1999. http://ww.icsi.berkeley.edu/~jagota/NCS

2nd-order methods PCA / factor analysis Higher order methods Projection pursuit / Blind deconvolution ICA definitions criteria for identifiability relations to other methods Applications Contrast functions Algorithms Outline

x = As + n General model Observations Mixing matrix Noise Latent variables, factors, independent components

s = Wx Find transformation s = f (x) Consider only linear transformation:

Principal component analysis Find direction(s) where variance of w T x is maximized. Equivalent to finding the eigenvectors of C=E(xx T ) corresponding to the k largest eigenvalues

Principal component analysis

Closely related to PCA x = As + n Method of principal factors: –Assumes knowledge of covariance matrix of the noise: E(nn T ) –PCA on: C = E(xx T )– E(nn T ) Factors are not defined uniquely, but only up to a rotation Factor analysis

Projection pursuit Redundancy reduction Blind deconvolution Requires assumption that data are not Gaussian Higher order methods

Find direction w, such that w T x has an ’interesting’ distribution Argued that interesting directions are those that show the least Gaussian distribution Projection pursuit

Differential entropy Maximised when f is a Gaussian density Minimize H(w T x) to find projection pursuit directions (y = w T x) Difficult to estimate the density of w T x

Example: projection pursuit

Observe filtered version of s(t): x(t) = s(t)*g(t) Find filter h(t), such that s(t) = h(t)*x(t ) Blind deconvolution

Seismic: ”statistical deconvolution” Example blind deconvolution

Blind deconvolution (3) g(t) s(t) t t

Blind deconvolution (4)

Definition 1 (General definition) ICA of a random vector x consists of finding a linear transformation, s=Wx, so that the components, s i, are as independent as possible, in the sense of maximizing some function F(s 1,..,s m ) that measure independence. ICA definitions

Definition 2 (Noisy ICA) ICA of a random vector x consists of estimating the following model for the data: x = As + n where the latent variables s i are assumed independent Definition 3 (Noise-free ICA) x = As ICA definitions

ICA requires statistical independence Distinguish between statistically independent and uncorrelated variables Statistically independent: Uncorrelated: Statistical independence

All the independent components, but one, must be non-Gaussian The number of observed mixtures must be at least as large the number of independent components, m >= n The matrix A must be of full column rank Note: with m < n, A may still be indentifiable Identifiability of ICA model

Redundancy reduction Noise free case –Find ’interesting’ projections –Special case of projection pursuit Blind deconvolution Factor analysis for non-Gaussian data Related to non-linear PCA Relations to other methods

Relations to other methods (2)

Blind source separation –Cocktail party problem Feature extraction Blind deconvolution Applications of ICA

Blind source separation

ICA method = Objective function + Optimization algorithm Objective (contrast) functions Multi-unit contrast functions –Find all independent components One-unit contrast functions –Find one independent component (at a time)

Mutual information Mutual information is zero if the y i are independent Difficult to estimate, approximations exist

Mutual information (2) Alternative definition

Mutual information (3) H(X) H(Y) H(X|Y) H(Y|X) I(X,Y)

Non-linear PCA Add non-linearity function g(.) in the formula for PCA

Find one vector, w, so that w T x equals one of the independent components, s i Related to projection pursuit Prior knowledge of number of independent components not needed One-unit contrast functions

Difference between differential entropy of y and differential entropy of Gaussian variable with same variance Negentropy If the y i are uncorrelated, the mutual information can be expressed as J(y) can be approximated by higher-order cumulants, but estimation is sensitive to outliers

Have x=As, want to find s=Wx Preprocessing –Centering of x –Sphering (whitening) of x Find transformation; v=Qx such that E(vv T )=I Found via PCA / SVD Sphering does not solve problem alone Algorithms

Jutten-Herault –Cancel non-linear cross-correlations –Non-diagonal terms of W are updated according to Algorithms (2) –The y i are updated iteratively as y = (I+W) -1 x Non-linear decorrelation Non-linear PCA FastICA,..., etc.

Definitions of ICA Conditions for identifiability of model Relations to other methods Contrast functions –One-unit / multi-unit –Mutual information / Negentropy Applications of ICA Algorithms Summary

Noisy ICA Tailor-made methods for certain applications Use of time correlations if x is a stochastic process Time delays/echoes in cocktail-party problem Non-linear ICA Future research

Survey on ICA Technical Report, Aapo Hyvärinen, 1999.

Similar presentations

Presentation on theme: "Survey on ICA Technical Report, Aapo Hyvärinen, 1999."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Survey on ICA Technical Report, Aapo Hyvärinen, 1999.

Similar presentations

Presentation on theme: "Survey on ICA Technical Report, Aapo Hyvärinen, 1999."— Presentation transcript:

Similar presentations

About project

Feedback