PCA NETWORK Unsupervised Learning NEtWORKS. PCA is a Representation Network useful for signal, image, video processing.

PCA NETWORK Unsupervised Learning NEtWORKS

PCA is a Representation Network useful for signal, image, video processing

In order to analyze multi-dimensional input vectors, a representation with maximum information is the principal component analysis (PCA). PCA per component: extract most significant features, inter-component: avoid duplication or redundancy between the neurons. PCA NEtWORKS

R x  Ř x = (1/M ) Σ t x(t)x t (t) An estimate of the autocorrelation matrix by taking the time average over the sample vectors: R x = UΛU t

the optimal matrix W is formed by the first m singular vectors of R x. x(t) = W a(t)  the errors of the optimal estimate are [Jain89]: matrix-2-norm error = λ m+1 least-mean-square error = Σ i n =m+1 λ i

to enhance the correlation between the input x(t) and the extracted component a(t), it is natural to use a Hebbian-type rule: w(t+1) = w(t) + β x(t)a(t) a(t) = w(t) t x(t) First PC

the Oja learning Rule is equivalent to a normalized Hebbian rule. (Show procedure!!) Δw(t) = β [x(t)a(t) - w(t) a(t) 2 ] Oja Learning Rule

By the Oja learning rule, w(t) converges asymptotically (with probability 1) to Convergence theorem: Single Component w = w(∞) = e 1 where e 1 is the principal eigenvector of R x

Δw(t) = β [x(t)a(t) - w(t) a(t) 2 ] Proof: Δw(t) = β [x(t)x’(t)w(t) - a(t) 2 w(t)] Δw(ť) = β [R x - σ(ť)I] w(ť) Δw(ť) = β [UΛU T - σ(ť)I] w(ť) Δw(ť) = β U[Λ - σ(ť)I] U T w(ť) ΔU T w(ť) = β [Λ - σ(ť)I] U T w(ť) ΔΘ(ť) = β [Λ - σ(ť)I] Θ (ť) take average over a block of data, and redenote ť as the block time index:

the relative dominance of the principle component grows, with a growth rate: Convergence Rates Each of the eigen-components is enhanced/dampened by θ i (ť+1) = [1+β' λ i - β' σ(ť)] θ i (ť) (1+β' [λ i -σ(ť)])/(1+β' [λ 1 - σ(ť)]) Θ(ť) = [θ 1 (ť) θ 2 (ť) … θ n (ť)] T

Simulation: Decay Rates of PCs

Multiple Principal Components How to extract

Let W denote a n  m weight matrix ΔW(t) = β [x(t) - W(t) a(t)] a(t) t Concern on duplication/redundancy

Assume that the first component is already obtained; then the output value can be ``deflated'' by the following transformation: Deflation Method x = (I- w 1 w’ 1 ) x ˜

the basic idea is to allow the old hidden units to influence the new units so that the new ones do not duplicate information (in full or in part) already provided by the old units. By this approach, the deflation process is effectively implemented in an adaptive manner. Lateral Orthogonalization Network

APEX Network (multiple PCs)

Δα ij (t) = β [ a i (t) a j (t) - α ij (t) a i (t) 2 ] APEX: Adaptive Principal-component Extractor Δw i (t) = β [ x(t)a i (t) - w i (t) a i (t) 2 ] the Oja Rule: for i-th component (e.g. i=2) Dynamic Orthogonalization Rule (e.g. i=2,j=1)

the Hebbian weight matrix W(t) in APEX converges asymptotically to a matrix formed by the m largest principal components. Convergence theorem: Multiple Components the weight matrix W(t) converges to (with probability 1), W(∞) = W where W is the matrix formed by m row vectors w i t, w i = w i (∞) = e i

Δα(t) = β [ a 1 (t) a 2 (t) - α(t) a 2 (t) 2 ] Δw 2 (t) = β [ x(t)a 2 (t) – w 2 (t) a 2 (t) 2 ] w’ 1 Δw 2 (t) = β [w’ 1 x(t)a 2 (t) – w’ 1 w 2 (t) a 2 (t) 2 ] Δw’ 1 w 2 (t) = β [a 1 (t)a 2 (t) – w’ 1 w 2 (t) a 2 (t) 2 ] Δ[w’ 1 w 2 (t)- Δα(t)] = β[ w’ 1 w 2 (t) -α(t)]a 2 (t) 2 α(t)→w ’ 1 w 2 (t) a 2 (t) = x’ (t)w 2 (t) - α(t)a 1 (t) = x’ (t) [I- w’ 1 w 1 ] w 2 (t) [w’ 1 w 2 (t+1)- α(t+1)] = [1-βσ(t)][ w’ 1 w 2 (t) -α(t)] w ’ 1 w 2 (t) - α(t) → 0

Learning Rates of APEX [w’ 1 w 2 (ť+1)- α(ť+1)] = [1-β’σ(ť)][w’ 1 w 2 (ť) -α(ť)] β’ = 1/σ(ť) β = 1/[Σ t a 2 (t) 2 ] β = 1/[Σ t γ t a 2 (t) 2 ] Learning Rates

PAPEX: Hierarchical Extraction Other Extensions DCA: Discriminant Component Analysis ICA: Independent Component Analysis

PCA NETWORK Unsupervised Learning NEtWORKS. PCA is a Representation Network useful for signal, image, video processing.

Similar presentations

Presentation on theme: "PCA NETWORK Unsupervised Learning NEtWORKS. PCA is a Representation Network useful for signal, image, video processing."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PCA NETWORK Unsupervised Learning NEtWORKS. PCA is a Representation Network useful for signal, image, video processing.

Similar presentations

Presentation on theme: "PCA NETWORK Unsupervised Learning NEtWORKS. PCA is a Representation Network useful for signal, image, video processing."— Presentation transcript:

Similar presentations

About project

Feedback