Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sparse Coding Arthur Pece Outline Generative-model-based vision Linear, non-Gaussian, over-complete generative models The penalty method.

Similar presentations


Presentation on theme: "Sparse Coding Arthur Pece Outline Generative-model-based vision Linear, non-Gaussian, over-complete generative models The penalty method."— Presentation transcript:

1 Sparse Coding Arthur Pece aecp@cs.rug.nl

2 Outline Generative-model-based vision Linear, non-Gaussian, over-complete generative models The penalty method of Olshausen+Field & Harpur+Prager Matching pursuit The inhibition method An application to medical images A hypothesis about the brain

3 Generative-Model-Based Vision Generative models Bayes’ theorem (gives an objective function) Iterative optimization (for parameter estimation) Occam’s razor (for model selection) A less-fuzzy definition of model-based vision. Four basic principles (suggested by the speaker):

4 Why? A generative model and Bayes’ theorem lead to a better understanding of what the algorithm is doing When the MAP solution cannot be found analytically, iterating between top-down and bottom-up becomes necessary (as in EM, Newton- like and conjugate-gradient methods) Models should not only be likely, but also lead to precise predictions, hence (one interpretation of) Occam’s razor

5 Linear Generative Models x is the observation vector ( n samples/pixels) s is the source vector ( m sources) A is the mixing matrix ( n x m ) n is the noise vector ( n dimensions) The noise vector is really a part of the source vector x = A.s + n

6 Learning vs. Search/Perception Learning: given an ensemble X = {x i }, maximize the posterior probability of the mixing matrix A Perception: given an instance x, maximize the posterior probability of the source vector s x = A.s + n

7 MAP Estimation From Bayes’ theorem: log p(A | X) = log p(X | A) + log p(A) - log p(X) (marginalize over S ) log p(s | x) = log p(x | s) + log p(s) - log p(x) (marginalize over A )

8 Statistical independence of the sources Why is A not the identity matrix ? Why is p(s) super-Gaussian (lepto-kurtic)? Why m>n ?

9 Why is A not the identity matrix? Pixels are not statistically independent: log p(x) /= Σ log p(x i ) Sources are (or should be) statistically independent: log p(s) = Σ log p(s j ) Thus, the log p.d.f. of images is equal to the sum of the log p.d.f.’s of the coefficients, NOT equal to the sum of the log p.d.f.’s of the pixels.

10 Why is A not the identity matrix? (continued) From the previous slide: Σ log p(c j ) /= Σ log p(x i ) But, statistically, the estimated probability of an image is higher if the estimate is given by the sum of coefficient probabilities, rather than the sum of pixel probabilities: E [ Σ log p(c j ) ] > E [ Σ log p(x i ) ] This is equivalent to: H [ p(c j )] < H [ p(x i )]

11 Why m>n ? Why is p(s) super-Gaussian? The image “sources” are edges; ultimately, objects Edges can be found at any image location and can have any orientation and intensity profile Objects can be found at any location in the scene and can have many different shapes Many more (potential) edges or (potential) objects than pixels Most of these potential edges or objects are not found in a specific image

12 Linear non-Gaussian generative model Super-Gaussian prior p.d.f. of sources Gaussian prior p.d.f. of noise log p(s | x,A) = log p(x | s,A) + log p(s) – log p(x | A) log p(x | s,A) = log p(x - A.s) = log p(n) = - n.n/(2σ 2 ) - log Z = - || x - A.s || 2 /(2σ 2 ) - log Z

13 Linear non-Gaussian generative model (continued) Example: Laplacian p.d.f. of sources: log p(s) = - Σ |s| / λ - log Q log p(s | x,A) = log p(x | s,A) + log p(s) – log p(x | A) = - || x - A.s || 2 /(2σ 2 ) - Σ |s| / λ - const.

14 Summary Generative-model-based vision Learning vs. Perception Over-complete expansions Sparse prior distribution of sources Linear over-complete generative model with Laplacian prior distribution for the sources

15 The Penalty Method: Coding Gradient-based optimization of the log-posterior probability of the coefficients (d/ds) log p(s | x) = - A T.(x - A.s) / σ 2 - sign(s) / λ Note: as the noise variance tends to zero, the quadratic term dominates the right-hand side and the MAP estimate could be obtained by solving a linear system. However, if m>n then minimizing a quadratic objective function would spread the image energy over non-orthogonal coefficients

16 Linear inference from linear generative models with Gaussian prior p.d.f. The logarithm of a multivariate Gaussian is a weighted sum of squares The gradient of a sum of squares is a linear function The MAP solution is the solution of a linear system

17 Non-linear inference from linear generative models with non-Gaussian prior p.d.f. The logarithm of a multivariate non-Gaussian p.d.f. is NOT a weighted sum of squares The gradient of a non-Gaussian p.d.f. is NOT a linear function The MAP solution is NOT the solution of a linear system: in general, no analytical solution exists (this is why over-complete bases are not popular)

18 PCA, ICA, SCA PCA generative model: multivariate Gaussian -> closed-form solution ICA generative model: non-Gaussian -> iterative optimization over image ensemble SCA generative model: over-complete non-Gaussian -> iterate for each image for perception, over the image ensemble for learning

19 The Penalty Method: Learning Gradient-based optimization of the log-posterior probability* of the mixing matrix Δ A = - A (z. c T + I) where z j = (d/ds j ) log p(s j ) and c is the MAP estimate of s * actually log-likelihood

20 Summary Generative-model-based vision Learning vs. Perception Over-complete expansions Sparse prior distribution of sources Linear over-complete generative model with Laplacian prior distribution for the sources Iterative coding as MAP estimation of sources Learning an over-complete expansion

21 Vector Quantization General VQ: K-means clustering of signals/images Shape-gain VQ: clustering on the unit sphere (after a change from Cartesian to polar coordinates) Iterated VQ: iterative VQ of the residual signal/image

22 Matching Pursuit Iterative shape-gain vector quantization: Projection of the residual image onto all expansion images Selection of the largest (in absolute value) projection Updating of the corresponding coefficient Subtraction of the updated component from the residual image

23 Inhibition Method Similar iteration structure, but more than one coefficient updated per iteration: Projection of the residual image onto all expansion images Selection of the largest (in absolute value) k projections Selection of orthogonal elelemnts in this reduced set Updating of the corresponding coefficients Subtraction of the updated components from the residual image

24 MacKay Diagram

25 Selection in Matching Pursuit

26 Selection in the Inhibition Method

27 Encoding natural images: Lena

28 Encoding natural images: a landscape

29 Encoding natural images: a bird

30 Comparison to the penalty method

31 Visual comparisons JPEG inhibition methodpenalty method

32 Expanding the dictionary

33 An Application to Medical Images X-ray images decomposed by means of matching pursuit Image reconstruction by optimally re-weighting the components obtained by matching pursuit Thresholding to detect micro-calcifications

34 Tumor detection in mammograms

35 Residual image after several matching pursuit iterations

36 Image reconstructed from matching-pursuit components

37 Weighted reconstruction

38 Receiver Operating Curve

39 A Hypothesis about the Brain Some facts All input to the cerebral cortex is relayed through the thalamus: e.g. all visual input from the retina is relayed through the LGN Connections between cortical areas and thalamic nuclei are always reciprocal Feedback to the LGN seems to be negative Hypothesis: cortico-thalamic loops minimize prediction error

40 Additional references Donald MacKay (1956) D Field (1994) Harpur and Prager (1995) Lewicki and Olshausen (1999) Yoshida (1999)


Download ppt "Sparse Coding Arthur Pece Outline Generative-model-based vision Linear, non-Gaussian, over-complete generative models The penalty method."

Similar presentations


Ads by Google