Presentation is loading. Please wait.

Presentation is loading. Please wait.

Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.

Similar presentations


Presentation on theme: "Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011."— Presentation transcript:

1 Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011

2 Outline Latent variable models in statistics Primary visual cortex (V1) Modeling and learning in V1 Layered hierarchical models Joint work with Song-Chun Zhu and Zhangzhang Si

3 Latent variable models Hidden Observed Learning:Examples Inference:

4 Latent variable models Mixture model Factor analysis

5 Latent variable models Hidden Observed Learning:Examples Maximum likelihood EM/gradient Inference / explaining away E-step / imputation

6 Computational neural science Z: Internal representation by neurons Y: Sensory data from outside environment Hidden Observed Connection weights Hierarchical extension: modeling Z by another layer of hidden variables explaining Y instead of Z Inference / explaining away

7 Source: Scientific American, 1999 Visual cortex: layered hierarchical architecture V1: primary visual cortex simple cells complex cells bottom-up/top-down

8 Simple V1 cells Daugman, 1985 Gabor wavelets: localized sine and cosine waves Transation, rotation, dilation of the above function

9 image pixels V1 simple cells respond to edges

10 Complex V1 cells Riesenhuber and Poggio,1999 Image pixels V1 simple cells V1 complex cells Local max Local sum Larger receptive field Less sensitive to deformation

11 Independent Component Analysis Bell and Sejnowski, 1996 Laplacian/Cauchy

12 Hyvarinen, 2000

13 Sparse coding Olshausen and Field, 1996 Laplacian/Cauchy/mixture Gaussians

14 Inference: sparsification, non-linear lasso/basis pursuit/matching pursuit mode and uncertainty of p(C|I) explaining-away, lateral inhibition Sparse coding / variable selection Learning: A dictionary of representational elements (regressors)

15 Olshausen and Field, 1996

16 Restricted Boltzmann Machine Hinton, Osindero and Teh, 2006 P(V|H) P(H|V): factorized no-explaining away hidden, binary visible

17 Energy-based model Teh, Welling, Osindero and Hinton, 2003 Features, no explaining-away Maximum entropy with marginals Exponential family with sufficient stat Zhu, Wu, and Mumford, 1997 Wu, Liu, and Zhu, 2000 Markov random field/Gibbs distribution

18 Zhu, Wu, and Mumford, 1997 Wu, Liu, and Zhu, 2000

19 Source: Scientific American, 1999 Visual cortex: layered hierarchical architecture bottom-up/top-down What is beyond V1? Hierarchical model?

20 Hierchical ICA/Energy-based model? Larger features Must introduce nonlinearities Purely bottom-up

21 P(V,H) = P(H)P(V|H) P(H)  P(V’,H) I H V V’ Discriminative correction by back-propagation Unfolding, untying, re-learning Hierarchical RBM Hinton, Osindero and Teh, 2006

22 Hierarchical sparse coding Attributed sparse coding elements transformation group topological neighborhood system Layer above : further coding of the attributes of selected sparse coding elements

23 Active basis model Wu, Si, Gong, Zhu, 10 Zhu, Guo, Wang, Xu, 05 n-stroke template n = 40 to 60, box= 100x100

24 Active basis model Wu, Si, Gong, Zhu, 10 Zhu, et al., 05 Yuille, Hallinan, Cohen, 92 n-stroke template n = 40 to 60, box= 100x100

25 Simplest AND-OR graph ( Pearl, 84; Zhu, Mumford 06 ) AND composition and OR perturbations or variations of basis elements Simplest shape model: average + residual Simplest modification of Olshausen-Field model Further sparse coding of attributes of sparse coding elements Simplicity

26 Bottom layer: sketch against texture Only need to pool a marginal q(c) as null hypothesis natural images  explicit q(I) of Zhu, Mumford, 97 this image  explicit q(I) of Zhu, Wu, Mumford, 97 Maximum entropy ( Della Pietra, Della Pietra, Lafferty, 97; Zhu, Wu, Mumford, 97; Jin, S. Geman, 06; Wu, Guo, Zhu, 08 ) Special case: density substitution ( Friedman, 87; Jin, S. Geman, 06 ) p(C, U) = p(C) p(U|C) = p(C) q(U|C) = p(C) q(U,C)/q(C)

27 Shared sketch algorithm : maximum likelihood learning Prototype: shared matching pursuit (closed-form computation) Step 1: two max to explain images by maximum likelihood no early decision on edge detection Step 2: arg-max for inferring hidden variables Step 3: arg-max explains away, thus inhibits (matching pursuit, Mallat, Zhang, 93 ) Finding n strokes to sketch M images simultaneously n = 60, M = 9

28 Bottom-up sum-max scoring ( no early edge decision ) Top-down arg-max sketching 1.Reinterpreting MAX1: OR-node of AND-OR, MAX for ARG-MAX in max-product algorithm 2.Stick to Olshausen-Field sparse top-down model : AND-node of AND-OR  Active basis, SUM2 layer, “neurons” memorize shapes by sparse connections to MAX1 layer  Hierarchical, recursive AND-OR/ SUM-MAX Architecture: more top-down than bottom-up Neurons: more representational than operational (OR-neurons/AND-neurons) Cortex-like sum-max maps: maximum likelihood inference SUM1 layer: simple V1 cells of Olshausen, Field, 96 MAX1 layer: complex V1 cells of Riesenhuber, Poggio, 99 Scan over multiple resolutions

29 Bottom-up detection Top-down sketching SUM1 MAX1 SUM2 arg MAX1 Sparse selective connection as a result of learning Explaining-away in learning but not in inference Bottom-up scoring and top-down sketching

30

31 Scan over multiple resolutions and orientations (rotating template)

32 Classification based on log likelihood ratio score Freund, Schapire, 95; Viola, Jones, 04

33 Adjusting Active Basis Model by L2 Regularized Logistic Regression By Ruixun Zhang Exponential family model, q(I) negatives  Logistic regression for p(class | image), partial likelihood Generative learning without negative examples  basis elements and hidden variables Discriminative adjustment with hugely reduced dimensionality  correcting conditional independence assumption L2 regularized logistic regression  re-estimated lambda’s Conditional on: (1) selected basis elements (2) inferred hidden variables (1) and (2)  generative learning

34 Active basis templates Adaboost templates # of negatives: 1055675104552149312217 Arg-max inference and explaining away, no reweighting, Residual images neutralize existing elements, same set of training examples No arg-max inference or explaining away inhibition Reweighted examples neutralize existing classifiers, changing set of examples double # elements same # elements

35 Mixture model of active basis templates fitted by EM/maximum likelihood with random initialization MNIST 500 total

36 Learning active basis models from non-aligned image EM-type maximum likelihood learning, Initialized by single image learning

37 Learning active basis models from non-aligned image

38

39

40 Hierarchical active basis by Zhangzhang Si et al. And-OR graph: Pearl, 84; Zhu, Mumford, 06 Compositionality and reusability: Geman, Potter, Chi, 02; L.Zhu, Lin, Huang, Chen,Yuille, 08 Part-based method: everyone et al. Latent SVM: Felzenszwalb, McAllester, Ramanan, 08 Constellation model: Weber, Welling, Perona, 00 Low log-likelihood High log-like

41 Simplicity Simplest and purest recursive two-layer AND-OR graph Simplest generalization of active basis model

42 AND-OR graph and SUM-MAX maps maximum likelihood inference Cortex-like, related to Riesenhuber, Poggio, 99 Bottom-up sum-max scoring Top-down arg-max sketching

43 Hierarchical active basis by Zhangzhang Si et al.

44

45

46

47 Shape script by composing active basis shape motifs Representing elementary geometric shapes (shape motifs) by active bases ( Si, Wu, 10 ) Geometry = sketch that can be parametrized

48 Bottom-layer: Olshausen-Field (foreground) + Zhu-Wu-Mumford (background) Maximum entropy tilting (Della Pietra, Della Pietra, Lafferty, 97) white noise  texture (high entropy)  sketch (low and mid entropy) (reverse the central limit theorem effect of information scaling) Build up layers: (1) AND-OR, SUM-MAX (top-down arg-MAX) (2) Perpetual sparse coding: further coding of attributes of the current sparse coding elements (a) residuals of attributes  continuous OR-nodes (b) mixture model  discrete OR-nodes Summary


Download ppt "Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011."

Similar presentations


Ads by Google