Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Inhomogeneous Gibbs Models Ce Liu

Similar presentations


Presentation on theme: "Learning Inhomogeneous Gibbs Models Ce Liu"— Presentation transcript:

1 Learning Inhomogeneous Gibbs Models Ce Liu celiu@microsoft.com

2 How to Describe the Virtual World

3 Histogram  Histogram: marginal distribution of image variances  Non Gaussian distributed

4 Texture Synthesis (Heeger et al, 95)  Image decomposition by steerable filters  Histogram matching

5 FRAME (Zhu et al, 97)  Homogeneous Markov random field (MRF)  Minimax entropy principle to learn homogeneous Gibbs distribution  Gibbs sampling and feature selection

6 Our Problem  To learn the distribution of structural signals  Challenges How to learn non-Gaussian distributions in high dimensions with small observations?How to learn non-Gaussian distributions in high dimensions with small observations? How to capture the sophisticated properties of the distribution?How to capture the sophisticated properties of the distribution? How to optimize parameters with global convergence?How to optimize parameters with global convergence?

7 Inhomogeneous Gibbs Models (IGM) A framework to learn arbitrary high-dimensional distributions 1D histograms on linear features to describe high- dimensional distribution1D histograms on linear features to describe high- dimensional distribution Maximum Entropy Principle– Gibbs distributionMaximum Entropy Principle– Gibbs distribution Minimum Entropy Principle– Feature PursuitMinimum Entropy Principle– Feature Pursuit Markov chain Monte Carlo in parameter optimizationMarkov chain Monte Carlo in parameter optimization Kullback-Leibler Feature (KLF)Kullback-Leibler Feature (KLF)

8 1D Observation: Histograms  Feature  (x): R d → R Linear feature  (x)=  T xLinear feature  (x)=  T x Kernel distance  (x)=||  x||Kernel distance  (x)=||  x||  Marginal distribution  Histogram

9 Intuition

10 Learning Descriptive Models =

11  Sufficient features can make the learnt model f(x) converge to the underlying distribution p(x)  Linear features and histograms are robust compared with other high-order statistics  Descriptive models

12 Maximum Entropy Principle  Maximum Entropy Model To generalize the statistical properties in the observedTo generalize the statistical properties in the observed To make the learnt model present information no more than what is availableTo make the learnt model present information no more than what is available  Mathematical formulation

13 Intuition of Maximum Entropy Principle

14  Solution form of maximum entropy model  Parameter: Inhomogeneous Gibbs Distribution Gibbs potential

15 Estimating Potential Function  Distribution form  Normalization  Maximizing Likelihood Estimation (MLE)  1 st and 2 nd order derivatives

16 Parameter Learning  Monte Carlo integration  Algorithm

17 Gibbs Sampling x y

18 Minimum Entropy Principle  Minimum entropy principle To make the learnt distribution close to the observedTo make the learnt distribution close to the observed  Feature selection

19 Feature Pursuit  A greedy procedure to learn the feature set  Reference model  Approximate information gain

20 Proposition The approximate information gain for a new feature is The approximate information gain for a new feature is and the optimal energy function for this feature is and the optimal energy function for this feature is

21 Kullback-Leibler Feature  Kullback-Leibler Feature  Pursue feature by Hybrid Monte CarloHybrid Monte Carlo Sequential 1D optimizationSequential 1D optimization Feature selectionFeature selection

22 Acceleration by Importance Sampling  Gibbs sampling is too slow…  Importance sampling by the reference model

23 Flowchart of IGM IGM Syn Samples Obs Samples Feature Pursuit KL Feature KL<  Output MCMC Obs Histograms N Y

24 Toy Problems (1) Synthesized samples Gibbs potential Observed histograms Synthesized histograms Feature pursuit Mixture of two Gaussians Circle

25 Toy Problems (2) Swiss Roll

26 Applied to High Dimensions  In high-dimensional space Too many features to constrain every dimensionToo many features to constrain every dimension MCMC sampling is extremely slowMCMC sampling is extremely slow  Solution: dimension reduction by PCA  Application: learning face prior model 83 landmarks defined to represent face (166d)83 landmarks defined to represent face (166d) 524 samples524 samples

27 Face Prior Learning (1) Observed face examplesSynthesized face samples without any features

28 Face Prior Learning (2) Synthesized with 10 featuresSynthesized with 20 features

29 Face Prior Learning (3) Synthesized with 30 featuresSynthesized with 50 features

30 Observed Histograms

31 Synthesized Histograms

32 Gibbs Potential Functions

33 Learning Caricature Exaggeration

34 Synthesis Results

35 Learning 2D Gibbs Process

36

37 Thank you! celiu@csail.mit.edu CSAIL


Download ppt "Learning Inhomogeneous Gibbs Models Ce Liu"

Similar presentations


Ads by Google