Learning Inhomogeneous Gibbs Models Ce Liu

Learning Inhomogeneous Gibbs Models Ce Liu celiu@microsoft.com

How to Describe the Virtual World

Histogram  Histogram: marginal distribution of image variances  Non Gaussian distributed

Texture Synthesis (Heeger et al, 95)  Image decomposition by steerable filters  Histogram matching

FRAME (Zhu et al, 97)  Homogeneous Markov random field (MRF)  Minimax entropy principle to learn homogeneous Gibbs distribution  Gibbs sampling and feature selection

Our Problem  To learn the distribution of structural signals  Challenges How to learn non-Gaussian distributions in high dimensions with small observations?How to learn non-Gaussian distributions in high dimensions with small observations? How to capture the sophisticated properties of the distribution?How to capture the sophisticated properties of the distribution? How to optimize parameters with global convergence?How to optimize parameters with global convergence?

Inhomogeneous Gibbs Models (IGM) A framework to learn arbitrary high-dimensional distributions 1D histograms on linear features to describe high- dimensional distribution1D histograms on linear features to describe high- dimensional distribution Maximum Entropy Principle– Gibbs distributionMaximum Entropy Principle– Gibbs distribution Minimum Entropy Principle– Feature PursuitMinimum Entropy Principle– Feature Pursuit Markov chain Monte Carlo in parameter optimizationMarkov chain Monte Carlo in parameter optimization Kullback-Leibler Feature (KLF)Kullback-Leibler Feature (KLF)

1D Observation: Histograms  Feature  (x): R d → R Linear feature  (x)=  T xLinear feature  (x)=  T x Kernel distance  (x)=||  x||Kernel distance  (x)=||  x||  Marginal distribution  Histogram

Intuition

Learning Descriptive Models =

 Sufficient features can make the learnt model f(x) converge to the underlying distribution p(x)  Linear features and histograms are robust compared with other high-order statistics  Descriptive models

Maximum Entropy Principle  Maximum Entropy Model To generalize the statistical properties in the observedTo generalize the statistical properties in the observed To make the learnt model present information no more than what is availableTo make the learnt model present information no more than what is available  Mathematical formulation

Intuition of Maximum Entropy Principle

 Solution form of maximum entropy model  Parameter: Inhomogeneous Gibbs Distribution Gibbs potential

Estimating Potential Function  Distribution form  Normalization  Maximizing Likelihood Estimation (MLE)  1 st and 2 nd order derivatives

Parameter Learning  Monte Carlo integration  Algorithm

Gibbs Sampling x y

Minimum Entropy Principle  Minimum entropy principle To make the learnt distribution close to the observedTo make the learnt distribution close to the observed  Feature selection

Feature Pursuit  A greedy procedure to learn the feature set  Reference model  Approximate information gain

Proposition The approximate information gain for a new feature is The approximate information gain for a new feature is and the optimal energy function for this feature is and the optimal energy function for this feature is

Kullback-Leibler Feature  Kullback-Leibler Feature  Pursue feature by Hybrid Monte CarloHybrid Monte Carlo Sequential 1D optimizationSequential 1D optimization Feature selectionFeature selection

Acceleration by Importance Sampling  Gibbs sampling is too slow…  Importance sampling by the reference model

Flowchart of IGM IGM Syn Samples Obs Samples Feature Pursuit KL Feature KL<  Output MCMC Obs Histograms N Y

Toy Problems (1) Synthesized samples Gibbs potential Observed histograms Synthesized histograms Feature pursuit Mixture of two Gaussians Circle

Toy Problems (2) Swiss Roll

Applied to High Dimensions  In high-dimensional space Too many features to constrain every dimensionToo many features to constrain every dimension MCMC sampling is extremely slowMCMC sampling is extremely slow  Solution: dimension reduction by PCA  Application: learning face prior model 83 landmarks defined to represent face (166d)83 landmarks defined to represent face (166d) 524 samples524 samples

Face Prior Learning (1) Observed face examplesSynthesized face samples without any features

Face Prior Learning (2) Synthesized with 10 featuresSynthesized with 20 features

Face Prior Learning (3) Synthesized with 30 featuresSynthesized with 50 features

Observed Histograms

Synthesized Histograms

Gibbs Potential Functions

Learning Caricature Exaggeration

Synthesis Results

Learning 2D Gibbs Process

Thank you! celiu@csail.mit.edu CSAIL

Learning Inhomogeneous Gibbs Models Ce Liu

Similar presentations

Presentation on theme: "Learning Inhomogeneous Gibbs Models Ce Liu"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning Inhomogeneous Gibbs Models Ce Liu

Similar presentations

Presentation on theme: "Learning Inhomogeneous Gibbs Models Ce Liu"— Presentation transcript:

Similar presentations

About project

Feedback