Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parameter Learning in MN. Outline CRF Learning CRF for 2-d image segmentation IPF parameter sharing revisited.

Similar presentations


Presentation on theme: "Parameter Learning in MN. Outline CRF Learning CRF for 2-d image segmentation IPF parameter sharing revisited."— Presentation transcript:

1 Parameter Learning in MN

2 Outline CRF Learning CRF for 2-d image segmentation IPF parameter sharing revisited

3 10-708 –  Carlos Guestrin 2006-2008 3 Log-linear Markov network (most common representation) Feature is some function  [D] for some subset of variables D – e.g., indicator function Log-linear model over a Markov network H: – a set of features  1 [D 1 ],…,  k [D k ] each D i is a subset of a clique in H two  ’s can be over the same variables – a set of weights w 1,…,w k usually learned from data –

4 Generative v. Discriminative classifiers – A review Want to Learn: h:X  Y – X – features – Y – target classes Bayes optimal classifier – P(Y|X) Generative classifier, e.g., Naïve Bayes: – Assume some functional form for P(X|Y), P(Y) – Estimate parameters of P(X|Y), P(Y) directly from training data – Use Bayes rule to calculate P(Y|X= x) – This is a ‘generative’ model Indirect computation of P(Y|X) through Bayes rule But, can generate a sample of the data, P(X) =  y P(y) P(X|y) Discriminative classifiers, e.g., Logistic Regression: – Assume some functional form for P(Y|X) – Estimate parameters of P(Y|X) directly from training data – This is the ‘discriminative’ model Directly learn P(Y|X) But cannot obtain a sample of the data, because P(X) is not available 4 10-708 –  Carlos Guestrin 2006-2008

5 5 Log-linear CRFs (most common representation) Graph H: only over hidden vars Y 1,..,Y P – No assumptions about dependency on observed vars X – You must always observe all of X Feature is some function  [D] for some subset of variables D – e.g., indicator function, Log-linear model over a CRF H: – a set of features  1 [D 1 ],…,  k [D k ] each D i is a subset of a clique in H two  ’s can be over the same variables – a set of weights w 1,…,w k usually learned from data –

6 Example: Image Segmentation - A set of features  1 [D 1 ],…,  k [D k ] – each D i is a subset of a clique in H – two  ’s can be over the same variables y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9 We will define features as follows: - : measures compatibility of node color and its segmentation -A set of indicator features triggered for each edge labeling pair {ff,bb,fb,bf} -This is a allowed since we can define many features overr the same subset of variables

7 Example: Image Segmentation - A set of features  1 [D 1 ],…,  k [D k ] – each D i is a subset of a clique in H – two  ’s can be over the same variables y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9

8 Example: Image Segmentation - A set of features  1 [D 1 ],…,  k [D k ] – each D i is a subset of a clique in H – two  ’s can be over the same variables We need to learn parameters w m y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9 -Now we just need to sum these features

9 Example: Image Segmentation Requires inference using the current parameter estimates y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9 Count for features m in data n Given N data points (images and their segmentations)

10 Example: Inference for Learning How to compute E[C fb |X[n]] y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9

11 Example: Inference for Learning How to compute E[C fb |X[n]] y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9

12 Representation Equivalence y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9 Log linear representation Tabular MN representation from HW4

13 Representation Equivalence y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9 Log linear representation Tabular MN representation from HW4Now do it over the edge potential This is correct as for every assignment to y i y j we select one value from the table

14 Tabular MN representation from HW4Now do it over the edge potential This is correct as for every assignment to y i y j we select one value from the table The cheap exp(log..) trick Just algebra Now lets combine it over all edge assuming parameter sharing Now use the same C m trick

15 Representation Equivalence y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9 Log linear representation Tabular MN representation from HW4Now substitute Equivalent, with w m = log  m Where  is the value in the tabular

16 Outline CRF Learning CRF for 2-d image segmentation IPF parameter sharing revisited

17 10-708 –  Carlos Guestrin 2006-2008 17 Iterative Proportional Fitting (IPF) Difficulty SATGrade Happy Job Coherence Letter Intelligence Setting derivative to zero: Fixed point equation: Iterate and converge to optimal parameters – Each iteration, must compute:

18 Parameter Sharing in your HW Note that I am suing Y for label All edge potentials are shared Also we are learning a conditional model y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 y7y7 y8y8 y9y9

19 IPF parameter sharing How to calculate these quantities using parameter sharing? In total we have 4 parameters as opposed to 4 parameters per edge We can cancel |E| due to division YiYi Run lbp,when converged YjYj xixi xjxj We only have one data point (image) in this example so we dropped X[n] to only X

20


Download ppt "Parameter Learning in MN. Outline CRF Learning CRF for 2-d image segmentation IPF parameter sharing revisited."

Similar presentations


Ads by Google