Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICML 2003 © Sergey Kirshner, UC Irvine Unsupervised Learning with Permuted Data Sergey Kirshner Sridevi Parise Padhraic Smyth School of Information and.

Similar presentations


Presentation on theme: "ICML 2003 © Sergey Kirshner, UC Irvine Unsupervised Learning with Permuted Data Sergey Kirshner Sridevi Parise Padhraic Smyth School of Information and."— Presentation transcript:

1 ICML 2003 © Sergey Kirshner, UC Irvine Unsupervised Learning with Permuted Data Sergey Kirshner Sridevi Parise Padhraic Smyth School of Information and Computer Science University of California, Irvine www.datalab.uci.edu

2 ICML 2003 © Sergey Kirshner, UC Irvine Permutation Problem -0.64 2.76 3.01 -0.52 1.05 2.60 -0.43 1.60 2.23 -0.17-0.46 0.79 -2.86 1.05 0.88 -0.54-1.93-0.15 -0.14-0.02 1.28 x 1 x 2 x 3 vector 1 vector 2 vector 3 vector 4 vector 5 vector 6 vector 7

3 ICML 2003 © Sergey Kirshner, UC Irvine Permutation Problem -0.64 2.76 3.01 -0.52 1.05 2.60 -0.43 1.60 2.23 -0.17-0.46 0.79 -2.86 1.05 0.88 -0.54-1.93-0.15 -0.14-0.02 1.28 x 1 x 2 x 3 vector 1 vector 2 vector 3 vector 4 vector 5 vector 6 vector 7 -0.64 2.76 3.01 -0.52 1.05 2.60 -0.43 1.60 2.23 -0.17-0.46 0.79 -2.86 1.05 0.88 -0.54-1.93-0.15 -0.14-0.02 1.28 x 1 x 2 x 3 vector 1 vector 2 vector 3 vector 4 vector 5 vector 6 vector 7

4 ICML 2003 © Sergey Kirshner, UC Irvine Permutation Problem -0.64 2.76 3.01 -0.52 1.05 2.60 -0.43 1.60 2.23 -0.17-0.46 0.79 -2.86 1.05 0.88 -0.54-1.93-0.15 -0.14-0.02 1.28 x 1 x 2 x 3 vector 1 vector 2 vector 3 vector 4 vector 5 vector 6 vector 7 ? ? ? ? ? ? ?

5 ICML 2003 © Sergey Kirshner, UC Irvine Motivational Example VLA FIRST Survey http://sundog.stsci.edu

6 ICML 2003 © Sergey Kirshner, UC Irvine Which Mapping Is the Right One? core lobe 1 lobe 2 core lobe 2 lobe 1 core lobe 1 lobe 2 core lobe 2 lobe 1 core lobe 1 lobe 2 core lobe 2 lobe 1

7 ICML 2003 © Sergey Kirshner, UC Irvine Permutation Problem Can we learn what permutations were applied? Can we learn the probability distribution which generated the data? If the distribution for the permuted data is known, how difficult is it to find the correct permutation? -0.64 2.76 3.01 -0.52 1.05 2.60 -0.43 1.60 2.23 -0.17-0.46 0.79 -2.86 1.05 0.88 -0.54-1.93-0.15 -0.14-0.02 1.28 x 1 x 2 x 3 vector 1 vector 2 vector 3 vector 4 vector 5 vector 6 vector 7 ? ? ? ? ? ? ?

8 ICML 2003 © Sergey Kirshner, UC Irvine Related Work Image analysis –point correspondence problem (Gold et al, 1995) –image transformation learning (Frey & Jojic, 2003) Information extraction –text field positions (McCallum et al, 2000)

9 ICML 2003 © Sergey Kirshner, UC Irvine What’s New? Previous work –problem-specific algorithms Our contributions –analysis of the difficulty of the general problem  Bayes error rate for permutations  bounded above by classification BER  specific results for Gaussian data –comments on learning

10 ICML 2003 © Sergey Kirshner, UC Irvine Generative Model p(x)

11 ICML 2003 © Sergey Kirshner, UC Irvine Generative Model p(x)

12 ICML 2003 © Sergey Kirshner, UC Irvine Generative Model

13 ICML 2003 © Sergey Kirshner, UC Irvine Generative Model p(  )

14 ICML 2003 © Sergey Kirshner, UC Irvine Generative Model

15 ICML 2003 © Sergey Kirshner, UC Irvine Generative Model q(x)

16 ICML 2003 © Sergey Kirshner, UC Irvine Generative Model q(x)p(x)

17 ICML 2003 © Sergey Kirshner, UC Irvine Generative Model q(x)

18 ICML 2003 © Sergey Kirshner, UC Irvine Generative Model q(x)

19 ICML 2003 © Sergey Kirshner, UC Irvine Mixture Model

20 ICML 2003 © Sergey Kirshner, UC Irvine How Hard is the Problem? Need measure of difficulty for a problem –How often does the optimal decision rule make a mistake? Bayes-optimal error rate Bayes-optimal permutation error rate

21 ICML 2003 © Sergey Kirshner, UC Irvine Marginal Probability Distributions p(x 1 ) p(x 2 )

22 ICML 2003 © Sergey Kirshner, UC Irvine Corresponding Marginal Problem Marginal (projection) probability distribution –component overlap problem –Bayes-optimal error rate

23 ICML 2003 © Sergey Kirshner, UC Irvine Key

24 ICML 2003 © Sergey Kirshner, UC Irvine Main Result Theorem: If the set of allowed permutations contains a key, and all allowed permutations are equally likely to be selected, Why is this important? –little overlap implies easy permutation problem –high overlap?  could still have easy permutation problem

25 ICML 2003 © Sergey Kirshner, UC Irvine Special Cases Consider low-dimensional special cases –2-dimensional Gaussians Find out what factors into the difficulty –  1,  2,  1 2,  2 2 determine overlap Bayes error rate –permutation Bayes error rate also depends on correlation /(  1  2 )

26 ICML 2003 © Sergey Kirshner, UC Irvine High Overlap Bayes Error Rate

27 ICML 2003 © Sergey Kirshner, UC Irvine High Permutation Bayes Error Rate Correlation approaches -1

28 ICML 2003 © Sergey Kirshner, UC Irvine High Permutation Bayes Error Rate Correlation approaches -1

29 ICML 2003 © Sergey Kirshner, UC Irvine Low Permutation Bayes Error Rate Correlation approaches 1

30 ICML 2003 © Sergey Kirshner, UC Irvine Low Permutation Bayes Error Rate Correlation approaches 1

31 ICML 2003 © Sergey Kirshner, UC Irvine Learning Solve as other mixture model problems: Expectation- Maximization (EM) –treat permutation as a hidden variable Identifiability –distribution p(x) resulting in a given distribution q(x) may not be unique!  each dimension is {0,1}

32 ICML 2003 © Sergey Kirshner, UC Irvine Learning to Rotate Galaxies Kirshner et al, NIPS 2002

33 ICML 2003 © Sergey Kirshner, UC Irvine Summary Framework for the permuted data problem Analysis of the optimal error rate –upper bound by optimal error rate of related problem Special case analysis –closed-form expressions –importance of correlation Parameter estimation when parameters are unknown

34 ICML 2003 © Sergey Kirshner, UC Irvine Future Work Identifiability –What distributions are identifiable? –Do permutation make identifiability problem different from ordinary mixtures? What to do with large number of permutations? Other applications

35 ICML 2003 © Sergey Kirshner, UC Irvine Acknowledgements Funding –NSF –DOE Datalab @ UCI http://www.datalab.uci.edu Sapphire Group at LLNL –Chandrika Kamath –Erick Cantú-Paz


Download ppt "ICML 2003 © Sergey Kirshner, UC Irvine Unsupervised Learning with Permuted Data Sergey Kirshner Sridevi Parise Padhraic Smyth School of Information and."

Similar presentations


Ads by Google