Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discriminative, Unsupervised, Convex Learning Dale Schuurmans Department of Computing Science University of Alberta MITACS Workshop, August 26, 2005.

Similar presentations


Presentation on theme: "Discriminative, Unsupervised, Convex Learning Dale Schuurmans Department of Computing Science University of Alberta MITACS Workshop, August 26, 2005."— Presentation transcript:

1 Discriminative, Unsupervised, Convex Learning Dale Schuurmans Department of Computing Science University of Alberta MITACS Workshop, August 26, 2005

2 2 Current Research Group PhD Tao Wang reinforcement learning PhD Ali Ghodsi dimensionality reduction PhD Dana Wilkinson action-based embedding PhD Yuhong Guo ensemble learning PhD Feng Jiao bioinformatics PhD Jiayuan Huang transduction on graphs PhD Qin Wang statistical natural language PhD Adam Milstein robotics, particle filtering PhD Dan Lizotte optimization, everything PhD Linli Xu unsupervised SVMs PDF Li Cheng computer vision

3 3 Current Research Group PhD Tao Wang reinforcement learning PhD Dana Wilkinson action-based embedding PhD Feng Jiao bioinformatics PhD Qin Wang statistical natural language PhD Dan Lizotte optimization, everything PDF Li Cheng computer vision

4 4 Today I will talk about: One Current Research Direction Learning Sequence Classifiers (HMMs)  Discriminative  Unsupervised  Convex EM?

5 5 Outline  Unsupervised SVMs  Discriminative, unsupervised, convex HMMs  Tao, Dana, Feng, Qin, Dan, Li

6 6

7 Unsupervised Support Vector Machines Joint work with Linli Xu

8 8 Main Idea  Unsupervised SVMs (and semi-supervised SVMs)  Harder computational problem than SVMs  Convex relaxation – Semidefinite program (Polynomial time)

9 9 Background: Two-class SVM  Supervised classification learning Labeled data  linear discriminant Classification rule: Some better than others? +

10 10 Maximum Margin Linear Discriminant Choose a linear discriminant to maximize

11 11 Unsupervised Learning  Given unlabeled data, how to infer classifications? Organize objects into groups — clustering

12 12 Idea: Maximum Margin Clustering  Given unlabeled data, find maximum margin separating hyperplane  Clusters the data  Constraint: class balance: bound difference in sizes between classes

13 13 Challenge  Find label assignment that results in a large margin  Hard  Convex relaxation – based on semidefinite programming

14 14 How to Derive Unsupervised SVM? Two-class case: 1.Start with Supervised Algorithm Given vector of assignments, y, solve Inv. sq. margin

15 15 How to Derive Unsupervised SVM? 2.Think of as a function of y If given y, would then solve Goal: Choose y to minimize inverse squared margin Problem: not a convex function of y Inv. sq. margin

16 16 How to Derive Unsupervised SVM? 3.Re-express problem with indicators comparing y labels If given y, would then solve New variables: An equivalence relation matrix Inv. sq. margin

17 17 How to Derive Unsupervised SVM? 3.Re-express problem with indicators comparing y labels If given M, would then solve New variables: An equivalence relation matrix Maximum of linear functions is convex Inv. sq. margin Note: convex function of M

18 18 How to Derive Unsupervised SVM? 4.Get constrained optimization problem Solve for M encodes an equivalence relation iff Not convex! Class balance 

19 19 How to Derive Unsupervised SVM? 4.Get constrained optimization problem Solve for M encodes an equivalence relation iff

20 20 How to Derive Unsupervised SVM? 5.Relax indicator variables to obtain a convex optimization problem Solve for M

21 21 How to Derive Unsupervised SVM? 5.Relax indicator variables to obtain a convex optimization problem Solve for M Semidefinite program

22 22 Multi-class Unsupervised SVM? 1.Start with Supervised Algorithm Given vector of assignments, y, solve (Crammer & Singer 01) Margin loss

23 23 Multi-class Unsupervised SVM? 2.Think of as a function of y If given y, would then solve (Crammer & Singer 01) Margin loss Goal: Choose y to minimize margin loss Problem: not a convex function of y

24 24 Multi-class Unsupervised SVM? 3.Re-express problem with indicators comparing y labels If given y, would then solve (Crammer & Singer 01) Margin loss New variables: M & D

25 25 Multi-class Unsupervised SVM? 3.Re-express problem with indicators comparing y labels If given M and D, would then solve New variables: M & D Margin loss convex function of M & D

26 26 Multi-class Unsupervised SVM? 4.Get constrained optimization problem Solve for M and D Class balance 

27 27 Multi-class Unsupervised SVM? 5.Relax indicator variables to obtain a convex optimization problem Solve for M and D

28 28 Multi-class Unsupervised SVM? 5.Relax indicator variables to obtain a convex optimization problem Solve for M and D Semidefinite program

29 29 Experimental Results SemiDef SpectralClustering Kmeans

30 30 Experimental Results

31 31 Percentage of misclassification errors Experimental Results Digit dataset

32 32 Extension to Semi-Supervised Algorithm Matrix M :

33 33 Experimental Results Percentage of misclassification errors Face dataset

34 34 Experimental Results

35 35

36 Discriminative, Unsupervised, Convex HMMs Joint work with Linli Xu With help from Li Cheng and Tao Wang

37 37 Hidden Markov Model  Joint probability model  Viterbi classifier “hidden” state observations Must coordinate local classifiers

38 38 HMM Training: Supervised  Given Maximum likelihood Conditional likelihood Models input distribution Discriminative (CRFs)

39 39 HMM Training: Unsupervised  Given only Now what? EM! Marginal likelihood Exactly the part we don’t care about

40 40 HMM Training: Unsupervised  Given only The problem with EM: Not convex Wrong objective Too popular Doesn’t work

41 41 HMM Training: Unsupervised  Given only The dream: Convex training Discriminative training When will someone invent unsupervised CRFs?

42 42 HMM Training: Unsupervised  Given only The question: How to learn effectively without seeing any y’s?

43 43 HMM Training: Unsupervised  Given only The question: How to learn effectively without seeing any y’s? The answer: That’s what we already did!  Unsupervised SVMs

44 44 HMM Training: Unsupervised  Given only The plan: supervised unsupervised single sequence SVMM3N unsup SVM?    

45 45 M3N: Max Margin Markov Nets  Relational SVMs  Supervised training: Given Solve factored QP

46 46 Unsupervised M3Ns  Strategy Start with supervised M3N QP y-labels  re-express in local M,D equivalence relations Impose class-balance Relax non-convex constraints  Then solve a really big SDP But still polynomial size

47 47 Unsupervised M3Ns  SDP

48 48 Some Initial Results  Synthetic HMM  Protein Secondary Structure pred.

49 49

50 50 Current Research Group PhD Tao Wang reinforcement learning PhD Dana Wilkinson action-based embedding PhD Feng Jiao bioinformatics PhD Qin Wang statistical natural language PhD Dan Lizotte optimization, everything PDF Li Cheng computer vision

51 51 Brief Research Background  Sequential PAC Learning  Linear Classifiers: Boosting, SVMs  Metric-Based Model Selection  Greedy Importance Sampling  Adversarial Optimization & Search  Large Markov Decision Processes


Download ppt "Discriminative, Unsupervised, Convex Learning Dale Schuurmans Department of Computing Science University of Alberta MITACS Workshop, August 26, 2005."

Similar presentations


Ads by Google