Download presentation

Presentation is loading. Please wait.

Published byLana Daughtery Modified about 1 year ago

1
Machine Learning with Discriminative Methods Lecture 18 – Structured Prediction CS Spring 2015 Alex Berg

2
Today’s class Structured prediction Discuss next reading for deep learning

3
Structure Prediction Some examples from Ben Taskar (UPenn and University of Washington)…

4
Handwriting Recognition brace Sequential structure xy

5
Object Segmentation Spatial structure xy

6
Natural Language Parsing The screen was a sea of red Recursive structure xy

7
Bilingual Word Alignment What is the anticipated cost of collecting fees under the new proposal? En vertu des nouvelles propositions, quel est le coût prévu de perception des droits? xy What is the anticipated cost of collecting fees under the new proposal ? En vertu de les nouvelles propositions, quel est le coût prévu de perception de les droits ? Combinatorial structure

8
Protein Structure and Disulfide Bridges Protein: 1IMT AVITGACERDLQCG KGTCCAVSLWIKSV RVCTPVGTSGEDCH PASHKIPFSGQRMH HTCPCAPNLACVQT SPKKFKCLSK

9
Local Prediction Classify using local information Ignores correlations & constraints! breac

10
Local Prediction building tree shrub ground

11
Structured Prediction Use local information Exploit correlations breac

12
Structured Prediction building tree shrub ground

13
Formulating the problem There is a rich history, and we are skipping to a post-modern reductionist viewpoint (see first 5 sections of Nowozin reading…) In particular we are avoiding a probabilistic formulation, but keep in mind that the following technique works for a probabilistic model, replacing armgax with maximum likelihood, and using sampling where appropriate…

14
Discriminative modeling for structured prediction We have already seen one example… Multiclass classification! #i’s == #unique y’s May be too flexible with large space of outputs…

15
Discriminative modeling for structured prediction Can we simplify f ? n<

16
Optimization for discriminative learning of structured prediction models Let’s look at perceptrons first. From Wikipedia: Perceptron Structured Perceptron

17
Optimization for discriminative learning of structured prediction models More generally we will do empirical risk minimization again… Possibly annoying bit is that to compute empirical estimate of loss we need to compute Parameterize f by Find the parameters that minimize the a regularization penalty plus the loss of f on training data. - Lots of recent work on approximating this step, as long as training and test are done the same way it is usually ok! - For the ’s are the ’s and optimizing them is a convex problem (for reasonable choices on loss and regularization, e.g. hinge loss and L2 on ’s…)

18
What is the loss for structured prediction? So we want: Will settle for minimizing a convex surrogate like:

19
Can get probabilistic models too (Conditional Random Fields CRFs) Boxed algorithms from: Jason Eisner (NLP/ML at JHU), proponent of Empircial Risk Minimization and Probabilistic Models. CRFs (~2000-) had a huge impact on natural language processing and computer vision communities. By modeling P(y|x) directly without modeling p(x), p(y) and p(x|y) as intermediate steps, saves much computational and sampling complexity. Some of Ben Taskar’s work (~2004-) addressed max margin structured prediction with probabilistic models combining these ideas with the margin on previous slide.

20
For next class Read the deep learning tutorial slides linked on the course web page: lxmls.pdf

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google