Machine Learning with Discriminative Methods Lecture 18 – Structured Prediction CS 790-134 Spring 2015 Alex Berg.

Presentation on theme: "Machine Learning with Discriminative Methods Lecture 18 – Structured Prediction CS 790-134 Spring 2015 Alex Berg."— Presentation transcript:

Machine Learning with Discriminative Methods Lecture 18 – Structured Prediction CS 790-134 Spring 2015 Alex Berg

Today’s class Structured prediction Discuss next reading for deep learning

Structure Prediction Some examples from Ben Taskar (UPenn and University of Washington)…

Handwriting Recognition brace Sequential structure xy

Object Segmentation Spatial structure xy

Natural Language Parsing The screen was a sea of red Recursive structure xy

Bilingual Word Alignment What is the anticipated cost of collecting fees under the new proposal? En vertu des nouvelles propositions, quel est le coût prévu de perception des droits? xy What is the anticipated cost of collecting fees under the new proposal ? En vertu de les nouvelles propositions, quel est le coût prévu de perception de les droits ? Combinatorial structure

Protein Structure and Disulfide Bridges Protein: 1IMT AVITGACERDLQCG KGTCCAVSLWIKSV RVCTPVGTSGEDCH PASHKIPFSGQRMH HTCPCAPNLACVQT SPKKFKCLSK

Local Prediction Classify using local information  Ignores correlations & constraints! breac

Local Prediction building tree shrub ground

Structured Prediction Use local information Exploit correlations breac

Structured Prediction building tree shrub ground

Formulating the problem There is a rich history, and we are skipping to a post-modern reductionist viewpoint (see first 5 sections of Nowozin reading…) In particular we are avoiding a probabilistic formulation, but keep in mind that the following technique works for a probabilistic model, replacing armgax with maximum likelihood, and using sampling where appropriate…

Discriminative modeling for structured prediction We have already seen one example… Multiclass classification! #i’s == #unique y’s May be too flexible with large space of outputs…

Discriminative modeling for structured prediction Can we simplify f ? n< { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/13/3833922/slides/slide_15.jpg", "name": "Discriminative modeling for structured prediction Can we simplify f .", "description": "n<

Optimization for discriminative learning of structured prediction models Let’s look at perceptrons first. From Wikipedia: Perceptron Structured Perceptron

Optimization for discriminative learning of structured prediction models More generally we will do empirical risk minimization again… Possibly annoying bit is that to compute empirical estimate of loss we need to compute Parameterize f by  Find the parameters  that minimize the a regularization penalty plus the loss of f   on training data. - Lots of recent work on approximating this step, as long as training and test are done the same way it is usually ok! - For the  ’s are the  ’s  and optimizing them is a convex problem (for reasonable choices on loss and regularization, e.g. hinge loss and L2 on  ’s…)

What is the loss for structured prediction? So we want: Will settle for minimizing a convex surrogate like:

Can get probabilistic models too (Conditional Random Fields CRFs) Boxed algorithms from: Jason Eisner (NLP/ML at JHU), proponent of Empircial Risk Minimization and Probabilistic Models. CRFs (~2000-) had a huge impact on natural language processing and computer vision communities. By modeling P(y|x) directly without modeling p(x), p(y) and p(x|y) as intermediate steps, saves much computational and sampling complexity. Some of Ben Taskar’s work (~2004-) addressed max margin structured prediction with probabilistic models combining these ideas with the margin on previous slide.

For next class Read the deep learning tutorial slides linked on the course web page: http://lxmls.it.pt/2014/socher- lxmls.pdf

Download ppt "Machine Learning with Discriminative Methods Lecture 18 – Structured Prediction CS 790-134 Spring 2015 Alex Berg."

Similar presentations