Presentation is loading. Please wait.

Presentation is loading. Please wait.

Constrained Optimization for Validation-Guided Conditional Random Field Learning Minmin Chen , Yixin Chen , Michael Brent , Aaron Tenney Washington University.

Similar presentations


Presentation on theme: "Constrained Optimization for Validation-Guided Conditional Random Field Learning Minmin Chen , Yixin Chen , Michael Brent , Aaron Tenney Washington University."— Presentation transcript:

1 Constrained Optimization for Validation-Guided Conditional Random Field Learning Minmin Chen , Yixin Chen , Michael Brent , Aaron Tenney Washington University in St. Louis 19/4/2015KDD 2009 Presented by: Qiang Yang, HKUST

2 Conditional Random Fields Conditional Random Fields ( Lafferty, McCallum& Pereira 2001 ) Probabilistic model for sequential data segmentation and labeling 2 112 GAT 12 CG Labeling Sequence Observation Sequence 9/4/2015KDD 2009 FEATURE

3 Applications Natural Language Processing Lafferty J., McCallum A.& Pereira F. 2001 Sha F. & Pereira F. 2003 Sarawagi S. & Cohen W. W. 2005 Computer Vision Sminchisescu C., Kanaujia A. & Metaxas D. 2006 Vishwanathan S. V. N. et al. 2006 Bioinformatics Culotta A., Kulp D. & McCallum A. 2005 Gross, S. S. et al. 2007 39/4/2015KDD 2009

4 Challenges and Related Work Overfitting Likelihood training  Overfitting Model flexibility  Large feature set  Overfitting Related Work Regularization (Sha 2003, Vail 2007) Smoothing methods (Newman 1977, Rosenfeld 1996) Regularization with a Gaussian prior works as well as or better than smoothing methods, but the regularization parameter is hard to tune 9/4/2015KDD 20094

5 Motivation of Proposed Work Cross-validation is often used to estimate the accuracy of a classifier and to select models. Generally, the performance on the validation set is strongly correlated to the performance of the trained model on the unseen data Constraints prescribing the performance of the trained model on the validation set are used to guide the learning process and avoid the model from fitting freely and tightly to the training data 9/4/2015KDD 20095

6 Single Training Multiple Validation (STMV) Framework STMV Framework Small data set Large data set 9/4/2015KDD 20096 Training Testing V1 V2 V3 TrainingTestingV1 V2V3

7 Constrained Formulation Where Constraints prescribing the difference of scores to ensure the model take consideration of the performance on the validation sets 9/4/2015KDD 20097 Original objective : maximize the log likelihood of the labeling sequence given the observation sequence of the training data Score of a specific labeling sequence y for an observation sequence x Score of the most likely sequence found by Viterbi under current model for validation sequence v(j) Score of real labeling sequence for validation sequence v(j)

8 Extended Saddle Point (ESP) Theory Extended Saddle Point Theory ( Wah & Chen 2005 ) Introduces a necessary and sufficient condition on the Constrained Local Minimum (CLM) of a constrained nonlinear programming problem in a continuous, discrete, or mixed space Offer several salient advantages over previous constraint handling theory Does not require the constraints to be differentiable or in closed form Satisfied over an extended region of penalty values Necessary and sufficient 9/4/2015KDD 20098

9 Extended Saddle Point (ESP) Search Algorithm Extended Saddle Point Search Algorithm Transform the constrained formulation into a penalty form where Outer loop updates the extended penalty values Inner loop minimizes Challenges Efficient calculation of the gradient of the penalty function The first term of the constraint is determined by the most likely sequence found by the Viterbi Algorithm. Change of the parameter W can result in a very different sequence  non-differentiable 9/4/2015KDD 20099

10 Approximation of the discontinuous Gradient 9/4/2015KDD 200910 Highest prob. Of reaching this state at time t f(s1, s2, X)

11 Experimental Results : Gene Prediction Find out the protein coding regions and associated components Apply our STMV framework and ESP search algorithm on CONTRAST, a state-of-the-art gene predictor Data set: Fruit fly genome, 27,463 genes, evenly divided into 4 sets Feature set includes around 33,000 features Compare performance to original CONTRAST, and CONTRAST with regularization Gene, exon, nucleotide level sensitivity and specificity 9/4/2015KDD 200911

12 Experimental Results : Gene Prediction(cont) The performance of original CRF, regularized CRF(CRFr) and constrained CRF(CRFc) on CONTRAST 9/4/2015KDD 200912

13 Experimental Result: Stock Price Prediction Predict if tomorrow’s stock price will raise or fall comparing to today’s, based on historical data; Preprocessing techniques to smooth out the noise in the raw data Date set: 1741 stocks from NASDAQ and NYSE Each contains stock prices from 2002-02-01 to 2007-09-12 Feature set includes 2,000 features Predict on day T+1: Training Sequence : day 1 ~ T Validation Sequence: day T-V+1 ~ T (V = 100) 9/4/2015KDD 200913

14 Experimental Result: Stock Price Prediction(cont) 9/4/2015KDD 200914

15 Conclusion A Single Training Multiple Validation (STMV) framework Integrate validation into the training process by modeling the validation quality as constraints in the problem formulation Effectively avoid overfitting of CRF models Approximation Scheme Efficiently approximate the discontinuous gradient of our constrained formulation Extended Saddle Point (ESP) search algorithm Robustly find a constrained local minimum for our constrained formulation 9/4/2015KDD 200915

16 Questions? mc15@cec.wustl.edu 9/4/2015KDD 200916


Download ppt "Constrained Optimization for Validation-Guided Conditional Random Field Learning Minmin Chen , Yixin Chen , Michael Brent , Aaron Tenney Washington University."

Similar presentations


Ads by Google