© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice George Forman Martin Scholz Shyam.

© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice George Forman Martin Scholz Shyam Rajaram HP Labs, Palo Alto, CA, USA Feature Shaping for Linear SVM Classifiers

Linear SVMs ? In reality: High-dimensional Varying predictiveness Heterogenous features common for feature selection

Example: Useful Non-linear Feature

Feature Transformations and SVMs Affine transformations no Linear transformation relative Distance between examples yes Non-monotonic transform. yes Change to single featureEffect

Wishlist: Raw Data - Things to Fix Detection of irrelevant features Appropriate scaling of feature ranges − Blood pressure vs. BMI: scale = importance ? Linear dependence of feature on target − FIX: Speeding - death rate doubles every 10mph Monotonic relationships with the target − FIX: blood pressure etc. healthy in a specific interval

The Transformation Landscape Complexity & Costs Feature Selection x i ’:=w i x i w i  {0,1} Feature Scaling w i  R + … Feature Shaping Non-linear kernels Feature Construction Kernel Learning Individual features Features sets Raw feature x i Transformed x i ’

8 Feature Selection Metrics [Forman CIKM’08]

9 BNS for feature selection [Forman, JMLR’02]

10 Scaling beats selection [Forman CIKM’08] BNS scaling binary features BNS selection IG selection F-measure

11 Scaling beats selection [Forman CIKM’08] BNS scaling binary features BNS selection IG selection F-measure

Shaping Example

Estimating class distributions Input: labeled examples projected to feature x i Goal: estimate p i := P( y | x i = v ) Large variety of cases: − Nominal, binary features − Ordinal features − Continuous features Output: p i : R  [0, 1] Compute blue curve!

Input: p i : R  [0, 1] Goal: make x i “more linearly dependent” Local probability (LP) shaper − x i ’ := p i ( x i ) − non-monotonic transformation Monotonic transformations: − Use rank as new feature value − Derive values from ROC plots Output: function for each i, mapping x i to x i ’ Reshaping Features

Coherent Data Processing Blocks PDF estimation Reshaping Features Feature Scaling Normalization Preserving sparsity

Feature Scaling Scale of features should reflect importance BNS scaling for binary features: For continuous case: − use BNS score of best binary split Diffing: scale each feature to [0, |BNS(x i ’)|]

Normalization Options tested in our experiments: L2 normalization – standard in text mining L1 normalization – sparse solutions No normalization

Preserving Sparsity Text data usually very sparse Substantial impact on complexity Discussed transformations: not sparsity- preserving Solution: − Affine transformation  no effect on SVMs − Adapt f i so that f i (x i,m ) = 0 if x i,m is mode of x i

Experiments Benchmarks − Text: News articles, TREC, Web data, … − UCI: 11 popular datasets, mixed attribute types − Used as binary classification problems, 50+ positives Learner: − Linear SVM (SMO) − 5x XVal to determine C (out of {.01,.1,1,10,100}) − No internal normalization of input − Logistic scaling activated for output

Text: Accuracy vs. training set size

UCI data: AUC vs. training set size

Overview: All binary UCI tasks

Lesion Study on UCI data PDF estimation Reshaping Features Feature Scaling Normalization Preserving sparsity

Conclusions Data representation is crucial in data mining “Feature Shaping”: − expressive, local technique for transforming features − generalizes selection and scaling − computationally cheap, very practical − tuned locally for each feature Simplistic implementation  decent improvements Case dependent, smart implementation  ? Questions?

© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice George Forman Martin Scholz Shyam.

Similar presentations

Presentation on theme: "© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice George Forman Martin Scholz Shyam."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice George Forman Martin Scholz Shyam.

Similar presentations

Presentation on theme: "© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice George Forman Martin Scholz Shyam."— Presentation transcript:

Similar presentations

About project

Feedback