A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.

Slides:

Advertisements

Similar presentations

Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao, Wei Fan, Jing Jiang, Jiawei Han l Motivate Solution Framework Data Sets Synthetic.

Advertisements

Machine Learning for Vision-Based Motion Analysis Learning pullback metrics for linear models Oxford Brookes Vision Group Oxford Brookes University 17/10/2008.

Lazy Paired Hyper-Parameter Tuning

An Interactive-Voting Based Map Matching Algorithm

Face Alignment by Explicit Shape Regression

Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.

On-line learning and Boosting

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

Large-Scale Entity-Based Online Social Network Profile Linkage.

Distributed Approximate Spectral Clustering for Large- Scale Datasets FEI GAO, WAEL ABD-ALMAGEED, MOHAMED HEFEEDA PRESENTED BY : BITA KAZEMI ZAHRANI 1.

Proactive Learning: Cost- Sensitive Active Learning with Multiple Imperfect Oracles Pinar Donmez and Jaime Carbonell Pinar Donmez and Jaime Carbonell Language.

Learning Visual Similarity Measures for Comparing Never Seen Objects Eric Nowak, Frédéric Jurie CVPR 2007.

Patch to the Future: Unsupervised Visual Prediction

Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,

Optimal Design Laboratory | University of Michigan, Ann Arbor 2011 Design Preference Elicitation Using Efficient Global Optimization Yi Ren Panos Y. Papalambros.

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson

OCFS: Optimal Orthogonal Centroid Feature Selection for Text Categorization Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, and Weiguo Fan et.

Introduction to Machine Learning course fall 2007 Lecturer: Amnon Shashua Teaching Assistant: Yevgeny Seldin School of Computer Science and Engineering.

Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University

Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.

Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.

Machine Learning in Simulation-Based Analysis 1 Li-C. Wang, Malgorzata Marek-Sadowska University of California, Santa Barbara.

CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.

InCob A particle swarm based hybrid system for imbalanced medical data sampling Pengyi Yang School of Information Technologies.

TransRank: A Novel Algorithm for Transfer of Rank Learning Depin Chen, Jun Yan, Gang Wang et al. University of Science and Technology of China, USTC Machine.

Active Learning for Class Imbalance Problem

Data mining and machine learning A brief introduction.

Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.

Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.

Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.

LOGO Ensemble Learning Lecturer: Dr. Bo Yuan

Universit at Dortmund, LS VIII

Benk Erika Kelemen Zsolt

A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &

Patch Based Prediction Techniques University of Houston By: Paul AMALAMAN From: UH-DMML Lab Director: Dr. Eick.

Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.

Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

A Clustering-based QoS Prediction Approach for Web Service Recommendation Shenzhen, China April 12, 2012 Jieming Zhu, Yu Kang, Zibin Zheng and Michael.

NTU & MSRA Ming-Feng Tsai

Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:

哈工大信息检索研究室 HITIR ’ s Update Summary at TAC2008 Extractive Content Selection Using Evolutionary Manifold-ranking and Spectral Clustering Reporter: Ph.d.

Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.

Feature learning for multivariate time series classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † * Arizona State University † Intel Corporation.

Experience Report: System Log Analysis for Anomaly Detection

Big data classification using neural network

Bridging Domains Using World Wide Knowledge for Transfer Learning

Semi-Supervised Clustering

Supervised Time Series Pattern Discovery through Local Importance

Asymmetric Gradient Boosting with Application to Spam Filtering

RandPing: A Randomized Algorithm for IP Mapping

A weight-incorporated similarity-based clustering ensemble method based on swarm intelligence Yue Ming NJIT#:

CellNetQL Image Segmentation without Feature Definition

Open-Category Classification by Adversarial Sample Generation

Discriminative Frequent Pattern Analysis for Effective Classification

MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.

Outline Background Motivation Proposed Model Experimental Results

Learning to Rank with Ties

Learning Dual Retrieval Module for Semi-supervised Relation Extraction

Machine Learning for Cyber

Presentation transcript:

A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2 1 Peking University 2 Microsoft Research Asia

Outline  Motivation & Background  Problem Definition & Algorithm Overview  Algorithm Details  Experiments - Classification  Experiments - Search Ranking  Conclusion

Motivation & Background  Supervised Learning:  Machine Learning task of inferring a function from labeled training data  Prediction Error:  No matter how strong a learning model is, it will suffer from prediction errors.  Noise in training data, dynamically changing data distribution, weakness of learner  Feedback from User:  Good signal for learning models to find the limitation and then improve accordingly

Learning to Fix Errors from Failure Cases  Automatically fix model prediction errors from failure cases in feedback data.  Input:  A well trained supervised model (we name it as Mother Model)  A collection of failure cases in feedback dataset.  Output:  Learning to automatically fix the model bugs from failure cases  Previous Works  Model Retraining  Model Aggregation  Incremental Learning

Local Patching: from Global to Local  Learning models are generally optimized globally  Introducing new prediction errors when fixing the old ones  Our key idea: learning to fix the model locally using patches New Error

Problem Definition

Algorithm Overview  Failure Case Collection  Learning Patch Regions/Failure Case Clustering  Clustering Failure Cases into N groups through subspace learning, compute the centroid and range for every group, then define our patches  Learning Patch Model  Learn a patch model using only the data samples that sufficiently close to the patch centroid

Algorithm Details

Learning Patch Region – Key Challenge  Failure cases may distribute diffusely  Small N = large patch range → many success cases will be patched  Big N = small patch range → high computational complexity  How to make trade-offs ?

Solution: Clustered Metric Learning  Our solution to diffusion: Metric Learning  Learn a distance metric, i.e. subspace, for failure cases, such that the similar failure cases will aggregate, and keep distant from the success cases. (Red circle = failure cases; blue circle = success cases) Key idea of the patch model learning (Left): The cases in original data space. (Middle): The cases mapped to the learned subspace. (Right): Repair the failure cases using a single patch.

Metric Learning

Clustered Metric Learning

Learning Patch Model

Experiments

Experiments - Classification  Dataset  Randomly select 3 UCI subset  Spambase, Waveform, Optical Digit Recognition  Convert to binary classification dataset  ~5000 instances in each dataset  Split to: 60% - training, 20% - feedback, 20% - test  Baseline Algorithm  SVM  Logistic Regression  SVM - retrained with training + feedback data  Logistic Regression - retrained with training + feedback data  SVM – Incremental Learning  Logistic Regression - Incremental Learning

Classification Accuracy  Classification accuracy on feedback dataset  Classification accuracy on test dataset SVMSVM+LPFLRLR+LPF Spam Wave Optdigit SVM SVM- Retain SVM-ILSVM+LPF LRLR-RetainLR-ILLR-LPF Spam Wave Optdigit

Classification – Case Coverage

Parameter Tuning  Number of Patches  Data sensitive, in our experiment the best N is 2

Experiments – Search Ranking  Dataset  Data from a commonly used commercial search engine  ~14, 126 pairs  With 5 grades label  Metrics  {1,3,5}  Baseline Algorithm  GBDT  GBDT + IL

Experiment Results – Ranking GBRTILGBRT + LPF

Experiment Results – Ranking (Cont.)

Conclusion  We proposed  The local model fixing problem  A novel patch framework fox fixing the failure cases in feedback dataset in local view  The experiment results demonstrate the effectiveness of our proposed Local Patch Framework

Thank you!