Presentation on theme: "Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao, Wei Fan, Jing Jiang, Jiawei Han l Motivate Solution Framework Data Sets Synthetic."— Presentation transcript:
Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao, Wei Fan, Jing Jiang, Jiawei Han l Motivate Solution Framework Data Sets Synthetic Data Sets Spam Filtering: Public email collection personal inboxes (u01, u02, u03) (ECML/PKDD 2006) Text Classification: Same top-level classification problems with different sub-fields in the training and test sets (Newsgroup, Reuters) Intrusion Detection: Two types of intrusions a different type of intrusions (KDD Cup99 Data) Baseline Methods Single models: Winnow (WNN), Logistic Regression (LRR), Support Vector Machine (SVM) Simple model averaging ensemble (SMA) Semi-supervised learning models: Transductive SVM (TSVM) Experiments New York Times training (labeled) test (unlabeled) Classifier 85.5% New York Times Reuters Classifier 64.1% New York Times Goal To design learning methods that are aware of the training and test domain difference. Examples Spam filtering: Public email collection personal inboxes Intrusion detection: Existing types of intrusions unknown types of intrusions Sentiment analysis: Expert review articles blog review articles Related work Sample selection bias correction: Reweight training examples or transform the representation Transfer learning: Adapt the classifier to the new domain Multi-task learning: Share learning among different tasks New Problems Learn from multiple source domains and transfer the knowledge to a target domain. Importantly, target domain does not have any labeled examples (different from some previously proposed methods) training (labeled) test (completey unlabeled) Classifier Reuters Ng Newsgroup …… ? Training (have conflicting concepts) Test Partially overlapping Source Domain Target Domain Source Domain Source Domain C1C1 C2C2 CkCk …… Training set 1 Test example x Training set 2 Training set k …… C1C1 C2C2 Test example x 0.9 0.1 0.4 0.6 0.8 0.2 Higher Weight Take away messages Transfer Learning ideal setting realistic setting Performance degrades Transfer from Multiple Domains A Synthetic ExampleGoal To unify knowledge that are consistent with the test domain from multiple source domains Observations Each base model may be effective on a subset of the test doamin. It is hard to select the optimal model since class labels in the test domain are unknown. Locally Weighted Ensemble (LWE)Determine Weights Example Groundtruth Optimal solution can be obtained from the regression problem if true labels are known: But groudtruth f is unknown!!! Approximate Optimal Weights Assumptions Test examples that are closer in the feature space are more likely to share the same class label. Graph-based Heuristic Map the structures of a model onto the structures of the test domain Weight each model locally according to its consistency with the neighborhood structure around the test example Example Higher Weight Local Structure Based Adjustment What if no models are similar to the clustering structure at x? Simply means that the training information are conflicting with the true target distribution at x. Solution: Ignore the training information and propogate the labels of neighbors in the test set to x. Experiments on Synthetic Data LWE beats the baslines in terms of prediction accuracy!!! Experiments on Text Data Experiments on Intrusion Data Parameter Sensitivity Locally weighted ensemble framework transfers useful knowledge from source domains and Graph-based heuristics makes the framework practical and effective Notes: Codes and datasets available at http://ews.uiuc.edu/~jinggao3/kdd 08transfer.htm Weight of a model is proportional to the similarity between its neighborhood graph and the clustering structure around x.