Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Simple Probabilistic Approach to Learning from Positive and Unlabeled Examples Dell Zhang (BBK) and Wee Sun Lee (NUS)

Similar presentations


Presentation on theme: "A Simple Probabilistic Approach to Learning from Positive and Unlabeled Examples Dell Zhang (BBK) and Wee Sun Lee (NUS)"— Presentation transcript:

1 A Simple Probabilistic Approach to Learning from Positive and Unlabeled Examples Dell Zhang (BBK) and Wee Sun Lee (NUS)

2 Problem Supervised Learning

3 Problem Semi-Supervised Learning

4 Problem PU Learning

5 Problem Unlabeled Examples Help

6 Problem PU Learning To distinguish the interesting instances (the positive class C + ) with other instances (the negative class C - ) by learning a classifier from a set of positive examples P and a set of unlabeled examples U There is no labeled negative example!

7 Applications To automatically filter web pages according to a user's preference the browsed or bookmarked pages can be used as positive examples while unlabeled examples can be easily collected from the web To automatically find machine learning literature the ICML papers can be used as positive examples while unlabeled examples can be easily collected from the ACM or IEEE digital library To automatically identify cancer patients the patients known to have cancers can be used as positive examples while unlabeled examples can be easily collected from the patient database To automatically discover future customers for direct marketing the current customers of the company can be used as positive examples while unlabeled examples can be purchased at a low cost compared with obtaining negative examples ……

8 Approaches Existing Approaches PNB (Denis et al. 2002); PNCT (Denis et al. 2003) S-EM (Liu et al. 2002); RC-SVM (Li & Liu 2003) PEBL (Yu et al. 2004); SVMC (Yu 2005) PN-SVM (Fung et al. 2005) W-LR (Lee & Liu 2003); B-SVM (Liu et al. 2003) Our Proposed Approach B-Pr

9 Our Approach A Probabilistic Model

10 Our Approach

11 Biased PrTFIDF (B-Pr) Estimate PrTFIDF (Joachims 1997) Estimmate Maximize On a held-out validation set (Lee & Liu 2003) Linear Time Complexity!

12 Experiments Reuters B-Pr>RC-SVM>PEBL ( p=0.55 ) RC-SVM>B-Pr>PEBL ( p=0.85 )

13 Experiments 20NewsGroups B-Pr>W-LR>S-EM ( p=0.3 ) B-Pr>W-LR>S-EM ( p=0.7 )

14 Conclusion A New Approach to Learning from Positive and Unlabeled Examples As effective as the state-of-the-art approaches Yet simpler and faster

15 Thank you Questions? Comments? Suggestions? ……


Download ppt "A Simple Probabilistic Approach to Learning from Positive and Unlabeled Examples Dell Zhang (BBK) and Wee Sun Lee (NUS)"

Similar presentations


Ads by Google