Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Support Cluster Machine Paper from ICML2007 Read by Haiqin Yang 2007-10-18 This paper, Support Cluster Machine, was written by Bin Li, Mingmin Chi, Jianping.

Similar presentations


Presentation on theme: "1 Support Cluster Machine Paper from ICML2007 Read by Haiqin Yang 2007-10-18 This paper, Support Cluster Machine, was written by Bin Li, Mingmin Chi, Jianping."— Presentation transcript:

1 1 Support Cluster Machine Paper from ICML2007 Read by Haiqin Yang 2007-10-18 This paper, Support Cluster Machine, was written by Bin Li, Mingmin Chi, Jianping Fan, Xiangyang Xue, which was published in 2007.

2 2 Outline Background and Motivation Support Cluster Machine - SCM Kernel in SCM Experiments An Interesting Application: Privacy-preserving Data Mining Discussions

3 3 Background and Motivation Large scale classification problem Decomposition methods  Osuna et al., 1997;  Joachims, 1999;  Platt, 1999;  Collobert & Bengio, 2001;  Keerthi et al., 2001; Incremental algorithms  Cauwenberghs & Poggio, 2000;  Fung & Mangasarian, 2002;  Laskov et al., 2006; Parallel techniques  Collobert et al., 2001;  Graf et al., 2004; Approximate formula  Fung & Mangasarian, 2001;  Lee & Mangasarian, 2001; Choose representatives  Active learning - Schohn & Cohn, 2003;  Cluster Based-SVM - Yu et al., 2003;  Core Vector Machine (CVM) - Tsang et al., 2005;  Clustering SVM - Boley, D. & Cao, 2004;

4 4 Support Cluster Machine - SCM Given training samples: Procedure

5 5 SCM Solution Dual representation Decision function

6 6 Kernel Probability product kernel By Gaussian assumption, i.e., Hence

7 7 Kernel Property I That is Decision function Property II

8 8 Experiments  Datasets Toydata MNIST – Handwritten digits ( ‘0’-’9’ ) classification Adult – Privacy-preserving Dataset  Clustering algorithms Threshold Order Dependent (TOD) EM algorithm  Classification methods libSVM SVMTorch SVM light CVM (Core Vector Machine) SCM  Model selection  CPU: 3.0GHz

9 9 Toydata  Samples: 2500 samples/class generated from a mixture of Gaussian distribution  Clustering algorithm: TOD  Clustering results: 25 positive, 25 negative

10 10 MNIST  Data description 10 classes: Handwritten digits ‘0’-’9’ Training samples: 60,000, about 6000 for each class Testing samples: 10,000  Construct 45 binary classifiers  Results 25 Clusters for EM algorithm

11 11 MNIST  Test results for TOD algorithm

12 12 Privacy-preserving Data Mining  Inter-Enterprise data mining Problem: Two parties owning confidential databases wish to build a decision-tree classifier on the union of their databases, without revealing any unnecessary information.  Horizontally partitioned Records (users) split across companies Example: Credit card fraud detection model  Vertically partitioned Attributes split across companies Example: Associations across websites

13 13 Privacy-preserving Data Mining Randomization approach 50 | 40K |...30 | 70K |...... Randomizer Reconstruct distribution of Age Reconstruct distribution of Salary Data Mining Algorithms Model 65 | 20K |...25 | 60K |......

14 14 Classification Example

15 15 Privacy-preserving Dataset: Adult Data description Training samples: 30162 Testing samples: 15060 Percentage of positive samples: 24.78% Procedure Horizontally partition data into three subsets (parties) Cluster by TOD algorithm Obtain three positive and three negative GMMs Combine positive and negative GMMs into one positive and one negative GMMs with modified priors Classify them by SCM

16 16 Privacy-preserving Dataset: Adult Partition results Experimental results

17 17 Discussions Solved problems Large scale problems: downsample by clustering + classifier Privacy-preserving problems: hide individual information Differences to other methods Training units are generative model, testing units are vectors Training units contain complete statistical information Only one parameter for model selection Easy implementation Generalization ability is not clear, while the RBF kernel in SVM has the property of larger width leads to lower VC dimension.

18 18 Discussions  Advantages of using priors and covariances

19 19 Thank you!


Download ppt "1 Support Cluster Machine Paper from ICML2007 Read by Haiqin Yang 2007-10-18 This paper, Support Cluster Machine, was written by Bin Li, Mingmin Chi, Jianping."

Similar presentations


Ads by Google