Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct. 2005.

Similar presentations


Presentation on theme: "ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct. 2005."— Presentation transcript:

1 ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct. 2005

2 ICONIP 2005 Outline Background Background –Classifiers »Discriminative classifiers: Support Vector Machines »Generative classifiers: Naïve Bayesian Classifiers Motivation Motivation Discriminative Naïve Bayesian Classifier Discriminative Naïve Bayesian Classifier Experiments Experiments Discussions Discussions Conclusion Conclusion

3 ICONIP 2005 Background Discriminative Classifiers Discriminative Classifiers –Directly maximize a discriminative function or posterior function –Example: Support Vector Machines SVM

4 ICONIP 2005 Background Generative Classifiers Generative Classifiers –Model the joint distribution for each class P(x|C) and then use Bayes rules to construct posterior classifiers P(C|x), C : class label, x: features. –Example: Naïve Bayesian Classifiers »Model the distribution for each class under the assumption: each feature of the data is independent of others features, when given the class label. Constant w.r.t. C Combining the assumption

5 ICONIP 2005 Background Comparison Comparison Example of Missing Information: From left to right: Original digit, 50% missing digit, 75% missing digit, and occluded digit.

6 ICONIP 2005 Background Why Generative classifiers are not accurate as Discriminative classifiers? Why Generative classifiers are not accurate as Discriminative classifiers? Training set subset D1 labeled as Class 1 subset D2 Labelled as Class 2 Estimate distribution P1 to approximate D1 Estimate distribution P2 to approximate D2 Construct Bayes rule for classification 1. It is incomplete for generative classifiers to just approximate the inner-class information. 2. The inter-class discriminative information between classes are discarded Scheme for Generative classifiers in two-category classification tasks Needed!

7 ICONIP 2005 Background Why Generative Classifiers are superior to Discriminative Classifiers in handling missing information problems? Why Generative Classifiers are superior to Discriminative Classifiers in handling missing information problems? –SVM lacks the ability under the uncertainty –NB can conduct uncertainty inference under the estimated distribution. A is the feature set T is the subset of A, which is missing A-T is thus the known features

8 ICONIP 2005 Motivation It seems that a good classifier should combine the strategies of discriminative classifiers and generative classifiers. It seems that a good classifier should combine the strategies of discriminative classifiers and generative classifiers. Our work trains one of the generative classifier: Naïve Bayesian Classifier in a discriminative way. Our work trains one of the generative classifier: Naïve Bayesian Classifier in a discriminative way.

9 ICONIP 2005 Interaction is needed!! Discriminative Naïve Bayesian Classifier Training set Sub-set D1 labeled as Class I Sub-set D2 labeled as Class 2 Estimate the distribution P1 to approximate D1 Estimate the distribution P2 to approximate D2 Use Bayes rule for classification Working Scheme of Naïve Bayesian Classifier Mathematic Explanation of Naïve Bayesian Classifier Easily solved by Lagrange Multiplier method

10 ICONIP 2005 Discriminative Naïve Bayesian Classifier (DNB) Optimization function of DNB Optimization function of DNB On one hand, the minimization of this function tries to approximate the dataset as accurately as possible. On the other hand, the optimization on this function also tries to enlarge the divergence between classes. Optimization on joint distribution directly inherits the ability of NB in handling missing information problems Divergence item

11 ICONIP 2005 Discriminative Naïve Bayesian Classifier (DNB) Complete Optimization problem Complete Optimization problem Nonlinear optimization problem under linear constraints.

12 ICONIP 2005 Discriminative Naïve Bayesian Classifier (DNB) Solve the Optimization problem Solve the Optimization problem –Using Rosen Gradient Projection methods

13 ICONIP 2005 Discriminative Naïve Bayesian Classifier (DNB) Gradient and Projection matrix Gradient and Projection matrix

14 ICONIP 2005 Extension to Multi-category Classification problems

15 ICONIP 2005 Experimental results Experimental Setup Experimental Setup –Datasets »4 benchmark datasets from UCI machine learning repository –Experimental Environments »Platform:Windows 2000 »Developing tool: Matlab 6.5

16 ICONIP 2005 Without information missing  Observations –DNB outperforms NB in every datasets –DNB wins in 2 datasets while it loses in the other 2 datasets in comparison with SVM –SVM outperforms DNB in Segment and Satimages

17 ICONIP 2005 With information missing Scheme Scheme –DNB uses to conduct inference when there is information missing –SVM sets 0 values to the missing features (the default way to process unknown features in LIBSVM) …………..(5)

18 ICONIP 2005 With information missing Error Rate in Iris with missing information Setup : Randomly discard features gradually from a small percentage to a big percentage Error Rate in Vote with missing information

19 ICONIP 2005 With information missing Error Rate in Satimage with missing informationError Rate in DNA with missing information

20 ICONIP 2005 Summary of Experiment Results 1. Observations  NB demonstrates a robust ability in handling missing information problems.  DNB inherits the ability of NB in handling missing information problems while it has a higher classification accuracy than NB  SVM cannot deal with missing information problems easily.

21 ICONIP 2005 Discussion Can DNB be extended to general Bayesian Network (BN) Classifier? Can DNB be extended to general Bayesian Network (BN) Classifier? –Structure learning problem will be involved. Direct application of DNB will encounter difficulties since the structure is non-fixed in restricted BNs. –Finding optimal General Bayesian Network Classifiers is an NP-complete problem. Discriminative training on constrained Bayesian Network Classifier is possible… Discriminative training on constrained Bayesian Network Classifier is possible…

22 ICONIP 2005 Conclusion We develop a novel model named Discriminative Naïve Bayesian Classifiers We develop a novel model named Discriminative Naïve Bayesian Classifiers –It outperforms Naïve Bayesian Classifier when no information is missing –It outperforms SVMs in handling missing information problems.


Download ppt "ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct. 2005."

Similar presentations


Ads by Google