Download presentation
Presentation is loading. Please wait.
1
Classification with reject option in gene expression data Blaise Hanczar and Edward R Dougherty BIOINFORMATICS Vol. 24 no. 17 2008, pages 1889-1895
2
Outline Introduction Theory of classification with reject option Implementation for bioinformatics Result and Discussion
3
Introduction Our method consists of defining a rejection region, which contains the examples for which classification is ambiguous. These are rejected by the classifier. The accuracy of the classifier becomes a user- defined parameter of the classification rule. The task of the classification rule is to minimize the rejection region with the constraint that the error rate of the classifier be bounded by the chosen target error.
4
Theory of classification with reject option A classifier is a function which divides the feature space into two regions, R 1, R 2. Error rate: Accuracy of a classifier:
5
Theory of classification with reject option Bayes classifier: rejection region R reject : Rejection rate: Acceptance rate:
6
Theory of classification with reject option Two types of error: – –Conditional error : Basic properties:
7
Theory of classification with reject option The error rate decreases monotonically while the rejection rate increases. Different thresholds for each class : – Our approach : Given the conditional error rate of the classifier for each class, design a classifier with the smallest rejection rate.
8
Implementation for bioinformatics Classifier model: –A classifier can be defined via a discriminant function : d(x)≤0 implies f(x)=C 1 d(x)>0 implies f(x)=C 2 the distance represent the confidence of the classification.
9
Implementation for bioinformatics
10
Threshold selection –Minimize the rejection rate under the constraints –The problem can be formalized as an optimization problem with three constraint :
11
Implementation for bioinformatics t 1 =t 2, classifiers with no rejection option.
12
Implementation for bioinformatics –We propose an iterative procedure: For a give value, let the function give the value of t 1. For a give value, let the function give the value of t 2. Alternately minimize t 1 with respect to the constraint and maximize t 2 with respect the constraint. Until t 1 cannot be decreased and t 2 cannot be increased.
13
Implementation for bioinformatics Feature selection –For feature selection we adapt sequential forward search (SFS) to classification with reject option. –We select the feature providing the lowest rejection rate under the conditional error constraints and.
14
Result and Discussion Error rate of the classifier with no rejection option whose the target error rate is 0.1. Error rate of the classifier with no rejection option. Synthetic data
15
Result and Discussion Error rate of classifier using posterior probabilities.
16
Result and Discussion Number of selected features is fixed to 10.
17
Result and Discussion
18
The target error of each class is fixed to 0.05 Fisher discriminant Real data
19
Result and Discussion SVM with linear kernel
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.