Presentation is loading. Please wait.

Presentation is loading. Please wait.

Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data.

Similar presentations


Presentation on theme: "Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data."— Presentation transcript:

1 Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data Source : Computers in Biology and Medicine, 2016, 73:38-46 Authors : Ali S, Majid A, Javed SG Speaker : Jiefan Tan Date : 2017/4/6 巴基斯坦应用和科学学院

2 Outline Introduction Related Work Proposed method Experiment
Conclusion

3 Introduction(1/3) Traditional machine learning techniques assume that all classification errors have the same cost and try to minimize the number of errors rather than the total cost. While in real-world applications, errors often have quite different cost. And there are a large number of imbalanced datasets in practical application.

4 Introduction(2/3)- Different cost
Traditional classification Mental pressure May die Healthy Cancer

5 Introduction(3/3) Dataset = * 1+ *99 overall accuracy rate = 99%
accuracy of the minority class = 0 classification * 100 May die

6 Cost-Sensitive Algorithm
Related Work Technology Methods 代表算法 Rescaling Thresholding ETA Sampling BFKO,ADSNNHRS Weighting C4.5CS Cost-Sensitive Algorithm Cost-Sensitive Decision Tree C4.5CS,CBDSDT Cost-Sensitive Neural Networks CSBNN,SDAE Cost-Sensitive SVM CSSVM,CISVM Ensemble Algorithm Bagging Boosting AdaBoost gentleBoost Evaluation Criteria Based on Cost Matrix Precision,Recall,F-value,G-mean Curve, Chart ROC,AUC, Cost-Curve 

7 Proposed method(1/6)-Cost matrix
Cost matrix to evaluate two-class problem is shown in Table In the table, we use the notation C(i,j) to represent the misclassification cost of classifying an instance from its actual class j into the predicted class i. Actual negative Actual positive Predict negative C(0,0), or TN C(0,1), or FN Predict positive C(1,0), or FP C(1,1), or TP

8 Proposed method(2/6)-CSC(1/2)
The expect cost R(i|x) of classifying an instance x into class i (by a classifier) can be expressed as: If P(j|x)>0.5 then x -> class j

9 Proposed method(3/6)-CSC(2/2)
The classifier will classify an instance x into positive class if and only if :

10 Proposed method(4/6)-Boosting
Training set …… …… Boosting Sample 1 Boosting Sample K Boosting Sample T Classifier 1 Classifier K Classifier T Boosting Ensemble Classifier Voting results

11 Decision tree as base learners
Proposed method(5/6) Extract Features Decision tree as base learners Ensemble Classifier

12 Proposed method(6/6)-Evaluation Criteria
Symbol Calculate method Precision TP/(TP+FP) Recall (Sensitivity)Sp TP/(TP+FN) Specificity-Sn TN/(TN+FP) G-mean

13 Experiment

14 Conclusion Imbalanced data classification problem has always been one of the important research issues in machine learning field. The proposed Can-CSC-GBE system has effectively reduced the misclassification costs and thereby improved the overall classification performance.

15 Thank you !


Download ppt "Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data."

Similar presentations


Ads by Google