Presentation is loading. Please wait.

Presentation is loading. Please wait.

Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,

Similar presentations


Presentation on theme: "Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,"— Presentation transcript:

1 Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario, Canada Joint work with Victor Sheng, Qiang Yang, …

2 Outline Introduction Cost-sensitive decision trees Test strategies Sequential Test Single Batch Test Sequential Batch Test Conclusions and future work

3 Outline Introduction Cost-sensitive decision trees Test strategies Sequential Test Single Batch Test Sequential Batch Test Conclusions and future work

4 Everything has a cost/benefit! Materials, products, services Disease, working/living condition, waiting, … Happiness, love, life, … Money, Sex and Happiness: An Empirical Study, by David G. Blanchflower & Andrew J. Oswald, in Journal The Scandinavian Journal of Economics. 106:3, Pages: Lasting/happy marriage is worth about $100,000 in happiness Utility-based learning: optimization; unifies many issues & is ultimate goal

5 Everything has a cost/benefit! In medical diagnosis… Tests have costs: temperature ($1), X-ray ($30), biopsy ($900) Diseases have costs: flu ($100), diabetes (100k), cancer (10 8 ) Misdiagnosis has (different) costs Cost of false alarm ($500) << cost of missing a cancer ($500,000) Doctors: balance the cost of tests and misdiagnosis Our goal: to minimize the total cost Many other similar applications… Model this process Cost-sensitive learning Intelligent test strategies PatientTest 1Test 2…Test nCancer? (Cost)$1$30...$900 FP/FN= 100/300k 00139Low…High Med…? ?…?0 ……………… New1?Med…??

6 Review of Previous Work Cost-sensitive learning: a survey (Turney 2000) Active research, also for imbalanced data problem CS meta learning (wrapper): thresholding, sampling, weighting, … CS learning algorithms. CSNB, our CS trees …but all consider misclassification costs only Some work considers test costs only A few previous works consider both test costs and misclassification costs (Turney 1995, Zubek and Dietterich 2002, Lizotte et al 2003); all computationally expensive

7 Review of Previous Work Active learning: actively seeking for extra info Pool-based: a pool of unlabeled examples, which ones to label Membership query: Is this instance positive? Feature value acquisition During training. But “missing is useful!” During testing: our work Human learning is active in many ways

8 Review of Previous Work Diagnosis: wide applications in medicine, mechanical systems, software, … Most previous AI-based diagnosis systems… Manually built (partially) Does not incorporate costs/benefit Cannot actively suggest the processes Our work: cost-sensitive and active; useful for diagnosis and policy setting

9 Outline Introduction Cost-sensitive decision trees Test strategies Sequential Test Single Batch Test Sequential Batch Test Conclusions and future work

10 Cost-sensitive Decision Tree PatientTest 1Test 2…Test nCancer? (Cost)$1$30...$900 FP/FN= 100/300k 00139Low…High Med…? ?…?0 ……………… 1 T1 T6 0 0 T2 T3 10 LowMed <36 >= a c b Advantages: tree structure, comprehensiblity Objective: minimizing the total cost of tests and misclassification.

11 Attribute Splitting Criteria Previous methods: C4.5 reduces the entropy (randomness), performs badly on cost sensitive tasks New (ICML’04): we reduce the total expected cost E E3E2E Choose T such that E – (E1+E2+E3) is max C C3C2C Choose T such that C – (C1+C2+C3+C_Test) is max

12 Case Study: Heart Disease Predict coronary artery disease Class 0: less than 50% artery narrowing; Class 1: more than 50% artery narrowing ~300 patients, collected from hospitals 13 non-invasive tests on patients

13 13 Tests (Heart Disease) TestsCostsMeaning age$1age of the patient sex$1sex cp$1chest pain type trestbps$1resting blood pressure chol$7.27cholesterol in mg/dl fbs$5.20fasting blood sugar restecg$15.50resting electrocardiography results thalach$102.90maximum heart rate thal$102.90maximum heart rate reached exang$87.30exercise induced angina oldpeak$87.30ST depression induced by exercise slope$87.30slope of the peak exercise ST segment ca$100.90number of major vessels colored by fluoroscopy

14 Cost-sensitive tree for Heart Disease thal ($102.9) fbs ($5.2) restecg ($15.5) sex ($1) chol ($7.27) 0 cp ($1) 0 slope ($87.3) restecg ($15.5) age ($1) thal ($102.9) Naturally prefer tests with small cost Balance cost and discriminating power Local heart-failure specialist thinks this tree is reasonable.

15 Considering Group Discount TestsCostsMeaning age$1age of the patient sex$1sex cp$1chest pain type trestbps$1resting blood pressure chol$7.27cholesterol in mg/dl fbs$5.20fasting blood sugar restecg$15.50resting electrocardiography results thalach$102.90maximum heart rate thal$102.90finishing heart rate exang$87.30exercise induced angina oldpeak$87.30ST depression induced by exercise slope$87.30slope of the peak exercise ST segment ca$100.90number of major vessels colored by fluoroscopy Discount: $2.10 Discount: $ Discount: $86.30

16 thal ($102.9) fbs ($5.2) restecg ($15.5) sex ($1) chol ($7.27) 0 cp ($1) 0 slope ($87.3) restecg ($15.5) age ($1) thal ($102.9) thal ($102.9) fbs ($5.2) restecg ($15.5) sex ($1) chol ($7.27) 0 cp ($1) 0 slope ($87.3) thalach ($1) age ($1) thal ($102.9) individual cost: $102.9 Before After Different trees without/with group discount

17 Algorithm of Cost-sensitive Decision Tree CSDT(Examples, Attributes, TestCosts) If all examples are positive, return root with label=+ If all examples are negative, return root with label=- If maximum cost reduction <0, return root with label according to min(P  TP+ N  FP, N  TN+ P  FN) Let A be an attribute with maximum cost reduction root  A Update TestCosts if discount applies For each possible value v i of the attribute A Add a new branch A=v i below root Segment the training examples Example_v i into the new branch Call CSDT(examples_v i, Attributes-A, TestCosts) to build subtree

18 Outline Introduction Cost-sensitive decision trees Test strategies Sequential Test Single Batch Test Sequential Batch Test Conclusions and future work

19 PatientTest 1Test 2…Test nCancer? (Cost)$1$30...$900 FP/FN= 100/300k 00139Low…High Med…? ?…?0 ……………… 1 T1 T6 0 0 T2 T3 10 LowMed <36 >= a c b New1??…?? Three categories of intelligent test strategies 1.Sequential Test: one test, wait, … then predict 2.Single Batch Test: one batch of tests, then predict 3.Sequential Batch Test: batch 1, batch 2, … then predict Minimize total cost of tests and misclassification, not trivial Our methods: utilizing the minimum-cost tree structure

20 Outline Introduction Cost-sensitive decision trees Test strategies Sequential Test Single Batch Test Sequential Batch Test Conclusions and future work

21 Sequential Test Use tree structure to guide test sequence “Optimal” because tree is (locally) optimal

22 Sequential Test thal ($102.9) fbs ($5.2) restecg ($15.5) sex ($1) chol ($7.27) 0 cp ($1) 0 slope ($87.3) thalach ($1) age ($1) thal ($102.9)

23 Experimental Comparison Using 10 datasets from UCI No. of AttributesNo. of ExamplesClass dist. (N/P) Ecoli /102 Breast /239 Heart816198/163 Thyroid /238 Australia /357 Tic-tac-toe /626 Mushroom /3916 Kr-vs-kp /1669 Voting /124 Cars /118

24 Comparing Sequential Test Eager learning: Sequential Test (OST) (ICML’04) Lazy learning: Lazy Sequential Test (LazyOST) (TKDE’05) Cost-sensitive Naïve Bayes (CSNB) (ICDM’04)

25 Outline Introduction Cost-sensitive decision trees Test strategies Sequential Test Single Batch Test Sequential Batch Test Conclusions and future work

26 Single Batch Test Only one batch – not an easy task If too few, important tests not requested; prediction is not accurate; total cost high If too many, some tests are wasted; total cost high The test example may not be classified by a leaf

27 Single Batch Test Expected cost reduction: if a test is done, what are the possible outcomes and cost reduction R(.): all reachable unknown nodes and leaves i j3j2j

28 Single Batch Test A*-like search algorithm Form a candidate list (L) and a batch list (B) Choose a test with maximum positive expected cost reduction from L, add it to B Update L: add all reachable unknowns to L Efficient with tree structure until expected cost reduction is 0

29 L = empty /* list of reachable and unknown attributes */ B = empty /* the batch of tests */ u = the first unknown attribute when classifying a test case Add u into L Loop For each i L, calculate E(i): E(i)= misc(i) – [c(i) + ] E(t) = max E(i) /* t has the maximum cost reduction */ If E(t) > 0 then add t into B, delete t from L, add r(t) into L else exit Loop /* No positive cost reduction */ Until L is empty Output B as the batch of tests Single Batch Test

30 thal ($102.9) fbs ($5.2) restecg ($15.5) sex ($1) chol ($7.27) 0 cp ($1) 0 slope ($87.3) thalach ($1) age ($1) thal ($102.9) ] Single Batch Test

31 thal ($102.9) fbs ($5.2) restecg ($15.5) sex ($1) chol ($7.27) 0 cp ($1) 0 slope ($87.3) thalach ($1) age ($1) thal ($102.9) ] Single Batch Test cp is unknown. cp has positive expected cost reduction. cp is added to the batch. cp’s reachable unknown nodes are added into the candidate list.

32 thal ($102.9) fbs ($5.2) restecg ($15.5) sex ($1) chol ($7.27) 0 cp ($1) 0 slope ($87.3) thalach ($1) age ($1) thal ($102.9) ] From the candidate list, choose one with maximum positive expected cost reduction. Add it to the batch, and update the candidate list. Repeat. After 7 steps, expected cost reduction is 0. Single Batch Test

33 thal ($102.9) fbs ($5.2) restecg ($15.5) sex ($1) chol ($7.27) 0 cp ($1) 0 slope ($87.3) thalach ($1) age ($1) thal ($102.9) ] Single Batch Test Do all tests in the batch

34 thal ($102.9) fbs ($5.2) restecg ($15.5) sex ($1) chol ($7.27) 0 cp ($1) 0 slope ($87.3) thalach ($1) age ($1) thal ($102.9) ] Predict by internal node Single Batch Test Make a prediction. Some tests are wasted.

35 Comparing Single Batch Tests Naïve Single Batch (NSB) (ICML’04) Cost-sensitive Naïve Bayes Single Batch (CSNB-SB) (ICDM’04) Greedy Single Batch (GSB) (TKDE’05) Single Batch Test (OSB) (TKDE’05)

36 Outline Introduction Cost-sensitive decision trees Test strategies Sequential Test Single Batch Test Sequential Batch Test Conclusions and future work

37 Sequential Batch Batch 1, batch 2, …, prediction Must include the cost of waiting in tests Wait cost of a batch: maximum wait cost in the batch Less than the sum Combines Sequential Test and Single Batch Test If all waiting costs =0, it becomes Sequential Test If all waiting costs very large, Single Batch

38 Sequential Batch The wait cost is derived from wait time agesexcptrestbps cho l fbs res tec g tha lac h exa ng oldpe k slop e ca tha l Test wait time in hours

39 Sequential Batch Extending the Single Batch to include the batch cost An additional constraint: cumulative ROI No more batches!

40 Loop L = empty /* list of reachable and unknown attributes */ B = empty /* the batch of tests */ u = the first unknown attribute when classifying a test case Add u into L Loop For each i L, calculate E(i): E(i)= misc(i) – [c(i) + ] E(t) = max E(i) /* t has the maximum cost reduction */ If E(t) > 0 & ROI increases then add t into B, delete t from L, add r(t) into L else exit Loop /* No positive cost reduction */ Until L is empty If (B is not empty) then Output B as the current batch of tests; obtain their values at a cost Classify the test example further, until encountering another unknown test Else exit the first Loop Sequential Batch

41 Comparing Sequential Batch Test

42 Outline Introduction Cost-sensitive decision trees Test strategies Sequential Test Single Batch Test Sequential Batch Test Conclusions and future work

43 Future Work Deal with different test examples differently Consider more costs: acquiring new examples If $10 for each new example, how many do I need? For $10, tell me if this patient has cancer If test is not accurate (e.g. 90%), how to build trees and how to do tests (will I do it again)? From cost-sensitive trees, derive medical policy for expensive/risky or cheap/effective tests

44 Conclusions Cost-sensitive decision tree: effective for learning with minimal total cost Can be used to model learning from data with costs Design and compare various test strategies Sequential Test: one test, wait, …: low cost but long wait Single Batch Test: one batch of tests: quick but higher cost Sequential Batch Test: batch, wait, batch, …: best tradeoff Our methods perform better than previous ones Can be readily applied to real-world diagnoses

45 References C.X. Ling, Q. Yang, J. Wang, and S. Zhang. Decision Trees with Minimal Costs. ICML'2004. X. Chai, L. Deng, Q. Yang, and C.X. Ling. Test-Cost Sensitive Naive Bayes Classification. ICDM'2004. C.X. Ling, S. Sheng, Q. Yang. “Intelligent Test Strategies for Cost- sensitive Decision Trees. IEEE TKDE, to appear, S. Zhang, Z. Qin, C.X. Ling, S. Sheng. "Missing is Useful": Missing Values in Cost-sensitive Decision Trees. IEEE TKDE, to appear, Turney, P.D Types of cost in inductive concept learning. Workshop on Cost-Sensitive Learning at ICML’2000. Zubek, V.B., and Dietterich, T Pruning improves heuristic search for cost-sensitive learning. ICML’2002. Turney, P.D Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. JAIR, 2: Lizotte, D., Madani, O., and Greiner R Budgeted Learning of Naïve-Bayes Classifiers. In Uncertainty in AI.


Download ppt "Active Cost-sensitive Learning (Intelligent Test Strategies) Charles X. Ling, PhD Department of Computer Science University of Western Ontario, Ontario,"

Similar presentations


Ads by Google