Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease Prediction Advisor : Dr. Hsu Presenter : Yu-San Hsieh Author : Carlos Ordonez 2006. CIKM.17-24

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2  Motivation  Objective  Method  Experiments  Conclusions Outline

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation  The mining association rules exits some questions in a medical data set ─ Irrelevant ─ Most relevant rules appear only at low support ─ The number of discovered rules becomes large at low support  The number of rules makes search slow and interpretation by the domain expert difficult.

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objective  We propose search constraints to find only medically significant association rules and make search more efficient.

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Method Medical dataset Transforming Search Constraints  Search constraints ─ User-specified maximum item-set size κ ─ group : A→g  group(A j ) = g j group(AGE)=0  AGE is not group-constrained group(AL)=1  AL is constrained to belong group 1 ─ group(attribute(a)) ≠ group(attribute(b)) (-1.0<= IL < 0.2) and (-1.0 <= LA < 0.2) are not in the same itemset ─ ac : A→C  ac(A j ) = c j ac(AGE) = 1  AGE is in antecedent ac(LAD) =2  LAD is in consequent Support confidence Phase 1 Phase 2 Phase 1 Phase 2 AGE  LAD

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 Experiments  The medical data set ─ 655 patients and 25 attribute (numeric and categorical) ─ Three basic elements for analysis Perfusion defect Coronary stenosis Risk fatocr ─ Default parameter setting Maximal itemset size κ=4 Minimum support = 1% Minimum confidence = 70% ─ Negation, ac and Group Association rules Decision tree

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 Conclusions  The decision tree are less effective than constrained association rule ─ Predict disease with several related target attribute ─ Low confidence factor ─ Slight overfitting ─ Rule complexity ─ Data set fragmentation

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 My opinion  Advantage ─ Producing medically useful rules, reducing the number of discovered rules and improving running time  Drawback ─ Lack of quantitative evaluation ─ Most of rules’ analysis  Application ─ Prediction ─ Classification

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 9 Method  Transformed to binary dimension ─ Numerical data: age 0< age <=40 and 40< age <=60 ─ Categorical data: sex sex = Male and sex = Female  First constraint ─ An attribute has negation Additional items are created and corresponding to each negated categorical value or each negated interval example: not(0 <= LM < 30), not(0 <= LAD <50), not(0 <= LCX <50)……

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 Experiments  Predictive association rule healthy diseased LCX LAD RCA

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 Experiments  Predictive Decision tree ─ Using the CN4.5 decision tree algorithm ─ Focused on predicting LAD disease (LAD ≧ 50 as the target class) ─ Result : maximal height = 3 Numeric dimensions and automatic splitsManually binned variable Confidence↓ , not useful Confidence↓


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease."

Similar presentations


Ads by Google