Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague.

Similar presentations


Presentation on theme: "Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague."— Presentation transcript:

1 Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague

2 Discovery Challenge 20032 DEATH CAUSE PATIENTS% myocardial infarction 80 8020.6 coronary heart disease 33 33 8.5 8.5 stroke 30 30 7.7 7.7 other causes 79 7920.3 sudden death 23 23 5.9 5.9 unknown 8 2.0 2.0 tumorous disease 11429.3 general atherosclerosis 22 22 5.7 5.7 TOTAL389100.0

3 Discovery Challenge 20033 Data matrix ENTRY General characteristicsExaminationsVices Marital status Transport to a job Physical activity in a job Activity after a job Education Responsibility Age Weight Height Chest pain Breathlesness Cholesterol Urine Subscapular Triceps Alcohol Liquors Beer 10 Beer 12 Wine Smoking Former smoker Duration of smoking Tea Sugar Coffee

4 Discovery Challenge 20034 Analytic questions Are there strong relations concerning death cause?   General characteristics (?)  Death cause (?)   Examinations (?)  Death cause (?)   Vices(?)  Death cause (?)   Combinations (?)  Death cause (?)

5 Discovery Challenge 20035 Example of relation: founded implication A Cholesterol & Coffee(3 and more cups)  0.63;15 Death cause (tumorous disease) S S¬S¬S A15924 ¬ A99266365 114275389 63% of patients satisfying A satisfy also S there are 15 patients satisfying both A and S

6 Discovery Challenge 20036 Example of relation: above average A Age(  65)  + 0.76;15 Death cause (general atherosclerosis) S A Age(  65)  0.1;15 Death cause (general atherosclerosis) S S¬S¬S A15136151 ¬ A7231238 22275389 relative frequency of S: 22/389 = 0.057 relative frequency of S if A: 15/151 = 0.099 relative frequency of S if A is 76 per cent higher than the relative frequency of S there are 15 patients satisfying both A and S

7 Discovery Challenge 20037 Liquors(?) & Smoking(?)  + 0.55;15 Death cause(?) Alcohol(?) & Tea(?)  + 0.55;15 Death cause(?) Beer 12(?) & Wine(?)  + 0.55;15 Death cause(?) Liquors(?) & Smoking(?) & Coffee(?) & Beer 12(?)  + 0.55;15 Death cause(?) ?????  + 0.55;15 Death cause(?) Vices(?)  + 0.55;15 Death cause (?) For which combinations of vices is relative frequency of some death causes at least 55 per cent higher than relative frequency of the same death cause among all patients ? We require at least 15 patients with particular death cause satisfying both particular condition. Example of task

8 Discovery Challenge 20038 4ft-Miner application Vices(?)  + 0.55;15 Death cause (?) Vices(?) = Antecedent  + 0.75;15 Death cause(?)

9 Discovery Challenge 20039 Dealing with attributes An example – Age Predefined intervals length 10: Age<40,50), Age<50,60), …, Age <70,80) Predefined intervals length 5: Age<40,45), Age<45,50), … Age <70,75) Sliding window length 10 Sliding window length 5 Sliding window length 2

10 Discovery Challenge 200310 Sliding window length 5 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,....., 67, 68, 69, 70, 71, 72, 73, 74........... 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,....., 67, 68, 69, 70, 71, 72, 73, 74

11 Discovery Challenge 200311 Dealing with attributes An other example – Marital status Marital status(divorced) – 39 patients Marital status(single) – 28 patients 81.5 % 10.0 %7.2 % 1.3 %

12 Discovery Challenge 200312 Dealing with attributes Some further examples Predefined intervals, sliding windows   Cholesterol   Subscapular   Height,   Weight,   … Particular values   Activity after job   Physical activity in a job   Education   Transport   Responsibility   …

13 Discovery Challenge 200313 4ft-Miner result example Beer 12(yes) & Vine(yes)  + 0.55;15 Death cause (tumorous disease)

14 Discovery Challenge 200314 Tasks: Antecedent  Death cause (?) Antecedent  rulesverifications General characteristics (9 attributes)  0.5;15 670 422  + 0.75;15 358 685 Examinations (6 attributes)  0.5;15 1 5 754  + 0.5;15 5 16 836 Vices (5 attributes)  0.5;15 0 22 755  + 0.55;15 9 20 610 Combinations 1 general + 1 other  0.5;15 11186 690  + 0.75;15 22294 288 Solution time in all cases ≤ 8 sec Intel Pentium on 3Ghz, 512 MB RAM

15 Discovery Challenge 200315 Conclusions   Only 389 patients with death code   Some potentially interesting rules   Fast work with 4ft-Miner   Possibility of tuning work with attributes predefined intervals, sliding windows …


Download ppt "Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague."

Similar presentations


Ads by Google