Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining for Patterns Based on Contingency Tables by KL-Miner First Experience Jan Rauch Milan Šimůnek (PhD. student) Václav Lín (student) University of.

Similar presentations


Presentation on theme: "Mining for Patterns Based on Contingency Tables by KL-Miner First Experience Jan Rauch Milan Šimůnek (PhD. student) Václav Lín (student) University of."— Presentation transcript:

1 Mining for Patterns Based on Contingency Tables by KL-Miner First Experience Jan Rauch Milan Šimůnek (PhD. student) Václav Lín (student) University of Economics Prague

2 FDM … KL-Miner, First Experience KL-Miner Basic features Application example Implementation principles Scalability Concluding remarks

3 FDM KL-Miner -- Data and Patterns M A1A1 A2A2 …APAP o1o1 212…1 o2o2 15…4 … ………… onon 39…2 Data: Data Matrix Patterns i.e. KL-hypothesis: R C / row attribute R {A 1, …, A P }, possible values i.e. categories: r 1, …, r K column attribute C {A 1, …, A P }, possible values i.e. categories: c 1, …, c L Boolean attribute derived from other attributes A 1, …, A P KL quantifier …. Condition imposed on contingency table of R and C

4 FDM KL – quantifiers Contingency table of R and C: Examples of quantifiers: Simple aggregate function: Kendalls quantifier: e.g. | b | P

5 FDM Kendalls quantifier b 0;1 b > 0 … positive ordinal dependence b < 0 … negative ordinal dependence b = 0 … ordinal independence | b | = 1 … C is a function of R Kendalls quantifier: e. g. | b | p or | b | p :Kendalls coeficient:

6 FDM KL-Miner application example STULONG Project, 1419 patients, entry examination See

7 FDM STULONG attributes examples (1) Systolic blood pressure Smoking Group of patients

8 FDM STULONG attributes examples (2) Skinfold above musculus triceps (mm) Beer – amount / day 219 attributes total 38 ordinal attributes We use 17 ordinal attributes

9 FDM Example - analytic question Are there any ordinal dependencies among attributes under some conditions? at least 50 patients | b | 0.75 relevant conditions :

10 FDM Example – relevant condition specification (1) Group of patients (normal), Group of patients (risk), … Beer 10(yes), Beer 12(yes), …, Beer 10(yes) Beer 12(yes) Sliding windows …

11 FDM Example – relevant condition specification (2) 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15,....., 43, 44, 45, 46, 47, 48, 49, , 5, 6, 7, 9, 10, 11, 12, 13, 14, 15,....., 43, 44, 45, 46, 47, 48, 49, 50 Sliding window

12 FDM Example – output overview 2 min 1sec verifications 25 hypotheses 3.06 GHz 512 MB DDR SDRAM

13 FDM Example – output detail (1) b = 0.82 (i.e. strong positive ordinal dependence)

14 FDM Example – output detail (2) b = 0.78 (i.e. strong positive ordinal dependence)

15 FDM Implementation principles (1) M A1A1 A2A2 …APAP A 1 [1]A 1 [2]A 1 [3] o1o1 212…1010 o2o2 15…4100 …………………… onon 39…2 001 AttributesCards of categories of A 1 Attributes are represented by cards of categories i.e. strings of bits

16 FDM Implementation principles (2) CARD [ ] = bit string representation of Booelan attribute CARD [ Group of patients (normal) Beer 10(yes) Beer 12(yes) ] = Group of patients [normal] Beer 10[yes] Beer 12[yes] Count( ) – number of 1 in the bit string

17 FDM Implementation principles (3) n 1,1 = Count( R[ r 1 ] C[c 1 ] CARD [ ])

18 FDM Scalability verifications approximately linear

19 FDM Concluding remarks KL-Miner practically interesting results Suitable for interactive work Further quantifiers Combinations with further mining procedures


Download ppt "Mining for Patterns Based on Contingency Tables by KL-Miner First Experience Jan Rauch Milan Šimůnek (PhD. student) Václav Lín (student) University of."

Similar presentations


Ads by Google