Presentation is loading. Please wait.

Presentation is loading. Please wait.

HASAR : Mining Sequential Association Rules for Atherosclerosis Risk Factor Analysis Laurent Brisson, Nicolas Pasquier, Céline Hebert, Martine Collard.

Similar presentations


Presentation on theme: "HASAR : Mining Sequential Association Rules for Atherosclerosis Risk Factor Analysis Laurent Brisson, Nicolas Pasquier, Céline Hebert, Martine Collard."— Presentation transcript:

1 HASAR : Mining Sequential Association Rules for Atherosclerosis Risk Factor Analysis Laurent Brisson, Nicolas Pasquier, Céline Hebert, Martine Collard I3S Laboratory, University of Nice-Sophia Antipolis GREYC Laboratory, University of Caen

2 Contents 1. Analytic question & Objectives 2. Model & Data Preparation 3. Algorithms 4. Results

3 Analytic Question Are there any differences in the development of risk factors and other characteristics between men of the risk group, who came down with the observed cardiovascular diseases and those who stayed healthy ?

4 Objectives Evolution of Risk Factors according behavioural changes Groups RG versus PG and NG Healthy patients (NCVD) versus those with cardiovascular diseases (CVD) Groups based on patient education level and job

5 Sequential Rules IDE_itemset  BEH_time_itemset  RF_time_item IDE_itemset : static identification attributes Age of the patient Educational level of the patient Alcohol consumption at the beginning of the study

6 Sequential Rules IDE_itemset  BEH_time_itemset  RF_time_item BEH_time_itemset : behavioural change attributes Comsumption of cigarettes a day Physical activity after job Physical activity in a job Different kinds of diet Medecine for cholesterol Medecine for blood pressure

7 Sequential Rules IDE_itemset  BEH_time_itemset  RF_time_item RF_time_item : risk factor change attribute Cholesterol level HDL Cholesterol level LDL Cholesterol level Triglycerides level Obesity …

8 Model IDE_itemset  BEH_time_itemset  RF_time_item Action period where it occurs at least one control Latency period a waiting time before observing effects Observation period where it occurs only one control

9 Data Preparation : creation of changes variables BEH_OBESITYid Control N Entry.height = h Control.weight = w Control N+1 Entry.height = h Control.weight = w Unknown-2h or w = *h or w = Unknown Unknown-2h or w = Unknown h or w = * Stay_normal1w / h ² <= 25 Decreased2w / h ² > 25w / h ² <= 25 Increased3w / h ² <= 25w / h ² > 25 Stay_high4w / h ² > 25

10 Data Preparation : Flattening operation Initial table : 1 row  1 control ID Control ID Patient BEH 1 … BEH j RF 1 … RF k

11 Data Preparation : Flattening operation Flattened table : 1 row  1 patient ID Patient IDE 1 … IDE j BEH 1 … BEH k RF 1 … RF m … BEH 1 … BEH k RF 1 … RF m control 1control n static attributes

12 Evolutionnary Approach A Genetic Algorithm searching for temporal rules Fixed-length chromosome BehavioursIdentification Risk factor

13 Evolutionnary Approach A gene for each static identification attributes IDE 1Behaviours…IDE jRisk factor

14 Evolutionnary Approach A gene for each kind of behavioural changes Identification Risk factor Action period …BEH kBEH 1

15 Evolutionnary Approach One gene to describe a risk factor Identification RF iBehaviours Observation periodAction period

16 Evolutionnary Approach Identification RF iBehaviours Observation periodAction period Latency period Fitness function : support * confidence * lift

17 Genetic Algorithm Optimization A CLOSE based approach for initialization CLOSE algorithmCLOSE algorithm improves:   extraction efficiency reducing the search-space (use of generators and frequent close itemset)   results relevance suppressing redondant rules (bases generation)

18 Results : Patient classes comparison Best rules on PG versus NG and RGBest rules on PG versus NG and RG

19 Results : Patient classes comparison Best rules on CVD versus NCVDBest rules on CVD versus NCVD

20 Results : Initialization Methods Comparison on RG groupComparison on RG group

21 Conclusion Different tendencies among groups Confirmation of prior medical knowledge Contradictions with some "assumptions" Further investigations with assistance of medical experts

22 Future Researches To analyse relationships between time windows and various risk factors To Develop new evaluation criteria To Integrate physician ’ s prior knowledge To apply HASAR approach to other temporal datasets


Download ppt "HASAR : Mining Sequential Association Rules for Atherosclerosis Risk Factor Analysis Laurent Brisson, Nicolas Pasquier, Céline Hebert, Martine Collard."

Similar presentations


Ads by Google