Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal.

Similar presentations


Presentation on theme: "1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal."— Presentation transcript:

1 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal

2 2 Outline Introduction –ILP –Examples –Motivation Experiments Conclusions Future Work

3 3 Introduction EELA selected application Task 3.3: additional applications

4 4 Introduction What is ILP? –It is NOT Instruction Level Parallelism –It is NOT Integer Linear Programming So, what is it????.......

5 5 Introduction It is Inductive Logic Programming –data mining –machine learning –Knowledge/information extraction Where: –Given: Set of observations (positive and negative) Background knowledge (descriptions) Language bias –Find: A hypothesis (in first order language) that best explains all positive observations and none of the negatives.

6 6 Introduction Advantages: –Use of an understandable description language –Relational knowledge

7 7 Introduction: example TRAINS GOING EASTTRAINS GOING WEST

8 8 Introduction: example short(car_12). closed(car_12). long(car_11). long(car_13). short(car_14). open_car(car_11). open_car(car_13). open_car(car_14). shape(car_11,rectangle). shape(car_12,rectangle). shape(car_13,rectangle). shape(car_14,rectangle). load(car_11,rectangle,3). load(car_12,triangle,1). load(car_13,hexagon,1). load(car_14,circle,1). wheels(car_11,2). wheels(car_12,2). wheels(car_13,3). wheels(car_14,2). has_car(east1,car_11). has_car(east1,car_12). has_car(east1,car_13). has_car(east1,car_14).

9 9 Introduction: example TRAINS GOING EASTTRAINS GOING WEST

10 10 Introduction: example eastbound(T) IF has_car(T,C) AND short(C) AND closed(C) TRAINS GOING EASTTRAINS GOING WEST

11 11 Another less “toyish” example: extracting knowledge from mammograms is_malignant(A) if 'BIRADS_category'(A,b5), 'MassPAO'(A,present), 'Age'(A,age6570), previous_finding(A,B,C), 'MassesShape'(B,none), 'Calc_Punctate'(B,notPresent), previous_finding(A,C), 'BIRADS_category'(C,b3). This rule states that finding (A) IS malignant IF it is: classified as BI-RADS 5 AND had a mass present in a patient who: was between the ages of 65 and 70 had two prior mammograms (B, C) and prior mammogram (B): had no mass shape described had no punctate calcifications and prior mammogram (C) was classified as BI-RADS 3

12 12 Introduction: Motivation Applications: –Link discovery –Social Network Analysis –Equivalent identities –Drug design –Protein unfolding –Protein metabolism –Why not? Classifying grid failures ( ) –And...many others!

13 13 Introduction: Motivation Why does ILP need a grid? –Search space can become large very quickly –Need many experiments to have statistical significant results Cross-validation Training, tuning, testing –Can combine classifiers: ensembles

14 14 Introduction: Motivation Assume we want to run a task for one domain: find a “good” hypothesis that describes pos examples Assume we run 5x4-fold cross-validation Assume we have 100 classifiers per fold # of experiments: 2,000

15 15 Introduction: Motivation Now assume each experiment takes 1 hour to run How long would it take to generate the 2,000 classifiers to be combined? ~ 83 days!!! If we consider varying learning parameters and learning algorithms, this number can be really big!!

16 16 Experiment Predict carcinogenecity in rodents –Difficult task –large search space! –Important problem Phase 1: –Tuning using 5x4-fold cross-validaton –Generating ensembles up to 100 Aleph: well-known ILP system Yap: Yet another prolog

17 17 Experiment: one of the classifiers active(A) if atom(A,_,n,32,B), B ≤ -0.401, has_property(A,cytogen_sce,n), methyl(A,_). Sister Chromatid Exchange (SCE) SCE is used for the determination of mutagenity

18 18 Experiment 2 submissions: –From LA –From EU

19 19 Submitting jobs from LA....

20 20 Experiment EELA resources utilised Resource# of jobs CERN1,160 CIEMAT279 CETA-CIEMAT173 UniCan98 LIP10 INFN38 UNAM16 BIOF.UFRJ159 IF.UFRJ8 UFCG28 Total1,969 ~ 300 resources in LA 211 jobs in LA

21 21 Experiments Why 1,969 out of 2,000??? 2 reasons: –Proxy expiration: On submission (takes loooooong!!!) On execution –Use of dynamic libraries

22 22 Submitting jobs from EU... from a non-EELA site, BUT Using the EELA VO: –Jobs run only on EU resources... Reasons: –Misconfiguration? –Closer brokers with more machines?

23 23 Conclusions Happiness: EELA is working!!! We can run thousands of experiments! Frida is happy!!! (see Condor introductory tutorials, if you feel curious about Frida ) Experiment showed good utilization of EELA resources in LA and EU Low failure rate (1%) Failures motivated by: –Dynamic libs not available in the remote machine –Proxy expiration

24 24 Future work More detailed analysis of jobs and logs Full ILP experiment More domains Other kinds of experiments based on Statistical Relational Learning And, do not forget: ILP can help to model and diagnose errors in the grid environment!

25 25 Collaborators Fernando Silva (DCC-UPorto) Vítor Santos Costa (DCC-UPorto) Rui Camacho (FE-UPorto) Nuno Fonseca (IBMC/IBMEC, Porto) Beth Burnside (UW-Madison hospital) David Page (UW-Madison) Jesse Davis (UWashington)

26 26 Thanks!!! Questions??


Download ppt "1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal."

Similar presentations


Ads by Google