Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Gustavo Miranda Teixeira Ricardo Silva Campos Laboratório de Fisiologia Computacional.

Similar presentations


Presentation on theme: "Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Gustavo Miranda Teixeira Ricardo Silva Campos Laboratório de Fisiologia Computacional."— Presentation transcript:

1 www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Gustavo Miranda Teixeira Ricardo Silva Campos Laboratório de Fisiologia Computacional - UFJF Itacuruça (Brazil), 2-15 November 2008 Heart Simulator Final Report

2 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 Introduction InvCell –Genetic algorithm to calculate parameters of a set of equations, which simulate the cardiac cells Heart Simulator –Forward Problem –Solves the bidomain equations, which calculate the variance of voltage in cardiac tissue InvTissue –GA to estimate parameters to the bidomain equation

3 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 InvCell This application was implemented using MPI; Cluster with 6 nodes; The first step was modify the algorithm strategy;

4 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 InvCell Old strategy of parallelism: –Synchronous –The master splits all individual equally among the slaves Slaves Master Slaves Master Individuals waiting for a free slave Master I finish, send me more work! The new one –Asynchronous The more efficient slaves work more

5 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 InvCell Why asynchronous instead of synchronous? –Synchronous method works well in a cluster using computers equal to each other; –The grid infra-structure is very heterogeneous; –So, asynchronous method avoids wasting of computational resources; –The master and another slaves don't have to wait the slowest slave finishes its task.

6 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 InvCell During the porting... –We had problems to compile the application in the user interface:  Lots of libraries;  We had compiled statically; The static executable file is very big; It takes more time to the grid dealing with it;  So our tutor installed the libraries in the working nodes.

7 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 InvCell We have fixed some problems in the code; Next step: –Put all parameters in the JDL file. Problems: –But some parameters usually had been lost. –Access permission to parameter files. Solutions: –using a script to copy and register parameter files in SE; –using a script to copy the executable file to each machine used in execution; –change some access permissions. The forward problem had the same problems.

8 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 The Simulator Porting We already tried porting to GILDA before the E2gris1 Create a JDL file with Job Type MPICH and the number of nodes to run The job was submited successfully but status remained “running” until proxy expired Same thing tried in e2gris1 –Same problem happend

9 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 The Simulator Porting Tutors showed us where the problem could be Many problems –Too many parameters –Parameter files too big Use the Storage Elements (SE)‏ –Send parameters to the SE –Retrieve the files before the execution –All working nodes copy the files to its local disk

10 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 The Simulator Porting The shell script –Used in every working node to copy files from the storage element –Change the binary permissions –Run the binary file –Copy the results back to the storage elements JDL file changed to run the shell script insted of the binary Each working node runs a sequential version of the binary file –Loss of MPICH parameters to set the master and slaves

11 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 The Simulator Porting The solution: use MPI hooks and some environment variables Set of environment variables to capture the parameters used by MPICH to run in parallel Possibility to include some code to execute before the application is started and after it's done –Pre run hooks  Here the parametes files were copied from the Ses –Post run hooks  Result files were copied to the SE

12 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 The Simulator Porting The job still showed the status “running” until the proxy expired Executable Binary too big to be copied with the InputSandbox –50+ MB Same solution to parameter files –Copy-register the binary to the storage element –Copy the binary to the WN in the pre run hooks

13 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 Conclusion E2gris1 helped us gridify 2/3 of the applications we intended to –InvCell is gridified –The Simulator is gridified –InvTissue is not InvTissue could not be ported because we coudn't contact the developer to fix some issues that are not related to the grid We believe the same gridification process used in the other applications could do the work

14 www.eu-eela.eu Itacuruça (Brazil), E2GRIS1, 2.11.2008 – 15.11.2008 Questions … 14


Download ppt "Www.eu-eela.org E-science grid facility for Europe and Latin America E2GRIS1 Gustavo Miranda Teixeira Ricardo Silva Campos Laboratório de Fisiologia Computacional."

Similar presentations


Ads by Google