Presentation is loading. Please wait.

Presentation is loading. Please wait.

RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra.

Similar presentations


Presentation on theme: "RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra."— Presentation transcript:

1 RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra

2 Outline of the presentation ➲ Porting features; ➲ Jobs management; ➲ Implementation tests and results; ➲ Conclusions and further development.

3 Porting features ➲ Totally implemented in python. ➲ Uses the same executable of the RISICO system (no changes needed). ➲ Easily configurable through configuration file.

4 The RISICO system ➲ Italy: 310000 km^2 ➲ Current system: 300k regular cells, 1km side. ➲ Grid version: 30M regular cells, 0.1km side. GRIDIFICATION

5 RISICO vs GRID-RISICO Get Input from Database Run RISICO Write Output to Database GRIDIFICATION Get Input from Database Upload Input into catalog Create n jobs Run RISICO on dataset 1 Collect outputs from catalog Write Outputs to Database JOB 1 Get input from catalog Write output 1 to catalog Run RISICO on dataset n JOB n Get input from catalog Write output n to catalog

6 Job submission ➲ A RISICO's job is fully defined by a jdl (job description language) file and by a parameter file. ➲ Each submitted job must terminate successfully within a defined time. The job activity is monitored by a software module called JobMonitor. ➲ The job submission procedure is handled by a JobSubmitter, which creates a set of job and associates a JobMonitor with each job.

7 Job Monitoring ➲ All the jobs are monitored by an instance of a module called JobMonitor. ➲ The JobMonitor: Checks the job status during execution; Retrieves the job output from catalog; If the job fails, JobMonitor tries to resubmit it. JobMonitor will log the error if the job fails to run correctly.

8 Workflow: job creation, submission and data-collection ➲ Downloads input from remote meteo-data database, creates an archive and uploads it to catalog; ➲ Creates a jdl and parameters file for each job; ➲ Submits the jobs. ➲ Waits for jobs output. ➲ Gets jobs output from catalog and aggregates them.

9 Job definition (1)‏ job 1 job n ➲ Each job works with a specific dataset defining a spatial domain (subset). ➲ Such subsets are created off-line and stored on the catalog. ➲ A parameters file states the association between a job and a dataset. ➲ Each job produces an output, whose path in the catalog is a-priori known.

10 Job definition (2)‏ Job 1: Domain: celle/celle_01.tar.bz2 Status: celle/stato0_01.tar.bz2 Input: input/input_20070119.tar.bz2 Output: output/output_01_20071119.tar.bz2 ➲ Each job has its own domain. ➲ Job domain, status information and output are referred to the same geographical domain ➲ All jobs share the same input file.

11 Job definition (3)‏ Job 2: Domain: celle/celle_02.tar.bz2 Status: celle/stato0_02.tar.bz2 Input: input/input_20070119.tar.bz2 Output: output/output_02_20071119.tar.bz2 Job n: Domain: celle/celle_nn.tar.bz2 Status: celle/stato0_nn.tar.bz2 Input: input/input_20070119.tar.bz2 Output: output/output_nn_20071119.tar.bz2 CATALOG Job 1: Domain: celle/celle_01.tar.bz2 Status: celle/stato0_01.tar.bz2 Input: input/input_20070119.tar.bz2 Output: output/output_01_20071119.tar.bz2

12 Final version ➲ Estimated performances on the complete set of data (30M cells): Total CPU-Time: about 2 hours and 30 minutes; Optimal job number: about 30 (5-10 minutes of CPU time for each job); Storage: 30GByte / day.

13 Test Results ➲ The porting has been tested with a subset (1M cells) of the RISICO system final working-set. ➲ 10 parallel jobs were used. ➲ Performances: Job CPU-time: 30 seconds Grid overhead: 2 minutes.

14 Conclusions ➲ RISICO represents a feasible and significative test case. ➲ Grid architecture provides a valuable benefits to operational activities.


Download ppt "RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra."

Similar presentations


Ads by Google