Presentation is loading. Please wait.

Presentation is loading. Please wait.

November 16, 2007 Dominique Boutigny – CC-IN2P3 Grids: Tools for e-Science DoSon AC GRID School.

Similar presentations


Presentation on theme: "November 16, 2007 Dominique Boutigny – CC-IN2P3 Grids: Tools for e-Science DoSon AC GRID School."— Presentation transcript:

1 November 16, 2007 Dominique Boutigny – CC-IN2P3 Grids: Tools for e-Science DoSon AC GRID School

2 November 16, 2007Dominique Boutigny2 Main characteristics of a Grid A grid is an architecture and a set of software tools designed to federate distributed computing resources. Resources are in principle heterogeneous Each node of the grid is administrated locally but there should be a central coordination in order to keep the system coherent An information system (even very light) should be present in order to match the computing tasks to the computing environment The underlying network is crucial A security and authorization system should be present

3 November 16, 2007Dominique Boutigny3 Different kind of production Grids Computing Grid Data Grid Both Computing and Data Molecular docking Medical imagery Astronomical data LHC data processing

4 November 16, 2007Dominique Boutigny4 Grids are a good way to increase the computing power available for a scientific community by putting resources in common Grids federate and contribute to build scientific communities Grids are often complicated to manage – A large grid requires a strong coordination between the participating sites But

5 November 16, 2007Dominique Boutigny5 The LHC Computing Grid LCG

6 November 16, 2007Dominique Boutigny6 Concorde (15 Km) Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Mt. Blanc (4.8 Km) 4 LHC experiments  15 PetaByte of data per year We have got a problem with data 100 Million SpecInt2000 This is ~ 5000 today's 8 core computers  ~15 M$ Relatively easy to setup – Each CPU core is independent of each other 15 PetaByte of data per year  Today, this is ~20 M$ if you want to put them on disk And you also need to store the Monte Carlo simulation Need to store data securely for the whole life of the experiments Complicated architecture as the data have to move worldwide Each LHC contributor should be able to have access to any data

7 November 16, 2007Dominique Boutigny7 A Hierarchical Grid Architecture in an International Framework CC-IN2P3 FZK PIC NDGF NIKHEF ASCC Brookhaven Fermilab TRIUMF RAL CNAF T1 (11) T0 T3 (many) T2 (52) Île de France Clermont Nantes Strasbourg Marseille Lyon CC-IN2P3 Annecy

8 November 16, 2007Dominique Boutigny8 LCG Vs EGEE In Europe the LHC Computing Grid is based on the multidisciplinary project EGEE  Middleware  Grid operation infrastructure PilotNew The Grid was a necessity for the LHC Computing It was a very good opportunity for other disciplines EGEE is also providing a very sophisticated operational framework Monitoring Monitoring Ticketing system Ticketing system EGEE-II: 90 partners – 32 countries – 32 M€  Crucial for the success of the project

9 November 16, 2007Dominique Boutigny9 LCG Vs EGEE

10 November 16, 2007Dominique Boutigny10

11 November 16, 2007Dominique Boutigny11 Interoperability 3 grid infrastructures are being used for LHC Computing 3 grid infrastructures are being used for LHC Computing –EGEE in Europe –NorduGrid in Nordic Countries –OSG in the US These 3 infrastructures are now able to interoperate These 3 infrastructures are now able to interoperate –Job submission –Operation Developments on interoperability Developments on interoperability –Short term: GIN (Grid Interoperability Now) –Longer term: SAGA / JSDL etc… They are based on different middlewares Developed within the OGF framework

12 November 16, 2007Dominique Boutigny12 GRID Services for the LHC Computing services Computing Element (CE) Worker nodes (WN) WNWN WN WN WN WNWN WN WN WN WNWN WN WN WN SL4 Workload Management System Storage Based on SRM  dCache  Castor  Storm  DPM File Management Transfer: FTS Cataloguing: LFC Database replication 3D - Project VOMS Virtual Organization Management Specific experiment services VO Boxes Will be used for priority management

13 November 16, 2007Dominique Boutigny13 The LHC Optical Private Network

14 November 16, 2007Dominique Boutigny14 LCG and emerging countries The grid is a complex environment which is mandatory to provide the huge computing resources necessary for the LHC The grid is a complex environment which is mandatory to provide the huge computing resources necessary for the LHC –The learning curve is steep ! Complexity … But… Complexity … But… –It provides a framework in which all the data will be available for every collaborator everywhere  This is a unique opportunity for laboratories in emerging countries to fully participate to the physics analysis

15 November 16, 2007Dominique Boutigny15 Lightweight Grids

16 November 16, 2007Dominique Boutigny16 BOINC Network Main server BOINC provide a framework for a lightweight Grid targeting CPU intensive applications running on small datasets

17 November 16, 2007Dominique Boutigny17 BOINC / Einstein@home Data analysis from the giant interferometer LIGO and GEO – Search for pulsar generated gravitational waves Fast Fourier transforms are computed on many chunks of the best data taking periods.  Search for Gravitational Wave signals on 30 000 directions spread on the sky  Huge combinatorial problem Use of individual PC  Big success > 160 000 participants  Contribution to scientific outreach Gravitational wave detection http://einstein.phys.uwm.edu/

18 November 16, 2007Dominique Boutigny18 BOINC BOINC provides a framework for a lightweight Grid which is usable to federates the usage of distributed PC BOINC provides a framework for a lightweight Grid which is usable to federates the usage of distributed PC Standalone usage is possible in many domains – BOINC is already used by several teams working in Biology. Standalone usage is possible in many domains – BOINC is already used by several teams working in Biology. Certainly a way to explore, for laboratories with limited computing resources Certainly a way to explore, for laboratories with limited computing resources

19 November 16, 2007Dominique Boutigny19 Java Job Submission (JJS) Developed at CC-IN2P3 by Pascal Calvat Developed at CC-IN2P3 by Pascal Calvat Java Job Submission is a very simple User Interface to submit jobs on the Grid Java Job Submission is a very simple User Interface to submit jobs on the Grid –Works on MAC, Windows and Linux –Direct submission to Computing Element –Very efficient Especially for short jobsEspecially for short jobs –Includes a learning system in order to dynamically build a list of the "best" submission sites based on their response time

20 November 16, 2007Dominique Boutigny20 SRB an example of a data Grid Developed at San Diego Supercomputing Center

21 November 16, 2007Dominique Boutigny21 SRB a Data Grid middleware (1) Many scientific applications are based on data production and analysis ATAGG CATAG GCTAT AGGCC AGATT AA

22 November 16, 2007Dominique Boutigny22 SRB a Data Grid middleware (2) User wants the complexity to be hidden Inspired from: http://legacy-web.nbirn.net/Resources_rd/Educational/Tutorials/SRB/021202SRBTutorial/021202SRBIntroBIRN.ppt Put data Get data SRB Put data DB SRB Metadata Catalog DB SRB Metadata Catalog DB SRB Metadata Catalog

23 November 16, 2007Dominique Boutigny23 Biomedical applications using SRB Export PC (DICOM server, SRB client) MRI Siemens MAGNETOM Sonata Maestro Class 1.5 T Acquisition Control PC DICOM   push DICOM DICOM

24 November 16, 2007Dominique Boutigny24 The BIRN Project Biomedical Informatics Research Network Brain imagery – Study of brain diseases http://www.nbirn.net/

25 November 16, 2007Dominique Boutigny25 SRB application in HEP Projet SuperNovae Factory Data acquisition in Hawai remotely controlled from France Data are exported to CC-IN2P3 and put at physicist disposal through SRB BaBar data distribution has been using SRB since several years Hundreds of TB of data has been transferred and referenced

26 November 16, 2007Dominique Boutigny26 Grid5000 a research grid Grid5000 is a project to build a 5000 node grid, dedicated for research on grid technologies 9 French sites are currently hosting 3166 Grid5000 nodes Sites are connected together on a 10 Gb/s backbone A booking system allows to reserve some nodes to run experiments. It is possible to install and deploy a complete software package from the OS up to the applications on all the nodes Since recently a network connection has been established between Grid5000 and the Japanese Grid NAREGI A close collaboration between Research Grids and Production Grids is essential Research Grids will develop the future software for the production grids Production Grids will provide the framework to test new developments

27 November 16, 2007Dominique Boutigny27 Networks and the Digital Divide (1) ICFA Standing Committee on Interregional Connectivity R. Les Cottrell and Shahryar Khan http://www.slac.stanford.edu/xorg/icfa/icfa-net-paper-jan07/ Pinger system running on 649 sites – 128 countries – 11 world regions

28 November 16, 2007Dominique Boutigny28 Networks and the Digital Divide (2) Behind Europe 6 Yrs: Russia, Latin America 7 Yrs: Mid-East, SE Asia 8-9 Yrs: So. Asia 11 Yrs: Cent. Asia 12 Yrs: Africa

29 November 16, 2007Dominique Boutigny29 The ORIENT / TEIN2 network Internet connection difficulties are often related to the "last mile problem"  Institutes local network  Institute connection to the main country backbone  etc Are often a problem Hong Kong is also Connected to GLORIAD 45 Mb/s 622 Mb/s to be upgraded to 2x2.5 Gb/s

30 November 16, 2007Dominique Boutigny30 Conclusions Different kind of grid systems have been presented Different kind of grid systems have been presented –They are adapted to different kind of research –They can be very light (BOINC) or much more complicated (LCG) There are different ways to do Grid computing There are different ways to do Grid computing –Can be very simple (a single User Interface) –Can be more sophisticated (by deploying a complete Grid node) But in any case the network quality is crucial ! But in any case the network quality is crucial ! –Emerging countries should put the focus on the network development Grid is nothing by itself, only scientific applications matters !


Download ppt "November 16, 2007 Dominique Boutigny – CC-IN2P3 Grids: Tools for e-Science DoSon AC GRID School."

Similar presentations


Ads by Google