A Grid Computing Use case Datagrid Jean-Marc Pierson.

Slides:



Advertisements
Similar presentations
CERN STAR TAP June 2001 Status of the EU DataGrid Project Fabrizio Gagliardi CERN EU-DataGrid Project Leader June 2001
Advertisements

OptorSim: A Replica Optimisation Simulator for the EU DataGrid W. H. Bell, D. G. Cameron, R. Carvajal, A. P. Millar, C.Nicholson, K. Stockinger, F. Zini.
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
High Performance Computing Course Notes Grid Computing.
A Computation Management Agent for Multi-Institutional Grids
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
EU-GRID Work Program Massimo Sgaravatto – INFN Padova Cristina Vistoli – INFN Cnaf as INFN members of the EU-GRID technical team.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
The CrossGrid project Juha Alatalo Timo Koivusalo.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Workload Management Massimo Sgaravatto INFN Padova.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
INFN-GRID Globus evaluation (WP 1) Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
DataGrid Applications Federico Carminati WP6 WorkShop December 11, 2000.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Chapter 4 Realtime Widely Distributed Instrumention System.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
7April 2000F Harris LHCb Software Workshop 1 LHCb planning on EU GRID activities (for discussion) F Harris.
D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
GridPP Presentation to AstroGrid 13 December 2001 Steve Lloyd Queen Mary University of London.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CLRC and the European DataGrid Middleware Information and Monitoring Services The current information service is built on the hierarchical database OpenLDAP.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
May http://cern.ch/hep-proj-grid-fabric1 EU DataGrid WP4 Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Networking: Applications and Services Antonia Ghiselli, INFN Stu Loken, LBNL Chairs.
2. WP9 – Earth Observation Applications ESA DataGrid Review Frascati, 10 June Welcome and introduction (15m) 2.WP9 – Earth Observation Applications.
Globus: A Report. Introduction What is Globus? Need for Globus. Goal of Globus Approach used by Globus: –Develop High level tools and basic technologies.
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
Developing GRID Applications GRACE Project
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
EGEE is a project funded by the European Union under contract IST Generic Applications Requirements Roberto Barbera NA4 Generic Applications.
WP10 Goals and accomplishments from WP10 point of view J. Montagnat, CNRS, CREATIS V. Breton, CNRS/IN2P3 DataGrid Biomedical Work Package.
14 June 2001LHCb workshop at Bologna1 LHCb and Datagrid - Status and Planning F Harris(Oxford)
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
Workload Management Workpackage
Moving the LHCb Monte Carlo production system to the GRID
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Grid related projects CERN openlab LCG EDG F.Fluckiger
Grid Computing.
Network Requirements Javier Orellana
University of Technology
Large Scale Distributed Computing
Wide Area Workload Management Work Package DATAGRID project
Gridifying the LHCb Monte Carlo production system
I Datagrid Workshop- Marseille C.Vistoli
Presentation transcript:

A Grid Computing Use case Datagrid Jean-Marc Pierson

DataGrid : a european effort 9.8 million euros 9.8 million euros researchers involved researchers involved CERN, ESA, CNRS, INFN… CERN, ESA, CNRS, INFN… objective : share huge amounts of distributed data over the network infrastructure objective : share huge amounts of distributed data over the network infrastructure developed over Globus Toolkit developed over Globus Toolkit (most figures and material from

Applications Domain High Energy Physics (HEP), led by CERN, for LHC data High Energy Physics (HEP), led by CERN, for LHC data Biology and Medical Image processing, led by CNRS (France), Biology and Medical Image processing, led by CNRS (France), Earth Observations (EO) led by the European Space Agency. Earth Observations (EO) led by the European Space Agency.

Data Grid middleware Five work packages Workload Scheduling and Management Workload Scheduling and Management Data Management Data Management Grid Monitoring Services Grid Monitoring Services Mass Storage Management Mass Storage Management Local Fabric Management Local Fabric Management

Workload Scheduling and Management (1) the problems : the problems : dynamic relocation of datadynamic relocation of data very large numbers of schedulable components in the system (computers and files)very large numbers of schedulable components in the system (computers and files) large number of simultaneous users submitting work to the systemlarge number of simultaneous users submitting work to the system different access policies applied at different sites and in different countries.different access policies applied at different sites and in different countries.

Workload Scheduling and Management (2) A need for : A need for : planning job decompositionplanning job decomposition and planning task distributionand planning task distribution Planning based on knowledge of the availability and proximity of computational capacity and the required data. Planning based on knowledge of the availability and proximity of computational capacity and the required data. a need for cost estimation tools (delays, data migration, caching...) a need for cost estimation tools (delays, data migration, caching...) Extension of job description languages (JSL) to express data dependencies. Extension of job description languages (JSL) to express data dependencies.

Data management goals : goals : to permit secure access of massive amounts of data in a universal global name spaceto permit secure access of massive amounts of data in a universal global name space to move and replicate data at high speed from one geographical site to anotherto move and replicate data at high speed from one geographical site to another to manage the synchronisation of remote data copies.to manage the synchronisation of remote data copies. tools : tools : dynamic automated wide-area data caching and distributiondynamic automated wide-area data caching and distribution generic interface to different mass storagegeneric interface to different mass storage performance and reliability issues associated with the use of tertiary storage will be addressed.performance and reliability issues associated with the use of tertiary storage will be addressed.

Monitoring the datagrid goal : goal : to enable transparent monitoring of the use of distributed resources at a large scale.to enable transparent monitoring of the use of distributed resources at a large scale. to assess finely the interplay between computer fabrics, networking and mass storageto assess finely the interplay between computer fabrics, networking and mass storage tools : tools : local monitoring of other middlewareslocal monitoring of other middlewares local monitoring of applications themselveslocal monitoring of applications themselves developping short time and long term information of monitoring (real time+archiving)developping short time and long term information of monitoring (real time+archiving) developping effective means of visual presentation of the multivariate data.developping effective means of visual presentation of the multivariate data.

Local fabric management goals : goals : information publication concerning resource availability and performanceinformation publication concerning resource availability and performance mapping of authentication and resource allocation mechanisms to local environmentmapping of authentication and resource allocation mechanisms to local environment self healing : dynamic configuration changes and error recovery strategiesself healing : dynamic configuration changes and error recovery strategies difficulty to scale well : tens of thousands of components difficulty to scale well : tens of thousands of components tools : tools : automatic fault detection and isolation, automatic reconfiguation of the fabric and re-running the tasksautomatic fault detection and isolation, automatic reconfiguation of the fabric and re-running the tasks automatic incorporation of new or updated componentsautomatic incorporation of new or updated components

Mass Storage Management goals : goals : to introduce standards for handling LHC data so that they can be exchangedto introduce standards for handling LHC data so that they can be exchanged to spread work to other application fieldto spread work to other application field tools : tools : uniform interface to the very different systems used at different sitesuniform interface to the very different systems used at different sites provide interchange of data and meta- data between sitesprovide interchange of data and meta- data between sites develop appropriate resource allocation and information publishing functionsdevelop appropriate resource allocation and information publishing functions

Conclusion Globus, and all its services, had to be extended ! Globus, and all its services, had to be extended ! Datagrid : a first effort for handling huge amounts of data Datagrid : a first effort for handling huge amounts of data Collaborative work ! Collaborative work ! Some key issues are not really treated : Some key issues are not really treated : data security is basicdata security is basic cache management does not use data semanticcache management does not use data semantic Useful for raw data intensive computation and management, not for semantically strong data : Le projet Medigrid ! Useful for raw data intensive computation and management, not for semantically strong data : Le projet Medigrid !