Presentation on theme: "EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?"— Presentation transcript:
EGEE-III INFSO-RI Enabling Grids for E-sciencE Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution? Marc-Elian Bégin Six² Sàrl, Switzerland Session: Exploring Cloud Computing, OGF23 Barcelona, Spain, June 2, 2008
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Content Context of comparative study Grid: EGEE/gLite Cloud: Amazon Web Service Comparison summary Conclusions Recommendations
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Context of comparative study This presentation is a summary of the report: –An EGEE Comparative study: Grids and Clouds- evolution or revolution? –https://edms.cern.ch/file/925013/3/EGEE-Grid-Cloud.pdfhttps://edms.cern.ch/file/925013/3/EGEE-Grid-Cloud.pdf Objective: –As cloud computing gains popularity and traction, need to position grid computing with respect to cloud computing –Compare real implementations and production offerings EGEE/gLite grid production service Amazon Web Services, with focus on EC2 and S3 Outcome: –Identified convergence paths and –Recommendations for managing convergence going forward
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Acknowledgment Many people provided comments, suggestions and feedback Special thanks got to: –Bob Jones, CERN –James Casey, CERN –Charles Loomis, CNRS and Six² partner
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, EGEE – What does it deliver? Infrastructure operation –Sites distributed across many countries Large quantity of CPUs and storage Continuous monitoring of grid services & automated site configuration/management Support multiple Virtual Organisations from diverse research disciplines Middleware –Production middleware distributed under business friendly open source licence Implements a service-oriented architecture that virtualises resources Adheres to recommendations on web service inter- operability and evolving towards emerging standards User Support - Managed process from first contact through to production usage –Training –Expertise in grid-enabling applications –Online helpdesk –Networking events (User Forum, Conferences etc.)
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences … >250 sites 48 countries >50,000 CPUs >20 PetaBytes >10,000 users >150 VOs >150,000 jobs/day
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Users and resources distribution
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, European Grid Initiative Need to prepare permanent, common Grid infrastructure Ensure the long-term sustainability of the European e-Infrastructure independent of short project funding cycles Coordinate the integration and interaction between National Grid Infrastructures (NGIs) Operate the production Grid infrastructure on a European level for a wide range of user communities
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Grid: EGEE/gLite EGEE highlights: Federated but separately administered resources (multiple sites, countries and continents) Heterogeneous resources Distributed, multiple research user communities grouped in Virtual Organisations (VO) Mostly publicly funded at local, national and international levels Range of data models, ranging from massive data sources, hard to replicate to transient datasets composed of varied file sizes
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Grid: EGEE/gLite (2) Provided services: Basic services (focus of comparison with AWS) –Computing Element (CE) –Storage Element (SE) Higher-level services –Workload Management System (WMS) –File & Metadata Catalog Services –File Transfer Service (FTS) –Virtual Organization Management Service (VOMS) For more info: –Bob Jones, EGEE Project Director, CERN,
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Amazon Web Services EC2 (Elastic Computing Cloud) is the computing service of Amazon –Based on hardware virtualisation (Xen) –Users request virtual machine instances, pointing to an image (public or private) stored in S3 –Users have full control over each instance (e.g. access as root, if required) –Request can be issued via SOAP and REST S3 (Simple Storage Service) is a service for storing and accessing data on the Amazon cloud –From a users point-of-view, S3 is independent from the other Amazon services –Data is built in a hierarchical fashion, grouped in buckets (i.e. containers) and objects –Data is accessible via SOAP, REST and BitTorrent
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Amazon Web Services (2) Other AWS services: –SQS (Simple Queue Service) –SimpleDB –Billing services: DevPay –Elastic IP (Static IPs for Dynamic Cloud Computing) –Multiple Locations
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Costs Cost study for computing upgrade at CERN for LHC (by Ian Bird, Tony Cass, Bernd Panzer-Steindel and Les Robertson) Cost summary for providing 40 MSI2000 of computing: –Custom data centre construction: 4.4 MCHF (~2.7 M) –Using EC2: 92 MCHF (~56.9 M) Cost of 4.4 MCHF doesnt include software license and man-power costs Comparison is made difficult by the choice of reference Amazon is using for its EC2 Compute Unit –e.g. EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a GHz 2007 Opteron or 2007 Xeon processor Our calculation was for 40 MSI2000 on EC2: 57 MCHF (~35.3 M)
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Costs: EGEE workload in 2007 CPU: 114 Million hours Data: 25PB stored 11PB transferred Estimated cost if performed with Amazons EC2 and S3: ~38 M 17/05/08 $ http://calculator.s3.amazonaws.com/calc5.html?
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, High-level deployment of LCG grid resources Where could the cloud be? Since transferring data across the cloud border costs!
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Can BitTorrent Help Using BitTorrent, transfers not metered by cloud if requesting the same files Where could the cloud be? Since transferring data across the cloud border costs!
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, EC2, S3 bandwidth performance summary The conclusions from  regarding the EC2 -> EC2 transfers are that basically were getting a full gigabit between the instances. Performance Test type Transfer (MB/sec) Remarks EC2 -> EC275.0Using curl on 1-2 GB files, without SSL S3 -> EC2 49.8Using 8 x curl on 1 GB files, with SSL 51.5Using 8 x curl on 1 GB files, without SSL EC2 -> S353.8Using 12 x curl on 1 GB files, with SSL
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Performance (2) Like AWS, CERN has opted for a storage / compute farms separation CERN can deliver a sustained 70 GB/s data throughput between the storage and compute farms A large scale performance analysis not available on AWS
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Scale Is EC2 (Elastic Computing Cloud) really elastic? Scale of EGEE is already established and well documented Scale from AWS is unknown, while latest experiments seem to indicate good scaling Both systems now have SLAs in place, including penalties (partial refund) from Amazon when not honoured Elastic IP and Multiple Locations provide building blocks for users to deploy resilient services, while EGEE is already massively distributed (>250 sites)
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, AWS Cloud interfaces No middleware!! Resource-side grid middleware?
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Ease of Use Key to the success of AWS is the choice of technologies –HTTP(S)/REST and support for ROA (Resource Oriented Architecture) –Hardware virtualisation (Xen based) –X.509 certificates This backs-up the claim from Amazon that AWS requires no middleware (for the user!) However, the level of service provided by AWS is lower than EGEE For EGEE/gLite, several MB are required to use the grid
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Service Mapping Ease of use comes at a cost: The cost of simplicity The basic constructs that EC2 and S3 services offer do not currently meet all the requirements of grid users and do not replace high-level services provided by gLite – e.g.: –File Transfer Service (FTS) –Workload Management System (WMS) –Grid catalogues such as ARDA Metadata Catalogue (AMGA), LCG File Catalog (LFC) or GANGA Are all users using the grid the same way? Should we revisit the way the grid is used and accessed? Who should be responsible for providing different levels of functionality
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Collaboration and Virtual Organisations Grids are used by large and/or distributed communities of collaborators Virtual Organisations support this concept, with services such as VOMS Only primitive ACLs are provided by AWS, can we bridge the gap? Scientific collaborations include the need for resources to be contributed and connected to the grid. Can the cloud be augmented by custom data centres
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Application Software Deployment Grid application software is often required to be installed at data centres for jobs to execute successfully Several operating systems and platforms required to host grid jobs Hardware virtualisation could alleviate these burdens –Grid application software can be baked in a virtual image –Data centres do not have to provide specific operating system – defined at the level of the VM Hardware virtualisation provides high-level of control to user (e.g. root) and high control and security for hosts
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Interoperability Assuming that several cloud computing providers come to be… Which interface matter? BOTH!!!
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Standards Since simple is beautiful, if the proposed interfaces by cloud services like AWS are to become popular with grid users, they might change the standardisation landscape HTTP, REST, Xen and BitTorrent are already largely standardised What is left at that level –REST access to storage –Virtual Image formats –Instantiation API (perhaps based on REST) –Metering interfaces (including monitoring) A reference open source implementation is missing What about higher-level services? Which ones?
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Conclusions Cloud computing is getting traction, especially with Amazon Web Services (AWS) commercial offering Grid (e.g. EGEE) has a larger scope, however, technological choices and simple interfaces like AWS is relevant to the grid world The question what is the usage pattern that will emerge in the coming years? remains unanswered and will have to be carefully tracked None of the resources contributed to the EGEE grid come from commercial offerings, such as Amazon. While this change? Technologies such as REST, HTTP, hardware virtualisation and BitTorrent could displace existing accesses to grid resources
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Conclusion (2) EGEE has an opportunity to lead the next generation e- Infrastructure by integrating new advancements such as cloud computing Hardware virtualisation could lower the operations cost of large infrastructures Important that new development is not a distraction from ensuring current production grid continuity Roadmap should be defined to include cloud technology in current e-Infrastructures in an incremental and harmonious fashion
Enabling Grids for E-sciencE EGEE-III INFSO-RI June 2, Recommendations 1.Promote/support the development of an open source cloud middleware distribution, based on interfaces similar to current commercial offerings 2.Promote the standardisation of the cloud, with the above mentioned implementation as a potential reference 3.Identify a convergence path between cloud services such as EC2 and S3 and the current EGEE security model based on VOMS 4.Virtualise all key grid services (e.g. information system, metadata catalogues, security service) with the goal of being able to deploy these on EC2-like resources 5.Promote/lobby the need for experiments (i.e. LHC/HEP, Life science) and other grid users to virtualise their application, with the goal of being able to deploy them on EC2-like resources 6.As a follow-on to point 5, promote/lobby the need for all service dependencies that grid user applications have to also be virtualised 7.Launch/support a feasibility study to verify that monitoring of cloud jobs can be performed at the hypervisor level, such that monitoring is independent from the virtualised applications 8.Upgrade current metadata catalogues to support HTTP(S) endpoints and S3-like metadata 9.Explore feasibility of running BitTorrent on grid sites