Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CERN, GRID and E-Science Contents: Introduction Computer intensive science Particle physics and the LHC The LHC data challenge LCG – the LHC Computing.

Similar presentations


Presentation on theme: "1 CERN, GRID and E-Science Contents: Introduction Computer intensive science Particle physics and the LHC The LHC data challenge LCG – the LHC Computing."— Presentation transcript:

1 1 CERN, GRID and E-Science Contents: Introduction Computer intensive science Particle physics and the LHC The LHC data challenge LCG – the LHC Computing Grid EGEE and grid infrastructure Grid – Business and Clouds Nils Høimyr, CERN IT Includes presentation material from Frédéric Hemmer, Bob Jones, Avner Algom (IGT) and other EGEE project members

2 2 IT at CERN – more than the Grid Physics computing – Grids (this talk!) Administrative information systems –Financial and administrative management systems, e-business... Desktop and office computing –Windows, Linux and Web infrastructure for day to day use Engineering applications and databases –CAD/CAM/CAE (Autocad, Euclid, Catia, Cadence, Ansys etc) –A number of technical information systems based on Oracle, MySQL Controls systems –Process control of accellerators, experiments and infrastructure Networks and telecom –European IP hub, security, voice over IP... More information: http://cern.ch/ithttp://cern.ch/it

3 3 Computing intensive science Science is becoming increasingly digital and needs to deal with increasing amounts of data Simulations get ever more detailed –Nanotechnology – design of new materials from the molecular scale –Modelling and predicting complex systems (weather forecasting, river floods, earthquake) –Decoding the human genome Experimental Science uses ever bigger sensors to make precise measurements  Compute a lot of statistics  Huge amounts of data  Serves user community around the world

4 4 Particle Physics (I) CERN: the world's largest particle physics laboratory Particle physics requires special tools to create and study new particles: accelerators and detectors Mont Blanc (4810 m) Downtown Geneva Large Hadron Collider (LHC): –most powerful instrument ever built to investigate elementary particles –four experiments: ALICE, ATLAS, CMS, LHCb –27 km circumference tunnel –First beam 10 th September 2008

5 5 Particle physics (II) Physicists smash particles into each other to: –identify their components –create new particles –reveal the nature of the interactions between them –create an environment similar to the one present at the origin of our Universe A particle collision = an event –need to count, trace and characterize all the particles produced and fully reconstruct the process Among all tracks, the presence of “special shapes” is the sign for the occurrence of interesting interactions.

6 6 The LHC Data Challenge Starting from this event Looking for this “signature”  Selectivity: 1 in 10 13 (Like looking for a needle in 20 million haystacks)

7 7 The LHC Computing Grid – November 2007 View of the ATLAS detector (under construction) 150 million sensors deliver data … … 40 million times per second Frédéric Hemmer, CERN, IT Department

8 8 LHC Data 40 million collisions per second After filtering, 100 collisions of interest per second A Megabyte of data for each collision = recording rate of 0.1 Gigabytes/sec 10 10 collisions recorded each year ~ 10 Petabytes/year of data LHC data correspond to about 20 million CDs each year! Concorde (15 Km) Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Mt. Blanc (4.8 Km)

9 9 LHC Processing Simulation  compute what the detector should have seen Reconstruction  transform signals from the detector to physical properties (energies, charge of particles, …) Analysis  use complex algorithms to extract physics  LHC data analysis requires a computing power equivalent to ~ 100,000 of today's fastest PC processors!

10 10 CPU Disk Tape LHC Computing Resources

11 11 Frédéric Hemmer, CERN, IT Department Solution: the Grid Use the Grid to unite computing resources of particle physics institutes around the world The World Wide Web provides seamless access to information that is stored in many millions of different geographical locations The Grid is an infrastructure that provides seamless access to computing power and data storage capacity distributed over the globe

12 12 Frédéric Hemmer, CERN, IT Department How does the Grid work? It makes multiple computer centres look like a single system to the end-user Advanced software, called middleware, automatically finds the data the scientist needs, and the computing power to analyse it. Middleware balances the load on different resources. It also handles security, accounting, monitoring and much more.

13 13 LHC Computing Grid project (LCG) More than 140 computing centres 12 large centres for primary data management: CERN (Tier-0) and eleven Tier- 1s 38 federations of smaller Tier-2 centres 35 countries involved NDGF Nordic Countries

14 14 LCG Service Hierarchy Tier-0 – the accelerator centre Data acquisition & initial processing Long-term data curation Distribution of data  Tier-1 centres Canada – Triumf (Vancouver) France – IN2P3 (Lyon) Germany – Forschunszentrum Karlsruhe Italy – CNAF (Bologna) Netherlands – NIKHEF/SARA (Amsterdam) Nordic countries – distributed Tier-1 Spain – PIC (Barcelona) Taiwan – Academia SInica (Taipei) UK – CLRC (Oxford) US – FermiLab (Illinois) – Brookhaven (NY ) Tier-1 – “online” to the data acquisition process  high availability Managed Mass Storage –  grid-enabled data service Data-heavy analysis National, regional support Tier-2 – ~140 centres in ~35 countries Simulation End-user analysis – batch and interactive Frédéric Hemmer, CERN, IT Department

15 15 Frédéric Hemmer, CERN, IT Department WLCG Collaboration The Collaboration –4 LHC experiments –~140 computing centres –12 large centres (Tier-0, Tier-1) –38 federations of smaller “Tier-2” centres –~35 countries Memorandum of Understanding –Agreed in October 2005, now being signed Resources –Focuses on the needs of the four LHC experiments –Commits resources  each October for the coming year  5-year forward look –Agrees on standards and procedures Relies on EGEE and OSG (and other regional efforts)

16 16 WLCG Grid Activity Data distribution from CERN to Tier-1 sites: All experiments exceeded required rates for extended periods, & simultaneously – 1.3 GB/s target – Well above 2 GB/s achievable All Tier 1s achieved (or exceeded) their target acceptance rates Latest test in May show that the data rates required for LHC start-up have been reached and can be sustained over long periods Frédéric Hemmer, CERN, IT Department WLCG ran ~ 44 million jobs in 2007 – workload has continued to increase Average over May total: (10.5 M) 340k jobs / day ATLAS average >200k jobs/day CMS average > 100k jobs/ day with peaks up to 200k This is the level needed for 2008/9

17 17 The EGEE project - Bob Jones - EGEE'08 - 22 September 2008 17

18 18 The EGEE project EGEE –Started in April 2004, with 91 partners in 32 countries –3 rd phrase (2008-2010) launched in spring Objectives –Large-scale, production-quality grid infrastructure for e-Science –Attracting new resources and users from industry as well as science –Maintain and further improve “gLite” Grid middleware Norwegian Partner: University of Bergen (UiB)

19 19 The EGEE project - Bob Jones - EGEE'08 - 22 September 2008 19 EGEE Production Grid Infrastructure Steady growth over the lifetime of the project Improved reliability EGEE Achievements - Infrastructure

20 20 The EGEE project - Bob Jones - EGEE'08 - 22 September 2008 20 EGEE Achievements - Applications >270 VOs from several scientific domains –Astronomy & Astrophysics –Civil Protection –Computational Chemistry –Comp. Fluid Dynamics –Computer Science/Tools –Condensed Matter Physics –Earth Sciences –Fusion –High Energy Physics –Life Sciences Further applications under evaluation Applications have moved from testing to routine and daily usage ~80-95% efficiency

21 21 Frédéric Hemmer, CERN, IT Department Sustainability Need to prepare for permanent Grid infrastructure Ensure a high quality of service for all user communities Independent of short project funding cycles Infrastructure managed in collaboration with National Grid Initiatives (NGIs) European Grid Initiative (EGI)

22 22 Conclusions Grid deployment provides a powerful tool for science – as well as applications from other fields Applications are already benefiting from Grid technologies Open Source is the right approach for publicly funded projects and necessary for fast and wide adoption –Example: The World Wide Web! Access to Grid-based supercomputing facilities gives increasing value to research institutions Grids are above all about collaboration at a large scale –Cloud Computing and the convergence of Web, Grid, SOA and Virtualisation (Extra slides, courtesy of A. Algom IGT/EGEE)

23 23 Nils.Hoimyr@cern.ch www.cern.ch/lcg www.eu-egee.org www.eu-egi.org/ www.nordugrid.org More information www.cern.ch/openlab

24 24 Cloud is the business implementation of the Grid Philosophy The Network is the Computer Power Grid Compute Grid Provides On-Demand Transparent Services Pay per Use

25 25 Buzzwords and popularity 22 Sep. 2008

26 26 Hosted by commercial companies, paid-for by users. Based on the economies of scale and expertise. Only pay for what you need, when you need it: (On- Demand + Pay per Use). Academia Sponsor-based (Mainly government money). Industry pays Internal Implementations. Business Model – Where the money comes from? Mainly IndustryFirst - Academia Second – certain industries Main Target Market Reduce IT costs. On-demand scalability for all applications, including research, development and business applications. To enable the R&D community to achieve its research goals in reasonable time. Computation over large data sets, or of parallelized compute- intensive applications. Why we need it? (The Problem) Cloud computingClassic Grid ComputingIssue Grid and Clouds

27 27 “By 2012, 80 percent of Fortune 1000 companies will pay for some cloud computing service, And 30 percent of them will pay for cloud computing infrastructure”. Gartner, 2008 Cloud Market

28 28 EC2 – “Google of the Clouds” According to Vogels, 370,000 developers have registered for Amazon Web Services since their debut in 2002, and the company now spends more bandwidth on the developers than it does on e-commerce. http://www.theregister.co.uk/2008/06/26/amazon_trumpets_web_services/ In the last two months of 2007 usage of Amazon Web Services grew by 40% $131 million revenues in Q1 from AWS 60,000 customers The majority of usage comes from banks, pharmaceuticals and other large corporations

29 29 Where do you want to Cloud today? External Cloud C Utility Computing Saas External Cloud A Utility Computing Saas External Cloud B Utility Computing Saas Cloud Search Engine Compare Cost/performance Coming up - The Cloud Market SLA

30 30

31 31

32 32 Why Now? Its all about the economics of scalability Enabled by technology maturity

33 33 Virtualization Grid SOA Network Performance Infrastructure Virtualization Service Oriented Infrastructure Service Based Logical Computer Network Resources Scalability-On-Demand SLA Why Now? (Concepts) Technology Maturity

34 34 The Virtualization Spectrum

35 35 - CIOs -> Do more with Less (Energy costs / Recession will boost it) - Lower cost for Scalability - Enterprise IT budget - Spending 80% on MAINTENANCE - In average, we utilize only 15% of our computing resources capacity - Peak Times economy - The Enterprise IT is not its core business - Psychology of Internet/Cloud trust (SalesForce, Gmail, Internet banking, etc.) - Ideal for Developers & SMB Why Now? (Economy)

36 36 Cost savings, leveraging economies of scale Pay only for what you use Resource flexibility Rapid prototyping and market testing Increased speed to market Improved service levels and availability Self-service deployment Reduce lock-in and switching costs Why Now? (Benefits)

37 37 Challenges Security and Trust Customer SLA – compare Cost/Performance Dynamic VM migration – Unique Universal IP Clouds Interoperability Data Protection & Recovery Standards: Security, SLA, VMs Management Tools Integration with Internal Infrastructure Small compact economical applications Cost/Performance prediction and measurement Keep it Transparent and Simple (or KISS...)

38 38 Summary Grid vs Cloud Grid computing: Running complex CPU-intensive applications across multiple computer centres –Users access the grid via dedicated Grid applications or portals for their domain. –Grid applications or portals are built on top of Grid middleware Cloud computing: Data services sourced at external providers, somewhere on the Internet cloud –HTTPS and REST protocols –Dedicated Internet portals by commercial providers –“Software as a service” by commercial providers.


Download ppt "1 CERN, GRID and E-Science Contents: Introduction Computer intensive science Particle physics and the LHC The LHC data challenge LCG – the LHC Computing."

Similar presentations


Ads by Google