Computing models, facilities, distributed computing

Slides:



Advertisements
Similar presentations
May 17, Capabilities Description of a Rapid Prototyping Capability for Earth-Sun System Sciences RPC Project Team Mississippi State University.
Advertisements

Ian Bird WLCG Workshop Okinawa, 12 th April 2015.
GridPP Steve Lloyd, Chair of the GridPP Collaboration Board.
Ian Bird LHCC Referees’ meeting; CERN, 11 th June 2013 March 6, 2013
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
DISTRIBUTED COMPUTING
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
Ian Bird GDB; CERN, 8 th May 2013 March 6, 2013
HPC Centres and Strategies for Advancing Computational Science in Academic Institutions Organisers: Dan Katz – University of Chicago Gabrielle Allen –
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
7. Grid Computing Systems and Resource Management
Future computing strategy Some considerations Ian Bird WLCG Overview Board CERN, 28 th September 2012.
ComPASS Summary, Budgets & Discussion Panagiotis Spentzouris, Fermilab ComPASS PI.
LHC Computing, CERN, & Federated Identities
Ian Bird CERN, 17 th July 2013 July 17, 2013
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Cloud-based e-science drivers for ESAs Sentinel Collaborative Ground Segment Kostas Koumandaros Greek Research & Technology Network Open Science retreat.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
Evolution of storage and data management
Delivering on the Promise of a Virtualized Dynamic Data Center
Review of the WLCG experiments compute plans
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Organizations Are Embracing New Opportunities
Ian Bird WLCG Workshop San Francisco, 8th October 2016
Clouds , Grids and Clusters
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Geoffrey Fox, Shantenu Jha, Dan Katz, Judy Qiu, Jon Weissman
The “Understanding Performance!” team in CERN IT
Methodology: Aspects: cost models, modelling of system, understanding of behaviour & performance, technology evolution, prototyping  Develop prototypes.
WLCG Manchester Report
Scientific Computing Strategy (for HEP)
FUTURE ICT CHALLENGES IN SCIENTIFIC COMPUTING
APEC 21st Century Renewable Energy Development Initiative
Siri Jodha Khalsa CIRES, Univ. of Colorado
ELIXIR: Potential areas for collaboration with e-Infrastructures
for the Offline and Computing groups
WLCG: TDR for HL-LHC Ian Bird LHCC Referees’ meting CERN, 9th May 2017.
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
UK Status and Plans Scientific Computing Forum 27th Oct 2017
Recap: introduction to e-science
Thoughts on Computing Upgrade Activities
Performance Testing Methodology for Cloud Based Applications
R&D for HL-LHC from the CWP
Strategy document for LHCC – Run 4
Introduction to Cloud Computing
PROCESS - H2020 Project Work Package WP6 JRA3
Cloud Computing.
Management of Virtual Execution Environments 3 June 2008
The latest developments in preparations of the LHC community for the computing challenges of the High Luminosity LHC Dagmar Adamova (NPI AS CR Prague/Rez)
New strategies of the LHC experiments to meet
Cloud Computing Dr. Sharad Saxena.
Scalable and Worldwide Cloud Platform Powers Expansion for White-Label Mobile TV Solution MINI-CASE STUDY “Microsoft Azure played a vital role in the design.
Introduction to D4Science
CloneManager® Helps Users Harness the Power of Microsoft Azure to Clone and Migrate Systems into the Cloud Cost-Effectively and Securely MICROSOFT AZURE.
Microsoft Azure Enables Big-Data-as-a-Service Applications for Industry and Government Use “Microsoft Azure is the most innovative and robust suite of.
Single Cell’s Progenitor Powered by Microsoft Azure Improves Organisational Efficiency with Strategic Procurement, Contract Management, and Analytics MICROSOFT.
GENI Global Environment for Network Innovation
Project Overview Konstantinos Tserpes, ICCS/NTUA Final Review Meeting
School Districts Can Analyze and Report on Data Across Multiple Systems with EdWire, a Powerful Integration Solution that Utilizes Microsoft Azure MICROSOFT.
Defining the Grid Fabrizio Gagliardi EMEA Director Technical Computing
FrAmework for Multi-agency Environments
Workflow and HPC erhtjhtyhy Doug Benjamin Argonne National Lab.
Presentation transcript:

Computing models, facilities, distributed computing Overview 12th September 2017 Ian Bird

Challenge Challenge for HL-LHC computing Manage multi-exabyte per year data flows, CPU needs of 20M cores. Improve performance from today to fit within constrained budget envelope (~ x10 needs above what technology evolution will bring), While optimizing the physics output 12 Sep 2017 CWP - computing models

Main themes Allow/help countries or regions to flexibly manage compute and storage resources internally, Supporting national/regional consolidation, provisioning resources in a way that makes sense in the local situation Use of federation of resources, integration of public, private, commercial, HPC, etc. as necessary Foresee some Tier1/Tier2 boundaries blurred and regions with common funding can federate their facilities, in order to optimize and consolidate the resources they provide, in a way that is flexible, and not held to a history that is decades old at this point. Investigate the “data-lakes” concept – keep bulk data (down to derived AODs) in a cloud-like realm (data-lake). Plug in processing via traffic-managed networks, bulk processing close to the data, and: Reduce the amount of data replication and distribution. Evolve from data placement to data serving. Data-lakes as cloud analysis facilities to enable new analysis models (big-data tools, ML, web-based remote analysis with scalable resource back-end, etc.) Review data distribution and delivery technologies, including event streaming, event serving, “FTS”, protocols, etc. 12 Sep 2017 CWP - computing models

Possible Model for future HEP computing infrastructure Cloud users: Analysis Simulation resources HEP Data lake Storage and compute HEP Data cloud Storage and compute 1st September 2017 Ian.Bird@cern.ch

Main themes – 2 Enable the adaptation to use very heterogeneous resources: HPC, specialized clusters, opportunistic, clouds (commercial or not) -> managing cost and quotas. Elasticity vs fixed capacity. Need ongoing and continual review and assessment of performance, bottlenecks, to understand where to direct next investment Software/libraries adaptation and validation for wide variety of processor types: Many/multi core; multi-threading, vector units, GPUs, all common CPU types Need capability to rapidly port to and validate on new architectures, even processor generations (new instruction sets) 12 Sep 2017 CWP - computing models

Main themes – 3 Assess the utility of implementing commonality at various layers of the architecture Commonality across well-understood functionalities, Interest from experiments in working together on common data management, resource provisioning (partly integrated in facilities), workload and workflow management, frameworks, etc. Room for innovation and change, within common frameworks 12 Sep 2017 CWP - computing models

Evolution vs change The running system of today has to evolve into the system for HL-LHC This does not mean there cannot be a major change in components; new facilities alongside old ones, new services phased in and old ones phased out But there will never be a stop of the current system and building of a new one The paradigm is one of managed change and evolution 12 Sep 2017 CWP - computing models

Cost There is no “cost model” Conditions are very different across countries, sites, funding agencies, etc., and change with time. What is needed is a way to continually optimize the system: To understand current performance bottlenecks; A process of continual measurement, review, optimization, change; requires: A useful set of understandable metrics (which may evolve) A good understanding of the performance To guide where next investment of effort, resources will be of greatest benefit; how to balance between CPU, storage, network, etc., and FTEs! 12 Sep 2017 CWP - computing models

R&D A program of R&D and prototyping is being drafted Covering main topics called out in the paper Providing testbeds (building on things like Techlab, openlab) Program of in-depth performance understanding, metrics, to optimize the system across CPU, Storage, networks End-end performance and overall optimisation of the system Recreate vs keeping datasets; ties together DM, Workflows, data analysis Etc. 12 Sep 2017 CWP - computing models

Draft documents Computing models, facilities, distributed computing as discussed here Other white papers on specific topics Necessary to join the group: hsf-community-white-paper@googlegroups.com 12 Sep 2017 CWP - computing models