Computing Strategy Victoria White, Associate Lab Director for Computing and CIO Fermilab PAC June 24, 2011.

Slides:



Advertisements
Similar presentations
Fermilab, the Grid & Cloud Computing Department and the KISTI Collaboration GSDC - KISTI Workshop Jangsan-ri, Nov 4, 2011 Gabriele Garzoglio Grid & Cloud.
Advertisements

Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Ian Bird WLCG Workshop Okinawa, 12 th April 2015.
Assessment of Core Services provided to USLHC by OSG.
“ Does Cloud Computing Offer a Viable Option for the Control of Statistical Data: How Safe Are Clouds” Federal Committee for Statistical Methodology (FCSM)
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
GridPP Steve Lloyd, Chair of the GridPP Collaboration Board.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
SLUO LHC Workshop: Closing RemarksPage 1 SLUO LHC Workshop: Closing Remarks David MacFarlane Associate Laboratory Directory for PPA.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.
The Grid & Cloud Computing Department at Fermilab and the KISTI Collaboration Meeting with KISTI Nov 1, 2011 Gabriele Garzoglio Grid & Cloud Computing.
US CMS/D0/CDF Jet/Met Workshop – Jan. 28, The CMS Physics Analysis Center - PAC Dan Green US CMS Program Manager Jan. 28, 2004.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Feb. 14, 2002DØRAM Proposal DØ IB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) Introduction Partial Workshop Results DØRAM Architecture.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
ComPASS Summary, Budgets & Discussion Panagiotis Spentzouris, Fermilab ComPASS PI.
LHC Computing, CERN, & Federated Identities
U.S. ATLAS Computing Facilities Overview Bruce G. Gibbard Brookhaven National Laboratory U.S. LHC Software and Computing Review Brookhaven National Laboratory.
F Tevatron Run II Computing Strategy Victoria White Head, Computing Division, Fermilab June 8, 2007.
Computing Division FY03 Budget and budget outlook for FY04 + CDF International Finance Committee April 4, 2003 Vicky White Head, Computing Division.
Fermilab: Present and Future Young-Kee Kim Data Preservation Workshop May 16, 2011.
Run II Review Closeout 15 Sept., 2004 FNAL. Thanks! …all the hard work from the reviewees –And all the speakers …hospitality of our hosts Good progress.
Ian Bird CERN, 17 th July 2013 July 17, 2013
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
Projects, Tools and Engineering Patricia McBride Computing Division Fermilab March 17, 2004.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Victoria A. White Head, Computing Division, Fermilab Fermilab Grid Computing – CDF, D0 and more..
Fermilab Site Report Keith Chadwick Grid & Cloud Computing Department Head Fermilab Work supported by the U.S. Department of Energy under contract No.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Nigel Lockyer Fermilab Operations Review 16 th -18 th May 2016 Fermilab in the Context of the DOE Mission.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Nigel Lockyer Fermilab Operations Review 16 th -18 th May 2016 Fermilab in the Context of the DOE Mission.
Computing Strategy Victoria White, Associate Lab Director for Computing and CIO Fermilab Institutional Review June 6-9, 2011.
Fabric for Frontier Experiments at Fermilab Gabriele Garzoglio Grid and Cloud Services Department, Scientific Computing Division, Fermilab ISGC – Thu,
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
CPM 2012, Fermilab D. MacFarlane & N. Holtkamp The Snowmass process and SLAC plans for HEP.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Fermilab Budget Briefing FY 2014 Intensity Frontier Proton Research KA Breakout February 28, 2013 Office of High Energy Physics Germantown, MD.
1 CF lab review ; September 16-19, 2013, H.Weerts Budget and Activity Summary Slides with budget numbers and FTE summaries/activity for FY14 through FY16.
Hall D Computing Facilities Ian Bird 16 March 2001.
Particle Physics Sector Young-Kee Kim / Greg Bock Leadership Team Strategic Planning Winter Workshop January 29, 2013.
Margaret Votava / Scientific Computing Division FIFE Workshop 20 June 2016 State of the Facilities.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
Gene Oleynik, Head of Data Storage and Caching,
Computing models, facilities, distributed computing
Electron Ion Collider New aspects of EIC experiment instrumentation and computing, as well as their possible impact on and context in society (B) COMPUTING.
LQCD Computing Operations
Presentation transcript:

Computing Strategy Victoria White, Associate Lab Director for Computing and CIO Fermilab PAC June 24, 2011

The Experiments you approve Depend heavily (at all stages from inception to publication and beyond) on Computing:  Facilities (power, cooling, space)  Data storage and distribution  Compute servers  Grid services  Databases  High performance networks  Software frameworks for simulation, processing, analysis  Tools such as GEANT, ROOT, Pythia, GENIE  General tools to support collaboration, documentation, code management, etc. Computing Strategy - Fermilab PAC 6/24/2011 2

Our job in the Computing Sector Is to enable science and to optimize the support (human and technological) of the scientific programs of the lab (including the Experiment program)  Within funding and resource contraints  In the face of growing demands  To meet emerging needs  To deal with rapidly changing technology We also have to provide computing to support the lab’s operations and provide all the standard services that an organization needs (and often expects 24x7) Computing Strategy - Fermilab PAC 6/24/2011 3

Computing Division -> Computing Sector Computing Strategy - Fermilab PAC 6/24/ Service Management Business Relationship Management (BSM) ITIL Process Owners Continuous Service Improvement Program ISO 20K Certification Office of the CIO Enterprise Architecture (EA) & Configuration Management Computer Security Governance and Portfolio Management Project Management Office Financial Management

Scientific Computing strategy Provide computing, software tools and expertise to all parts of the Fermilab scientific program including theory simulations (Lattice QCD and Cosmology), and accelerator modeling Work closely with each scientific program – as collaborators (where a scientist from computing is involved) and as valued customers. Create a coherent Scientific Computing program from the many parts and many funding sources – encouraging sharing of facilities, common approaches and re-use of software wherever possible Computing Strategy - Fermilab PAC 6/24/2011 5

EXPERIMENT COMPUTING STRATEGIES Computing Strategy - Fermilab PAC 6/24/2011 6

CMS Tier 1 at Fermilab The CMS Tier-1 facility at Fermilab and the experienced team who operate it enable CMS to reprocess data quickly and to distribute the data reliably to the user community around the world. Computing Strategy - Fermilab PAC 6/24/ Fermilab also operates: LHC Physics Center (LPC) Remote Operations Center U.S. CMS Analysis Facility

CMS Offline and Computing Fermilab is a hub for CMS Offline and Computing  Ian Fisk is the CMS Computing Coordinator  Liz Sexton-Kennedy is Deputy Offline Coordinator  Patricia McBride is Deputy Computing Coordinator  Leadership roles in many areas in CMS Offline and Computing: Frameworks, Simulations, Data Quality Monitoring, Workload Management and Data Management, Data Operations, Integration and User Support. Fermilab Remote Operations Center allows US physicists to participate in monitoring shifts for CMS. Computing Strategy - Fermilab PAC 6/24/2011 8

Computing Strategy for CMS Continue to evolve the CMS Tier 1 center at Fermilab - to meet US obligations to CMS and provide the highest level of availability and functionality for the $ Continue to ensure that the LHC Physics Center and the US CMS physics community is well supported by the Tier 3 (LPC CAF) at Fermilab Plan for evolution of the computing, software and data access models as the experiment matures – requires R&D and development  Ever higher bandwidth networks  Data on demand  Frameworks for multi-core Computing Strategy - Fermilab PAC 6/24/2011 9

Any Data, Anywhere, Any time: Early Demonstrator 10 Root I/O and Xrootd demonstrator : an example of evolving requirements and technology Computing Strategy - Fermilab PAC 6/24/2011

Run II Computing Strategy Production processing and Monte-Carlo production capability after the end of data taking  Reprocessing efforts in 2011/2012 aimed at the Higgs  Monte Carlo production at the current rate through mid Analysis computing capability for at least 5 years, but diminishing after end of 2012  Push for 2012 conferences for many results –no large drop in computing requirements through this period Continued support for up to 5 years for  Code management and science software infrastructure  Data handling for production (+MC) and Analysis Operations Curation of the data: > 10 years with possibly some support for continuing analyses Computing Strategy - Fermilab PAC 6/24/

Tevatron – looking ahead Computing Strategy - Fermilab PAC 6/24/ CDF and D0 expect the publication rate to remain stable for several years. Analysis activity:  Expect > 100 (students+ postdocs) actively doing analysis in each experiment through  Expect this number to be much smaller in 2015 though data analysis will still be on-going. D0 Publications each year CDF Publications each year

“Data Preservation” for Tevatron data Data will be stored and migrated to new tape technologies for ~ 10 years  Eventually 16 PB of data will seem modest If we want to maintain the ability to reprocess and do analysis on the data there is a lot of work to be done to keep the entire environment viable  Code, access to databases, libraries, I/O routines, Operating Systems, documentation….. If there is a goal to provide “open data” that scientists outside of CDF and Dzero could use there is even more work to do. 4 th Data Preservation Workshop was held at Fermilab in May Not just a Tevatron issue Computing Strategy - Fermilab PAC 6/24/

Intensity Frontier program needs Computing Strategy - Fermilab PAC 6/24/ Many experiments in many different phases of development/operations. MINOS MiniBooNE SciBooNE MINERvA NOvA MicroBooNE ArgoNeuT Mu2e g-2 LBNE Project X era expts CPU (cores) Disk (TB) 1 PB

Intensity Frontier strategies NuComp forum to encourage planning and common approaches where possible A shared analysis facility where we can quickly and flexibly allocate computing to experiments Continue to work to “grid enable” the simulation and processing software  Good success with MINOS, MINERvA and Mu2e All experiments use shared storage services – for data and local disk – so we can allocate resources when needed Hired two associate scientists in the past year and reassigned another scientist. Computing Strategy - Fermilab PAC 6/24/

Budget/resource allocation for There is always upward pressure for computing  more disk and more cpu leads to faster results and greater flexibility  more help with software & operations is always requested Within a fixed budget each experiment can usually optimize between tape drives, tapes, disk, cpu, servers  assuming basic shared services are provided. With so many experiments in so many different stages we intend to convene a “Scientific Computing Portfolio Management Team” to examine the needs/computing models of the different Fermilab based experiments and help in allocating the finite dollars to optimize scientific output. Computing Strategy - Fermilab PAC 6/24/

Cosmic Frontier experiments Continue to curate data for SDSS Support data and processing for Auger, CDMS and COUPP Will maintain an archive copy of the DES data and provide modest analysis facilities for Fermilab DES scientists.  Data management is an NCSA (NSF) responsibility  We have the capability to provide computing should this become necessary DES use Open Science Grid resources opportunistically Future initiatives still in the planning stages Computing Strategy - Fermilab PAC 6/24/ SDSS DES

DES Analysis Computing at Fermilab Computing Strategy - Fermilab PAC 6/24/ Fermilab plans to host a copy of the DES Science Archive. This consists of two pieces  A copy of the Science database  A copy of the relevant image data on disk and tape This copy serves a number of different roles  Acts as a backup for the primary NCSA archive, enabling collaboration access to the data when the primary is unavailable  Handles queries by the collaboration, thus supplementing the resources at NCSA  Enables the Fermilab scientists to effectively exploit the DES data for science analysis To support the science analysis of the Fermilab Scientists, DES will need a modest amount of computing (of order 24 nodes). This is similar to what was supported for the SDSS project.

LSST Fermilab recently joined LSST Fermilab expertise in data management, software frameworks, overall computing  from SDSS and from the entire program means we could contribute effectively Currently negotiating small roles in  Data Acquisition (where it touches data management)  Science Analysis (where it touches data management) Computing Strategy - Fermilab PAC 6/24/

SOFTWARE IN COLLABORATION Computing Strategy - Fermilab PAC 6/24/

Software Tools and frameworks: our strategy Develop and maintain core expertise and tools, aiming to support the entire lifecycle of scientific programs  Focus on areas of general applicability with long term support requirements  Work in partnership with individual programs to create scientific applications  Participate in projects and collaborations that aim to develop scientific computational infrastructure Provide support of concept development to scientific programs in pre-project phase  Enabled by core expertise and tools Reuse expertise and best-of-class tools from partnerships with individual projects and make them available to other projects Computing Strategy - Fermilab PAC 6/24/

Framework Applications  Success: specific application (RunII) leads to community tool and continuing requests for framework applications from new projects  Success: high-quality implementations (most recently, CMS framework) RunII Offline infrastructure Framework LQCD software LAr NOv A CMS Mu2e MiniBooNE Computing Strategy - Fermilab PAC 6/24/

“CMS framework in excellent shape and well validated*” Computing Strategy - Fermilab PAC 6/24/ *CMS offline coordinators, Dec 2010

Detector Simulation GEANT activity: members of G4 collaboration since 2007, toolkit capability development. Work in critical areas defined by G4 external reviews Simulation development & support activity: provide expertise and support to Fermilab projects and users. Applications in high-priority areas for the Fermilab program. Shifting from LHC/CMS main focus to Intensity Frontier Toolkit evolution: in collaboration with other institutions (SLAC, CERN,…) Optimize performance of existing toolkit Enhance capabilities and improve infrastructure Computing Strategy - Fermilab PAC 6/24/

Analysis suites for the community: ROOT ROOT is the standard HEP analysis toolkit, used for RunII, LHC, and Intensity Frontier  Fermilab is a founding member of the ROOT project Support deployment and operation of ROOT applications by Fermilab users and projects Development emphasis, in collaboration with CERN, to optimize I/O (essential for LHC) and thread safety (driven by technology evolution and LHC needs) 25 Computing Strategy - Fermilab PAC 6/24/2011

Software – collaborative efforts Computing Strategy - Fermilab PAC 6/24/ ComPASS – Accelerator Modeling Tools project Lattice QCD project and USQCD Collaboration Open Science Grid – many aspects and some sub- projects such as Grid security, workload management Grid and Data Management tools Advanced Wide Area Network projects Dcache collaboration Enstore collaboration Scientific Linux (with CERN) GEANT core development /validation (with GEANT4 collaboration) ROOT development & support (with CERN) Cosmological Computing Data Preservation initiative (global HEP)

SHARING STRATEGIES Computing Strategy - Fermilab PAC 6/24/

Why Sharing Strategies are needed Cost Coherent technical approaches and architectures Support over the entire lifecycle of an experiment/project Computing Strategy - Fermilab PAC 6/24/

Experiment/Project Lifecycle and funding Computing Strategy - Fermilab PAC 6/24/ Early Period R&D, Simulations LOI, Proposals Shared services Mature phase Construction, Operations, Analysis Shared services Expt or Project specific Final data-taking and beyond Final analysis, Data preservation and access Shared services Shared services Project specific Shared services

Sharing via the Grid – FermiGrid Computing Strategy - Fermilab PAC 6/24/ TeraGrid WLCG NDGF User Login & Job Submission GRIDFarm 3284 slots CMS 7485 slots CDF 5600 slots D slots FermiGrid Monitoring/ Accounting Services FermiGrid Infrastructure Services FermiGrid Site Gateway FermiGrid Authentication /Authorization Services Open Science Grid

Computing Strategy - Fermilab PAC 6/24/ The Open Science Grid (OSG) advances science through open distributed computing. The OSG is a multi-disciplinary partnership to federate local, regional, community and national cyberinfrastructures to meet the needs of research and academic communities at all scales. Total of 95 sites; ½ million jobs a day, 1 million CPU hours/day; 1 million files transferred/day. It is cost effective, it promotes collaboration, it is working! Open Science Grid (OSG) The US contribution and partnership with the LHC Computing Grid is provided through OSG for CMS and ATLAS

FNAL CPU – core count for science Computing Strategy - Fermilab PAC 6/24/

Data Storage at Fermilab - Tape Computing Strategy - Fermilab PAC 6/24/

Data on tape - total Computing Strategy - Fermilab PAC 6/24/2011 Other Experiments 34

FermiCloud: Virtualization likely a key component for long term analysis The FermiCloud project is a private cloud facility built to provide a production facility for cloud services A private cloud—on-site access only for registered Fermilab users  Can be evolved into a hybrid cloud with connections to Magellan, Amazon or other cloud provider in the future. Much of the “data intensive” computing cannot use commercial Cloud computing Not cost effective today for permanent use – only for overflow or unexpected needs for Simulation. Computing Strategy - Fermilab PAC 6/24/

COMPUTING FOR THEORY AND SIMULATION SCIENCE Computing Strategy - Fermilab PAC 6/24/

High Performance (parallel) Computing is needed for Lattice Gauge Theory calculations (LQCD) Accelerator modeling tools and simulations Computational Cosmology: Computing Strategy - Fermilab PAC 6/24/ Dark energy, matter Cosmic gasGalaxies Simulations connect fundamentals with observables

Strategies for Simulation Science Computing Lattice QCD is the poster child  Coherent inclusive US QCD collaboration Paul MacKenzie, Fermilab leads. This allocates HPC resources.  LQCD Computing Project (HEP and NP funding) Bill Boroski, Fermilab is the Project Manager  SciDAC II project to develop the software infrastructure Accelerator modeling  Multi-institutional tools project COMPASS – Panagiotis Spentzouris, Fermilab is the PI  Also accelerator project specific modeling efforts Computational Cosmology  Computational Cosmology Collaboration (C 3 ) for mid-range computing for astrophysics and cosmology  Taskforce – Fermilab, ANL, U of Chicago - to develop strategy Computing Strategy - Fermilab PAC 6/24/

CORE COMPUTING & INFRASTRUCTURE Computing Strategy - Fermilab PAC 6/24/

Core Computing – a strong base Scientific Computing relies on Core Computing services and Computing Facility infrastructure  Core Networking and network services  Computer rooms, power and cooling  , videoconferencing, web servers  Document databases, Indico, calendering  Service desk  Monitoring and alerts  Logistics  Desktop support (Windows and Mac)  Printer support  Computer Security  ….. and more All of the above is provided through overheads Computing Strategy - Fermilab PAC 6/24/

Computer Rooms The home of all the scientific computing hardware is the computer rooms.  They provide power, space and cooling for all the systems.  CD’s computer rooms are a critical component of the successful delivery of scientific computing. Computing Strategy - Fermilab PAC 6/24/ Feynman Computing Center (FCC) Grid Computing Center (GCC)Lattice Computing Center (LCC)

Fermilab Computing Facilities Computing Strategy - Fermilab PAC 6/24/ Lattice Computing Center (LCC) High Performance Computing (HPC) Accelerator Simulation, Cosmology nodes No UPS Feynman Computing Center (FCC) High availability services – e.g. core network, , etc. Tape Robotic Storage ( slot libraries) UPS & Standby Power Generation ARRA project: upgrade cooling and add HA computing room - completed Grid Computing Center (GCC) High Density Computational Computing CMS, RUNII, Grid Farm batch worker nodes Lattice HPC nodes Tape Robotic Storage ( slot libraries) UPS & taps for portable generators EPA Energy Star award 2010

Facilities: more than just space power and cooling – continuous planning Computing Strategy - Fermilab PAC 6/24/ ARRA funded new high availability computer room in Feynman Computing Center Many CMS disks are now in here

Reliable high speed networking is key Computing Strategy - Fermilab PAC 6/24/

Conclusion We have a coherent and evolving scientific computing program that emphasizes sharing of resources, re-use of code and tools, and requirements planning. Embedded scientists with deep involvement are also a key strategy for success. Fermilab takes on leadership roles in computing in many areas. We support projects and experiments at all stages of their lifecycle – but if we want to truly preserve access to Tevatron data long term much more work is needed. Computing Strategy - Fermilab PAC 6/24/