Assessment of Core Services provided to USLHC by OSG.

Slides:



Advertisements
Similar presentations
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Advertisements

 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
May 9, 2008 Reorganization of the OSG Project The existing project organization chart was put in place at the beginning of It has worked very well.
Technical Review Group (TRG)Agenda 27/04/06 TRG Remit Membership Operation ICT Strategy ICT Roadmap.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Open Science Grid June 28, 2006 Bill Kramer Chair of the Open Science Grid Council NERSC Center General Manager, LBNL.
Introduction and Overview “the grid” – a proposed distributed computing infrastructure for advanced science and engineering. Purpose: grid concept is motivated.
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
Welcome to HTCondor Week #14 (year #29 for our project)
US NITRD LSN-MAGIC Coordinating Team – Organization and Goals Richard Carlson NGNS Program Manager, Research Division, Office of Advanced Scientific Computing.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
Welcome to CW 2007!!!. The Condor Project (Established ‘85) Distributed Computing research performed by.
Key Project Drivers - FY11 Ruth Pordes, June 15th 2010.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
Cloud Computing in NASA Missions Dan Whorton CTO, Stinger Ghaffarian Technologies June 25, 2010 All material in RED will be updated.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
SG - OSG Improving Campus Research CI Through Leveraging and Integration: Developing a SURAgrid-OSG Collaboration John McGee, RENCI/OSG Engagement Coordinator.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
The ILC And the Grid Andreas Gellrich DESY LCWS2007 DESY, Hamburg, Germany
Bob Jones Technical Director CERN - August 2003 EGEE is proposed as a project to be funded by the European Union under contract IST
Tools for collaboration How to share your duck tales…
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
…building the next IT revolution From Web to Grid…
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Open Science Grid & its Security Technical Group ESCC22 Jul 2004 Bob Cowles
Report from the WLCG Operations and Tools TEG Maria Girone / CERN & Jeff Templon / NIKHEF WLCG Workshop, 19 th May 2012.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
EMI INFSO-RI European Middleware Initiative (EMI) Alberto Di Meglio (CERN)
LHC Computing, CERN, & Federated Identities
VLDATA Common solution for the (very-)large data challenge EINFRA-1, focus on topics (4) & (5)
European Middleware Initiative (EMI) Alberto Di Meglio (CERN) Project Director.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Ian Collier, STFC, Romain Wartel, CERN Maintaining Traceability in an Evolving Distributed Computing Environment Introduction Security.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Ian Bird LCG Project Leader Status of EGEE  EGI transition WLCG LHCC Referees’ meeting 21 st September 2009.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
EGI-InSPIRE EGI-InSPIRE RI The European Grid Infrastructure Steven Newhouse Director, EGI.eu Project Director, EGI-InSPIRE 29/06/2016CoreGrid.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
What is OSG? (What does it have to do with Atlas T3s?) What is OSG? (What does it have to do with Atlas T3s?) Dan Fraser OSG Production Coordinator OSG.
Operations Coordination Team Maria Girone, CERN IT-ES GDB, 11 July 2012.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Bob Jones EGEE Technical Director
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Joslynn Lee – Data Science Educator
JRA3 Introduction Åke Edlund EGEE Security Head
Integrated Management System and Certification
Ian Bird GDB Meeting CERN 9 September 2003
POW MND section.
How to enable computing
WLCG Collaboration Workshop;
Cloud Computing R&D Proposal
Input on Sustainability
Productive + Hybrid + Intelligent + Trusted
Presentation transcript:

Assessment of Core Services provided to USLHC by OSG

Charge “ Please describe the essential services and infrastructure supplied by OSG for LHC computing operations and provide a cost estimate for those services in analogy to the proposed European model which includes identifying the core services currently provided by EGEE to wLCG. Be certain to note commonalities and differences between U.S. ATLAS and U.S. CMS. Please describe the service level agreements with OSG for those services. Identify potential issues associated with a possible transitioning of these services to the LHC Operations Program and suggest possible solutions.”

Introduction Four years into the Open Science Grid project the LHC experiments, together with a number of other sciences, can well rely on the OSG to provide vital services for their scientific computing infrastructure The computing systems for both ATLAS and CMS consist of a grid of more than a hundred distributed computing centers. The Grid approach helps to distribute the huge and complex multi-PetaByte data sets and sophisticated analysis and data simulation needs across a large number of sites. Distributing the computing also addresses the complex funding structure of a truly distributed worldwide collaboration of hundreds of universities and labs. Different from a company or a single institutional computing center with top- down management and coordination covering all required services, the LHC experiments are fundamentally a group of researchers that need to rely on a loosely affiliated set of computing centers

OSG - Center of Expertise OSG develops and maintains a center-of-expertise in High Throughput Computing (HTC) and Grid Computing which is leveraged by U.S. ATLAS and U.S. CMS as new needs become apparent. OSG is viewed as the right forum for building a community for establishing best practices and knowledge sharing about building and evolving the computing infrastructure within the campus of institutions in the U.S. participating in LHC data analysis. To be specific this should include more efficient and professional system administration, which will lead to reduced downtimes, less manual configuration, improved system tuning, reduced costs, and more efficient use of the scientist ’ s time. Conversely, in terms of connecting campuses to national infrastructure we require OSG to develop a strategy across all existing work areas to support these connections.

Collaboration Domain and computational scientists work together to solve the end- to-end application needs based on advancing the principles, methodologies and frameworks of large-scale computing In such a distributed environment the LHC science program cannot just rely on bilateral agreements with individual sites, even if some of them like the participating national labs have the know-how and IT infrastructure to provide a large set of the required services. The OSG consortium addresses the need for a homogenous approach across all sites on important issues. This includes not only the middleware and interfaces to site services, but also the approaches and procedures concerning computer security, incident response, user registration etc.

Strategic Importance of OSG OSG thus has become a major strategic component for the US LHC scientific programs addressing critical needs for LHC computing. OSG also benefits university computing centers and national laboratories that are providing computing for science. It allows them to provide and manage their facilities across their broader program and to capitalize on economies made possible by sharing expertise and support. Currently funded as a research project OSG has entered a stage where OSG has evolved into a mission-critical facility infrastructure USLHC ’ s VO- specific services heavily depend on. More than 50% of personnel is devoted to operations. We consider OSG therefore today a production facility with a research component rather than a just research project. It is vital to the LHC program that the present level of service continues uninterrupted for the foreseeable future, and that all of the services and support structures upon which the LHC program relies today have a clear transition or continuation strategy.

Middleware Architecture A technology working group under the leadership of the Technical Director of OSG comprising participation from U.S. ATLAS, U.S. CMS, LIGO and OSG is investigating, researching, and clarifying design issues, resolving questions directly, and summarizing technical design trade-offs such that the component project teams can make informed decisions

Storage Services We expect that future storage developments — which we know will be required to scale up the storage systems to unprecedented sizes in the coming 5 years — will profit from a strong collaborative approach between OSG stakeholders, and that OSG will be ready to provide integration, deployment and operational support for the next generation of storage solutions.

Configuration Management We understand configuration management to mean a distributed tool that makes system (OS and software) deployment and configuration simple and reliable. If a particular server needs to be restored, re-created, or duplicated the system updates itself according to a centrally defined profile. In the context of OSG, the idea would be a tool that allows administrators to deploy, change, and backup and restore (via versioning) their entire site software configuration.

Application Support The U.S. LHC program requires that OSG personnel will continue to be involved in the process of developing and applying experiment- specific services for LHC data analysis on top of the OSG middleware stack. Examples include scalable workload management systems, like glideinWMS and PanDA and high performance systems for data storage and data access. OSG provides support for integrating the services into the OSG and global Grid infrastructure. In PanDA, for example, the OSG provides for the integration of security infrastructure mandated and deployed by WLCG and OSG to provide secure and traceable operation of pilot- based multi-user workload management systems

Virtualization From a technology viewpoint, the number of cores per machine will continue to increase in the near future and the challenge becomes implementing software in ways that can efficiently exploit them. The increased number of cores per machine has helped to drive the rapid adoption of virtualization. In addition to its benefits for resource consolidation, virtualization creates opportunities for a more flexible approach to offering computing services. Both technologies are rapidly maturing, particularly in terms of performance and management tools. Physics applications can benefit from these advances but computing services need to adapt to support them

Virtualization - Actions Provide infrastructure at the centers for the preparation of virtual machine (VM) images delivery to them developing and running LHC data analysis. Include the capability to run VM images at Tier-1 and Tier-2 virtualized batch systems Establish procedures for creating images that can be trusted and run at Grid sites. This is needed for Virtual Organizations like U.S. ATLAS and U.S. CMS to be able to run their images on their own facilities and on opportunistic resources. Deploy multicore performance and monitoring tools (e.g. KSM, PERFMON) at U.S. LHC facility sites. Provide input to initiatives for running multicore jobs Grid-wide, e.g. MPI (Message Passing Interface) Working Group recommendations. Grid interoperability with clouds: Prototype a solution to run Grid jobs on academic and commercial cloud resources

Effort - LHC-specific (WLCG)

Effort - LHC-specific (Ops)

Effort - LHC-specific

Effort - non LHC-specific

Additional Effort (1/2)

Additional Effort (2/2)