Changes to CernVM-FS repository are staged on an “installation box" using a read/write file system interface. There is a dedicated installation box for.

Slides:



Advertisements
Similar presentations
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Advertisements

Replication Technologies at WLCG Lorena Lobato Pardavila CERN IT Department – DB Group JINR/CERN Grid and Management Information Systems, Dubna (Russia)
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
SaaS, PaaS & TaaS By: Raza Usmani
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
CVMFS: Software Access Anywhere Dan Bradley Any data, Any time, Anywhere Project.
Test results Test definition (1) Istituto Nazionale di Fisica Nucleare, Sezione di Roma; (2) Istituto Nazionale di Fisica Nucleare, Sezione di Bologna.
Abstract The automated multi-platform software nightly build system is a major component in the ATLAS collaborative software organization, validation and.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Introduction to CVMFS A way to distribute HEP software on cloud Tian Yan (IHEP Computing Center, BESIIICGEM Cloud Computing Summer School.
ATLAS Metrics for CCRC’08 Database Milestones WLCG CCRC'08 Post-Mortem Workshop CERN, Geneva, Switzerland June 12-13, 2008 Alexandre Vaniachine.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
Nightly System Growth Graphs Abstract For over 10 years of development the ATLAS Nightly Build System has evolved into a factory for automatic release.
Daniela Anzellotti Alessandro De Salvo Barbara Martelli Lorenzo Rinaldi.
Introduction to the Adapter Server Rob Mace June, 2008.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Organization and Management of ATLAS Nightly Builds F. Luehring a, E. Obreshkov b, D.Quarrie c, G. Rybkine d, A. Undrus e University of Indiana, USA a,
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
Alex Undrus – Nightly Builds – ATLAS SW Week – Dec Preamble: Code Referencing Code Referencing is a vital service to cope with 7 million lines of.
NA61/NA49 virtualisation: status and plans Dag Toppe Larsen CERN
STAR Collaboration, July 2004 Grid Collector Wei-Ming Zhang Kent State University John Wu, Alex Sim, Junmin Gu and Arie Shoshani Lawrence Berkeley National.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
CERN IT Department t LHCb Software Distribution Roberto Santinelli CERN IT/GS.
2012 Objectives for CernVM. PH/SFT Technical Group Meeting CernVM/Subprojects The R&D phase of the project has finished and we continue to work as part.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Magda Distributed Data Manager Prototype Torre Wenaus BNL September 2001.
Grid Technologies for Distributed Database Services 3D Project Meeting CERN, May 19, 2005 A. Vaniachine (ANL)
Predrag Buncic (CERN/PH-SFT) Software Packaging: Can Virtualization help?
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Testing CernVM-FS scalability at RAL Tier1 Ian Collier RAL Tier1 Fabric Team WLCG GDB - September
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
CernVM-FS Infrastructure for EGI VOs Catalin Condurache - STFC RAL Tier1 EGI Webinar, 5 September 2013.
Feedback from CMS Andrew Lahiff STFC Rutherford Appleton Laboratory Contributions from Christoph Wissing, Bockjoo Kim, Alessandro Degano CernVM Users Workshop.
36 th LHCb Software Week Pere Mato/CERN.  Provide a complete, portable and easy to configure user environment for developing and running LHC data analysis.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF.
Copyright © 2004 R2AD, LLC Submitted to GGF ACS Working Group for GGF-16 R2AD, LLC Distributing Software Life Cycles Join the ACS Team GGF-16, Athens R2AD,
CernVM-FS – Best Practice to Consolidate Global Software Distribution Catalin CONDURACHE, Ian COLLIER STFC RAL Tier-1 ISGC15, Taipei, March 2015.
Considerations on Using CernVM-FS for Datasets Sharing Within Various Research Communities Catalin Condurache STFC RAL UK ISGC, Taipei, 18 March 2016.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Alessandro De Salvo Mayuko Kataoka, Arturo Sanchez Pineda,Yuri Smirnov CHEP 2015 The ATLAS Software Installation System v2 Alessandro De Salvo Mayuko Kataoka,
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Dario Barberis: ATLAS DB S&C Week – 3 December Oracle/Frontier and CondDB Consolidation Dario Barberis Genoa University/INFN.
CVMFS Alessandro De Salvo Outline  CVMFS architecture  CVMFS usage in the.
Predrag Buncic, CERN/PH-SFT The Future of CernVM.
CHEP 2010 Taipei, 19 October Predrag Buncic Jakob Blomer, Carlos Aguado Sanchez, Pere Mato, Artem Harutyunyan CERN/PH-SFT.
CernVM-FS vs Dataset Sharing
By: Raza Usmani SaaS, PaaS & TaaS By: Raza Usmani
Database Replication and Monitoring
StoRM: a SRM solution for disk based storage systems
Blueprint of Persistent Infrastructure as a Service
Introduction to CVMFS A way to distribute HEP software on cloud
Conditions Data access using FroNTier Squid cache Server
CernVM Status Report Predrag Buncic (CERN/PH-SFT).
Patrick Dreher Research Scientist & Associate Director
Presentation transcript:

Changes to CernVM-FS repository are staged on an “installation box" using a read/write file system interface. There is a dedicated installation box for each repository. CernVM-FS tools installed on these machines are used to commit changes, process new and updated files, and to publish new repository snapshots. The machines act as a pure interface to the master storage and can be safely re-installed if required. Software installation and condition data distribution via CernVM FileSystem in ATLAS On behalf of the ATLAS collaboration A De Salvo 1, A De Silva 2, D Benjamin 3, J Blomer 4, P Buncic 4, A Harutyunyan 4, A. Undrus 5, Y Yao 6 [1] Istituto Nazionale di Fisica Nucleare, Sezione di Roma1; [2] TRIUMF (CA); [3] Duke University (US); [4] CERN (CH); [5] Brookhaven National Laboratory (US); [6] LBNL (US) The ATLAS Collaboration is managing one of the largest collections of software among the High Energy Physics Experiments. Traditionally this software has been distributed via rpm or pacman packages, and has been installed in every site and user's machine, using more space than needed since the releases could not always share common binaries. As soon as the software has grown in size and number of releases this approach showed its limits, in terms of manageability, used disk space and performance. The adopted solution is based on the CernVM FileSystem, a fuse-based http, read-only filesystem which guarantees file de-duplication, on-demand file transfer with caching, scalability and performance. CernVM-FS The CernVM File System (CernVM-FS) is a file system used by various HEP experiments for the access and on-demand delivery of software stacks for data analysis, reconstruction, and simulation. It consists of web servers and web caches for data distribution to the CernVM-FS clients that provide a POSIX compliant read-only file system on the worker nodes. CernVM-FS leverages the use of content-addressable storage. Files with the same content are automatically not duplicated because they have the same content hash. Data integrity is trivial to verify by re-calculating the cryptographic content hash. Files that carry the name of their cryptographic content hash are immutable. Hence they are easy to cache as there is no expiry policy required. There is natural support for versioning and file system snapshots. CernVM-FS uses HTTP as its transfer protocol and transfers files and file catalogs only on request, i.e. on an open() or stat() call. Once transferred, files and file catalogs are locally cached. Installed items: 1.Stable Software Releases; 2.Nightly builds; 3.Local settings; 4.Software Tools 5.Conditions Data Stable Software Releases The deployment is performed via the same installation agents used in the standard Grid sites. The installation agent is currently manually run by a release manager whenever a new release is available, but can be already automatized by running periodically. The installation parameters are kept in the Installation DB, and dynamically queried by the deployment agent when started. The Installation DB is part of the ATLAS Installation System, and updated by the Grid installation team. Once the releases are installed in CernVMFS, a validation job is sent to each site. For each validated releases a tag is added to the site resources, meaning they are suitable to be used for with the given release. The Installation System can handle both CernVMFS and non-CernVMFS sites, transparently. Conditions data ATLAS conditions data refers to the time varying non-event’ data needed to operate and debug the ATLAS detector, perform reconstruction of event data and subsequent analysis. The conditions data contains all sorts of slowly evolving data including detector alignment, calibration, monitoring, and slow controls data. Most of the conditions data is stored within an Oracle database at CERN and accessed remotely using the Frontier service. Some of the data is stored in external files due to the complexity and size of the data objects. CernVM-FS provides an ideal solution for distributing the files that would other wise have to be distributed through the ATLAS distributed data management system. CernVM-FS allows the large computing sites simplification by reducing the need for specialized storage areas. Performance Some measurements on CernVM-FS access times against the most used file systems in ATLAS have shown that the performance of CernVM-FS is either comparable or better performing when the system caches are filled with pre-loaded data. The results of a metadata-intensive test performed at CNAF, Italy, show the comparison of performance between CernVM-FS and cNFS over gpfs. The tests are equivalent to the operations that are performed by each ATLAS job in real life. The average access times with CernVM-FS are ~16s when the cache is loaded and ~37s when the cache is empty, to be compared with the cNFS values of ~19s and ~49s respectively with page cache on and off. Therefore CernVM-FS is generally even faster than standard distributed file systems. According to other measurements, CVMFS also performs consistently better than AFS in wide area networks with the additional bonus that the system can work in disconnected Mode. Local Settings The local parameters needed before running jobs, ATLAS currently uses an hybrid system, where the generic values are kept in CernVM-FS, with a possible override at the site level from a location pointed by an environment variable, defined by the site administrators to point to a local shared disk of the cluster. Periodic installation jobs write the per-site configuration in the shared local area of the sites. The fully enabled CernVM-FS option, where the local parameters are dynamically retrieved by the ATLAS Grid Information System (AGIS) is being tested. Software Tools The availability of Tier3 software tools from CernVM-FS is invaluable for the Physicist end-user to access Grid Resources and for analysis activities on locally managed resources, which can range from a grid-enabled Data Center to a user desktop or a CernVM Virtual Machine. The software, comprising multiple versions of Grid Middleware, front-end Grid job submission tools, analysis frameworks, compilers, data management software and a User Interface with diagnostics tools, are first tested in an integrated environment and then installed on the CernVM-FS server by the same mechanism that allows local disk installation on non-privileged Tier3 accounts. The User Interface integrates the Tier3 software with the Stable and Nightly ATLAS releases and the Conditions Data, the latter two residing on different CernVM-FS repositories. Nightly Releases The ATLAS Nightly Build System maintains about 50 branches of the multi-platform releases based on the recent versions of software packages. Code developers use nightly releases for testing new packages, verification of patches to existing software, and migration to new platforms, compilers, and external software versions. ATLAS nightly CVMFS server provides about 15 widely used nightly branches. Each branch is represented by 7 latest nightly releases. CVMFS installations are fully automatic and tightly synchronized with Nightly Build System processes. The most recent nightly releases are uploaded daily in few hours after a build completion. The status of CVMFS installation is displayed on the Nightly System web pages. CernVM-FS Infrastructure for ATLAS The ATLAS experiment maintains three CernVM-FS repositories, for stable software release, for conditions data, and for nightly builds, respectively. The nightly builds are the daily builds that reflect the current state of the source code, provided by the ATLAS Nightly Build System. Each of these repositories is subject to daily changes. The Stratum- 0 web service provides the entry point to access data via HTTP. Hourly replication jobs synchronize the Stratum 0 web service with the Stratum 1 web services. Stratum 1 web services are operated at CERN, BNL, RAL, FermiLab, and ASGC LHC Tier 1 centers. CernVM-FS clients can connect to any of the Stratum 1 servers. They automatically switch to another Stratum 1 service in case of failures. For computer clusters, CernVM-FS clients connect through a cluster-local web proxy. In case of ATLAS, CernVM-FS mostly piggybacks on the already deployed network of Squid caches of the Frontier service