NA61/NA49 virtualisation: status and plans Dag Toppe Larsen CERN 08.10.2012.

Slides:



Advertisements
Similar presentations
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Advertisements

A Computation Management Agent for Multi-Institutional Grids
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
Reproducible Environment for Scientific Applications (Lab session) Tak-Lon (Stephen) Wu.
To run the program: To run the program: You need the OS: You need the OS:
Space Science and Engineering Center University of Wisconsin-Madison Virtual Machines: A method for distributing DB processing software Liam Gumley.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.
SYN407D: Image Management made easy with Provisioning Services 6.0
Customized cloud platform for computing on your terms !
BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar.
Building service testbeds on FIRE D5.2.5 Virtual Cluster on Federated Cloud Demonstration Kit August 2012 Version 1.0 Copyright © 2012 CESGA. All rights.
Testing Virtual Machine Performance Running ATLAS Software Yushu Yao Paolo Calafiura LBNL April 15,
1 port BOSS on Wenjing Wu (IHEP-CC)
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Introduction to CVMFS A way to distribute HEP software on cloud Tian Yan (IHEP Computing Center, BESIIICGEM Cloud Computing Summer School.
Data production using CernVM and lxCloud Dag Toppe Larsen Belgrade
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Vagrant workflow Jul. 15, 2014.
Status of StoRM+Lustre and Multi-VO Support YAN Tian Distributed Computing Group Meeting Oct. 14, 2014.
Data production using CernVM and LxCloud Dag Toppe Larsen Warsaw,
Predrag Buncic (CERN/PH-SFT) WP9 - Workshop Summary
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
Infrastructure for QA and automatic trending F. Bellini, M. Germain ALICE Offline Week, 19 th November 2014.
Changes to CernVM-FS repository are staged on an “installation box" using a read/write file system interface. There is a dedicated installation box for.
SWGData and Software Access - 1 UCB, Nov 15/16, 2006 THEMIS SCIENCE WORKING TEAM MEETING Data and Software Access Ken Bromund GST Inc., at NASA/GSFC.
WLCG Overview Board, September 3 rd 2010 P. Mato, P.Buncic Use of multi-core and virtualization technologies.
J.P. Wellisch, CERN/EP/SFT SCRAM Information on SCRAM J.P. Wellisch, C. Williams, S. Ashby.
Servicing HEP experiments with a complete set of ready integrated and configured common software components Stefan Roiser 1, Ana Gaspar 1, Yves Perrin.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Tools and techniques for managing virtual machine images Andreas.
2012 Objectives for CernVM. PH/SFT Technical Group Meeting CernVM/Subprojects The R&D phase of the project has finished and we continue to work as part.
Maite Barroso - 10/05/01 - n° 1 WP4 PM9 Deliverable Presentation: Interim Installation System Configuration Management Prototype
Predrag Buncic (CERN/PH-SFT) Virtualizing LHC Applications.
NA61/NA49 virtualisation: status and plans Dag Toppe Larsen Budapest
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Selenium server By, Kartikeya Rastogi Mayur Sapre Mosheca. R
Predrag Buncic (CERN/PH-SFT) Software Packaging: Can Virtualization help?
Seweryn Kowalski Database for NA61 experiment Nuclear Physics Department Institute of Physics University of Silesia.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Oracle Virtualization Last Update Copyright 2012 Kenneth M. Chipps Ph.D.
1 Cloud Services Requirements and Challenges of Large International User Groups Laurence Field IT/SDC 2/12/2014.
Feedback from CMS Andrew Lahiff STFC Rutherford Appleton Laboratory Contributions from Christoph Wissing, Bockjoo Kim, Alessandro Degano CernVM Users Workshop.
36 th LHCb Software Week Pere Mato/CERN.  Provide a complete, portable and easy to configure user environment for developing and running LHC data analysis.
Virtual Machines Module 2. Objectives Define virtual machine Define common terminology Identify advantages and disadvantages Determine what software is.
NA61 Collaboration Meeting CERN, December Predrag Buncic, Mihajlo Mudrinic CERN/PH-SFT Enabling long term data preservation.
Predrag Buncic (CERN/PH-SFT) CernVM Status. CERN, 24/10/ Virtualization R&D (WP9)  The aim of WP9 is to provide a complete, portable and easy.
EGI-InSPIRE RI EGI Webinar EGI-InSPIRE RI Porting your application to the EGI Federated Cloud 17 Feb
© ExplorNet’s Centers for Quality Teaching and Learning 1 Explain the purpose of Microsoft virtualization. Objective Course Weight 2%
Advancing CernVM-FS and its Development Infrastructure José Molina Colmenero CERN EP-SFT.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Virtualisation: status and plans Dag Toppe Larsen
Predrag Buncic, CERN/PH-SFT The Future of CernVM.
bitcurator-access-webtools Quick Start Guide
Use of HLT farm and Clouds in ALICE
Update on revised HEPiX Contextualization
Virtualisation for NA49/NA61
NA61/NA49 virtualisation:
Blueprint of Persistent Infrastructure as a Service
Dag Toppe Larsen UiB/CERN CERN,
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
Dag Toppe Larsen UiB/CERN CERN,
ATLAS Cloud Operations
Virtualisation for NA49/NA61
ETICS Services Management
CernVM Status Report Predrag Buncic (CERN/PH-SFT).
bitcurator-access-webtools Quick Start Guide
Presentation transcript:

NA61/NA49 virtualisation: status and plans Dag Toppe Larsen CERN

NA61/NA49 meeting, CERN2 Outline Quick reminder of CERNVM and installation Tasks Each task in detail Roadmap Input needed

NA61/NA49 meeting, CERN3 CERNVM CERNVM is a Linux-distribution Designed specifically for virtual machines (VMs) Based on SLC (currently SLC5) Compressed image size ~300MB Both 32-bit and 64-bit versions Addition software “Standard” software via Conary package manager Experiment software via CVMFS Contextualisation: images adapted to experiment requirements during boot Data preservation: all images are permanently preserved

NA61/NA49 meeting, CERN4 CVMFS Distributed read-only file system for CERNVM (i.e. the same as AFS for LXPLUS) Can also be used by “real” machines (e.g. LXPLUS, grid) Files compressed and distributed via HTTP  Global availability Central server, site replication via standard HTTP proxies Files decompressed and cached on (CERNVM) computer  Can run without Internet access if all needed files are cached Mainly for experimental software, but also other “static” data (e.g. calibration data) Each experiment has a repository to store all versions of software Common software (e.g. ROOT) available from SFT repository

NA61/NA49 meeting, CERN5 Data preservation As technology evolves, no longer possible to run legacy software on modern platforms Must be preserved and accessible: Experiment data Experiment software Operating environment (operating system, libraries, compilers, hardware) Just preserving data and software is not enough Virtualisation may preserve operating environment

NA61/NA49 meeting, CERN6 CERNVM data preservation “Solution”: Experiment data stored on Castor Experiment software versions stored on CVMFS  HTTP “lasting” technology Operation environments stored as CERNVM image versions Thus, a legacy version of CERNVM can be started as a VM, running a legacy version of experiment software Forward-looking approach (we start preserving now)

NA61/NA49 meeting, CERN7 CernVM for development CernVM makes it possible to run production version of legacy software/shine on laptop without local install Also possible to compile Shine from SVN on CernVM “out of the box” when the proper NA61 environment is set up

NA61/NA49 meeting, CERN8 CernVM installation on laptop Install a hypervisor of your choice, e.g. Virtualbox: Download a matching CernVM desktop image: Open :8004 in your web browser (user=admin, password=password) Select NA61 and PH-SFT software repositories Reboot You are now ready to use NA61 software in CernVM on your laptop! More information: wOFInstallation (CernVM section) wOFInstallation

NA61/NA49 meeting, CERN9 Tasks Make experiment software available Facilitate batch processing Validate outputs On-demand virtual clusters Reference cloud cluster Data (re)production scripts Production reconstruction Data production web interface

NA61/NA49 meeting, CERN10 Make experiment software available NA61/NA49 software must be available on CVMFS for CernVM to process data NA61 Legacy software chain installed  Changes to be fed back to SVN SHINE software installed  ROOT and other dependencies provided via CVMFS  SVN checkout compiles “out of the box”  Using 32-bit CernVM image NA49 Software has been installed

NA61/NA49 meeting, CERN11 Facilitate batch processing LXPLUS uses PBS batch system CernVM uses Condor batch system “Philosophical” differences PBS has one job script per job Condor has common job description file with parameters for each job Existing PBS scripts have been ported to Condor

NA61/NA49 meeting, CERN12 Output validation – status Run 8688 has been processed on both CernVM/CVMFS and LXPLUS/AFS, using software version v2r7g According to analysis by Grzegorz, there are relatively small discrepancies Despite gap TPC not running on CernVM/CVMFS, even if same set-up file and working on LXBATCH/CVMFS When bug has been found, should repeat CernVM/CVMFS, LXBATCH/CVMFS and LXBATCH/AFS comparison

NA61/NA49 meeting, CERN13 On-demand virtual clusters A cluster may need VMs of different configurations, depending on type of jobs Memory, CernVM version, experiment SW, etc. Thus, need for dynamic creation/destruction of virtual cluster Created command-line script for creating virtual clusters Later to be controlled by data production web interface

NA61/NA49 meeting, CERN14 Test production reconstruction To run on private cloud and LXCLOUND Currently, the private cloud has more resources, LXCLOUD the final target, important to do testing on it Data can currently be processed “by hand” Have tested the (re)production scripts, some modifications need Output should be compared/validated to the output from normal LXBATCH production Once this successful, request more LXCLOUD resources

NA61/NA49 meeting, CERN15 Private reference cloud cluster The virtual machines require a cluster of physical hosts A reference cloud cluster has been created Private cloud Currently 24 cores Set-up may be replicated on other sites wishing to provide cloud/CernVM resources

NA61/NA49 meeting, CERN16 Cloud cluster The virtual machines require a cluster of physical hosts A LXCLOUD cloud cluster has been created Provided by CERN IT  New service, currently “experimental” Currently allocated 4 virtual machines  May be expanded to include more VMs  Will push for this once complete processing chain is ready

NA61/NA49 meeting, CERN17 Data processing web interface A web interface for processing of the data to be created Interface to bookkeeping system to extract runs/chunks belonging to reactions List all existing raw/processed data with status (e.g. software versions used for processing) Easy selection of data for (re)processing with selected OS and software version A virtual on-demand cluster is created After processing, data written back to Castor Using EC2 interface for the cloud management Allows for great flexibility of processing site

NA61/NA49 meeting, CERN18 Data processing scripts Created script for submitting reaction for processing Input:  Reaction name  Software version  Global key  (CernVM version) Needs some “tuning” (e.g. better create set-up files from global key) Needs some improvement of job description files (include SHOE formats, PSD data) Created script for resubmit failed jobs Failed jobs identified from:  Non-existing/empty/small output DSPACK, SHOE, ROOT files  Failed/exited/terminated chunks/events  After resubmitting fixed number of times (3?), give up Mostly working OK, but a small number false positives (short runs with only 1 or 2 “empty” events)

NA61/NA49 meeting, CERN19 Data processing web interface & scripts Data processing web interface a front-end to the data processing scripts Reaction list from bookkeeping system Reaction run list from bookkeeping system Software list from CVMFS directory tree Global key list from local data base? User selects data and parameters, and click “process”.

NA61/NA49 meeting, CERN20 Roadmap TaskStatus/doneRemainingExpected NA61 software installation OKGap TPC not runningNovember? NA49 software installation OKData validationNovember? Facilitate batch system OK November? Validate outputsIn progressRerun after fixing gap TPC November? On-demand virtual cluster OK Production reconstruction Cluster readySome improvements to scripts October Reference cloud cluster OKDocumentationNovember Data processing web interface Created scripts for data (re)processing Create web interfaceNovember

NA61/NA49 meeting, CERN21 Next steps Parallel task 1 Understand source of GAP TPC not running Rerun validation Parallel task 2 Finalise data processing scripts Run large-scale processing using scripts from command line Request larger LXCLOUD Transfer to NA61 Parallel task 3 Create web interface Test web interface

NA61/NA49 meeting, CERN22 Input needed NA49 validation NA61 gap TPC Please keep virtualisation (CernVM/CVMFS) in mind when making plans...