Presentation is loading. Please wait.

Presentation is loading. Please wait.

Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław

Similar presentations


Presentation on theme: "Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław"— Presentation transcript:

1 Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław 05.03.2012

2 Outline Quick reminder of CERNVM Tasks Roadmap Input needed
Each task in detail Roadmap Input needed NA61/NA49 meeting, Wrocław

3 CERNVM CERNVM is a Linux-distribution Addition software
Designed specifically for virtual machines (VMs) Based on SLC (currently SLC5) Compressed image size ~300MB Both 32-bit and 64-bit versions Addition software “Standard” software via Conary package manager Experiment software via CVMFS Contextualisation: images adapted to experiment requirements during boot Data preservation: all images are permanently preserved NA61/NA49 meeting, Wrocław

4 CVMFS Distributed read-only file system for CERNVM (i.e. the same as AFS for LXPLUS) Can also be used by “real” machines (e.g. LXPLUS, grid) Files compressed and distributed via HTTP Global availability Central server, site replication via standard HTTP proxies Files decompressed and cached on (CERNVM) computer Can run without Internet access if all needed files are cached Mainly for experimental software, but also other “static” data (e.g. calibration data) Each experiment has a repository to store all versions of software Common software (e.g. ROOT) available from SFT repository NA61/NA49 meeting, Wrocław

5 Data preservation As technology evolves, no longer possible to run legacy software on modern platforms Must be preserved and accessible: Experiment data Experiment software Operating environment (operating system, libraries, compilers, hardware) Just preserving data and software is not enough Virtualisation may preserve operating environment NA61/NA49 meeting, Wrocław

6 CERNVM data preservation
“Solution”: Experiment data stored on Castor Experiment software versions stored on CVMFS HTTP “lasting” technology Operation environments stored as CERNVM image versions Thus, a legacy version of CERNVM can be started as a VM, running a legacy version of experiment software Forward-looking approach (we start preserving now) NA61/NA49 meeting, Wrocław

7 Tasks Make experiment software available Facilitate batch processing
Validate outputs On-demand virtual clusters Production reconstruction Reference cloud cluster Data bookkeeping web interface NA61/NA49 meeting, Wrocław

8 Make experiment software available
NA61/NA49 software must be available on CVMFS for CERNVM to process data NA61 Legacy software chain installed Changes to be fed back to SVN SHINE Preparing to install Use ROOT from SFT repository Conary package manager to install other dependencies Have to create package for XZ, currently not available Will there be 64-bit version of SHINE, or will it always be 32-bit? Installation expected to be easier than for legacy chain Not “critical” until ready, but good to gain experience, and be prepared NA49 SLC4 development machine and repository set up Need expert support with actual installation NA61/NA49 meeting, Wrocław

9 Facilitate batch processing
LXPLUS uses PBS batch system, CERNVM uses Condor New scripts prepared “Philosophical” differences PBS has separate script for each job Condor has common job description file Installation of legacy NA61 reconstruction chain recently completed Issues discovered, which requires modifications to scripts But no big issues NA61/NA49 meeting, Wrocław

10 Validate outputs Data processed on CERNVM/CVMFS have to produce same results as from LXPLUS/AFS A larger data set should be used for this testing As part of processing the data on CERNVM, one can automatically run ds_diff on the newly reconstructed data, and LXPLUS data copied from Castor “Easy” to add to Condor script Output from ds_diff must be checked by hand Make sure same versions of reconstruction software is used NA61/NA49 meeting, Wrocław

11 On-demand virtual clusters
On boot, the VMs are set up (contextualised) with the configurations and software needed by the relevant experiment Environment (variables, etc.) Version of experimental software Version of OS image Hardware configuration (e.g. RAM) VMs can be discarded after the data is processed A script will create a virtual cluster with head node and a suitable number of worker nodes Cluster discarded when jobs are finished Initially command-line script Later controlled by data bookkeeping web interface NA61/NA49 meeting, Wrocław

12 Production reconstruction
After outputs are validated, production reconstruction next step Cluster of “decent” size needed Need to submit ~50 VMs to process a large data set Reference cloud too small Need to negotiate with IT to use LXCLOUD (not- yet-public CERN cloud) CERN already has a large number of internal virtual machines NA61/NA49 meeting, Wrocław

13 Reference cloud cluster
The virtual machines require a cluster of physical hosts A reference cloud cluster has been created Detailed documentation will simplify the process of replicating it at other sites Based on OpenNebula (popular cloud framework) KVM hypervisor Provides Amazon EC2 interface (de facto standard for cloud management) NA61/NA49 meeting, Wrocław

14 Data bookkeeping web interface
A web interface for bookkeeping of the data to be created List all existing data with status (e.g. software versions used for processing) Easy selection of data for (re)processing with selected OS and software version A virtual on-demand cluster is created After processing, data written back to Castor Either based on existing frameworks, or on new development Likely using EC2 interface for the cloud management Can allow for great flexibility of processing site NA61/NA49 meeting, Wrocław

15 Roadmap Task Status/done Remaining Expected NA61 software installation
Legacy framework SHINE End of March? NA49 software installation Development machine, software repository Software installation Facilitate batch system Condor job scripts Modifications/bug fixes March Validate outputs Small data set Large data set (using batch system) End of March On-demand virtual cluster Cluster creation / destroy scripts Production reconstruction Dependencies mostly ready Remaining tasks, prepare for real reconstruction April Reference cloud cluster Cluster working Documentation June/July Data bookkeeping web interface Initial planing Evaluate frameworks “First” version “Final” version End of October NA61/NA49 meeting, Wrocław

16 Input needed NA49 software installation Eventual SHINE issues
Eventual validation issues How to practically arrange for production reconstruction Please keep virtualisation (CERNVM/CVMFS) in mind when making plans ... NA61/NA49 meeting, Wrocław


Download ppt "Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław"

Similar presentations


Ads by Google