Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Tools and techniques for managing virtual machine images Andreas.

Slides:



Advertisements
Similar presentations
S.Chechelnitskiy / SFU Simon Fraser Running CE and SE in a XEN virtualized environment S.Chechelnitskiy Simon Fraser University CHEP 2007 September 6 th.
Advertisements

INFSO-RI An On-Demand Dynamic Virtualization Manager Øyvind Valen-Sendstad CERN – IT/GD, ETICS Virtual Node bootstrapper.
A comparison between xen and kvm Andrea Chierici Riccardo Veraldi INFN-CNAF.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite Release Process Maria Alandes Pradillo.
OpenVZ Live Migration Jim Owens. Overview Review of OpenVZ Features Resource management Installation VM creation VM management Checkpointing Migration.
INFSO-RI Enabling Grids for E-sciencE Status of LCG-2 porting Stephen Childs, Brian Coghlan and Eamonn Kenny Grid-Ireland/EGEE October.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
INFSO-RI Enabling Grids for E-sciencE The gLite Software Development Process Alberto Di Meglio CERN.
EGEE is a project funded by the European Union under contract IST Testing processes Leanne Guy Testing activity manager JRA1 All hands meeting,
EGEE is a project funded by the European Union under contract IST Build Infrastructure & Release Procedures Integration.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Loomis (CNRS/LAL) M.-E. Bégin (SixSq.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks perfSONAR deployment over Spanish LHC Tier.
INFSO-RI Enabling Grids for E-sciencE The gLite Software Development Process Alberto Di Meglio EGEE – JRA1 CERN.
INFSO-RI SA1 Service Management Alberto AIMAR (CERN) ETICS 2 Final Review Brussels - 11 May 2010.
1 / 22 AliRoot and AliEn Build Integration and Testing System.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Extensions to the ETICS Build System Client.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Loomis (CNRS/LAL) M.-E. Bégin (SixSq.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Middleware Deployment and Support in EGEE.
1 CloudVS: Enabling Version Control for Virtual Machines in an Open- Source Cloud under Commodity Settings Chung-Pan Tang, Tsz-Yeung Wong, Patrick P. C.
EGEE is a project funded by the European Union under contract IST JRA1-SA1 requirement gathering Maite Barroso JRA1 Integration and Testing.
Predrag Buncic (CERN/PH-SFT) WP9 - Workshop Summary
20-May-2003HEPiX Amsterdam EDG Fabric Management on Solaris G. Cancio Melia, L. Cons, Ph. Defert, I. Reguero, J. Pelegrin, P. Poznanski, C. Ungil Presented.
INFSO-RI Enabling Grids for E-sciencE Strategy for gLite multi-platform support Author:Eamonn Kenny Meeting:SA3 All Hands Meeting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CERN status report SA3 All Hands Meeting.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Stephen Childs Trinity College Dublin &
EGEE’06 Conference Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Testing gLite middleware: overview & status Andreas.
EVGM081 Multi-Site Virtual Cluster: A User-Oriented, Distributed Deployment and Management Mechanism for Grid Computing Environments Takahiro Hirofuchi,
Ian Gable University of Victoria 1 Deploying HEP Applications Using Xen and Globus Virtual Workspaces A. Agarwal, A. Charbonneau, R. Desmarais, R. Enge,
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The future of the gLite release process Oliver.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
2012 Objectives for CernVM. PH/SFT Technical Group Meeting CernVM/Subprojects The R&D phase of the project has finished and we continue to work as part.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Maite Barroso - 10/05/01 - n° 1 WP4 PM9 Deliverable Presentation: Interim Installation System Configuration Management Prototype
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GLite testing status and future Gianni Pucciani.
INFSO-RI Enabling Grids for E-sciencE The gLite Software Development Process Alberto Di Meglio EGEE – JRA1 CERN.
EMI INFSO-RI EMI Quality Assurance Tools Lorenzo Dini (CERN) SA2.4 Task Leader.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA3 partner collaboration tasks & process.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Grid testing using virtual machines Stephen Childs*, Brian Coghlan, David O'Callaghan, Geoff Quigley, John Walsh Department of Computer Science Trinity.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite configuration (plans) Robert Harakaly.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Patch Preparation SA3 All Hands Meeting.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSA3.4.1 “The process document” Oliver Keeble.
EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Experiences with a distributed.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementing product teams Oliver Keeble.
Predrag Buncic (CERN/PH-SFT) CernVM Status. CERN, 24/10/ Virtualization R&D (WP9)  The aim of WP9 is to provide a complete, portable and easy.
EGEE-III INFSO-RI Enabling Grids for E-sciencE JRA1 and SA3 All Hands Meeting December 2009, CERN, Geneva Product Teams –
JRA1 Meeting – 09/02/ Software Configuration Management and Integration EGEE is proposed as a project funded by the European Union under contract.
INFSO-RI Enabling Grids for E-sciencE Software Process Author: Laurence Field (CERN) Presented by David Smith JRA1 All Hands meeting,
INFN/IGI contributions Federated Clouds Task Force F2F meeting November 24, 2011, Amsterdam.
WP5 – Infrastructure Operations Test and Production Infrastructures StratusLab kick-off meeting June 2010, Orsay, France GRNET.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Towards an Information System Product Team.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GOCDB4 Gilles Mathieu, RAL-STFC, UK An introduction.
A comparison between xen and kvm Andrea Chierici Riccardo Veraldi INFN-CNAF CCR 2009.
CERN Openlab Openlab II virtualization developments Havard Bjerke.
Virtualization Review and Discussion
Dag Toppe Larsen UiB/CERN CERN,
Dag Toppe Larsen UiB/CERN CERN,
Testing for patch certification
Future Test Activities SA3 All Hands Meeting Dublin
Leanne Guy EGEE JRA1 Test Team Manager
ETICS Services Management
Virtualization in the gLite Grid Middleware software process
Porting LCG to IA64 Andreas Unterkircher CERN openlab May 2004
Module 01 ETICS Overview ETICS Online Tutorials
Presentation transcript:

Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Tools and techniques for managing virtual machine images Andreas Unterkircher, Dimitar Shiyachki CERN Grid Deployment Group Havard Bjerke

Enabling Grids for E-sciencE Andreas Unterkircher Content Motivation Tools developed –VM image generation (libfsimage) –GUI for VM image generation (OSFarm) –Efficient transfer of VM images HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher gLite certification process GLite uses a continuous release process. Services are updated individually on top of a baseline release. Updates are added via a patch –We use Savannah for bug and patch tracking –Has one or more bugs attached –Can also be used to introduce new features or services –Has all relevant information: OS, architecture, affected services, baseline release, configuration changes, rpm lists etc. Patch is being certified –Update and configure affected services –Run basic and regression tests –Verify if attached bugs are fixed and write regression tests. –Put patch into “certified” or “rejected”. For the former the patch is ready for release to pre production and later for production. HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher Problems in patch certification Certification of several patches at the same time can cause conflicts. A non functional patch may spoil the whole testbed Patch certification often fails already at an early stage (rpm installation, configuration) A failed patch can pollute a machine. A complete reinstallation is necessary. Many scenarios and interactions have to be considered: gLite 3.1/gLite 3.2, SL4/SL5/Debian, x86/x86_ patches treated in 2008 HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher Virtualization for patch certification Patch certification can only be done efficiently by heavily using virtualization Already in summer 2006 we ran our own VM management system now known as VNode We collaborate with various groups in CERN IT –Openlab: OSFarm –FIO: Quattor profiles, SLC Xen support –Netops: hostnames for VMs HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher Certification process We operate a certification grid testbed offering all services. Certifiers bring up virtual machines containing the updated versions of nodes affected by a patch and connect them to the testbed. HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher Requirements for VM images We want to reproduce what is deployed at a site –Base OS installation + gLite node type Creating a VM images by scratching a physical machine is not practical A VM creation engine should run on an existing OS in a chroot environment VMs with different Linux flavors and architectures: SL(C)*, Debian,… x86 and x86_64 HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher VM image generation - libsfimage Python library for generating Linux file systems and populating them –Has command line interface –Produces images in a chroot environment Produces a tar.gz file that can be used as an image to boot with Xen Supported distributions: SL(C)3/4/5, Debian, Ubuntu, CentOS, Fedora on x86 and x86_64 Available in the xenvirt module in CERN’s CVS HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher Example./genfs.py –t SL4_x86_64 –d /tmp/SL4fs –o /tmp/SL4_x86_64_img1.tar.gz –g Base Core –p python openldap-client –w rootpwd HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher OSFarm HEPiX Spring Meeting Uses libsfimage for creating core images GUI for configuring and adding software to images Optimizes the image creation process by caching shared data

Enabling Grids for E-sciencE Andreas Unterkircher OSFarm HEPiX Spring Meeting Images and their configuration (XML) are stored in the repository for later retrieval Each request is checksummed and compared to existing configurations

Enabling Grids for E-sciencE Andreas Unterkircher Layered image generation Image core: minimal VM image –can be shared between VM image configurations Image base: contains software critical for virtual appliances –Can be shared between virtual appliances Image generation process is optimized by caching and sharing lower stages of an image HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher Copy-on-write staging HEPiX Spring Meeting layer request Request inferior layer snapshot apply packages layer exists? return layer Base stages are kept in stage Uses LVM snapshots (copy-on-write) for instantaneous staging Tag of a cached image is the checksum of its configuration parameters yes no

Enabling Grids for E-sciencE Andreas Unterkircher Using images on the Grid Many different user communities use the Grid. It is difficult for the Worker Nodes to fulfill all the different requirements. Provide each job on the Grid a virtual machine with a dedicated OS setup With thousands of users image transfer becomes an issue Observation –Images are often similar Content-Based transfer: don’t transfer the whole image, just transfer the difference to what is already available HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher Content-Based transfer HEPiX Spring Meeting libXlibY libXlibZ binX libY libX Each file starts on a block boundary Identical blocks can be identified with a hash checksum Only 50% of image A needs to be transferred if image B is already at the destination Image A Image B

Enabling Grids for E-sciencE Andreas Unterkircher Image comparisons High commonality can be expected in practice –Different Update versions of same base OS –Similar applications needed by users from same research area SL 3 (343 MB) and SL 4 (762 MB) –Transfer SLC4 to SLC3  22 % common blocks –Transfer SLC3 to SLC4  48 % common blocks HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher Cost Generating hash tables for source file and target repository – linear cost Accessing hash tables Hash table data overhead –Depends on  Hash function (e.g. SHA is 20 bytes)  Block size –Measured 0.5 – 2.0 % of the image size HEPiX Spring Meeting

Enabling Grids for E-sciencE Andreas Unterkircher Implementation HEPiX Spring Meeting Source machine Hash produce threa d Hash dispatch Target image Block request dispatch Block dispatch Image repository Prototype implementation done By H. Bjerke at openlab Multi-threaded Implemented in Java

Enabling Grids for E-sciencE Andreas Unterkircher Measurements HEPiX Spring Meeting Disk read speed: 35.6 MB/s Network bandwidth: 11.9 MB/s Delta = difference between source and target

Enabling Grids for E-sciencE Andreas Unterkircher DESY Virtualization Workshop 20 Discussion