CERN IT Department CH-1211 Genève 23 Switzerland The CERN internal Cloud Sebastien Goasguen, Belmiro Rodrigues Moreira, Ewan Roche, Ulrich.

Slides:



Advertisements
Similar presentations
Profit from the cloud TM Parallels Dynamic Infrastructure AndOpenStack.
Advertisements

CERN IT Department CH-1211 Genève 23 Switzerland t CERN-IT Plans on Virtualization Ian Bird On behalf of IT WLCG Workshop, 9 th July 2010.
INFSO-RI An On-Demand Dynamic Virtualization Manager Øyvind Valen-Sendstad CERN – IT/GD, ETICS Virtual Node bootstrapper.
European Organization for Nuclear Research Virtualization Review and Discussion Omer Khalid 17 th June 2010.
Virtualization for Cloud Computing
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Ewan Roche, Ulrich Schwickerath, Manuel Guijarro,
Tier-1 experience with provisioning virtualised worker nodes on demand Andrew Lahiff, Ian Collier STFC Rutherford Appleton Laboratory, Harwell Oxford,
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
Deploying Moodle with Red Hat Enterprise Virtualization Brian McSpadden Director of Network Operations Remote-Learner.net.
Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
+ CS 325: CS Hardware and Software Organization and Architecture Cloud Architectures.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
1 Evolution of OSG to support virtualization and multi-core applications (Perspective of a Condor Guy) Dan Bradley University of Wisconsin Workshop on.
Eucalyptus 3 (&3.1). Eucalyptus 3 Product Overview – Govind Rangasamy.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
Jose Castro Leon CERN – IT/OIS CERN Agile Infrastructure Infrastructure as a Service.
Virtualised Worker Nodes Where are we? What next? Tony Cass GDB /12/12.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN-IT Update Ian Bird On behalf of IT Multi-core and Virtualisation Workshop,
Ian Gable HEPiX Spring 2009, Umeå 1 VM CPU Benchmarking the HEPiX Way Manfred Alef, Ian Gable FZK Karlsruhe University of Victoria May 28, 2009.
Commissioning the CERN IT Agile Infrastructure with experiment workloads Ramón Medrano Llamas IT-SDC-OL
Image Distribution and VMIC (brainstorm) Belmiro Moreira CERN IT-PES-PS.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
CERN - IT Department CH-1211 Genève 23 Switzerland t Operating systems and Information Services OIS Proposed Drupal Service Definition IT-OIS.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Juraj Sucik, Michal Kwiatek, Rafal.
1 Cloud Services Requirements and Challenges of Large International User Groups Laurence Field IT/SDC 2/12/2014.
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
HEPiX Virtualisation Working Group Status, February 10 th 2010 April 21 st 2010 May 12 th 2010.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Improving resilience of T0 grid services Manuel Guijarro.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Agile Infrastructure Project Overview : Status and.
Farming Andrea Chierici CNAF Review Current situation.
Trusted Virtual Machine Images the HEPiX Point of View Tony Cass October 21 st 2011.
WP5 – Infrastructure Operations Test and Production Infrastructures StratusLab kick-off meeting June 2010, Orsay, France GRNET.
The HEPiX Virtualisation Working Group Towards a Grid of Clouds Tony Cass CHEP 2012 May 24 th 2012.
SEMINAR ON.  OVERVIEW -  What is Cloud Computing???  Amazon Elastic Cloud Computing (Amazon EC2)  Amazon EC2 Core Concept  How to use Amazon EC2.
HEPiX Virtualisation working group Andrea Chierici INFN-CNAF Workshop CCR 2010.
CH-1211 Genève 23 Job efficiencies at CERN Review of job efficiencies at CERN status report James Casey, Daniel Rodrigues, Ulrich Schwickerath.
CERN IT Department CH-1211 Genève 23 Switzerland PES Virtualization at CERN – a status report Virtualization at CERN – status report Sebastien.
CERN Openlab Openlab II virtualization developments Havard Bjerke.
CERN IT Department CH-1211 Genève 23 Switzerland Batch virtualization project at CERN The batch virtualization project at CERN Tony Cass,
SUSE Linux Enterprise Server for SAP Applications
A move towards Greener Planet
WLCG IPv6 deployment strategy
Cloud Technology and the NGS Steve Thorn Edinburgh University (Matteo Turilli, Oxford University)‏ Presented by David Fergusson.
Use of HLT farm and Clouds in ALICE
Resource Provisioning Services Introduction and Plans
IT Services Katarzyna Dziedziniewicz-Wojcik IT-DB.
Cloud Challenges C. Loomis (CNRS/LAL) EGI-TF (Amsterdam)
Virtualisation for NA49/NA61
Blueprint of Persistent Infrastructure as a Service
Dag Toppe Larsen UiB/CERN CERN,
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
Dag Toppe Larsen UiB/CERN CERN,
ATLAS Cloud Operations
StratusLab Final Periodic Review
StratusLab Final Periodic Review
BDII Performance Tests
Virtualisation for NA49/NA61
Статус ГРИД-кластера ИЯФ СО РАН.
PES Lessons learned from large scale LSF scalability tests
CPU accounting of public cloud resources
Virtualization in the gLite Grid Middleware software process
FCT Follow-up Meeting 31 March, 2017 Fernando Meireles
CernVM Status Report Predrag Buncic (CERN/PH-SFT).
WLCG Collaboration Workshop;
OS Virtualization.
Managing Clouds with VMM
Virtualization Meetup Discussion
Cloud Computing: Concepts
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland The CERN internal Cloud Sebastien Goasguen, Belmiro Rodrigues Moreira, Ewan Roche, Ulrich Schwickerath, Romain Wartel HEPIX2010, Cornell university See also related presentations: Related presentations by Tony Cass, Romain Wartel, and myself Virtualization at CERN, HEPiX spring meeting 2010 Batch virtualization at CERN, HEPIX autumn meeting 2009 Virtualization, HEPIX spring meeting 2009 Virtualization vision, GDB 9/9/2009 and HEPIX Batch virtualization at CERN, EGEE09 conference, Barcelona, The CERN internal Cloud status report PES

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 2 Outline IaaS for CERN Principles Components Applications Principles Components Project status Summary

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 3 Principles CERNs internal “cloud” prototype … if we can call it a cloud

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 4 Principles Requirements: the approach must be easy to manage smoothly integrate with the existing infrastructure be scalable be flexible for extensions

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 5 Design decisions: setup of an internal infrastructure for virtualization With the potential to become a public cloud The initial deployment foresees that we deliver IaaS internally to IT service managers NOT (yet?) to end users the first and for now only customer is the batch service Principles

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 6 The hypervisor cluster (lxcloud): Supports XEN and KVM (XEN being dropped now) Is centrally managed with a unique software setup Provides the infrastructure required for IaaS Runs the latest OS version to fully support the hardware Components

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 7 The local image cache: Is a central place holding all approved images Is designed in collaboration with the HEPiX WG Components See the presentation of Romain Wartel on VMIC Internal image distribution Caching of images on the hypervisors Transfer using bit-torrent

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 8 The image provisioning system Receives requests Selects a hypervisor Provisions the machine Controls and monitors it during its lifetime Components

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 9 Components

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 10 Components ISF: InfraStructure sharing facility, Platform computing pVMO (physical virtual machine orchestrator) RIA (resource interface adapter) Native support modules (eg. for VmWare) AC (adaptive cluster) (requires LSF 8.0) ONE: OpenNebula Modular OpenSource solution Now also professional support Cloud interfaces

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 11 ISF and ONE relationship Hypervisor cluster (physical resources) OpenNebula Platform Infrastructure Sharing Facility (ISF) For high level VM management pVMO VM provisioning system(s) pVMO and ONE do approximately the same thing

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 12 Planning: ISF or ONE or both ? ISF version 2.0 preview

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 13 The batch application design Batch resources as application Or: virtualization of batch resources

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 14 The batch application design Requirements transparent to the end users possible to mix virtual machines and physical worker nodes possible to start small and grow scalable able to solve the operational issues

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 15 A Golden node: Is a centrally managed virtual machine Is a clone of physical machines running the same services Does not execute jobs by itself Serves as source for virtual machine creation Application components

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 16 Adding automation and elasticity(*) Limit the active life time of the virtual machines (to 24h) Automatically Monitor the VM population and restart nodes as needed Application components (*) Batch service managers point of view

CERN IT Department CH-1211 Genève 23 Switzerland 17 ISC-cloud 2010 – Frankfurt / Main How to manage intrusive interventions in a fragmented setup ? Draining, runs until job finishes Accepts jobs for 24 H LONG JOB VM Slot #1 Draining, runs until job finishes Accepts jobs for 24 H SHORT JOB VM Slot #2 Draining, runs until job finishesAccepts jobs for 24 H LONG JOB Image AImage B Draining, runs until job finishes Accepts jobs for 24 H SHORT JOB Notes: NEW virtual machines always start with the latest image Image A and Image B can correspond to different OS versions Application components

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 18 Feasibility studies The question is: will it work at the required scale ? Or: will it ever fly ?

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 19 Feasibility studies General infrastructure (eg. ldap) Networking: number of IPs Virtual machine provisioning systems Image distribution efficiency Application level: Virtual batch nodes Batch system scalability → dedicated presentation Scalability at all levels

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 20 Feasibility studies Virtual machine provisioning system tests Time Nodes seen in LSF Peak of ~16,000 VMs

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 21 Feasibility studies OpenNebula: Up to 16,000 VMs started, all available slots filled ISF/pVMO: Up to ~10,000 VMs started Limitation is understood and no longer present

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 22 Feasibility studies Image Distribution SCP wave: 38 min Rtorrent: 17 min See Romain's presentation

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 23 Feasibility studies Currently we have: 12 KVM hypervisors ready for production hypervisors with private IPs And 8 VM slots with public IPs 2 KVM hypervisors deployed into production 16 KVM VMs running as public batch nodes for ~4 weeks Virtual batch nodes with real userjobs

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 24 New feature which is: not seen on physical worker nodes caused by network monitoring tools used at CERN under investigation Virtual machines in public batch

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 25 Changes since last HEPiX Dropped support for SLC4 Favor KVM over XEN KVM is the emerging technology XEN support for in RHES/SL6 is questionable CPU usage monitoring works out of the box with KVM 16 VMs

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 26 KVM in SLC5, impressions Fairly stable Some performance issues CPU performance O(10%) low I/O performance even more Requires tuning, and eventually a newer version KVM processes stay in S mode Utilization turns up as system CPU, not user CPU

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 27 Impementation status

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 28 Implementation status

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 29 Plans and remaining technical issues: VM CPU accounting / CPU factors ISF adaptive cluster may come late Still some development needed for VM renewal Performance considerations For scaling up: Krb5 authentication for root access may not scale May need our own lemon instance ? LSF doesn't scale up to 15k nodes, redesign is needed

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 30 Summary Quite some progress during summer 2010 Only one application so far (lxbatch) 16 VMs under test with real payload Up to 96 by Christmas Scale up Expected for 2011 Depends on experiences

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 31 ? Any questions

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 32 Benchmarking CPU benchmarks: HS06 I/O : iozone No tunings applied Up to 17% penalty in CPU with current KVM Same order or worse for IOzone Needs further investigation and tuning Tunings: kvm kernel module options, CPU affinity

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 33 Benchmarking HS06 KVM, no tuning applied HS06 rating # CPUs / VM Bare metal: 8 Cores, 8 processes: ~96 HS06, so 17% performance penalty

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 34 Benchmarking HS06 rating # VMs/hypervisor Bare metal: ~12 HS06/core, so 12% performance penalty HS06 KVM no tuning

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 35 I/O Benchmarking Iozone read performance VM Bare metal KV M

CERN IT Department CH-1211 Genève 23 Switzerland The internal cloud infrastructure at CERN - a status report - 36 I/O Benchmarking Iozone write performance VM Bare metal KV M