Deployment work at CERN: installation and configuration tasks WP4 workshop Barcelona project conference 5/03 German Cancio CERN IT/FIO.

Slides:



Advertisements
Similar presentations
German Cancio – WP4 developments Partner Logo WP4-install plans WP6 meeting, Paris project conference
Advertisements

WSUS Presented by: Nada Abdullah Ahmed.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
ASIS et le projet EU DataGrid (EDG) Germán Cancio IT/FIO.
Distributed components
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
Current Status of Fabric Management at CERN, 26/7/2004 Current Status of Fabric Management at CERN CHEP 2004 Interlaken, 27/9/2004 CERN IT/FIO: G. Cancio,
How Clients and Servers Work Together. Objectives Learn about the interaction of clients and servers Explore the features and functions of Web servers.
.NET Mobile Application Development Introduction to Mobile and Distributed Applications.
Automating Linux Installations at CERN G. Cancio, L. Cons, P. Defert, M. Olive, I. Reguero, C. Rossi IT/PDP, CERN presented by G. Cancio.
Understanding and Managing WebSphere V5
Automatic Software Testing Tool for Computer Networks ARD Presentation Adi Shachar Yaniv Cohen Dudi Patimer
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
WP4-install task report WP4 workshop Barcelona project conference 5/03 German Cancio.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
ELFms meeting, 2/3/04 German Cancio, 2/3/04 Proxy servers in CERN-CC.
DataGrid is a project funded by the European Commission under contract IST IT Post-C5, Managing Computer Centre machines with Quattor.
1 Linux in the Computer Center at CERN Zeuthen Thorsten Kleinwort CERN-IT.
Olof Bärring – WP4 summary- 6/3/ n° 1 Partner Logo WP4 report Status, issues and plans
Large Computer Centres Tony Cass Leader, Fabric Infrastructure & Operations Group Information Technology Department 14 th January and medium.
EDG WP4: installation task LSCCW/HEPiX hands-on, NIKHEF 5/03 German Cancio CERN IT/FIO
Olof Bärring – WP4 summary- 4/9/ n° 1 Partner Logo WP4 report Plans for testbed 2
Partner Logo German Cancio – WP4-install LCFG HOW-TO - n° 1 LCFGng configuration examples Updated 10/2002
F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.
Quattor-for-Castor Jan van Eldik Sept 7, Outline Overview of CERN –Central bits CDB template structure SWREP –Local bits Updating profiles.
German Cancio – WP4 developments Partner Logo System Management: Node Configuration & Software Package Management
SONIC-3: Creating Large Scale Installations & Deployments Andrew S. Neumann Principal Engineer, Progress Sonic.
Large Farm 'Real Life Problems' and their Solutions Thorsten Kleinwort CERN IT/FIO HEPiX II/2004 BNL.
20-May-2003HEPiX Amsterdam EDG Fabric Management on Solaris G. Cancio Melia, L. Cons, Ph. Defert, I. Reguero, J. Pelegrin, P. Poznanski, C. Ungil Presented.
G. Cancio, L. Cons, Ph. Defert - n°1 October 2002 Software Packages Management System for the EU DataGrid G. Cancio Melia, L. Cons, Ph. Defert. CERN/IT.
Maite Barroso – WP4 Barcelona – 13/05/ n° 1 -WP4 Barcelona- Closure Maite Barroso 13/05/2003
Lemon Monitoring Miroslav Siket, German Cancio, David Front, Maciej Stepniewski CERN-IT/FIO-FS LCG Operations Workshop Bologna, May 2005.
Installing, running, and maintaining large Linux Clusters at CERN Thorsten Kleinwort CERN-IT/FIO CHEP
Distributed database system
SPMA & SWRep: Basic exercises HEPiX hands-on, NIKHEF 5/03 German Cancio
Software Management with Quattor German Cancio CERN/IT.
Olof Bärring – WP4 summary- 4/9/ n° 1 Partner Logo WP4 report Plans for testbed 2 [Including slides prepared by Lex Holt.]
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Managing the CERN LHC Tier0/Tier1 centre Status and Plans March 27 th 2003 CERN.ch.
SONIC-3: Creating Large Scale Installations & Deployments Andrew S. Neumann Principal Engineer Progress Sonic.
German Cancio – WP4 developments Partner Logo WP4 / ATF ATF meeting, 9/4/2002
Cluster Configuration Update Including LSF Status Thorsten Kleinwort for CERN IT/PDP-IS HEPiX I/2001 LAL Orsay Tuesday, December 08, 2015.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Fabric Management with ELFms BARC-CERN collaboration meeting B.A.R.C. Mumbai 28/10/05 Presented by G. Cancio – CERN/IT.
LCG LCG-1 Deployment and usage experience Lev Shamardin SINP MSU, Moscow
German Cancio – WP4 developments Partner Logo WP4-install progress CERN, 19/6/2002 for WP4-install.
Maite Barroso - 10/05/01 - n° 1 WP4 PM9 Deliverable Presentation: Interim Installation System Configuration Management Prototype
ASIS + RPM: ASISwsmp German Cancio, Lionel Cons, Philippe Defert, Andras Nagy CERN/IT Presented by Alan Lovell.
LCG CERN David Foster LCG WP4 Meeting 20 th June 2002 LCG Project Status WP4 Meeting Presentation David Foster IT/LCG 20 June 2002.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
Quattor tutorial Introduction German Cancio, Rafael Garcia, Cal Loomis.
Next Generation of Apache Hadoop MapReduce Owen
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Agile Infrastructure Project Overview : Status and.
Scientific Linux Inventory Project (SLIP) Troy Dawson Connie Sieh.
Fabric Management: Progress and Plans PEB Tim Smith IT/FIO.
Managing Large Linux Farms at CERN OpenLab: Fabric Management Workshop Tim Smith CERN/IT.
Quattor: An administration toolkit for optimizing resources Marco Emilio Poleggi - CERN/INFN-CNAF German Cancio - CERN
Status of Fabric Management at CERN
Germán Cancio CERN IT/FIO LCG workshop, 24/3/04
WP4-install status update
Status and plans of central CERN Linux facilities
German Cancio CERN IT .quattro architecture German Cancio CERN IT.
Distributed computing deals with hardware
Saranya Sriram Developer Evangelist | Microsoft
The Problem ~6,000 PCs Another ~1,000 boxes But! Affected by:
Presentation transcript:

Deployment work at CERN: installation and configuration tasks WP4 workshop Barcelona project conference 5/03 German Cancio CERN IT/FIO

WP4 deployment at CERN CC / - n° 2 Agenda u Why, what, when u CERN enhancements u Issues u Summary

WP4 deployment at CERN CC / - n° 3 Why? u CERN Computer Centre is being scaled up in capacity for handling LHC capacity requirements n O(10K) nodes expected in 2006/7 n Currently, ~ 1.5K n Concentrating on Linux on Intel Hardware u Current management software in production was written (many!) years ago, for a reduced number of heterogeneous RISC UNIX nodes n No central configuration mechanism (>20 databases holding configuration information!) n Insufficient automatization (manual steps required for node installation and configuration) n Scalability issues: eg. software distribution n Outdated technology obsoleted by upcoming de-facto standards (eg. RPM) u CERN’s aim participating in WP4 was to find solutions to this problem!

WP4 deployment at CERN CC / - n° 4 What? u Central Configuration Database n CDB set up for hosting configuration profiles of the main CERN interactive and batch services (LXPLUS and LXBATCH) n Templates created for hardware information (currently: new SEIL nodes): n System information including partition layout n Software packages configuration structured in templates (RH distribution, CERN CC packages, EDG packages) u Software Distribution n SWRep: hosting RPM packages n SPMA: Replaces legacy software update mechanisms (ASIS, rpmupdate, SUE security) u Node configuration n CCM: access to CDB information via NVA API n NCM: will replace the CERN ‘SUE’ node management framework

WP4 deployment at CERN CC / - n° 5 When? u SPM (SPMA and SWRep) n Started pilot for phasing out the CERN SW distribution systems on batch worker nodes for LCG-1 n Using HTTP as package access protocol (scalability) n > 100 nodes currently running it in production n Deployment page: u CDB n Database in production status, holding site-wide, cluster and node specific configurations for ~ 350 clients n Using WP4 Global schema with site specific extensions n Enhancements: GUI (next slide) u CCM n About to be deployed u NCM n As soon as it gets available with components

WP4 deployment at CERN CC / - n° 6 CERN Enhancements (I) u GUI for CDB n PanGUIn: graphical front end for browsing, modifying, validating and committing CDB templates n Generic development (no site dependencies): Can be feed back into EDG u Configuration access shortcuts and migration n CCM provides a ‘low-level’ API for traversing the node configuration profile tree n CCConfig: ‘high-level’ API giving access for recurring information chunks (partitions, package lists, network information, cluster name, etc) n CCConfig allows to access CDB info from the current node configuration system (SUE) u Server clustering solution n For CDB (XML profiles) and SWRep (RPM’s over HTTP) n Replication done with rsync n Load balancing done with simple DNS round-robin n Currently, 3 servers in production (800 MHz, 500MB RAM, FastEthernet)

WP4 deployment at CERN CC / - n° 7

WP4 deployment at CERN CC / - n° 8 CERN enhancements (II) u Release life cycle in CDB n A high-level configuration change may work on 99.9% of the nodes but break on some n Even if CDB keeps history information and is able to roll back, it cannot easily keep multiple configurations in parallel n Solution: keep ‘test’, ‘new’, ‘pro’ and ‘old’ templates simultaneously in CDB; and migration scripts which perform transitions from ‘test’->’new’->’pro’- >’old’. n Most node configs point to ‘pro’; test nodes use ‘test’, certification nodes use ‘new’ n If a node fails with a new ‘pro’ configuration, it is pointed to use ‘old’ while understanding the problem, instead of rolling back the complete CDB

WP4 deployment at CERN CC / - n° 9 CERN enhancements (III) test2newnew2pro test new pro old eg. lxplus001,32,33 lxplus rest of lxplusxxx Nodes (object templates) Template Maintainers old newtest old. test new pro eg. lxplus001,32,33 lxplus rest of lxplusxxx pro

WP4 deployment at CERN CC / - n° 10 Issues (I) u Understand the current system n Identify current dependencies (on shared file systems, legacy services) n We cannot start over from scratch! u Migration considerations n Production environment -> minimal downtimes! n Large computer centre n WP4 not a drop-in, turn-key solution, needs integration with existing environment n Produce ‘glue’ procedures and scripts u Scalability and Performance on large clusters n SWRep: How many servers do we need for n nodes? n CDB: how long does it take to generate O(100), O(1000) configuration profiles? What are the CPU requirements?

WP4 deployment at CERN CC / - n° 11 Issues (II) u Careful planning and execution required n Find appropriate time slots for phase-ins and phase-outs (one at a time!) n Announcements to users when non-transparent changes required n update operation manuals for site administrators and technicians n Problems and failures have much higher impact than on EDG testbeds! u Not all R3 functionality available yet n … some new procedures are harder to follow than old ones! u Site wide deployment (outside computer centre) n Until now, same tools used for farms, servers and desktops n Desktop support kept in mind during WP4 design, but needs additional work u WP4 deployment timescale will exceed the lifetime of EDG n Support and bugfixes n Future developments n Port to new platforms (RH9, IA64, Solaris…)

WP4 deployment at CERN CC / - n° 12 Summary u WP4 software is becoming a key part of the CERN computer centre management u CERN CC has very high requirements on stability and reliability n WP4 is coping with it! (at least up to now..) u Most valuable feedback for EDG, both in terms of testing and functionality enhancements