EDG WP4: installation task LSCCW/HEPiX hands-on, NIKHEF 5/03 German Cancio CERN IT/FIO

Slides:



Advertisements
Similar presentations
ELFms status and deployment, 25/5/2004 ELFms, status, deployment Germán Cancio for CERN IT/FIO HEPiX spring 2004 Edinburgh 25/5/2004.
Advertisements

DataGrid is a project funded by the European Union CHEP 2003 – March 2003 – Towards automation of computing fabrics... – n° 1 Towards automation.
German Cancio – WP4 developments Partner Logo WP4-install plans WP6 meeting, Paris project conference
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
ASIS et le projet EU DataGrid (EDG) Germán Cancio IT/FIO.
NGOP J.Fromm K.Genser T.Levshina M.Mengel V.Podstavkov.
Current Status of Fabric Management at CERN, 26/7/2004 Current Status of Fabric Management at CERN CHEP 2004 Interlaken, 27/9/2004 CERN IT/FIO: G. Cancio,
Automating Linux Installations at CERN G. Cancio, L. Cons, P. Defert, M. Olive, I. Reguero, C. Rossi IT/PDP, CERN presented by G. Cancio.
Understanding and Managing WebSphere V5
Partner Logo German Cancio – WP4-install LCFG HOW-TO - n° 1 WP4 hands-on workshop: EDG LCFGng exercises
WP4-install task report WP4 workshop Barcelona project conference 5/03 German Cancio.
EGEE is a project funded by the European Union under contract IST Quattor Installation of Grid Software C. Loomis (LAL-Orsay) GDB (CERN) Sept.
Managing Mature White Box Clusters at CERN LCW: Practical Experience Tim Smith CERN/IT.
ELFms meeting, 2/3/04 German Cancio, 2/3/04 Proxy servers in CERN-CC.
C. Loomis – Testbed Status – 28/01/2002 – n° 1 Future WP6 Tasks Charles Loomis January 28, 2002
DataGrid is a project funded by the European Commission under contract IST IT Post-C5, Managing Computer Centre machines with Quattor.
EDG LCFGng: concepts Fabric Management Tutorial - n° 2 LCFG (Local ConFiGuration system)  LCFG is originally developed by the.
1 Linux in the Computer Center at CERN Zeuthen Thorsten Kleinwort CERN-IT.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
Olof Bärring – WP4 summary- 6/3/ n° 1 Partner Logo WP4 report Status, issues and plans
Large Computer Centres Tony Cass Leader, Fabric Infrastructure & Operations Group Information Technology Department 14 th January and medium.
quattor NCM components introduction tutorial German Cancio CERN IT/FIO.
Nov 1, 2000Site report DESY1 DESY Site Report Wolfgang Friebel DESY Nov 1, 2000 HEPiX Fall
CERN Manual Installation of a UI – Oxford July - 1 LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed.
Partner Logo DataGRID WP4 - Fabric Management Status HEPiX 2002, Catania / IT, , Jan Iven Role and.
Olof Bärring – WP4 summary- 4/9/ n° 1 Partner Logo WP4 report Plans for testbed 2
May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
Fermilab Distributed Monitoring System (NGOP) Progress Report J.Fromm K.Genser T.Levshina M.Mengel V.Podstavkov.
Partner Logo German Cancio – WP4-install LCFG HOW-TO - n° 1 LCFGng configuration examples Updated 10/2002
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
Quattor-for-Castor Jan van Eldik Sept 7, Outline Overview of CERN –Central bits CDB template structure SWREP –Local bits Updating profiles.
German Cancio – WP4 developments Partner Logo System Management: Node Configuration & Software Package Management
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Large Farm 'Real Life Problems' and their Solutions Thorsten Kleinwort CERN IT/FIO HEPiX II/2004 BNL.
Deployment work at CERN: installation and configuration tasks WP4 workshop Barcelona project conference 5/03 German Cancio CERN IT/FIO.
20-May-2003HEPiX Amsterdam EDG Fabric Management on Solaris G. Cancio Melia, L. Cons, Ph. Defert, I. Reguero, J. Pelegrin, P. Poznanski, C. Ungil Presented.
INFSO-RI Enabling Grids for E-sciencE SCDB C. Loomis / Michel Jouvin (LAL-Orsay) Quattor Tutorial LCG T2 Workshop June 16, 2006.
G. Cancio, L. Cons, Ph. Defert - n°1 October 2002 Software Packages Management System for the EU DataGrid G. Cancio Melia, L. Cons, Ph. Defert. CERN/IT.
Lemon Monitoring Miroslav Siket, German Cancio, David Front, Maciej Stepniewski CERN-IT/FIO-FS LCG Operations Workshop Bologna, May 2005.
Installing, running, and maintaining large Linux Clusters at CERN Thorsten Kleinwort CERN-IT/FIO CHEP
SPMA & SWRep: Basic exercises HEPiX hands-on, NIKHEF 5/03 German Cancio
Software Management with Quattor German Cancio CERN/IT.
Olof Bärring – WP4 summary- 4/9/ n° 1 Partner Logo WP4 report Plans for testbed 2 [Including slides prepared by Lex Holt.]
C. Aiftimiei, E. Ferro / January LCFGng server installation Cristina Aiftimiei, Enrico Ferro INFN-LNL.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Tools and techniques for managing virtual machine images Andreas.
Olof Bärring – EDG WP4 status&plans- 22/10/ n° 1 Partner Logo EDG WP4 (fabric mgmt): status&plans Large Cluster.
Fabric Management with ELFms BARC-CERN collaboration meeting B.A.R.C. Mumbai 28/10/05 Presented by G. Cancio – CERN/IT.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
German Cancio – WP4 developments Partner Logo WP4-install progress CERN, 19/6/2002 for WP4-install.
Maite Barroso - 10/05/01 - n° 1 WP4 PM9 Deliverable Presentation: Interim Installation System Configuration Management Prototype
ASIS + RPM: ASISwsmp German Cancio, Lionel Cons, Philippe Defert, Andras Nagy CERN/IT Presented by Alan Lovell.
Yannick Patois - Datagrid Software Repository Presentation - March, n° 1 Datagrid Software Repository Presentation CVS, packages and automatic.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
Linux Configuration using April 12 th 2010 L. Brarda / CERN (some slides & pictures taken from the Quattor website) ‏
Automated management…, 26/7/2004 Automated management of large fabrics with ELFms Germán Cancio for CERN IT/FIO LCG-Asia Workshop Taipei, 26/7/2004
Quattor tutorial Introduction German Cancio, Rafael Garcia, Cal Loomis.
Introduction to NCM Configuration components German Cancio CERN/IT.
Partner Logo Olof Bärring, WP4 workshop 10/12/ n° 1 (My) Vision of where we are going WP4 workshop, 10/12/2002 Olof Bärring.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Fabric Management: Progress and Plans PEB Tim Smith IT/FIO.
Managing Large Linux Farms at CERN OpenLab: Fabric Management Workshop Tim Smith CERN/IT.
Quattor: An administration toolkit for optimizing resources Marco Emilio Poleggi - CERN/INFN-CNAF German Cancio - CERN
Status of Fabric Management at CERN
Germán Cancio CERN IT/FIO LCG workshop, 24/3/04
WP4-install status update
German Cancio CERN IT .quattro architecture German Cancio CERN IT.
Software deployment and service administration with Quattor
Towards automation of computing fabrics using tools from the fabric management workpackage of the EU DataGrid project Maite Barroso Lopez (WP4)
Module 01 ETICS Overview ETICS Online Tutorials
Presentation transcript:

EDG WP4: installation task LSCCW/HEPiX hands-on, NIKHEF 5/03 German Cancio CERN IT/FIO

HEPiX hands-on / Installation Task / German Cancio CERN - n° 2 Agenda Part 1: u General architectural overview u Components description and current status Part 2: u Exercises on software distribution Part 3: u Discussion: differences to other solutions (if time permits)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 3 Disclaimer u This is not a repetition of the WP4 LCFGng tutorial given last year at CERN. I will describe the proposed replacement for LCFG, developed by EDG WP4-install. u This is a work in progress. Most of the subsystems presented here are currently under design/development, although some are already been deployed at CERN. u There are less practical exercises than theory slides ;-( u Your feedback is a most welcome source for improvements!

HEPiX hands-on / Installation Task / German Cancio CERN - n° 4 EDG WP4: reminder u WP4 is the ‘fabric management’ work package of the EU DataGrid project. u Objective: n To develop system management tools for enabling the deployment of very large computing fabrics […] with reduced sysadmin and operation costs. u Installation task: solutions for n automated from scratch node installation n node configuration/reconfiguration n software storage, distribution and installation u Configuration task: solutions for n storing, maintaining and retrieving configuration information.

HEPiX hands-on / Installation Task / German Cancio CERN - n° 5 WP4-install architecture Subsystems: u Base Installation: n AII (Automated Installation Infrastructure) u Node Configuration: n NCM (Node Configuration Manager) u Software Distribution: n Software Repository (SWRep) n Software Package Management Agent (SPMA)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 6 WP4-install arch CCM SPMA NCM Components Cdispd NCM Registration Notification SPMA SPMA.cfg CDB nfs http ftp Mgmt API ACL’s Client Nodes SWRep Servers cache Packages (rpm, pkg) packages (RPM, PKG) PXE DHCP Mgmt API ACL’s Installation server DHCP handling KS/JS PXE handling KS/JS generator Node Install CCM Node (re)install? Automated Installation Infrastructure DHCP and Kickstart (or JumpStart) are re- generated according to CDB contents PXE can be set to reboot or reinstall by operator Software Repository Packages (in RPM or PKG format) can be uploaded into multiple Software Repositories Client access is using HTTP, NFS/AFS or FTP Management access subject to authentication/authorization Node Configuration Manager (NCM) Configuration Management on the node is done by NCM Components Each component is responsible for configuring a service (network, NFS, sendmail, PBS) Components are notified by the Cdispd whenever there was a change in their configuration Software Package Mgmt Agent (SPMA) SPMA manages the installed packages Runs on Linux (RPM) or Solaris (PKG) SPMA configuration done via an NCM component Can use a local cache for pre-fetching packages (simultaneous upgrades of large farms)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 7 Base installation (AII)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 8 AII (Automated Installation Infrastructure) u Subsystem to automate the node base installation via the network u Layer on top of existing technologies (base system installer, DHCP, PXE) u Modules: u AII-dhcp: n manage DHCP server for network installation information u AII-nbp (network bootstrap program): n manages the PXE configuration for each node (boot from HD/ start the installation via network) u AII-osinstall: n Manage OS configuration files required by the OS installation procedure (KickStart, JumpStart) u More details in AII design document:

HEPiX hands-on / Installation Task / German Cancio CERN - n° 9 AII: current status u Architectural design finished u Detailed Design, implementation progressing u first alpha version expected mid July

HEPiX hands-on / Installation Task / German Cancio CERN - n° 10 Node Configuration (NCM)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 11 Node Configuration Management (NCM) u Client software running on the node which takes care of “implementing” what is in the configuration profile u Modules: n “Components” n Invocation and notification framework n Component support libraries

HEPiX hands-on / Installation Task / German Cancio CERN - n° 12 NCM: Components u “Components” (like SUE “features” or LCFG ‘objects’) are responsible for updating local config files, and notifying services if needed u Components register their interest in configuration entries or subtrees, and get invoked in case of changes u Components do only configure the system Usually, this implies regenerating and/or updating local config files (eg. /etc/sshd_config ) u Use standard system facilities (SysV scripts) for managing services n Components can notify services using SysV scripts when their configuration changes. u Possible to define configuration dependencies between components n Eg. configure network before sendmail

HEPiX hands-on / Installation Task / German Cancio CERN - n° 13 Component example sub Configure { my ($self) # access configuration information my $config=NVA::Config->new(); my $arch=$config->getValue('/system/architecture’); # NVA API $self->Fail (“not supported") unless ($arch eq ‘i386’); # (re)generate and/or update local config file(s) open (myconfig,’/etc/myconfig’); … # notify affected (SysV) services if required if ($changed) { system(‘/sbin/service myservice reload’); … }

HEPiX hands-on / Installation Task / German Cancio CERN - n° 14 NCM (contd.)  cdispd (Configuration Dispatch Daemon) n Monitors the config profile, and invokes components via the ncd if there were changes  ncd (Node Configuration Deployer): n framework and front-end for executing components (via cron, cdispd, or manually) n Dependency ordering of components u Component support libraries: n For recurring system mgmt tasks (interfaces to system services, sysinfo), log handling, etc u More details in NCM design document

HEPiX hands-on / Installation Task / German Cancio CERN - n° 15 NCM architecture (from design doc.)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 16 NCM: Status u Architectural design finished u Detailed (class) design progressing u First version expected mid July u Porting/coding of base configuration components completed mid September n more than 60 components to be ported for having a complete EDG solution (configuring all EDG middleware services)! u Pilot deployment on CERN central interactive/batch facilities expected at the end of the year

HEPiX hands-on / Installation Task / German Cancio CERN - n° 17 Software Distribution (SWRep and SPMA)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 18 SPM (Software Package Mgmt) (I) SWRep (Software Repository): u Client-server toolsuite for the management of software packages u Universal repository: n Extendable to multiple platforms and package formats (RHLinux/RPM, Solaris/PKG,… others like Debian dpkg) n Multiple package versions/releases u Management (“product maintainers”) interface: n ACL based mechanism to grant/deny modification rights (packages associated to “areas”) n Current implementation using SSH u Client access: via standard protocols n HTTP (scalability), but also AFS/NFS, FTP u Replication: using standard tools (eg. rsync) n Availability, load balancing

HEPiX hands-on / Installation Task / German Cancio CERN - n° 19 SPM (Software Package Mgmt) (II) Software Package Management Agent (SPMA): u Runs on every target node u Multiple repositories can be accessed (eg. division/experiment specific) u Plug-in framework allows for portability n System packager specific transactional interface (RPMT, PKGT) u Can manage either all or a subset of packages on the nodes n Useful for add-on installations, and also for desktops n Configurable policies (partial or full control, mandatory and unwanted packages, conflict resolution…) u Addresses scalability n Packages can be stored ahead in a local cache, avoiding peak loads on software repository servers (simultaneous upgrades of large farms) n HTTP protocol allows to use web proxy hierarchies

HEPiX hands-on / Installation Task / German Cancio CERN - n° 20 SPM (Software Package Mgmt) (III) u SPMA functionality: 1.Compares the packages currently installed on the local node with the packages listed in the configuration 2.Computes the necessary install/deinstall/upgrade operations 3.Invokes the packager (rpmt/pkgt) with the right operation transaction set u The SPM is driven via a local configuration file n For batch/servers: A NCM component generates/maintains this cf file out of CDB information n For desktops: Possible to write a GUI for locally editing the cf file

HEPiX hands-on / Installation Task / German Cancio CERN - n° 21 Software Package Manager (SPM) RPMT u RPMT (RPM transactions) is a small tool on top of the RPM libraries, which allows for multiple simultaneous package operations resolving dependencies (unlike RPM) n Example: ‘upgrade X, deinstall Y, downgrade Z, install T’ and verify/resolve appropriate dependencies u Does use basic RPM library calls, no added intelligence u Ports available for RPM 3 and 4.0.X u Will try to feedback to rpm user community after porting to RPM 4.2 u CERN IT/PS working on equivalent Solaris port (PKGT)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 22 SWRep/SPMA architecture Packages Mgmt API Repository A packages Mgmt API CDB config Client nodes NCM/ GUI SPMA.cfg SPMA (RPM, PKG) GUI CLI cache Repository B inventory http afs nfs ftp (HTTP Proxy) rpmt

HEPiX hands-on / Installation Task / German Cancio CERN - n° 23 SPMA & SWRep: current status u First production version available u Being deployed in the CERN Computer Centre (next slide) u Enhanced functionality (package cache management) for mid- October u Solaris port progressing (cf. M. Guijarro’s talk)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 24 SPMA/SWRep CERN CC u Started phasing out legacy SW distribution systems (including ASIS) on the central batch/interactive servers (LXPLUS&LXBATCH) n Using HTTP as package access protocol (scalability) n > 400 nodes currently running it in production n Deployment page: u Server clustering solution n For CDB (XML profiles) and SWRep (RPM’s over HTTP) n Replication done with rsync n Load balancing done with simple DNS round-robin n Currently, 3 servers in production (800 MHz, 500MB RAM, FastEthernet) giving ~ 3*12Mbyte throughput n Future: may include usage of hierarchical web proxys (eg. using squid)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 25 Questions / comments ?