Http://cern.ch/quattro German Cancio CERN IT .quattro architecture http://cern.ch/quattro German Cancio CERN IT.

Slides:



Advertisements
Similar presentations
ELFms status and deployment, 25/5/2004 ELFms, status, deployment Germán Cancio for CERN IT/FIO HEPiX spring 2004 Edinburgh 25/5/2004.
Advertisements

DataGrid is a project funded by the European Union CHEP 2003 – March 2003 – Towards automation of computing fabrics... – n° 1 Towards automation.
German Cancio – WP4 developments Partner Logo WP4-install plans WP6 meeting, Paris project conference
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
ASIS et le projet EU DataGrid (EDG) Germán Cancio IT/FIO.
Current Status of Fabric Management at CERN, 26/7/2004 Current Status of Fabric Management at CERN CHEP 2004 Interlaken, 27/9/2004 CERN IT/FIO: G. Cancio,
Automating Linux Installations at CERN G. Cancio, L. Cons, P. Defert, M. Olive, I. Reguero, C. Rossi IT/PDP, CERN presented by G. Cancio.
Understanding and Managing WebSphere V5
WP4-install task report WP4 workshop Barcelona project conference 5/03 German Cancio.
EGEE is a project funded by the European Union under contract IST Quattor Installation of Grid Software C. Loomis (LAL-Orsay) GDB (CERN) Sept.
ELFms meeting, 2/3/04 German Cancio, 2/3/04 Proxy servers in CERN-CC.
DataGrid is a project funded by the European Commission under contract IST IT Post-C5, Managing Computer Centre machines with Quattor.
EDG LCFGng: concepts Fabric Management Tutorial - n° 2 LCFG (Local ConFiGuration system)  LCFG is originally developed by the.
1 Linux in the Computer Center at CERN Zeuthen Thorsten Kleinwort CERN-IT.
quattor NCM components introduction tutorial German Cancio CERN IT/FIO.
EDG WP4: installation task LSCCW/HEPiX hands-on, NIKHEF 5/03 German Cancio CERN IT/FIO
Partner Logo DataGRID WP4 - Fabric Management Status HEPiX 2002, Catania / IT, , Jan Iven Role and.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
German Cancio – WP4 developments Partner Logo System Management: Node Configuration & Software Package Management
Large Farm 'Real Life Problems' and their Solutions Thorsten Kleinwort CERN IT/FIO HEPiX II/2004 BNL.
Deployment work at CERN: installation and configuration tasks WP4 workshop Barcelona project conference 5/03 German Cancio CERN IT/FIO.
20-May-2003HEPiX Amsterdam EDG Fabric Management on Solaris G. Cancio Melia, L. Cons, Ph. Defert, I. Reguero, J. Pelegrin, P. Poznanski, C. Ungil Presented.
INFSO-RI Enabling Grids for E-sciencE SCDB C. Loomis / Michel Jouvin (LAL-Orsay) Quattor Tutorial LCG T2 Workshop June 16, 2006.
G. Cancio, L. Cons, Ph. Defert - n°1 October 2002 Software Packages Management System for the EU DataGrid G. Cancio Melia, L. Cons, Ph. Defert. CERN/IT.
Lemon Monitoring Miroslav Siket, German Cancio, David Front, Maciej Stepniewski CERN-IT/FIO-FS LCG Operations Workshop Bologna, May 2005.
Installing, running, and maintaining large Linux Clusters at CERN Thorsten Kleinwort CERN-IT/FIO CHEP
SPMA & SWRep: Basic exercises HEPiX hands-on, NIKHEF 5/03 German Cancio
Software Management with Quattor German Cancio CERN/IT.
Olof Bärring – WP4 summary- 4/9/ n° 1 Partner Logo WP4 report Plans for testbed 2 [Including slides prepared by Lex Holt.]
German Cancio – WP4 developments Partner Logo WP4-install progress CERN, 19/6/2002 for WP4-install.
Maite Barroso - 10/05/01 - n° 1 WP4 PM9 Deliverable Presentation: Interim Installation System Configuration Management Prototype
ASIS + RPM: ASISwsmp German Cancio, Lionel Cons, Philippe Defert, Andras Nagy CERN/IT Presented by Alan Lovell.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
Linux Configuration using April 12 th 2010 L. Brarda / CERN (some slides & pictures taken from the Quattor website) ‏
Automated management…, 26/7/2004 Automated management of large fabrics with ELFms Germán Cancio for CERN IT/FIO LCG-Asia Workshop Taipei, 26/7/2004
Quattor tutorial Introduction German Cancio, Rafael Garcia, Cal Loomis.
Partner Logo Olof Bärring, WP4 workshop 10/12/ n° 1 (My) Vision of where we are going WP4 workshop, 10/12/2002 Olof Bärring.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
Fabric Management: Progress and Plans PEB Tim Smith IT/FIO.
Managing Large Linux Farms at CERN OpenLab: Fabric Management Workshop Tim Smith CERN/IT.
Quattor: An administration toolkit for optimizing resources Marco Emilio Poleggi - CERN/INFN-CNAF German Cancio - CERN
BY: SALMAN 1.
Architecture Review 10/11/2004
Jean-Philippe Baud, IT-GD, CERN November 2007
Consulting Services JobScheduler Architecture Decision Template
Installation of the ALICE Software
BY: SALMAN.
System Monitoring with Lemon
AII v2 Ronald Starink Luis Fernando Muñoz Mejías
Netscape Application Server
Virtualisation for NA49/NA61
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
Large Scale Parallel Print Service
Status of Fabric Management at CERN
Overview – SOE PatchTT November 2015.
Germán Cancio CERN IT/FIO LCG workshop, 24/3/04
Open Source distributed document DB for an enterprise
Consulting Services JobScheduler Architecture Decision Template
Overall Architecture and Component Model
WP4-install status update
Virtualisation for NA49/NA61
Quattor Usage at Nikhef
Processes The most important processes used in Web-based systems and their internal organization.
Introduction to J2EE Architecture
Software deployment and service administration with Quattor
Towards automation of computing fabrics using tools from the fabric management workpackage of the EU DataGrid project Maite Barroso Lopez (WP4)
Lecture 1: Multi-tier Architecture Overview
Module 01 ETICS Overview ETICS Online Tutorials
Sending data to EUROSTAT using STATEL and STADIUM web client
Presentation transcript:

http://cern.ch/quattro German Cancio CERN IT .quattro architecture http://cern.ch/quattro German Cancio CERN IT

.quattro architecture - overview Configuration Database CDB (Central Configuration Database) CCM (Configuration Cache Manager) Installation: AII (Automated Installation Infrastructure) NCM (Node Configuration Manager) Software Repository (SWRep) Software Package Management Agent (SPMA) .quattro home page (under construction) http://cern.ch/quattro

Configuration DB design Server Modules Provide different access patterns to Configuration Information Configuration Data Base (CDB) Configuration Information store. The information is updated in transactions, it is validated and versioned. Pan Templates are compiled into XML profiles Server Module SQL/LDAP/HTTP Server Module SQL/LDAP/HTTP GUI CDB Pan XML pan NVA AP I Installation ... Cache CLI CCM Node Pan Templates with configuration information are input into CDB via GUI & CLI HTTP + notifications nodes are notified about changes of their configuration nodes fetch the XML profiles via HTTP Configuration Information is stored in the local cache. It is accessed via NVA-API

Configuration DB status System in implemented (except for CLI and Server Modules), most of the components in 1.0 production version, Pilot deployment of the complete system for LCG 1 using the “panguin” GUI (screenshot next slide) In parallel: System being consolidated, Issues of scalability and security being studied and addressed, Server Modules under development (SQL). More information: http://cern.ch/hep-proj-grid-config/

panguin GUI for managing/editing PAN templates

XML profile generated by PAN (lxplus001)

install design http SPMA Mgmt API nfs SPMA ftp SPMA NCM Cdispd PXE SWRep Servers http SPMA cache packages (rpm, pkg) Packages Mgmt API nfs SPMA.cfg (RPM, PKG) ACL’s SPMA ftp SPMA NCM Components NCM Node (re)install? Installation server Cdispd PXE CCM PXE handling Mgmt API Registration Notification ACL’s Node Install DHCP DHCP handling KS/JS KS/JS generator Client Nodes CCM CDB

install design http SPMA Mgmt API nfs SPMA ftp SPMA NCM Cdispd PXE Software Package Mgmt Agent (SPMA) SPMA manages the installed packages Runs on Linux (RPM) or Solaris (PKG) SPMA configuration done via an NCM component Can use a local cache for pre-fetching packages (simultaneous upgrades of large farms) install design SWRep Servers http SPMA cache packages (rpm, pkg) Packages Mgmt API nfs SPMA.cfg (RPM, PKG) ACL’s Automated Installation Infrastructure DHCP and Kickstart (or JumpStart) are re-generated according to CDB contents PXE can be set to reboot or reinstall by operator SPMA ftp SPMA NCM Components NCM Node (re)install? Software Repository Packages (in RPM or PKG format) can be uploaded into multiple Software Repositories Client access is using HTTP, NFS/AFS or FTP Management access subject to authentication/authorization Installation server Cdispd PXE CCM PXE handling Mgmt API Registration Notification ACL’s Node Install DHCP Node Configuration Manager (NCM) Configuration Management on the node is done by NCM Components Each component is responsible for configuring a service (network, NFS, sendmail, PBS) Components are notified by the Cdispd whenever there was a change in their configuration DHCP handling KS/JS KS/JS generator Client Nodes CCM CDB

AII (Automated Installation Infrastructure) Subsystem to automate the node base installation via the network Layer on top of existing technologies (base system installer, DHCP, PXE) Modules: AII-dhcp: manage DHCP server for network installation information AII-nbp (network bootstrap program): manages the PXE configuration for each node (boot from HD/ start the installation via network) AII-osinstall: Manage OS configuration files required by the OS installation procedure (KickStart, JumpStart) More details in AII design document: http://edms.cern.ch/document/374559

AII: current status Architectural design finished Detailed Design, implementation progressing first alpha version expected mid July

Node Configuration Management (NCM) Client software running on the node which takes care of “implementing” what is in the configuration profile Modules: “Components” Invocation and notification framework Component support libraries

NCM: Components “Components” (like SUE “features” or LCFG ‘objects’) are responsible for updating local config files, and notifying services if needed Components register their interest in configuration entries or subtrees, and get invoked in case of changes Components do only configure the system Usually, this implies regenerating and/or updating local config files (eg. /etc/sshd_config) Use standard system facilities (SysV scripts) for managing services Components can notify services using SysV scripts when their configuration changes. Possible to define configuration dependencies between components Eg. configure network before sendmail

Component example sub Configure { my ($self) = @_; # access configuration information my $config=NVA::Config->new(); my $arch=$config->getValue('/system/architecture’); # NVA API $self->Fail (“not supported") unless ($arch eq ‘i386’); # (re)generate and/or update local config file(s) open (myconfig,’/etc/myconfig’); … # notify affected (SysV) services if required if ($changed) { system(‘/sbin/service myservice reload’); … }

NCM (contd.) cdispd (Configuration Dispatch Daemon) Monitors the config profile, and invokes components via the ncd if there were changes ncd (Node Configuration Deployer): framework and front-end for executing components (via cron, cdispd, or manually) Dependency ordering of components Component support libraries: For recurring system mgmt tasks (interfaces to system services, sysinfo), log handling, etc More details in NCM design document http://edms.cern.ch/document/372643

NCM: Status Architectural design finished Detailed (class) design progressing First version expected end July Porting/coding of base configuration components completed mid September more than 60 components to be ported for having a complete EDG solution (configuring all EDG middleware services)! Pilot deployment on CERN central interactive/batch facilities expected at the end of the year

SPM (Software Package Mgmt) (I) SWRep (Software Repository): Client-server toolsuite for the management of software packages Universal repository: Extendable to multiple platforms and package formats (RHLinux/RPM, Solaris/PKG,… others like Debian dpkg) Multiple package versions/releases Management (“product maintainers”) interface: ACL based mechanism to grant/deny modification rights (packages associated to “areas”) Current implementation using SSH Client access: via standard protocols HTTP (scalability), but also AFS/NFS, FTP Replication: using standard tools (eg. rsync) Availability, load balancing

SPM (Software Package Mgmt) (II) Software Package Management Agent (SPMA): Runs on every target node Multiple repositories can be accessed (eg. division/experiment specific) Plug-in framework allows for portability System packager specific transactional interface (RPMT, PKGT) Can manage either all or a subset of packages on the nodes Useful for add-on installations, and also for desktops Configurable policies (partial or full control, mandatory and unwanted packages, conflict resolution…) Addresses scalability Packages can be stored ahead in a local cache, avoiding peak loads on software repository servers (simultaneous upgrades of large farms) HTTP protocol allows to use web proxy hierarchies

SPM (Software Package Mgmt) (III) SPMA functionality: Compares the packages currently installed on the local node with the packages listed in the configuration Computes the necessary install/deinstall/upgrade operations Invokes the packager (rpmt/pkgt) with the right operation transaction set The SPM is driven via a local configuration file For batch/servers: A NCM component generates/maintains this cf file out of CDB information For desktops: Possible to write a GUI for locally editing the cf file

Software Package Manager (SPM) RPMT RPMT (RPM transactions) is a small tool on top of the RPM libraries, which allows for multiple simultaneous package operations resolving dependencies (unlike RPM) Example: ‘upgrade X, deinstall Y, downgrade Z, install T’ and verify/resolve appropriate dependencies Does use basic RPM library calls, no added intelligence Ports available for RPM 3 and 4.0.X Will try to feedback to rpm user community after porting to RPM 4.2 CERN IT/PS working on equivalent Solaris port (PKGT)

SPMA & SWRep: current status First production version available Being deployed in the CERN Computer Centre (next slide) Enhanced functionality (package cache management) for mid- October Solaris port progressing

SPMA/SWRep deployment @ CERN CC Phased out legacy SW distribution systems (including ASIS) on the central batch/interactive servers (LXPLUS&LXBATCH) Using HTTP as package access protocol (scalability) 1000 nodes currently running it in production Deployment page: http://cern.ch/wp4-install/CERN/deploy Server clustering solution For CDB (XML profiles) and SWRep (RPM’s over HTTP) Replication done with rsync Load balancing done with simple DNS round-robin Currently, 3 servers in production (800 MHz, 500MB RAM, FastEthernet) giving ~ 3*12Mbyte throughput Future: may include usage of hierarchical web proxys (eg. using squid)