Disk Server Deployment at RAL Castor F2F RAL - Feb 2009 Martin Bly.

Slides:



Advertisements
Similar presentations
Managing A Large Farm: CSF Andrew Sansum 26 November 2002.
Advertisements

BizTalk Deployment using Visual Studio Release Management
Configurations Management System Chris Boyd.  Time consuming task of provisioning a number of systems with STIG compliance  Managing a number of systems.
Software Distribution in Microsoft System Center Configuration Manager v.Next: Part 1.
Automating Linux Installations at CERN G. Cancio, L. Cons, P. Defert, M. Olive, I. Reguero, C. Rossi IT/PDP, CERN presented by G. Cancio.
Installing and Setting up mongoDB replica set PREPARED BY SUDHEER KONDLA SOLUTIONS ARCHITECT.
VIR314. Understand the scenarios Application support Understand the scenarios Application support Review of the sequencing process Demo Review of the.
Installing and maintaining clusters of FreeBSD servers using PXE and Rsync Cor Bosman XS4ALL
PPD Computing “Business Continuity” Windows and Mac Kevin Dunford May 17 th 2012.
Puppetize It! An Introduction to Puppet Mike Seda CEO, Seda Systems, Inc.
Module 13: Maintaining Software by Using Windows Server Update Services.
Terry Henry IS System Manager, SharePoint SME Micron Technology Inc.
1 Linux in the Computer Center at CERN Zeuthen Thorsten Kleinwort CERN-IT.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
11 MANAGING AND DISTRIBUTING SOFTWARE BY USING GROUP POLICY Chapter 5.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
CERN Manual Installation of a UI – Oxford July - 1 LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed.
Git workflow and basic commands By: Anuj Sharma. Why git? Git is a distributed revision control system with an emphasis on speed, data integrity, and.
Phone: Mega AS Consulting Ltd © 2007  CAT – the problem & the solution  Using the CAT - Administrator  Mega.
Configuration Management with Cobbler and Puppet Kashif Mohammad University of Oxford.
LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed in conjunction with GridPP and EGEE Running LCG, Handy Hints and Tips.
Michael Still Google Inc. October, Managing Unix servers the slack way Tools and techniques for managing large numbers of Unix machines Michael.
PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Wahid, Sam, Alastair. Now installed on production storage Edinburgh: srm.glite.ecdf.ed.ac.uk  Local and global redir work (port open) e.g. root://srm.glite.ecdf.ed.ac.uk//atlas/dq2/mc12_8TeV/NTUP_SMWZ/e1242_a159_a165_r3549_p1067/mc1.
Quattor-for-Castor Jan van Eldik Sept 7, Outline Overview of CERN –Central bits CDB template structure SWREP –Local bits Updating profiles.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
1 PUPPET AND DSC. INTRODUCTION AND USAGE IN CONTINUOUS DELIVERY PROCESS. VIKTAR VEDMICH PAVEL PESETSKIY AUGUST 1, 2015.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Stuart Kenny and Stephen Childs Trinity.
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
DataGRID Testbed Enlargement EDG Retreat Chavannes, august 2002 Fabio HERNANDEZ
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
© 2008 Cisco Systems, Inc. All rights reserved.CIPT1 v6.0—1-1 Getting Started with Cisco Unified Communications Manager Installing and Upgrading Cisco.
Online System Status LHCb Week Beat Jost / Cern 9 June 2015.
Puppet at MWT2 Sarah Williams Indiana University.
Configuration Report 12/02/2015
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GLite testing status and future Gianni Pucciani.
Configuration Report (nearly) Christmas Edition
EGEE-II INFSO-RI Enabling Grids for E-sciencE YAIM Overview MiMOS Grid tutorial HungChe, ASGC OPS Team.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Progress report from University of Cyprus.
CERN Running a LCG-2 Site – Oxford July - 1 LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
Post Install Configuration FreeBSD SANOG 9 January 14, 2007 Colombo, Sri Lanka Hervey Allen.
CASTOR Status at RAL CASTOR External Operations Face To Face Meeting Bonny Strong 10 June 2008.
1 Update at RAL and in the Quattor community Ian Collier - RAL Tier1 HEPiX FAll 2010, Cornell.
INFSO-RI Enabling Grids for E-sciencE gLite Certification and Deployment Process Markus Schulz, SA1, CERN EGEE 1 st EU Review 9-11/02/2005.
EGI-InSPIRE RI EGI Webinar EGI-InSPIRE RI Porting your application to the EGI Federated Cloud 17 Feb
II EGEE conference Den Haag November, ROC-CIC status in Italy
RI EGI-TF 2010, Tutorial Managing an EGEE/EGI Virtual Organisation (VO) with EDGES bridged Desktop Resources Tutorial Robert Lovas, MTA SZTAKI.
Cloud Installation & Configuration Management. Outline  Definitions  Tools, “Comparison”  References.
Overview of cluster management tools Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
INFN/IGI contributions Federated Clouds Task Force F2F meeting November 24, 2011, Amsterdam.
Let's build a VMM service template from A to Z in one hour Damien Caro Technical Evangelist Microsoft Central & Eastern Europe
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarksEGEE-III INFSO-RI MPI on the grid:
EGI-InSPIRE RI Pakiti Michal Prochazka, (Daniel Kouril)
Configuration Manager Deploying Surface Pro 3 with Configuration Manager Niall Brady ECM MVP
INFSO-RI Enabling Grids for E-sciencE Workshop WLCG Security for Grid Sites Louis Poncet System Engineer SA3 - OSCT.
NGI and Site Nagios Monitoring
Environment Manager Troubleshooting and Debugging
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
DDPS in Action: Session 11 Hydration
Configuration Of A Pull Network.
SUSE Linux Enterprise Desktop Administration
Status and plans for bookkeeping system and production tools
Presentation transcript:

Disk Server Deployment at RAL Castor F2F RAL - Feb 2009 Martin Bly

Slide 2 19 Feb 2009 Disk Server Deployment at RAL - Martin Bly Overview General Deployment Castor Specifics Issues

Slide 3 19 Feb 2009 Disk Server Deployment at RAL - Martin Bly General Deployments I PXE/Kickstart Two variants –Version 1 three stages: Kickstart (reboot) Updates Personality (reboot) –Version 2 (scriptlets): Kickstart (reboot) Updates + personality (reboot) Version 1 used for disk servers of all types Kickstart, updates, personalities, scriptlets are all hand crafted Update script is ‘standard’ for all OS types and variants –Add repository definitions, brings OS up-to-date and installs standard requirements (AFS, SSH keys…) –Also removes some unwanted stuff (Bluetooth, Samba, WiFi…)

Slide 4 19 Feb 2009 Disk Server Deployment at RAL - Martin Bly General Deployments II Personality scripts –Make the systems into BDII or WN or CE or disk server or a specific type –Additional repositories (Castor, Glite, …) –Installs personality specific packages –Removes additional OS stuff some variants of same type might need but are unnecessary in a specific cases –Configures system Adds or edits config files

Slide 5 19 Feb 2009 Disk Server Deployment at RAL - Martin Bly General Deployment III Kickstart files are hardware specific Kickstart files are task (personality) specific All OS variants call the same update script Then call the task-specific personality script for the specific hardware System has grown to the way it is over time –Now: less than ideal

Slide 6 19 Feb 2009 Disk Server Deployment at RAL - Martin Bly General Deployment IV Viglen06 NFS kickstart Viglen06 dCache kickstart sl4-update Viglen06 Castor kickstart Viglen06 xrootd kickstart Viglen07i Castor kickstart Viglen07a Castor kickstart CV05 Castor kickstart Comp04 Castor kickstart Viglen06 NFS type Viglen06 dCache type Viglen06 Castor type Viglen06 xrootd type Viglen07i Castor type Viglen07a Castor type CV05 Castor type Comp04 Castor type Function Specific Hardware Specific

Slide 7 19 Feb 2009 Disk Server Deployment at RAL - Martin Bly Castor Specifics I Castor disk servers and central services hosts augment the standard deployment using Puppet to configure stagemap.conf etc. Disk servers deployed in two stages Detailed procedure to follow for complete process – lots of checking Stage 1: Install to ‘nonProd’ class (per VO) –nonProd – holding class for servers ready to go into production ‘nearline’ – but not quite as near as I’d like. –Careful choreography of changes to Puppet, LSF –Uses the Kickstart system for main provisioning Master config file – Castor version, VO, Service Class – use dby kickstart for VO and service class specific actions –Certificates added by hand –Registered with rmMaster –Pause to check (test!) all is OK. –At this point, server is being monitored by nagios, ganglia etc In case it needs an intervention

Slide 8 19 Feb 2009 Disk Server Deployment at RAL - Martin Bly Castor Specifics II Stage 2: Deploy to production service class –Configure Puppet to declare Service Class –Configure Puppet to add server to LSF production configuration –Enable new server on Castor central server –Activate Nagios callouts –Admire the data flow… (Cacti, ganglia etc)

Slide 9 19 Feb 2009 Disk Server Deployment at RAL - Martin Bly Issues Still a significant number of manual steps on LSF system and central services to add a new server –Particularly configuring Puppet manifests –Choreography of steps on diverse systems otherwise ‘the logs fill up’! Status –Replacing spreadsheet with a database We don’t have an ‘simple’ end-to-end system that can orchestrate the changes needed to (re)deploy a server –Do we have the optimum MO for Castor? Will it scale? –Working on our Fabric Management options Need to understand if the sequence of steps for deployment is the best possible (if the MO is correct)?