May 22, 2003VOLDEMORT-S. Timm--LSCCW 1 VOLDEMORT VOLatile Distribution of Electronic Media Over Rsync Transport May 22, 2003 LSCCW Steven Timm.

Slides:



Advertisements
Similar presentations
SSH Operation and Techniques - © William Stearns 1 SSH Operation and Techniques The Swiss Army Knife of encryption tools…
Advertisements

Linux+ Guide to Linux Certification, Second Edition
Chapter One The Essence of UNIX.
Linux+ Guide to Linux Certification, Second Edition Chapter 3 Linux Installation and Usage.
Introducing the Command Line CMSC 121 Introduction to UNIX Much of the material in these slides was taken from Dan Hood’s CMSC 121 Lecture Notes.
Lesson 22 – Introduction to Linux Systems Administration.
Asynchronous Solution Appendix Eleven. Training Manual Asynchronous Solution August 26, 2005 Inventory # A11-2 Chapter Overview In this chapter,
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Installing software on personal computer
Automating Linux Installations at CERN G. Cancio, L. Cons, P. Defert, M. Olive, I. Reguero, C. Rossi IT/PDP, CERN presented by G. Cancio.
Oak Ridge National Laboratory Tools for Cluster Administration and Applications (ancient technology – from 2001…)
Installing Linux softwares Sirak Kaewjamnong. 2 Software packets  When Linux developers create their software they typically bundle all the executable.
Firewalls, Perimeter Protection, and VPNs - SANS © SSH Operation The Swiss Army Knife of encryption tools…
Chapter 11: Creating and Managing Shared Folders BAI617.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Linux Operations and Administration
Linux+ Guide to Linux Certification Chapter Three Linux Installation and Usage.
Hands-On Microsoft Windows Server 2008 Chapter 5 Configuring, Managing, and Troubleshooting Resource Access.
Partner Logo German Cancio – WP4-install LCFG HOW-TO - n° 1 WP4 hands-on workshop: EDG LCFGng exercises
Chapter 2 Applying Practical Automation Speaker : Chuang-Hung Shih Date :
Guide to Linux Installation and Administration, 2e1 Chapter 3 Installing Linux.
Recovery Manager Overview Target Database Recovery Catalog Database Enterprise Manager Recovery Manager (RMAN) Media Options Server Session.
SUSE Linux Enterprise Server Administration (Course 3037) Chapter 4 Manage Software for SUSE Linux Enterprise Server.
1 Guide to Novell NetWare 6.0 Network Administration Chapter 13.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
Rocks ‘n’ Rolls An Introduction to Programming Clusters using Rocks © 2008 UC Regents Anoop Rajendra.
Hands-On Microsoft Windows Server 2008 Chapter 5 Configuring, Managing, and Troubleshooting Resource Access.
CERN Manual Installation of a UI – Oxford July - 1 LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed.
O.S.C.A.R. Cluster Installation. O.S.C.A.R O.S.C.A.R. Open Source Cluster Application Resource Latest Version: 2.2 ( March, 2003 )
Module 6: Configuring User Environments Using Group Policy.
Module 7 Configure User and Computer Environments By Using Group Policy.
WINDOWS XP PROFESSIONAL AUTOMATING THE WINDOWS XP INSTALLATION Bilal Munir Mughal Chapter-2 1.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
Partner Logo German Cancio – WP4-install LCFG HOW-TO - n° 1 LCFGng configuration examples Updated 10/2002
The Alternative Larry Moore. 5 Nodes and Variant Input File Sizes Hadoop Alternative.
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
1 Linux Networking and Security Chapter 5. 2 Configuring File Sharing Services Configure an FTP server for anonymous or regular users Set up NFS file.
What is a port The Ports Collection is essentially a set of Makefiles, patches, and description files placed in /usr/ports. The port includes instructions.
1 Network Information System (NIS). 2 Module – Network Information System (NIS) ♦ Overview This module focuses on configuring and managing Network Information.
© 2008 Cisco Systems, Inc. All rights reserved.CIPT1 v6.0—1-1 Getting Started with Cisco Unified Communications Manager Installing and Upgrading Cisco.
Advanced Programming in the UNIX Environment Hop Lee.
CMap Version 0.16 Ben Faga. CMap CMap Version 0.16 Bug fixes and code optimizations More intuitive menu system Asynchronous loading of comparative map.
Maite Barroso - 10/05/01 - n° 1 WP4 PM9 Deliverable Presentation: Interim Installation System Configuration Management Prototype
Linux Operations and Administration
EGEE-II INFSO-RI Enabling Grids for E-sciencE YAIM Overview MiMOS Grid tutorial HungChe, ASGC OPS Team.
Module 6: Configuring User Environments Using Group Policies.
CSC414 “Introduction to UNIX/ Linux” Lecture 6. Schedule 1. Introduction to Unix/ Linux 2. Kernel Structure and Device Drivers. 3. System and Storage.
Andrew McNab - Globus Distribution for Testbed 1 Globus Distribution for Testbed 1 Andrew McNab, University of Manchester
Chapter Linux Basics. Acknowledgements This presentation was prepared by – Banyat Settapanich – Bahran Madaen This presentation will be updated later.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
CS 283Computer Networks Spring 2013 Instructor: Yuan Xue.
@Yuan Xue CS 283Computer Networks Spring 2011 Instructor: Yuan Xue.
Operating Environment. Installation and Upgrade Options Solaris suninstall program Solaris Web Start Installation Custom Jumpstart procedure Standard.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Chapter 4: server services. The Complete Guide to Linux System Administration2 Objectives Configure network interfaces using command- line and graphical.
Deterlab Tutorial CS 285 Network Security. What is Deterlab? Deterlab is a security-enhanced experimental infrastructure (based on Emulab) that supports.
Planning Server Deployments Chapter 1. Server Deployment When planning a server deployment for a large enterprise network, the operating system edition.
Linux and Coldfusion MX Mid-Michigan Coldfusion User’s Group, Nov
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) 马兰馨 IHEP, CAS Setting Up a Repository.
SCDB Update Michel Jouvin LAL, Orsay March 17, 2010 Quattor Workshop, Thessaloniki.
Scientific Linux Inventory Project (SLIP) Troy Dawson Connie Sieh.
Stress Free Deployments with Octopus Deploy
© 2002, Cisco Systems, Inc. All rights reserved.
Guide to Linux Installation and Administration, 2e
Self Healing and Dynamic Construction Framework:
Solving ETL Bottlenecks with SSIS Scale Out
SUSE Linux Enterprise Desktop Administration
Presentation transcript:

May 22, 2003VOLDEMORT-S. Timm--LSCCW 1 VOLDEMORT VOLatile Distribution of Electronic Media Over Rsync Transport May 22, 2003 LSCCW Steven Timm

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 2 Introduction: Disclaimer--Any resemblance to characters in Harry Potter books of J.K. Rowling is pure coincidence. Rsync is open-source package which allows to keep directories on remote machines synchronized with each other Common method at many installations of distributing volatile local files on machines that are already installed. Needs a set of supporting scripts to make it a useful tool.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 3 VOLDEMORT overview Currently deployed on over 700 machines at Fermilab Works on RH Linux 6, 7, 9, Advanced Server, Sun, SGI Written in shell scripts and perl Two major uses: – Keeping production computing farms installations up to date without reinstalling – Partitioning of US/CMS computing dynamically

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 4 Major Goals of Voldemort Replace NIS with system to put passwd files locally on each node Have a unified structure to push new files out to existing nodes and install them on new nodes Have a single place where each volatile file is modified. Keep current capability to have special files for a single farm, subcluster, or node. Production Farms plus US/CMS have at least 13 different hardware configurations

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 5 Components of Voldemort 0.6 Voldemort-push, includes rsync_push binary and scripts to clone slave servers. Installed on all servers. Voldemort, installed on all clients, includes pullrsync and a number of auxiliary scripts that are called by pusher and puller. Tree file structure, set up in $VOLDEMORT_DIR/clusters Databases to describe the file structure Available as RPM or in Fermi ups/upd format.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 6 Features Pullrsync included in 7.3 Farms workgroup post-install of Fermi Linux Scripts tested on all flavors of Linux Includes option to sync out changes in Fermi Linux comps file But—not tied to Fermi Linux..can and does work with other installation systems such as Rocks or System Imager

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 7 Why replace NIS? NIS was stable on CDF farms-169 nodes, no timeouts for months—BUT We had to have at least 64 NIS slave servers to accomplish this Pushing to all those slave servers is a network load in itself Yppush doesn’t gracefully handle when a node is down Initial configuration of ypinit –s is error prone and can’t be automated during install.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 8 Why replace NIS cont’d Malformed map on one slave server can mess up several nodes NIS is small amount of network traffic but is very sensitive to bigger network flows and is disrupted by them. On our farms, we don’t store any real passwords in NIS, accounts change rarely. Ideal situation to distribute files

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 9 Installer vs on-line changes Whenever we made a change to the farm, we had to change in two places…on the nodes and in the installer. Often this has been forgotten Method of making installer changes is not straightforward Need to make a system where any file that goes on the system is only changed in ONE place.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 10 Down nodes problem: Right now, if we put extra files on the system, we have to go back and fix nodes that were down later, manually. Need a system that will remember which nodes were down, and keep retrying until it gets them all.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 11 Design goals of Voldemort Do not put any node-specific info into the Fermi Linux workgroup—we don’t want our whole structure available to world via anonymous FTP. (or our account names and groups or nfs servers). Replace /export/linux/Workgroups/Farms/nodes with a new structure that is used both by online activities and the installer. Keep our current capacity to have node specific, farm- specific, and subcluster-specific files

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 12 $VOLDEMORT_DIR/clusters/common/d b/nodes.conf Nodes.conf—database of nodes. Read by both rsync_push and pullrsync fncdf75:cdffarm1:Linux :i-acd:38400::N:2518 Reader Fields: – Node nameAPIC used in install – Cluster nameNode specific – FlavorSubclusters – Disk arrangement – Baud Rate

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 13 $VOLDEMORT_DIR/clusters/common/d b/files.conf Files.conf: Not fully populated yet. Three fields: – Full pathname to the file – (example: fnsfo/files/Linux /etc/passwd) – Files it depends on – common/templates/Linux /etc/passwd – fnsfo/templates/NULL/yppasswd – Command used to make it – (cat the above two files together).

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 14 $VOLDEMORT_DIR/ clusters/fnpce/ Prescripts—scripts that have to be executed before a rpm or file can be installed RPMS Files—single files that are pushed out to worker nodes Scripts—usually run only by the installer Tarballs—Mainly for pushing out /local/ups directory to worker nodes

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 15 $VOLDEMORT_DIR/clusters/fnpce/files Under each category, space for more than one flavor. Right now: Linux (731) Linux+2.4(711) Linux+2.2(612) IRIX+6.5 Can also define arbitrary flavor “foo” as long as database matches.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 16 $VOLDEMORT_DIR/clusters/fnpce/files/ Linux Each subdirectory of files directory gets pushed out independently—governed by.pushdir files Four subdirectories (typ) /etc, /root, /usr/local, /var/adm Three types of files: – Passwd, group, netgroup, auto.*,.k5login – Non-standard config files for RPMS in redhat base – Hardware-specific or farm-specific files

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 17 $VOLDEMORT_DIR/fnpce/tarballs/Linux Currently only one tarball Structure same as files (.pushdir governs) /local/ups/localups.tar Tarball should be created to be untarred in the directory it’s pushed into. Had to add this option because pushing a ups/upd tree of 19K files (180 Mb) was too slow.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 18 $VOLDEMORT_DIR/clusters/fnpce/RPM S/Linux RPMS that go here are either farm-specific or hardware-specific. Anything for whole farm should go into farms workgroup.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 19 $VOLDEMORT_DIR/clusters/fnpce/[pre] scripts/Linux Scripts and prescripts are mainly executed during the install Installer calls /sbin/pullrsync –I which forces running of all scripts Scripts should be smart enough to detect if the action has already been done

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 20 Subclusters Subclusters can exist in any of five categories, files, tarballs, RPMS, scripts, prescripts Subcluster membership determined by the database Convention: All hardware specific files (ethernet, lm_sensors) go into a subcluster named after the motherboard type Node can be in more than one subcluster For files and tarballs, a.pushdir at the top level.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 21 Node-specific files Can also have files specific to a single node Enabled by having field in database be “Y” instead of the default “N”

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 22 Rsync_push Rsync_push reads through the database and pushes to every node that matches the command-line options it was called with. *IMPORTANT* Default is to push to everything! There is an are-you-sure option now that warns you what you are pushing. Rsync_push –r allows you to retry nodes that didn’t push successfully the first time. Default transport is kerberized rsh. Others can be used as well. To push to a node, host principal of the server must be in /root/.k5login of the client node.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 23 Rsync_push options 1 -c Push for a given cluster -f Push for a given flavor -b Push for a list of nodes -B Push for a range of nodes -l push for all the nodes in If more than one is specified, we take the AND Example: rsync_push –c cdffarm –f Linux –B fncdf “ ”

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 24 Rsync_push options 2 -R don’t push the RPMS -F don’t push the files -S don’t push the scripts -P don’t push the prescripts -T don’t push the tarballs -L don’t push the Linux /etc/workgroup Default is to push everything

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 25 Rsync_push options 3 -w specify the workgroup you are pushing (default is Farms) -e use an alternative rsh command besides /usr/krb5/bin/rsh -q quiet—minimum or no output -v verbose—the more v’s, the more verbose -i Install mode—run new scripts and prescripts when they are pushed out -I Install mode—run all scripts and prescripts when they are pushed out -C Clear out the RPMS, scripts, and prescripts directory on the worker nodes.

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 26 pullrsync Determines node ID and type either from local config file or from database read Runs only if machine wasn’t shut down clean, and during the install Options –h help –H –c -f –M -t -R –S –P –F –T –L –w –I –i –q –v (as in rsync_push)

May 22, 2003 VOLDEMORT-S. Timm--LSCCW 27 Future plans Version v0_6 current, no known bugs right now. Next version needs better and faster database Also need ability to automatically distribute the push across slave servers Big task, integrating more closely with ROCKS and RH oss.fnal.gov/scs/public/farms/doc/voldemort.html