Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quick Overview of NPACI Rocks Philip M. Papadopoulos Associate Director, Distributed Computing San Diego Supercomputer Center.

Similar presentations


Presentation on theme: "Quick Overview of NPACI Rocks Philip M. Papadopoulos Associate Director, Distributed Computing San Diego Supercomputer Center."— Presentation transcript:

1 Quick Overview of NPACI Rocks Philip M. Papadopoulos Associate Director, Distributed Computing San Diego Supercomputer Center

2 Seed Questions Do you buy-in installation services? From the supplier or a third-party vendor? –We integrate. Easier to have vendor integrate larger clusters Do you buy pre-configured systems or build your own configuation? –Rocks is adaptable to many configurations Do you upgrade the full cluster at one time or in rolling mode? –Suggest all at once (very quick with Rocks) can be done as a batch job. –Can support rolling, if desired. Do you perform formal acceptance or burn-in tests? –Unfortunately, no. Need more automated testing.

3 Installation/Management Need to have a strategy for managing cluster nodes Pitfalls –Installing each node “by hand” Difficult to keep software on nodes up to date –Disk Imaging techniques (e.g.. VA Disk Imager) Difficult to handle heterogeneous nodes Treats OS as a single monolithic system –Specialized installation programs (e.g. IBM’s LUI, or RWCPs Multicast installer) – let Linux packaging vendors do their job Penultimate –RedHat Kickstart Define packages needed for OS on nodes, kickstart gives a reasonable measure of control. Need to fully automate to scale out (Rocks gets you there)

4 Scaling out Evolve to management of “two” systems –The front end(s) Log in host User’s home areas, passwords, groups Cluster configuration information –The compute nodes Disposable OS image Let software manage node heterogeneity Parallel (re)installation Data partitions on cluster drives untouched during re-installs Cluster-wide configuration files derived through reports from a MySQL database (DHCP, hosts, PBS nodes, …)

5 NPACI Rocks Toolkit – rocks.npaci.edu Techniques and software for easy installation, management, monitoring and update of clusters Installation –Bootable CD + floppy which contains all the packages and site configuration info to bring up an entire cluster Management and update philosophies –Trivial to completely reinstall any (all) nodes. –Nodes are 100% automatically configured Use of DHCP, NIS for configuration –Use RedHat’s Kickstart to define the set of software that defines a node. –All software is delivered in a RedHat Package (RPM) Encapsulate configuration for a package (e.g.. Myrinet) Manage dependencies –Never try to figure out if node software is consistent If you ever ask yourself this question, reinstall the node

6 Rocks Current State – Ver. 2.1 Now tracking Redhat 7.1 –2.4 Kernel –“Standard Tools” – PBS, MAUI, MPICH, GM, SSH, SSL, … –Could support other distros … don’t have staff for this. Designed to take “bare hardware” to cluster in a short period of time –Linux upgrades are often “forklift-style”. Rocks supports this as the default mode of admin Bootable CD –Kickstart file for Frontend created from Rocks webpage. –Use same CD to boot nodes. Automated integration “Legacy Unix config files” derived from mySQL database Re-installation (we have a single HTTP server, 100 Mbit) –One node: 10 Minutes –32 nodes: 13 Minutes –Use multiple HTTP servers + IP-balancing switches for scale

7 More Rocksisms Leverage widely-used (standard) software wherever possible –Everything is in RedHat Packages (RPM) –RedHat’s “kickstart” installation tool –SSH, Telnet (only during installation), Existing open source tools Write only the software that we need to write Focus on simplicity –Commodity components For example: x86 compute servers, Ethernet, Myrinet –Minimal For example: no additional diagnostic or proprietary networks Rocks is a collection point of software for people building clusters –It evolving to include cluster software and packaging from more than just SDSC and UCB –

8 Rocks-dist Integrate RedHat Packages from –Redhat (mirror) – base distribution + updates –Contrib directory –Locally produced packages –Local contrib (e.g. commerically bought code) –Packages from rocks.npaci.edu Produces a single updated distribution that resides on front-end –Is a RedHat Distribution with patches and updates applied Kickstart (RedHat) file is a text description of what’s on a node. Rocks automatically produces frontend and node files. Different Kickstart files and different distribution can co- exist on a front-end to add flexibility in configuring nodes.

9 insert-ethers Used to populate the “nodes” MySQL table Parses a file (e.g., /var/log/messages) for DHCPDISCOVER messages –Extracts MAC addr and, if not in table, adds MAC addr and hostname to table For every new entry: –Rebuilds /etc/hosts and /etc/dhcpd.conf –Reconfigures NIS –Restarts DHCP and PBS Hostname is – - - Configurable to change hostname –E.g., when adding new cabinets

10 Configuration Derived from Database mySQL DB makehosts /etc/hosts makedhcp /etc/dhcpd.conf pbs-config-sql pbs node list insert-ethers Node 0 Node 1 Node N Automated node discovery

11 Remote re-installation Shoot-node and eKV Rocks provides a simple method to remotely reinstall a node –CD/Floppy used to install the first time By default, hard power cycling will cause a node to reinstall itself. –Addressable PDUs can do this on generic hardware With no serial (or KVM) console, we are able to watch a node as installs (eKV), but … –Can’t see BIOS messages at boot up Syslog for all nodes sent to a log host (and to local disk) –Can look at what a node was complaining about before it went offline

12 192.168.254.254 Remotely starting reinstallation on two nodes 192.168.254.253 Remote re-installation Shoot-node and eKV

13 Monitoring your cluster PBS has a GUI called xpsmon. Gives a nice graphical view of up/down state of nodes SNMP status –Use the extensive SNMP MIB defined by the Linux community to find out many things about a node Installed software Uptime Load Slow Ganglia (UCB) – IP Multicast-based monitoring system –20+ different health measures I think we’re still weak here – learning about other activities in this area (e.g. ngop, CERN activities, City Toolkit)

14 Cern Cern.ch/hep-proj-grid-fabric Installation tools : wwwinfo.cern.ch/pdp


Download ppt "Quick Overview of NPACI Rocks Philip M. Papadopoulos Associate Director, Distributed Computing San Diego Supercomputer Center."

Similar presentations


Ads by Google