Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presented by Open Source Cluster Application Resources (OSCAR) Stephen L. Scott Thomas Naughton Geoffroy Vallée Network and Cluster Computing Computer.

Similar presentations


Presentation on theme: "Presented by Open Source Cluster Application Resources (OSCAR) Stephen L. Scott Thomas Naughton Geoffroy Vallée Network and Cluster Computing Computer."— Presentation transcript:

1 Presented by Open Source Cluster Application Resources (OSCAR) Stephen L. Scott Thomas Naughton Geoffroy Vallée Network and Cluster Computing Computer Science and Mathematics Division

2 2 Scott_OSCAR_0611 OSCAR Snapshot of best known methods for building, programming and using clusters. Open Source Cluster Application Resources Consortium of academic, research and industry members.

3 3 Scott_OSCAR_0611 Over 5-years of OSCAR January 2000 Concept first discussed April 2000 Organizational meeting  Cluster assembly is time consuming and repetitive  Nice to offer a toolkit to automate  Leverage wealth of open source components April 2001 First public release OSCAR V5.0 Nov 2006 Released at SC06

4 4 Scott_OSCAR_0611 What does OSCAR do?  Wizard based cluster software installation  Operating system  Cluster environment  Automatically configures cluster components  Increases consistency among cluster builds  Reduces time to build / install a cluster  Reduces need for expertise

5 5 Scott_OSCAR_0611 Design goals Modular meta- package system / API – “OSCAR Packages” Keep it simple for package authors Open Source to foster reuse and community participation Fosters “spin-offs” to reuse OSCAR framework Native package systems Existing distributions Management, system and applications Keep the interface simple Provide basic operations of cluster software and node administration Enable others to re-use and extend system – deployment tool Extensibility for new Software and Projects Leverage “best practices” whenever possible Reduce overhead for cluster management

6 6 Scott_OSCAR_0611 OSCAR overview Framework for cluster management Simplifies installation, configuration and operation Reduces time/learning curve for cluster build – Requires: pre-installed headnode with supported Linux distribution – Thereafter: wizard guides user through setup/install of entire cluster Package-based framework Content: Software + Configuration, Tests, Docs Types: – Core: SIS, C3, Switcher, ODA, OPD, APItest, Support Libs – Non-core: selected & third-party (PVM, LAM/MPI, Toque/Maui,...) Access: repositories accessible via OPD/OPDer

7 7 Scott_OSCAR_0611 OSCAR packages  Simple way to wrap software & configuration  “Do you offer package Foo version X?”  Basic Design goals  Keep simple for package authors  Modular packaging (each self contained)  Timely release/updates  Leverage RPM + meta file + scripts, tests, docs, …  Recently extended to better support RPM, Debs, etc.  Repositories for downloading via OPD/OPDer

8 8 Scott_OSCAR_0611 OSCAR – Cluster Installation Wizard Step 1 Step 2 Step 3 Step 4 Step 6 Step 5 Step 7 Step 8 Done! Start Cluster deployment monitor Cluster deployment monitor

9 9 Scott_OSCAR_0611 Administration /Configuration HPC Services/ Tools OSCAR components  SIS, C3, OPIUM, Kernel-Picker & cluster services (dhcp, nfs, ntp,...)  Security: Pfilter, OpenSSH Core Infrastructure/ Management  Parallel Libs: MPICH, LAM/MPI, PVM, Open MPI  OpenPBS/MAUI, Torque, SGE  HDF5  Ganglia, Clumon  Other 3 rd party OSCAR Packages  System Installation Suite (SIS), Cluster Command & Control (C3), Env-Switcher  OSCAR DAtabase (ODA), OSCAR Package Downloader (OPD)

10 10 Scott_OSCAR_0611 C3 Power Tools  Command-line interface for cluster system administration and parallel user tools  Parallel execution cexec  Execute across a single cluster or multiple clusters at same time  Scatter/gather operations cpush / cget  Distribute or fetch files for all node(s)/cluster(s)  Used throughout OSCAR  mechanism for cluster wide operations

11 11 Scott_OSCAR_0611 Highlights OSCAR v5.0 Improved distribution and architecture support Node installation monitor New network setup options and other GUI enhancements “Use Your Own Kernel” (UYOK) for SystemImager New OSCAR Packages: – Open MPI, SC3, SGE, YUME, NetbootMgr Supported platforms: x86: fc4, fc5, mdv2006, rhel4, suse10.0 (tentative) x86_64: fc4, fc5, rhel4 Diskless OSCAR & LiveCD OSCAR on Debian (OoD) OSCAR Command Line Interface (CLI) Native package management – network installation In progress

12 12 Scott_OSCAR_0611 OSCAR: proven scalability Selected machines registered at OSCAR website Based on data taken on 11/2/2006:  OSCAR Cluster Registration Page http://oscar.openclustergroup.org/cluster-register?sort=node_count  ORNL OIC User Guide http://oic.ornl.gov/ornlindex.html#architecture Endeavor232 nodes with 928 CPUs ORNL OIC440 nodes with 880 CPUs McKenzie264 nodes with 528 CPUs SUN-CLUSTER128 nodes with 512 CPUs Cacau205 nodes with 410 CPUs Barossa184 nodes with 368 CPUs Smalley66 nodes with 264 CPUs OSCAR-SSC130 nodes with 130 CPUs

13 13 Scott_OSCAR_0611 More OSCAR information… Open Cluster Group Home Page Development Page Mailing Lists www.OpenClusterGroup.org oscar.OpenClusterGroup.org svn.oscar.openclustergroup.org/trac/oscar oscar-users@lists.sourceforge.net oscar-devel@lists.sourceforge.net oscar-users@lists.sourceforge.net oscar-devel@lists.sourceforge.net OSCAR OSCAR Research supported by the Mathematics, Information and Computational Sciences Office, Office of Advanced Scientific Computing Research, Office of Science, U. S. Department of Energy, under contract No. DE-AC05-00OR22725 with UT-Battelle, LLC. OSCAR Symposium www.csm.ornl.gov/oscar07

14 14 Scott_OSCAR_0611 HA- OSCAR NEC's OSCAR- Pro SSS- OSCAR SSI- OSCAR OSCAR “flavors”

15 15 Scott_OSCAR_0611 HA-OSCAR:  The first known field-grade open source HA Beowulf cluster release  Self-configuration Multi-head Beowulf system  HA and HPC clustering techniques to enable critical HPC infrastructure  Services: Active/Hot Standby  Self-healing with 3-5 sec automatic failover time RAS Management for HPC cluster: Self-Awareness

16 16 Scott_OSCAR_0611 NEC's OSCAR-Pro Presented at OSCAR'06 keynote by Erich Focht (NEC) Presented at OSCAR'06 keynote by Erich Focht (NEC) Commercial Enhancements Leverage open source tool Joined project / contributions to OSCAR core Integrate additions when applicable Feedback and direction based on user needs

17 17 Scott_OSCAR_0611 Scalable System Software Computer centers use incompatible, ad hoc set of systems tools Tools are not designed to scale to multi-Teraflop systems Duplication of work to try and scale tools System growth vs. Administrator growth Define standard interfaces for system components Create scalable, standardized management tools (Subsequently) reduce costs & improve efficiency at centers DOE Labs: ORNL, ANL, LBNL, PNNL, SNL, LANL, Ames Academics: NCSA, PSC, SDSC Industry: IBM, Cray, Intel, SGI Problems Goals Participants

18 18 Scott_OSCAR_0611 SSS project overview Map out functional areas Standardize the system interfaces  Schedulers, job managers  System monitors  Accounting and user management  Checkpoint/restart  Build and configure systems  Open forum of universities, labs, industry representatives  Define component interfaces in XML  Develop communication infrastructure

19 Beckerman_0611 19 Beckerman_Biomed_0611 Components written in any mixture of C, C++, Java, Perl, and Python can be integrated into the Scalable Systems Software Suite Node state manager Meta scheduler Meta monitor Meta manager Standard XML interfaces Meta services Service directory Event manager Allocation management System and job monitor Accounting Node configuration and build manager Usage reports Process manager Checkpoint restart Hardware infrastructure manager Authentication Communication Scheduler Job queue manager

20 20 Scott_OSCAR_0611 SSS-OSCAR Components Bamboo BLCR Gold MAUI-SSS SSSLib Warehouse MPD2 LAM/MPI (w/ BLCR) Queue/Job Manager Berkeley Checkpoint/Restart Accounting & Allocation Management System Checkpoint/Restart enabled MPI Job Scheduler SSS Communication library Includes: SD, EM, PM, BCM, NSM, NWI Distributed System Monitor MPI Process Manager

21 21 Scott_OSCAR_0611 Single System Image Open Source Application Resources (SSI-OSCAR)  Easy use thanks to SSI systems  SMP illusion  High performance  Fault Tolerance  Easy management thanks to OCSAR  Automatic cluster install/update

22 Beckerman_0611 22 Beckerman_Biomed_0611 Contacts Thomas Naughton Network and Cluster Computing Computer Science and Mathematics Division (865) 576-4184 naughtont@ornl.gov Stephen L. Scott Network and Cluster Computing Computer Science and Mathematics Division (865) 574-3144 scottsl@ornl.gov Geoffroy Vallée Network and Cluster Computing Computer Science and Mathematics Division (865) 574-3152 valleegr@ornl.gov 22 Scott_OSCAR_0611


Download ppt "Presented by Open Source Cluster Application Resources (OSCAR) Stephen L. Scott Thomas Naughton Geoffroy Vallée Network and Cluster Computing Computer."

Similar presentations


Ads by Google