Developing a Cluster Strategy for NPACI All Hands Meeting Panel Feb 11, 2000 David E. Culler Computer Science Division University of California, Berkeley.

Slides:



Advertisements
Similar presentations
Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)
Advertisements

Recent Developments in Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science Department.
Dan Bradley Computer Sciences Department University of Wisconsin-Madison Schedd On The Side.
Evolution of High Performance Cluster Architectures David E. Culler NPACI 2001 All Hands Meeting.
High Performance Computing Course Notes Grid Computing.
Information Technology Center Introduction to High Performance Computing at KFUPM.
Unique Opportunities in Experimental Computer Systems Research - the Berkeley Testbeds David Culler U.C. Berkeley Grad.
1 Week #1 Objectives Review clients, servers, and Windows network models Differentiate among the editions of Server 2008 Discuss the new Windows Server.
1 Week #1 Objectives Review clients, servers, and Windows network models Differentiate among the editions of Server 2008 Discuss the new Windows Server.
NPACI Panel on Clusters David E. Culler Computer Science Division University of California, Berkeley
Clusters Massive Cluster Gigabit Ethernet Progress on System Architecture for Extreme Devices David Culler U.C. Berkeley.
Programming in the Many Software Engineering Paradigm for the 21 st Century Nenad Medvidovic Center for Software Engineering Computer Science Department.
6/2/20071 Grid Computing Sun Grid Engine (SGE) Manoj Katwal.
Towards I-Space Ninja Mini-Retreat June 11, 1997 David Culler, Steve Gribble, Mark Stemm, Matt Welsh Computer Science Division U.C. Berkeley.
UCB Millennium and the Vineyard Cluster Architecture Phil Buonadonna University of California, Berkeley
Building a Cluster Support Service Implementation of the SCS Program UC Computing Services Conference Gary Jung SCS Project Manager
ProActive Infrastructure Eric Brewer, David Culler, Anthony Joseph, Randy Katz Computer Science Division U.C. Berkeley ninja.cs.berkeley.edu Active Networks.
IPPS 981 Berkeley FY98 Resource Working Group David E. Culler Computer Science Division U.C. Berkeley
Copyright © 2010 Platform Computing Corporation. All Rights Reserved. Platform Computing Ken Hertzler VP Product Management.
Web Search – Summer Term 2006 V. Web Search - Page Repository (c) Wolfgang Hürst, Albert-Ludwigs-University.
Chiba City: A Testbed for Scalablity and Development FAST-OS Workshop July 10, 2002 Rémy Evard Mathematics.
Internet-Scale Systems Research Group Eric Brewer David Culler Anthony Joseph Randy Katz Steven McCanne Computer Science Division University of California,
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Slide 3-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 3 Operating System Organization.
Idle virtual machine detection in FermiCloud Giovanni Franzini September 21, 2012 Scientific Computing Division Grid and Cloud Computing Department.
Research Computing with Newton Gerald Ragghianti Nov. 12, 2010.
Implementing Failover Clustering with Hyper-V
Building an Application Server for Home Network based on Android Platform Yi-hsien Liao Supervised by : Dr. Chao-huang Wei Department of Electrical Engineering.
An Introduction to Cloud Computing. The challenge Add new services for your users quickly and cost effectively.
Dynamics AX Technical Overview Application Architecture Dynamics AX Technical Overview.
Computer Measurement Group, India CLOUD PERFORMANCE TESTING - KEY CONSIDERATIONS Abhijeet Padwal, Persistent Systems.
Extreme Networks Confidential and Proprietary. © 2010 Extreme Networks Inc. All rights reserved.
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Successful Deployment and Solid Management … Close Relatives Tim Sinclair, General Manager, Windows Enterprise Management.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.
Scalable Cluster Management: Frameworks, Tools, and Systems David A. Evensky Ann C. Gentile Pete Wyckoff Robert C. Armstrong Robert L. Clay Ron Brightwell.
SUSE Linux Enterprise Desktop Administration Chapter 12 Administer Printing.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
ECE 4450:427/527 - Computer Networks Spring 2015 Dr. Nghi Tran Department of Electrical & Computer Engineering Lecture 2: Overview of Computer Network.
Installation and Development Tools National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The SEASR project and its.
Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.
Grid Computing at The Hartford Condor Week 2008 Robert Nordlund
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
1 Internet2 K20 SEGPs and IP Multicast Joanne Hugi Computing Center University of Oregon.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
Cluster Software Overview
VApp Product Support Engineering Rev E VMware Confidential.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
Millennium Executive Committee Meeting David E. Culler Computer Science Division
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
June, 1999©Vanu, Inc. Vanu Bose Vanu, Inc. Programming the Physical Layer in Wireless Networks.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
Advanced Network Administration Computer Clusters.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
HTCondor and TORQUE/Maui
Berkeley Cluster Projects
An Introduction to Cloud Computing
U.C. Berkeley Millennium Project
Internet-Scale Systems Research Group
IBM Pervasive Computing Visit June 9, 1997
ECE 4450:427/527 - Computer Networks Spring 2017
Progress on System Architecture for Extreme Devices
Optena: Enterprise Condor
Building and running HPC apps in Windows Azure
Defining the Grid Fabrizio Gagliardi EMEA Director Technical Computing
Presentation transcript:

Developing a Cluster Strategy for NPACI All Hands Meeting Panel Feb 11, 2000 David E. Culler Computer Science Division University of California, Berkeley

2/11/2000NPACI Clusters2 x86+Myrinet platforms w/ GbE inter-networking UCB Millennium Cluster of Clusters PIII-X 64x4 PII 8x2 PIII 32x2 ½ TB DLIB PII PIII Gigabit Ethernet (GbE) Ninja Math Bio CE Physics Astro NTON Internet-2 SuperNet Mobile Svcs Kiosks NOW Distributed ownership, allocation, and management

2/11/2000NPACI Clusters3 Vineyard Cluster Architecture Distributed resource utilization and management in a “Vineyard” of Clusters. - VIA / GM, GbE - Multicast Applications / Services (ISPACE/Kiosks) - NT / Linux (2.2.x) - Stride Scheduler MPIVEXEC PBS I/O Mgmt / Monitoring REXEC TOOLS Rootstock Distribution

2/11/2000NPACI Clusters4 Clusters “own” HPC

2/11/2000NPACI Clusters5 Fundamental Advantages of Clusters Cost Performance Performance / Cost Track leading edge of market technology Incremental scalability Availability Tremendous I/O performance Wide-Area Network performance –competitive internal network performance too Allow specialization of networked services

2/11/2000NPACI Clusters6 Fundamental Challenges Management Complete system on every node –need scalable administration Incremental scalability & availability => –heterogeneity –some parts inoperable at any time The Cluster projects are making great progress in this area –eg: Millennium rootstock Cluster tools are what you want for managing the desktops across your department

2/11/2000NPACI Clusters7 CS&E HPC hampered by “self-centered” usage model Have my own application for my studies Want the entire machine to myself Want it now Think “services” Think “software” The value is in your application. Make it a service and make it available to the scientific community. Put it on a cluster to deliver results 24x7 x 52

2/11/2000NPACI Clusters8 Example: TCAD Simulation Service star formation simulation earthquake simulations phylogeny, BLAST,...

2/11/2000NPACI Clusters9 Extreme Example UCB Millennium / NOW has deliver 70 CPU years! Simple special case, but... Engineered for portability, adaptability, availability

2/11/2000NPACI Clusters10 What should NPACI do? To be relevant: become a “Center of Expertise” for clusters draw expertise toward the center for ease of dissemination facilitate and encourage building clusters among the partners invest in an interesting cluster “close to home” –cheap! Graft Millennium invest in people to understand the implications To Lead: Pioneer widespread computational science and engineering services infiniband

2/11/2000NPACI Clusters11 from e-commerce to

2/11/2000NPACI Clusters12 Technical Backup Slides

2/11/2000NPACI Clusters13 Rootstock Mechanics K cluster stock - build - os - drvrs - mill SW - os mods leased builds cs CAN Cluster System Distribution Center... IP network 1. Cluster Stock - Rootstock build pages - Full Current Linux - all fixes and pckgs - SSL, SSH - Cluster Drivers - Cluster System Layers - rexec, mpe, pbs - Optional SW ($) - Cluster Kernal Mods 5. Cluster Update button (future) - 2nd dialtone, CF engine, rolling update 2. Make the CS “graft” - specify IP address - pckg removes - dchp, dns, nis,... sanity check and build - resolv.conf, /etc/hosts,... constructs cluster build (lease) download CS build floppy Cluster 3. CS power-on build - xfer and localize DT - add local admin scripts - node build floppy 4. Node power-on build - local stock from CS

2/11/2000NPACI Clusters14 REXEC / VEXEC Components –rexecd, rexec & vexecd rexecd vexecd (Policy A) rexec Cluster IP Multicast Channel %rexec –n 2 –r 3 indexer minimum $ vexecd (Policy B) Node ANode BNode CNode D “Nodes AB” run indexer on Nodes AB at 3 credits/min

2/11/2000NPACI Clusters15 Computational Economy Market-based approach to resource allocation –Optimizes for user value Resources Economic F.E. APIAPI APIAPI Access Modules Resource Managers Time Share Batch Queue Apps (Value)