Cluster Computers.

Slides:



Advertisements
Similar presentations
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Advertisements

♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
Protocols and software for exploiting Myrinet clusters Congduc Pham and the main contributors P. Geoffray, L. Prylli, B. Tourancheau, R. Westrelin.
Today’s topics Single processors and the Memory Hierarchy
Beowulf Supercomputer System Lee, Jung won CS843.
Types of Parallel Computers
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
Low Overhead Fault Tolerant Networking (in Myrinet)
An Overview of Myrinet By: Ralph Zajac. What is Myrinet? n LAN designed for clusters n Based on USCD’s ATOMIC LAN n Has many characteristics of MPP message-passing.
Hosted VMM Architecture Advantages: –Installs and runs like an application –Portable – host OS does I/O access –Coexists with applications running on.
Real Parallel Computers. Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel.
5/8/2006 Nicole SAN Protocols 1 Storage Networking Protocols Nicole Opferman CS 526.
Storage area network and System area network (SAN)
System Architecture A Reconfigurable and Programmable Gigabit Network Interface Card Jeff Shafer, Hyong-Youb Kim, Paul Willmann, Dr. Scott Rixner Rice.
Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes Maria Athanasaki, Evangelos Koukis, Nectarios Koziris National Technical.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
Chapter 8 Input/Output. Busses l Group of electrical conductors suitable for carrying computer signals from one location to another l Each conductor in.
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
1 Computing platform Andrew A. Chien Mohsen Saneei University of Tehran.
RSC Williams MAPLD 2005/BOF-S1 A Linux-based Software Environment for the Reconfigurable Scalable Computing Project John A. Williams 1
The MPC Parallel Computer Hardware, Low-level Protocols and Performances University P. & M. Curie (PARIS) LIP6 laboratory Olivier Glück.
RiceNIC: A Reconfigurable and Programmable Gigabit Network Interface Card Jeff Shafer, Dr. Scott Rixner Rice Computer Architecture:
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Integrating New Capabilities into NetPIPE Dave Turner, Adam Oline, Xuehua Chen, and Troy Benjegerdes Scalable Computing Laboratory of Ames Laboratory This.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
A record and replay mechanism using programmable network interface cards Laurent Lefèvre INRIA / LIP (UMR CNRS, INRIA, ENS, UCB)
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
Interconnection network network interface and a case study.
Software Overhead in Messaging Layers Pitch Patarasuk.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
ICALEPCS 2007 The Evolution of the Elettra Control System The evolution of the Elettra Control Sytem C. Scafuri, L. Pivetta.
The CMS Event Builder Demonstrator based on MyrinetFrans Meijers. CHEP 2000, Padova Italy, Feb The CMS Event Builder Demonstrator based on Myrinet.
Wide-Area Parallel Computing in Java Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences vrije Universiteit.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures.
Parallel Computing on Wide-Area Clusters: the Albatross Project Aske Plaat Thilo Kielmann Jason Maassen Rob van Nieuwpoort Ronald Veldema Vrije Universiteit.
Enhancements for Voltaire’s InfiniBand simulator
CHAPTER 11: Modern Computer Systems
Network Connected Multiprocessors
High Performance and Reliable Multicast over Myrinet/GM-2
USB The topics covered, in order, are USB background
Infiniband Architecture
Operating Systems (CS 340 D)
J.M. Landgraf, M.J. LeVine, A. Ljubicic, Jr., M.W. Schulz
Constructing a system with multiple computers or processors
CS 286 Computer Organization and Architecture
Telecommunication ELEC503
High Performance Messaging on Workstations
Storage Networking Protocols
Storage area network and System area network (SAN)
Characteristics of Reconfigurable Hardware
Chapter 1: How are computers organized?
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapters 1-3 Concepts NT Server Capabilities
CDA-5155 Computer Architecture Principles Fall 2000
Myrinet 2Gbps Networks (
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.
Types of Parallel Computers
Chapter 13: I/O Systems.
Presentation transcript:

Cluster Computers

Introduction Cluster computing Cluster computers versus supercomputers Standard PCs or workstations connected by a fast network Good price/performance ratio Exploit existing (idle) machines or use (new) dedicated machines Cluster computers versus supercomputers Processing power is similar: based on microprocessors Communication performance was the key difference Modern networks (Myrinet, Infiniband) have bridged this gap

Overview Cluster computers at our department 128-node Pentium-Pro/Myrinet cluster 72-node dual-Pentium-III/Myrinet-2000 cluster Part of a wide-area system: Distributed ASCI Supercomputer Network interface protocols for Myrinet Low-level systems software Partly runs on the network interface card (firmware)

Distributed ASCI Supercomputer (1997-2001)

Node configuration 200 MHz Pentium Pro 128 MB memory 2.5 GB disk Fast Ethernet 100 Mbit/s Myrinet 1.28 Gbit/s (full duplex) Operating system: Red Hat Linux

DAS-2 Cluster (2002-now) 72 nodes, each with 2 CPUs (144 CPUs in total) 1 GHz Pentium-III 1 GB memory per node 20 GB disk Fast Ethernet 100 Mbit/s Myrinet-2000 2 Gbit/s (crossbar) Operating system: Red Hat Linux Part of wide-area DAS-2 system (5 clusters with 200 nodes in total) Ethernet switch Myrinet switch

Myrinet Components: 8-port switches Network interface card for each node (on PCI bus) Electrical cables: reliable links Myrinet switches: 8 x 8 crossbar switch Each port connects to a node (network interface) or another switch Source-based, cut-through routing Less than 1 microsecond switching delay

24-node DAS-1 cluster

128-node DAS-1 cluster Ring topology would have: 22 switches Poor diameter: 11 Poor bisection width: 2

Topology 128-node cluster PC PC 4 x 8 grid with wrap-around Each switch is connected to 4 other switches and 4 PCs 32 switches (128/4) Diameter: 6 Bisection width: 8 PC PC

Myrinet interface board Hardware 40 MHz custom cpu (LANai 4.1) 1 MByte SRAM 3 DMA engines (send, receive, to/from host) full duplex Myrinet link PCI bus interface Software LANai Control Program (LCP)

Network interface protocols for Myrinet Myrinet has programmable Network Interface processor Gives much flexibility to protocol designer NI protocol: low-level software running on NI and host Used to implement higher-level programming languages and libraries Critical for performance Want few μsec latency, 10s MB/sec throughput Map network interface (NI) into user space to avoid OS overhead Goal: give supercomputer communication performance to clusters

Basic NI protocol

Enhancements (see paper) Optimizing throughput using Programmed I/O instead of DMA Making communication reliable using flow control Use combination of polling and interrupts to reduce message receipt overhead Efficient multicast communication in firmware

Performance on DAS-2 9.6 μsec 1-way null-latency 168 MB/sec throughput

MareNostrum: largest Myrinet cluster in the world IBM system at Barcelona Supercomputer Center 4812 PowerPC 970 processors, 9.6 TB memory