Presentation is loading. Please wait.

Presentation is loading. Please wait.

CENG 546 Dr. Esma Yıldırım. Copyright © 2012, Elsevier Inc. All rights reserved. 2 - 2 What is a computing cluster?  A computing cluster consists of.

Similar presentations


Presentation on theme: "CENG 546 Dr. Esma Yıldırım. Copyright © 2012, Elsevier Inc. All rights reserved. 2 - 2 What is a computing cluster?  A computing cluster consists of."— Presentation transcript:

1 CENG 546 Dr. Esma Yıldırım

2 Copyright © 2012, Elsevier Inc. All rights reserved. 2 - 2 What is a computing cluster?  A computing cluster consists of a collection of interconnected stand-alone/complete computers, which can cooperatively working together as a single, integrated computing resource. Cluster explores parallelism at job level and distributed computing with higher availability.  A typical cluster:  Merging multiple system images to a SSI (single-system image ) at certain functional levels.  Low latency communication protocols applied  Loosely coupled than an SMP with a SSI

3 3  It is a distributed/parallel computing system  It is constructed entirely from commodity subsystems  All subcomponents can be acquired commercially and separately  Computing elements (nodes) are employed as fully operational standalone mainstream systems  Two major subsystems:  Compute nodes  System area network (SAN)  Employs industry standard interfaces for integration  Uses industry standard software for majority of services  Incorporates additional middleware for interoperability among elements  Uses software for coordinated programming of elements in parallel

4 Copyright © 2012, Elsevier Inc. All rights reserved. 2 - 4 Multicomputer Clusters:  Cluster: A network of computers supported by middleware and interacting by message passing  PC Cluster (Most Linux clusters)  Workstation Cluster (NOW, COW)  Server cluster or Server Farm  Cluster of SMPs or ccNUMA systems  Cluster-structured massively parallel processors (MPP) – about 85% of the top-500 systems

5 Copyright © 2012, Elsevier Inc. All rights reserved. 2 - 5

6 Copyright © 2012, Elsevier Inc. All rights reserved. 2 - 6  System availability (HA) : Cluster offers inherent high system availability due to the redundancy of hardware, operating systems, and applications.  Hardware Fault Tolerance: Cluster has some degree of redundancy in most system components including both hardware and software modules.  OS and application reliability : Run multiple copies of the OS and applications, and through this redundancy  Scalability : Adding servers to a cluster or adding more clusters to a network as the application need arises.  High Performance : Running cluster enabled programs to yield higher throughput.

7 7  The ability to deliver proportionally greater sustained performance through increased system resources  Strong Scaling  Fixed size application problem  Application size remains constant with increase in system size  Weak Scaling  Variable size application problem  Application size scales proportionally with system size  Capability computing  in most pure form: strong scaling  Marketing claims tend toward this class  Capacity computing  Throughput computing  Includes job-stream workloads  In most simple form: weak scaling  Cooperative computing  Interacting and coordinating concurrent processes  Not a widely used term  Also: “coordinated computing”

8 8  Peak floating point operations per second (flops)  Peak instructions per second (ips)  Sustained throughput  Average performance over a period of time  flops, Mflops, Gflops, Tflops, Pflops  flops, Megaflops, Gigaflops, Teraflops, Petaflops  ips, Mips, ops, Mops …  Cycles per instruction  cpi  Alternatively: instructions per cycle, ipc  Memory access latency  cycles per second  Memory access bandwidth  bytes per second (Bps)  bits per second (bps)  or Gigabytes per second, GBps, GB/s

9  I/O Interface  Memory Interface  Cache hierarchy  Register Sets  Control  Execution pipeline  Arithmetic Logic Units 9

10 10  A general class of system  Integrates multiple processors in to an interconnected ensemble  MIMD: Multiple Instruction Stream Multiple Data Stream  Different memory models  Distributed memory  Nodes support separate address spaces  Shared memory  Symmetric multiprocessor  UMA – uniform memory access  Cache coherent  Distributed shared memory  NUMA – non uniform memory access  Cache coherent  PGAS  Partitioned global address space  NUMA  Not cache coherence  Hybrid : Ensemble of distributed shared memory nodes  Massively Parallel Processor, MPP

11 11  MPP  General class of large scale multiprocessor  Represents largest systems  IBM BG/L  Cray XT3  Distinguished by memory strategy  Distributed memory  Distributed shared memory  Cache coherent  Partitioned global address space  Custom interconnect network  Potentially heterogeneous  May incorporate accelerator to boost peak performance

12 12

13 13

14 Copyright © 2012, Elsevier Inc. All rights reserved. 2 - 14 IBM BlueGene/L Supercomputer: The World Fastest Message-Passing MPP built in 2005 Built jointly by IBM and LLNL teams and funded by US DoE ASCI Research Program

15 15  Building block for large MPP  Multiple processors  2 to 32 processors  Now Multicore  Uniform Memory Access (UMA) shared memory  Every processor has equal access in equal time to all banks of the main memory  Cache coherent  Multiple copies of variable maintained consistent by hardware

16 16

17 17 USB Peripherals JTAG MP L1 L2 MP L1 L2 L3 MP L1 L2 MP L1 L2 L3 M1 M n-1 Controller S S NIC Legend : MP : MicroProcessor L1,L2,L3 : Caches M1.. : Memory Banks S : Storage NIC : Network Interface Card Ethernet PCI-e

18 18 Distributed Shared Memory- Non-uniform memory access

19 19 16X System Area Network 64 Processor Constellation 64 Processor Commodity Cluster 4X System Area Network An ensemble of N nodes each comprising p computing elements The p elements are tightly bound shared memory (e.g., smp, dsm) The N nodes are loosely coupled, i.e., distributed memory p is greater than N Distinction is which layer gives us the most power through parallelism

20 20 Science Problems : Environmental Modeling, Physics, Computational Chemistry, etc. Application : Coastal Modeling, Black hole simulations, etc. Algorithms : PDE, Gaussian Elimination, 12 Dwarves, etc. Program Source Code Programming Languages: Fortran, C, C++, UPC, Fortress, X10, etc. Compilers : Intel C/C++/Fortran Compilers, PGI C/C++/Fortran, IBM XLC, XLC++, XLF, etc. Runtime Systems : Java Runtime, MPI etc. Operating Systems : Linux, Unix, AIX etc. Systems Architecture : Vector, SIMD array, MPP, Commodity Cluster Firmware : Motherboard chipset, BIOS, NIC drivers, Microarchitectures : Intel/AMD x86, SUN SPARC, IBM Power 5/6 Logic Design : RTL Circuit Design : ASIC, FPGA, Custom VLSI Device Technology : NMOS, CMOS, TTL, Optical Model of Computation

21 21

22 22


Download ppt "CENG 546 Dr. Esma Yıldırım. Copyright © 2012, Elsevier Inc. All rights reserved. 2 - 2 What is a computing cluster?  A computing cluster consists of."

Similar presentations


Ads by Google