2. Computer Clusters for Scalable Parallel Computing

Slides:



Advertisements
Similar presentations
Chapter 4 Computer Networks
Advertisements

Distributed Processing, Client/Server and Clusters
INTRODUCTION TO COMPUTER NETWORKS Zeeshan Abbas. Introduction to Computer Networks INTRODUCTION TO COMPUTER NETWORKS.
Introduction to Storage Area Network (SAN) Jie Feng Winter 2001.
Distributed Systems CS
Distributed Systems Topics What is a Distributed System?
Distributed Systems 1 Topics  What is a Distributed System?  Why Distributed Systems?  Examples of Distributed Systems  Distributed System Requirements.
Dinker Batra CLUSTERING Categories of Clusters. Dinker Batra Introduction A computer cluster is a group of linked computers, working together closely.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Distributed Processing, Client/Server, and Clusters
Chapter 16 Client/Server Computing Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Distributed Hardware How are computers interconnected ? –via a bus-based –via a switch How are processors and memories interconnected ? –Private –shared.
Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
INTRODUCTION TO COMPUTER NETWORKS INTRODUCTION Lecture # 1 (
Massively Distributed Database Systems Spring 2014 Ki-Joune Li Pusan National University.
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
CLUSTER COMPUTING Prepared by: Kalpesh Sindha (ITSNS)
Shilpa Seth.  Centralized System Centralized System  Client Server System Client Server System  Parallel System Parallel System.
PMIT-6102 Advanced Database Systems
1 Copyright © 2012, Elsevier Inc. All rights reserved Distributed and Cloud Computing K. Hwang, G. Fox and J. Dongarra Chapter 2: Computer Clusters.
1 In Summary Need more computing power Improve the operating speed of processors & other components constrained by the speed of light, thermodynamic laws,
Networks. Network Classifications Acronyms, acronyms, and more acronyms What does PAN stand for? – Personal Area Network – interconnection of personal.
Computer System Architectures Computer System Software
Chapter 4: Computer Networks Department of Computer Science Foundation Year Program Umm Alqura University, Makkah Computer Skills /1436.
Networks. Network Classifications Acronyms, acronyms, and more acronyms What does PAN stand for? – Personal Area Network – interconnection of personal.
A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.
1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.
Distributed Systems 1 CS- 492 Distributed system & Parallel Processing Sunday: 2/4/1435 (8 – 11 ) Lecture (1) Introduction to distributed system and models.
Multi-media Computers and Computer Networks. Questions ? Media is used for ………………… Multimedia computer is capable of integrating ………………………………….. OCR stands.
CHAPTER 11: Modern Computer Systems
N. GSU Slide 1 Chapter 02 Cloud Computing Systems N. Xiong Georgia State University.
CLUSTER COMPUTING STIMI K.O. ROLL NO:53 MCA B-5. INTRODUCTION  A computer cluster is a group of tightly coupled computers that work together closely.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Chapter 2 Computer Clusters Lecture 2.2 Computer Cluster Architectures.
Introduction to Network Basic 1. Agenda – - Internetworking Basic – - OSI Layer – - TCP/IP Model – - IP Addressing – - Subnetting & VLSM – - The Internal.
Cluster Workstations. Recently the distinction between parallel and distributed computers has become blurred with the advent of the network of workstations.
University of Palestine Faculty of Applied Engineering and Urban Planning Software Engineering Department INTRODUCTION TO COMPUTER NETWORKS Dr. Abdelhamid.
N. GSU Slide 1 Chapter 05 Clustered Systems for Massive Parallelism N. Xiong Georgia State University.
Ch System Models for Distributed and Cloud Computing Classification of Massive systems (Table 1.2) Clusters of Cooperative Computers 
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
CLUSTER COMPUTING TECHNOLOGY BY-1.SACHIN YADAV 2.MADHAV SHINDE SECTION-3.
Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003.
Distributed Computing Systems CSCI 6900/4900. Review Distributed system –A collection of independent computers that appears to its users as a single coherent.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Server HW CSIS 4490 n-Tier Client/Server Dr. Hoganson Server Hardware Mission-critical –High reliability –redundancy Massive storage (disk) –RAID for redundancy.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
Background Computer System Architectures Computer System Software.
SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Clouds , Grids and Clusters
INTRODUCTION TO COMPUTER NETWORKS
CLUSTER COMPUTING.
Distributed computing deals with hardware
INTRODUCTION TO COMPUTER NETWORKS
INTRODUCTION TO COMPUTER NETWORKS
Database System Architectures
Presentation transcript:

2. Computer Clusters for Scalable Parallel Computing 2.1 Clustering for Massive Parallelism A computer cluster is a collection of interconnected stand-alone computers which can work together collectively and cooperatively as a single integrated computing resource pool. Clustering explores massive parallelism at the job level and achieves high availability (HA) through stand-alone operations. 2.1.1 Cluster Development Trends Milestone Cluster Systems hot research challenge fast communication job scheduling SSI HA 2.1.2 Design Objectives of Computer Clusters Ch 2-1. Computer Cluster

Dedicated vs. Enterprise Clusters Scalability Packaging Cluster nodes can be packaged in a compact or a slack fashion. In a compact cluster, the nodes are closely packaged in one or more racks sitting in a rooms. In a slack cluster, the nodes are attached to their usual peripherals, and they may be located in different rooms, different buildings, or even remote regions. Packaging directly affects communication wire length, and thus the selection of interconnection technology used. Control A cluster can be either controlled or managed in centralized or decentralized fashion. A compact cluster normally has centralized control, while a slack cluster can be controlled either way. Homogeneity Security Intracluster communication can be either exposed or enclosed. Dedicated vs. Enterprise Clusters An enterprise cluster is mainly used to utilize idle resources in the nodes. Ch 2-1. Computer Cluster

2.1.3 Fundamental Cluster Design Issues Scalable Performance Single-System Image (SSI) Availability Support Clusters can provide cost-effective HA capability with lots of redundancy in processors, memory, disks, I/O devices, networks, and operating system images. Cluster Job Management Job management software is required to provide batching, load balancing, parallel processing, and other functionality. Internode Communication Fault Tolerance and Recovery Cluster Family Classification Computer cluster These are clusters designed mainly for collective computation over a single large job. High-Availability clusters HA clusters are designed to be fault-tolerant and achieve HA of service. HA clusters operate with many redundant nodes to sustain faults or failures. Load-balancing clusters These clusters shoot for higher resource utilization through load balancing among all participating nodes in the cluster. Ch 2-1. Computer Cluster

2.2 Computer Cluster and MPP Architecture 2.1.4 Analysis of the Top 500 Supercomputers Architecture Evaluation SMP  MPP  cluster systems Speed Improvement over Time Pflops (Peta flops): 1,000조 부동소수점 명령/초 Performance plot of the Top 500 supercomputer from 1993 to 2010. (Fig. 2.2) Operating System Trends in the Top 500 Linux (82%) The Top Five Systems in 2010 Table 2.2 Country Share and Application Share Major increases of supercomputer applications are in the area of database, research, finance, and information services. 2.2 Computer Cluster and MPP Architecture 2.2.1 Cluster Organization and Resource Sharing A Basic Cluster Architecture Ch 2-1. Computer Cluster

2.2.3 Cluster System Interconnects Figure 2.4 shows the basic architecture of a computer cluster over PCs or workstations. Resource Sharing in Clusters Fig, 2.5 The shared-nothing configuration in Part (a) simply connects two or more autonomous computers via a LAN such as Ethernet. A shared-disk cluster is shown in Part (b). This is what most business clusters desire so that they can enable recovery support in case of node failure. The shared-memory cluster in Part (c) is much more difficult to realize. The nodes could be connected by a scalable coherence interface (SCI) ring, which is connected to the memory bus of each node through an NIC module. Node Architecture and MPP Packaging Cluster nodes are classified into two categories: compute nodes and service nodes Table 2.3 introduces two example compute node architecture: homogeneous design and hybrid node design. 2.2.3 Cluster System Interconnects High-Bandwidth Interconnects Ch 2-1. Computer Cluster

2.2.4 Hardware, Software, and Middleware Support Crossbar Switch in Google Search Engine Cluster (Fig. 2.7) Share of System Interconnects over Time Distribution of large-scale system interconnects (Fig. 2.8). The InfiniBand system fabric built in a typical high-performance computer cluster (Fig. 2.9). 2.2.4 Hardware, Software, and Middleware Support The middleware, OS extensions, and hardware support needed to achieve HA in a typical Linux cluster system (Fig. 2.10). 2.2.5 GPU Clusters for Massive Parallelism GPU Cluster Components A GPU cluster is often built as a heterogeneous system consisting of three major components: the CPU host nodes, the GPU nodes and the cluster interconnect between them. Ch 2-1. Computer Cluster

Academic Cluster Computing Initiative Ch 2-1. Computer Cluster