Lecture 3: Computer Architectures

Slides:



Advertisements
Similar presentations
© DEEDS – OS Course WS11/12 Lecture 10 - Multiprocessing Support 1 Administrative Issues  Exam date candidates  CW 7 * Feb 14th (Tue): * Feb 16th.
Advertisements

© 2009 Fakultas Teknologi Informasi Universitas Budi Luhur Jl. Ciledug Raya Petukangan Utara Jakarta Selatan Website:
PIPELINE AND VECTOR PROCESSING
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 6: Multicore Systems
SISD—Single Instruction Single Data Xin Meng Tufts University School of Engineering.
Taxanomy of parallel machines. Taxonomy of parallel machines Memory – Shared mem. – Distributed mem. Control – SIMD – MIMD.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.
Multiprocessors CSE 4711 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor –Although.
1 Burroughs B5500 multiprocessor. These machines were designed to support HLLs, such as Algol. They used a stack architecture, but part of the stack was.

Chapter 17 Parallel Processing.
Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.
PSU CS 106 Computing Fundamentals II Introduction HM 1/3/2009.
 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.
CPE 731 Advanced Computer Architecture Multiprocessor Introduction
1 Pertemuan 25 Parallel Processing 1 Matakuliah: H0344/Organisasi dan Arsitektur Komputer Tahun: 2005 Versi: 1/1.
Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
Flynn’s Taxonomy of Computer Architectures Source: Wikipedia Michael Flynn 1966 CMPS 5433 – Parallel Processing.
1 Parallel computing and its recent topics. 2 Outline 1. Introduction of parallel processing (1)What is parallel processing (2)Classification of parallel.
Parallel Computing Basic Concepts Computational Models Synchronous vs. Asynchronous The Flynn Taxonomy Shared versus Distributed Memory Interconnection.
CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
Edgar Gabriel Short Course: Advanced programming with MPI Edgar Gabriel Spring 2007.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Institute for Software Science – University of ViennaP.Brezany Parallel and Distributed Systems Peter Brezany Institute for Software Science University.
-1- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Parallel Computer Architectures 2 nd week References Flynn’s Taxonomy Classification of Parallel.
Spring 2003CSE P5481 Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing.
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 8 Multiple Processor Systems Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
PARALLEL PROCESSOR- TAXONOMY. CH18 Parallel Processing {Multi-processor, Multi-computer} Multiple Processor Organizations Symmetric Multiprocessors Cache.
Parallel Computing.
CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer/
Outline Why this subject? What is High Performance Computing?
Computer Architecture And Organization UNIT-II Flynn’s Classification Of Computer Architectures.
Parallel Processing Presented by: Wanki Ho CS147, Section 1.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Classification of parallel computers Limitations of parallel processing.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
These slides are based on the book:
CS203 – Advanced Computer Architecture
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.
CHAPTER SEVEN PARALLEL PROCESSING © Prepared By: Razif Razali.
buses, crossing switch, multistage network.
CS 147 – Parallel Processing
Flynn’s Classification Of Computer Architectures
Different Architectures
Chapter 17 Parallel Processing
Symmetric Multiprocessing (SMP)
buses, crossing switch, multistage network.
Overview Parallel Processing Pipelining
AN INTRODUCTION ON PARALLEL PROCESSING
Advanced Computer and Parallel Processing
Part 2: Parallel Models (I)
Parallel Computing Team 4 Sahil arora | Rashmi Chaudhary
Chapter 4 Multiprocessors
COMPUTER ARCHITECTURES FOR PARALLEL ROCESSING
Presentation transcript:

Lecture 3: Computer Architectures

Basic Computer Architecture Von Neumann Architecture Memory instruction data Input unit Output unit ALU Processor CU Reg.

Levels of Parallelism Bit level parallelism Within arithmetic logic circuits Instruction level parallelism Multiple instructions execute per clock cycle Memory system parallelism Overlap of memory operations with computation Operating system parallelism More than one processor Multiple jobs run in parallel on SMP Loop level Procedure level

Within arithmetic logic circuits Levels of Parallelism Bit Level Parallelism Within arithmetic logic circuits

Instruction Level Parallelism (ILP) Levels of Parallelism Instruction Level Parallelism (ILP) Multiple instructions execute per clock cycle Pipelining (instruction - data) Multiple Issue (VLIW)

Memory System Parallelism Levels of Parallelism Memory System Parallelism Overlap of memory operations with computation

Operating System Parallelism Levels of Parallelism Operating System Parallelism There are more than one processor Multiple jobs run in parallel on SMP Loop level Procedure level

Flynn’s Taxonomy Single Instruction stream - Single Data stream (SISD) Single Instruction stream - Multiple Data stream (SIMD) Multiple Instruction stream - Single Data stream (MISD) Multiple Instruction stream - Multiple Data stream (MIMD)

Single Instruction stream - Single Data stream (SISD) Von Neumann Architecture Memory instruction data ALU CU Processor

Flynn’s Taxonomy Single Instruction stream - Single Data stream (SISD) Single Instruction stream - Multiple Data stream (SIMD) Multiple Instruction stream - Single Data stream (MISD) Multiple Instruction stream - Multiple Data stream (MIMD)

Single Instruction stream - Multiple Data stream (SIMD) Instructions of the program are broadcast to more than one processor Each processor executes the same instruction synchronously, but using different data Used for applications that operate upon arrays of data data PE data PE instruction CU Memory data PE data PE instruction

Flynn’s Taxonomy Single Instruction stream - Single Data stream (SISD) Single Instruction stream - Multiple Data stream (SIMD) Multiple Instruction stream - Single Data stream (MISD) Multiple Instruction stream - Multiple Data stream (MIMD)

Multiple Instruction stream - Multiple Data stream (MIMD) Each processor has a separate program An instruction stream is generated for each program on each processor Each instruction operates upon different data

Multiple Instruction stream - Multiple Data stream (MIMD) Shared memory Distributed memory

Shared vs Distributed Memory Each processor has its own local memory Message-passing is used to exchange data between processors Shared memory Single address space All processes have access to the pool of shared memory P M Network Memory Bus P

Distributed Memory Processors cannot directly access another processor’s memory Each node has a network interface (NI) for communication and synchronization M M M M P P P P NI NI NI NI Network

Distributed Memory Each processor executes different instructions asynchronously, using different data instr data M CU PE data data instr M CU PE data Network data instr M CU PE data data instr M CU PE data

Shared Memory Each processor executes different instructions asynchronously, using different data data CU PE data CU PE Memory data CU PE data CU PE instruction

Shared Memory Uniform memory access (UMA) Each processor has uniform access to memory (symmetric multiprocessor - SMP) Non-uniform memory access (NUMA) Time for memory access depends on the location of data Local access is faster than non-local access Easier to scale than SMPs P P P P Bus Memory Memory Bus P Memory Bus P Network

Distributed Shared Memory Making the main memory of a cluster of computers look as if it is a single memory with a single address space Shared memory programming techniques can be used

Multicore Systems Many general purpose processors GPU (Graphics Processor Unit) GPGPU (General Purpose GPU) Hybrid Memory The trend is: Board composed of multiple manycore chips sharing memory Rack composed of multiple boards A room full of these racks

Distributed Systems Clusters Grid Individual computers, that are tightly coupled by software, in a local environment, to work together on single problems or on related problems Grid Many individual systems, that are geographically distributed, are tightly coupled by software, to work together on single problems or on related problems