Download presentation

Presentation is loading. Please wait.

Published byLydia Bocook Modified about 1 year ago

1
Fundamental of Computer Architecture By Panyayot Chaikan November 01, 2003

2
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Chapter 10 แนะนำการประมวลผลแบบ ขนาน Introduction to Parallel processing

3
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture เนื้อหา แนะนำสถาปัตยกรรมการประมวลผลแบบขนาน มัลติโพรเซสเซอร์ เวกเตอร์คอมพิวเตอร์ คลัส เตอร์ Interconnection network แบบต่างๆ แนะนำการเขียนโปรแกรมแบบขนาน

4
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture High performance computer Large computing capacity Required to compute large amount of data in a reasonable amount of time Often called Supercomputer

5
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Supercomputer Applications Weather forecasting Finite element analysis in structural design Fluid flow analysis Simulation of large complex physical system Computer Aided Design (CAD)

6
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Parallel processing Picture from

7
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture 3 ways to construct Supercomputer Vector processing Multiprocessing Distributed computer system

8
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Vector Supercomputing Using fastest possible circuit Wide path for access large main memory Extensive I/O capability Dissipate considerable power and require expensive cooling arrangement Provide excellent performance but at very high price

9
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Vector Supercomputing NEC SX5 CRAY CRAY1, Y-MP Fujitsu VP5000 Hitachi SR8000

10
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Cray Supercomputer Picture from

11
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Multiprocessor Use large number of processor design for workstation or PC market Has an efficient high bandwidth medium for communication among the processor memory I/O Provide High performance but cheaper than vector processing

12
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Distributed computer system Using many workstation connected by Local area network Provide large computing capabilities at a reasonable cost

13
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Multiprocessing performance Many computation can proceed in parallel Difficulty: the application must be broken down into small task that can be assigned to individual processor Processors must communicate with each other to exchange data

14
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Classification of Parallel structure Proposed by Flynn[1966] 4 types of computation SISD SIMD MIMD MISD

15
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture SISD Single Instruction stream, Single Data stream Used in single-processor computer system

16
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture SIMD Single Instruction stream, Multiple Data stream Single stream of instruction is broadcast to a number of processor Each processor operates on its own data Each processor has its own memories All processors executes the same program but operate on different data

17
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture MIMD Multiple Instruction stream, Multiple Data stream Many processor execute a different program and access its own sequence of data

18
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture MISD Multiple Instruction stream, Single Data stream Common data structure is manipulated by separate processor Each processor executes a different program This form does not occur often in practice

19
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Array processing Is the SIMD form of parallel processing Instruction is broadcast from a central processor

20
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture 2 types of Array processing Use small number of powerful processor ILLIAC-IV: 64 processors, each processor is 64-bit Use large number of very simple processor CM2: processors, each processor is 1-bit MP-1216: processors, each processor is 4-bit Gamma II plus: 4096 processors, each processor is 8- bit

21
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Array processing Well suited to numerical problem that can be expressed in matrix or vector format

22
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture The structure of general- purpose multiprocessors UMA multiprocessor NUMA multiprocessor Distributed memory system

23
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A UMA multiprocessor

24
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A NUMA multiprocessor

25
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A distributed memory system

26
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Taxonomy of parallel processing

27
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Interconnection network Single bus Crossbar networks Multistage networks Hypercube networks Mesh networks Tree networks Ring networks

28
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Crossbar interconnection network

29
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Multistage shuffle network

30
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A 3-dimensional Hypercube Network

31
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A 2-dimensional mesh network

32
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Four-way tree network

33
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Flat tree network

34
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Ring network

35
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture HP Convex architecture Picture from

36
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture HP Convex Hypernode Picture from

37
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture SGI Power Challenge Picture from

38
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Clustered Supercomputer Picture from

39
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Clusters

40
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Benefits of clustering Incremental scalability High availability Superior price/performance

41
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Parallel programming Task must be broken down into small task that can be assigned to individual processors at program level Need operating system support Different architecture, different programming method

42
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A sequential program to compute the dot product integer array a[1..N], b[1..N] integer dot_product. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N] for k:= 1 to N dot_product := dot_product + x[k] * y[k] end for end do_dot

43
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture First attempt of 2- processor computation shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer id id := mypid() for k:= (id*N/2)+1 to (id+1)*N/2 lock (dot_product_lock) dot_product := dot_product + x[k] * y[k] unlock (dot_product_lock) end barrier (done) end do_dot

44
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture An efficient 2-processor computation of a shared memory machine shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer local_dot_product private integer id id := mypid() local_dot_product := 0 for k:= (id*N/2)+1 to (id+1)*N/2 local_dot_product := local_dot_product + x[k] * y[k] end lock (dot_product_lock) dot_product := dot_product + local_dot_product unlock (dot_product_lock) barrier (done) end do_dot

45
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Performance considerations

46
Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture จบ บทที่ 10

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google