Download presentation

Presentation is loading. Please wait.

Published byLydia Bocook Modified over 2 years ago

1
Fundamental of Computer Architecture By Panyayot Chaikan panyayot@coe.psu.ac.th 240-208 November 01, 2003

2
Chapter 10 - Introduction to Parallel processing 2 240-208 Fundamental of Computer Architecture Chapter 10 แนะนำการประมวลผลแบบ ขนาน Introduction to Parallel processing

3
Chapter 10 - Introduction to Parallel processing 3 240-208 Fundamental of Computer Architecture เนื้อหา แนะนำสถาปัตยกรรมการประมวลผลแบบขนาน มัลติโพรเซสเซอร์ เวกเตอร์คอมพิวเตอร์ คลัส เตอร์ Interconnection network แบบต่างๆ แนะนำการเขียนโปรแกรมแบบขนาน

4
Chapter 10 - Introduction to Parallel processing 4 240-208 Fundamental of Computer Architecture High performance computer Large computing capacity Required to compute large amount of data in a reasonable amount of time Often called Supercomputer

5
Chapter 10 - Introduction to Parallel processing 5 240-208 Fundamental of Computer Architecture Supercomputer Applications Weather forecasting Finite element analysis in structural design Fluid flow analysis Simulation of large complex physical system Computer Aided Design (CAD)

6
Chapter 10 - Introduction to Parallel processing 6 240-208 Fundamental of Computer Architecture Parallel processing Picture from http://www.byte.com/art/9601/img/509029c2.htm

7
Chapter 10 - Introduction to Parallel processing 7 240-208 Fundamental of Computer Architecture 3 ways to construct Supercomputer Vector processing Multiprocessing Distributed computer system

8
Chapter 10 - Introduction to Parallel processing 8 240-208 Fundamental of Computer Architecture Vector Supercomputing Using fastest possible circuit Wide path for access large main memory Extensive I/O capability Dissipate considerable power and require expensive cooling arrangement Provide excellent performance but at very high price

9
Chapter 10 - Introduction to Parallel processing 9 240-208 Fundamental of Computer Architecture Vector Supercomputing NEC SX5 CRAY CRAY1, Y-MP Fujitsu VP5000 Hitachi SR8000

10
Chapter 10 - Introduction to Parallel processing 10 240-208 Fundamental of Computer Architecture Cray Supercomputer Picture from http://www.meteo.fr/scem/images/cray.gif

11
Chapter 10 - Introduction to Parallel processing 11 240-208 Fundamental of Computer Architecture Multiprocessor Use large number of processor design for workstation or PC market Has an efficient high bandwidth medium for communication among the processor memory I/O Provide High performance but cheaper than vector processing

12
Chapter 10 - Introduction to Parallel processing 12 240-208 Fundamental of Computer Architecture Distributed computer system Using many workstation connected by Local area network Provide large computing capabilities at a reasonable cost

13
Chapter 10 - Introduction to Parallel processing 13 240-208 Fundamental of Computer Architecture Multiprocessing performance Many computation can proceed in parallel Difficulty: the application must be broken down into small task that can be assigned to individual processor Processors must communicate with each other to exchange data

14
Chapter 10 - Introduction to Parallel processing 14 240-208 Fundamental of Computer Architecture Classification of Parallel structure Proposed by Flynn[1966] 4 types of computation SISD SIMD MIMD MISD

15
Chapter 10 - Introduction to Parallel processing 15 240-208 Fundamental of Computer Architecture SISD Single Instruction stream, Single Data stream Used in single-processor computer system

16
Chapter 10 - Introduction to Parallel processing 16 240-208 Fundamental of Computer Architecture SIMD Single Instruction stream, Multiple Data stream Single stream of instruction is broadcast to a number of processor Each processor operates on its own data Each processor has its own memories All processors executes the same program but operate on different data

17
Chapter 10 - Introduction to Parallel processing 17 240-208 Fundamental of Computer Architecture MIMD Multiple Instruction stream, Multiple Data stream Many processor execute a different program and access its own sequence of data

18
Chapter 10 - Introduction to Parallel processing 18 240-208 Fundamental of Computer Architecture MISD Multiple Instruction stream, Single Data stream Common data structure is manipulated by separate processor Each processor executes a different program This form does not occur often in practice

19
Chapter 10 - Introduction to Parallel processing 19 240-208 Fundamental of Computer Architecture Array processing Is the SIMD form of parallel processing Instruction is broadcast from a central processor

20
Chapter 10 - Introduction to Parallel processing 20 240-208 Fundamental of Computer Architecture 2 types of Array processing Use small number of powerful processor ILLIAC-IV: 64 processors, each processor is 64-bit Use large number of very simple processor CM2: 65536 processors, each processor is 1-bit MP-1216: 16384 processors, each processor is 4-bit Gamma II plus: 4096 processors, each processor is 8- bit

21
Chapter 10 - Introduction to Parallel processing 21 240-208 Fundamental of Computer Architecture Array processing Well suited to numerical problem that can be expressed in matrix or vector format

22
Chapter 10 - Introduction to Parallel processing 22 240-208 Fundamental of Computer Architecture The structure of general- purpose multiprocessors UMA multiprocessor NUMA multiprocessor Distributed memory system

23
Chapter 10 - Introduction to Parallel processing 23 240-208 Fundamental of Computer Architecture A UMA multiprocessor

24
Chapter 10 - Introduction to Parallel processing 24 240-208 Fundamental of Computer Architecture A NUMA multiprocessor

25
Chapter 10 - Introduction to Parallel processing 25 240-208 Fundamental of Computer Architecture A distributed memory system

26
Chapter 10 - Introduction to Parallel processing 26 240-208 Fundamental of Computer Architecture Taxonomy of parallel processing

27
Chapter 10 - Introduction to Parallel processing 27 240-208 Fundamental of Computer Architecture Interconnection network Single bus Crossbar networks Multistage networks Hypercube networks Mesh networks Tree networks Ring networks

28
Chapter 10 - Introduction to Parallel processing 28 240-208 Fundamental of Computer Architecture Crossbar interconnection network

29
Chapter 10 - Introduction to Parallel processing 29 240-208 Fundamental of Computer Architecture Multistage shuffle network

30
Chapter 10 - Introduction to Parallel processing 30 240-208 Fundamental of Computer Architecture A 3-dimensional Hypercube Network

31
Chapter 10 - Introduction to Parallel processing 31 240-208 Fundamental of Computer Architecture A 2-dimensional mesh network

32
Chapter 10 - Introduction to Parallel processing 32 240-208 Fundamental of Computer Architecture Four-way tree network

33
Chapter 10 - Introduction to Parallel processing 33 240-208 Fundamental of Computer Architecture Flat tree network

34
Chapter 10 - Introduction to Parallel processing 34 240-208 Fundamental of Computer Architecture Ring network

35
Chapter 10 - Introduction to Parallel processing 35 240-208 Fundamental of Computer Architecture HP Convex architecture Picture from http://www.byte.com/art/9601/img/509029g2.htm

36
Chapter 10 - Introduction to Parallel processing 36 240-208 Fundamental of Computer Architecture HP Convex Hypernode Picture from http://www.byte.com/art/9601/img/509029f2.htm

37
Chapter 10 - Introduction to Parallel processing 37 240-208 Fundamental of Computer Architecture SGI Power Challenge Picture from http://www.byte.com/art/9601/img/509029l2.htm

38
Chapter 10 - Introduction to Parallel processing 38 240-208 Fundamental of Computer Architecture Clustered Supercomputer Picture from http://arstechnica.com/cpu/2q00/klat2/klat2-1.html

39
Chapter 10 - Introduction to Parallel processing 39 240-208 Fundamental of Computer Architecture Clusters

40
Chapter 10 - Introduction to Parallel processing 40 240-208 Fundamental of Computer Architecture Benefits of clustering Incremental scalability High availability Superior price/performance

41
Chapter 10 - Introduction to Parallel processing 41 240-208 Fundamental of Computer Architecture Parallel programming Task must be broken down into small task that can be assigned to individual processors at program level Need operating system support Different architecture, different programming method

42
Chapter 10 - Introduction to Parallel processing 42 240-208 Fundamental of Computer Architecture A sequential program to compute the dot product integer array a[1..N], b[1..N] integer dot_product. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N] for k:= 1 to N dot_product := dot_product + x[k] * y[k] end for end do_dot

43
Chapter 10 - Introduction to Parallel processing 43 240-208 Fundamental of Computer Architecture First attempt of 2- processor computation shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer id id := mypid() for k:= (id*N/2)+1 to (id+1)*N/2 lock (dot_product_lock) dot_product := dot_product + x[k] * y[k] unlock (dot_product_lock) end barrier (done) end do_dot

44
Chapter 10 - Introduction to Parallel processing 44 240-208 Fundamental of Computer Architecture An efficient 2-processor computation of a shared memory machine shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer local_dot_product private integer id id := mypid() local_dot_product := 0 for k:= (id*N/2)+1 to (id+1)*N/2 local_dot_product := local_dot_product + x[k] * y[k] end lock (dot_product_lock) dot_product := dot_product + local_dot_product unlock (dot_product_lock) barrier (done) end do_dot

45
Chapter 10 - Introduction to Parallel processing 45 240-208 Fundamental of Computer Architecture Performance considerations

46
Chapter 10 - Introduction to Parallel processing 46 240-208 Fundamental of Computer Architecture จบ บทที่ 10

Similar presentations

OK

Multiprocessor Use large number of processor design for workstation or PC market Has an efficient medium for communication among the processor memory.

Multiprocessor Use large number of processor design for workstation or PC market Has an efficient medium for communication among the processor memory.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on indian herbs and spices Ppt on object oriented programming with c++ textbook Download a ppt on natural disasters Ppt on hard copy devices Ppt on ufo and aliens today Ppt on viruses and anti viruses name Ppt on classroom action research Ppt on dances of india Ppt on polynomials in maths what is the range Ppt on first conditional sentence