Presentation is loading. Please wait.

Presentation is loading. Please wait.

2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract.

Similar presentations


Presentation on theme: "2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract."— Presentation transcript:

1 2016/1/5Part I1 Models of Parallel Processing

2 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract models of real machines.

3 2016/1/5Part I3 Development of Early Models (1) Associative processing (AP) was perhaps the earliest form of parallel processing. –Associative or content-addressable memories (AMs, CAMs), which allow memory cells to be accessed based on contents rather than their physical locations within the memory array. –AMI AP architectures are essentially based on incorporating simple processing logic into the memory array so as to remove the need for transferring large volumes of data through the limited-bandwidth interface between the memory and the processor (the von Neumann bottleneck)

4 2016/1/5Part I4 Development of Early Models (2) the AM/AP model has evolved through the incorporation of additional capabilities, so that it is in essence converging with SIMD-type array processors.

5 2016/1/5Part I5 Development of Early Models (3) neural networks Cellular automata

6 2016/1/5Part I6

7 2016/1/5Part I7

8 2016/1/5Part I8 SIMD Vs. MIMD (1) Most early parallel machines had SIMD designs. Within the SIMD category, two fundamental design choices exist: –Synchronous versus asynchronous SIMD A possible cure is to use the asynchronous version of SIMD, known as SPMD –Custom- versus commodity-chip SIMD

9 2016/1/5Part I9 SIMD Vs. MIMD (2) In the 1990s, the MIMD paradigm has become more popular recently. MIMD machines are most effective for medium- to coarse- grain parallel applications, where the computation is divided into relatively large subcomputations or tasks whose executions are assigned to the various processors.

10 2016/1/5Part I10 SIMD Vs. MIMD (3) Within the MIMD class, three fundamental issues or design choices are subjects of ongoing debates in the research community. –MPP-massively or moderately parallel processor Is it more cost-effective to build a parallel processor out of a relatively small number of powerful processors or a massive number of very simple processors –Tightly versus loosely coupled MIMD network of workstations (NOW), cluster computing, Grid Computing –Explicit message passing versus virtual shared memory

11 2016/1/5Part I11 Global Vs. Distributed Memory (1) Within the MIMD class of paranel processors, memory can be global or distributed. Global memory may be visualized as being in a central location where all processors can access it with equal ease. memory­latency-hiding techniques must be employed. An example of such methods is the use of multithreading.

12 2016/1/5Part I12

13 2016/1/5Part I13 Global Vs. Distributed Memory (2) Examples for both the processor-to-memory and processor-to- processor networks include: an abstract model of global-memory computers, known as PRAM. One approach to reducing the amount of data that must pass through the processor-to­memory interconnection network is to use a private cache memory. (locality of data access, cache coherence problem)

14 2016/1/5Part I14

15 2016/1/5Part I15 Global Vs. Distributed Memory (3) Distributed-memory architectures can be conceptually viewed as in Fig. 4.5. In addition to the types of interconnection networks enumerated for shared-memory parallel processors, distributed-memory MIMD architectures can also be interconnected by a variety of direct networks. (as nonuniform memory access (NUMA) architectures)

16 2016/1/5Part I16

17 2016/1/5Part I17 PRAM Shared-Memory Model (1) The theoretical model used for conventional or sequential computers (SISD class) is known as the random-access machine (RAM) The parallel version of RAM (PRAM), constitutes an abstract model of the class of global-memory parallel processors. The abstraction consists of ignoring the details of the processor-to-memory interconnection network and taking the view that each processor can access any memory location in each machine cycle, independent of what other processors are doing.

18 2016/1/5Part I18

19 2016/1/5Part I19 PRAM Shared-Memory Model (2) In the formal PRAM model, a single processor is assumed to be active initially. In each computation step, each active processor can read from and write into the shared memory and can also activate another processor. Even though the global-memory architecture was introduced as a subclass of the MIMD class, the abstract PRAM model depicted in Fig. 4.6 can be SIMD or MIMD.

20 2016/1/5Part I20

21 2016/1/5Part I21 PRAM Shared-Memory Model (3) This implies that each instruction cycle would have to consume Ω(log p) real time. The above point is important when we try to compare PRAM algorithms with those for distributed-memory models. An O(log p)-step PRAM algorithm may not be faster than an O(1og 2 p)-step algorithm for a hypercube architecture.

22 2016/1/5Part I22 Distributed-Memory or Graph Models (1) Given the internal processor and memory structures in each node, a distributed-memory architecture is characterized primarily by the network used to interconnect the nodes. This network is usually represented as a graph. Important parameters of an interconnec­tion network include –Network diameter: the longest of the shortest paths between various pairs of nodes –Bisection (band)width: the smallest number (total capacity) of links that need to be cut in order to divide the network into two subnetworks of half the size. –Vertex or node degree: the number of communication ports required of each node

23 2016/1/5Part I23

24 2016/1/5Part I24

25 2016/1/5Part I25 Distributed-Memory or Graph Models (2) Even though the distributed-memory architecture was introduced as a subclass of the MIMD class, machines based on networks of the type shown in Fig. 4.8 can be SIMD- or MIMD-type. Fig. 4.9 are available for reducing bus traffic by taking advantage of the locality of communication within small clusters of processors.

26 2016/1/5Part I26


Download ppt "2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract."

Similar presentations


Ads by Google