Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.

Similar presentations


Presentation on theme: "Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the."— Presentation transcript:

1 Lecture 13 Parallel Processing

2 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the simultaneous use of multiple compute resources to solve a computational problem.

3 3 Why Use Parallel Computing? Saves time – wall clock time Cost savings Overcoming memory constraints It’s the future of computing Increasingly attractive Economics, technology, architecture, application demand Increasingly central and mainstream Parallelism exploited at many levels Instruction-level parallelism Multiprocessor servers Large-scale multiprocessors (“MPPs”) Focus of this class: multiprocessor level of parallelism Same story from memory system perspective Increase bandwidth, reduce average latency with many local memories Wide range of parallel architectures make sense Different cost, performance and scalability

4 4 Single instruction, single data stream - SISD Single instruction, multiple data stream - SIMD Multiple instruction, single data stream - MISD Multiple instruction, multiple data stream-MIMD Architecture Categories

5 5 Single Instruction, Single Data Stream - SISD Single processor Single instruction stream Data stored in single memory Uni-processor

6 6 Single Instruction, Multiple Data Stream - SIMD Single machine instruction Controls simultaneous execution Number of processing elements Lockstep basis Each processing element has associated data memory Each instruction executed on different set of data by different processors Vector and array processors

7 7 Multiple Instruction, Single Data Stream - MISD Sequence of data Transmitted to set of processors Each processor executes different instruction sequence Never been implemented

8 8 Multiple Instruction, Multiple Data Stream- MIMD Set of processors Simultaneously execute different instruction sequences Different sets of data SMPs, clusters and NUMA systems

9 9 Taxonomy of parallel computing paradigms models Parallel Computer SynchronousAsynchronous Vector/Array SIMD Systolic MIMD

10 10 Flynn’s Hardware Taxonomy Processor Organizations Single instruction, single data (SISD) stream Multiple instruction, single data (MISD) stream Single instruction, multiple data (SIMD) stream Multiple instruction, multiple data (MIMD) stream Uniprocessor Vector processor Array processor Shared memory Distributed memory Symmetric multiprocessor (SMP) Nonuniform memory access (NUMA) Clusters

11 11 Shared Memory Architecture All processors access all memory as a single global address space. Data sharing is fast. Lack of scalability between memory and CPUs

12 12 Distributed Memory Each processor has its own memory. Is scalable, no overhead for cache coherency. Programmer is responsible for many details of communication between processors.

13 13 Near Tightly Coupled - SMP Processors share memory Communicate via that shared memory Symmetric Multiprocessor (SMP) Share single memory or pool Shared bus to access memory Memory access time to given area of memory is approximately the same for each processor

14 14 Tightly Coupled - NUMA Nonuniform memory access Access times to different regions section of memroy may differ

15 15 Distributed Shared Memory Arch.: NUMA Memory is physically distributed but logically shared. The physical layout similar to the distributed-memory case Aggregated memory of the whole system appear as one single address space. Due to the distributed nature, memory access performance varies depending on which CPU accesses which parts of memory (“local” vs. “remote” access). Two locality domains linked through a high speed connection called Hyper Transport Advantage – Scalability Disadvantage – Locality Problems and Connection congestion.

16 16 Loosely Coupled - Clusters Collection of independent uniprocessors or SMPs Interconnected to form a cluster Communication via fixed path or network connections

17 17 Symmetric Multiprocessors A stand alone computer with the following characteristics Two or more similar processors of comparable capacity Processors share same memory and I/O Processors are connected by a bus or other internal connection Memory access time is approximately the same for each processor All processors share access to I/O Either through same channels or different channels giving paths to same devices All processors can perform the same functions (hence symmetric) System controlled by integrated operating system providing interaction between processors Interaction at job, task, file and data element levels

18 18 SMP Advantages Performance If some work can be done in parallel Availability Since all processors can perform the same functions, failure of a single processor does not halt the system Incremental growth User can enhance performance by adding additional processors Scaling Vendors can offer range of products based on number of processors

19 19 Interconnection Networks(IN) IN topology Distributed MemoryShared Memory StaticDynamic 1-dimensional 2-dimensional Hypercube Single-stage Multi-stage Cross-bar Vector MIMD

20 20 Distributed Memory – Static Networks Linear array (1-d) 2-dimensional networks ring star tree mesh

21 21 Organization Classification Time shared or common bus Multiport memory Central control unit

22 22 Time Shared Bus Simplest form Structure and interface similar to single processor system Following features provided Addressing - distinguish modules on bus Arbitration - any module can be temporary master Time sharing - if one module has the bus, others must wait and may have to suspend Now have multiple processors as well as multiple I/O modules

23 23 Time Share Bus - Advantages Simplicity Flexibility Reliability

24 24 Time Share Bus - Disadvantage Performance limited by bus cycle time Each processor should have local cache Reduce number of bus accesses Leads to problems with cache coherence Solved in hardware - see later

25 25 Operating System Issues Simultaneous concurrent processes Scheduling Synchronization Memory management Reliability and fault tolerance

26 26 Clusters Alternative to SMP High performance High availability Server applications A group of interconnected whole computers Working together as unified resource Illusion of being one machine Each computer called a node

27 27 Cluster Benefits Absolute scalability Incremental scalability High availability Superior price/performance


Download ppt "Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the."

Similar presentations


Ads by Google