Download presentation
Presentation is loading. Please wait.
Published byBenjamin Dickerson Modified over 6 years ago
1
Multi-Processing in High Performance Computer Architecture:
2
What is Multiprocessing?
Enables several programs to run concurrently Coordinated processing of Programs by more than one processor Use of 2 or more CPUs within a single computer system Ability of a system to support more than one processor and to allocate tasks between them What is Multiprocessing?
3
Idealism (Target for Processor Performance):
4
Memory Hierarchies:
5
Memory in Modern Processor (L1 Cache?):
6
Flynn’s Taxonomy for Parallel Machines:
How many Instruction and Data Streams ? Instruction Stream Data Stream Uni-Processor SISD 1 Single Instruction Single Data Vector/MMX SIMD >1 Single Instruction Multiple Data Streaming Processor (Camera) MISD Multiple Instruction Single Data Multi-Processor MIMD Multiple Instruction Multiple Data (Multi-Core)
7
Why Not Uni-Processors:
Making a wider Processor can efficiently run parallel programs but not programs that have dependencies In Uni-Processor, the instructions a = b + c; d = e + f can be run in parallel because none of the results depend on other calculations The instructions b = e + f; a = b + c, might not run in parallel due to data dependencies. There are several more complex data dependencies. Parallel part : Fast, One at a time(Stalls) : Slow
8
Why Multi-Processors? Uni-Processor Already 4 Wide
Diminishing returns from getting wider Frequency (high) Voltage (high) Power ( ^3) ½ CV^2f Uni-Processor Already 4 Wide 2x Transistors every 2 years 2x Cores every 2 years 2x Performance every 2 years (Assuming we use all cores) But Moore’s Law Continues Why Multi-Processors?
9
Multi-Processor needs Parallel Programs (Years to develop):
Sequential (single-thread) code lot easier to develop 01 Debugging Parallel Code is much more difficult 02 Performance Scaling is much harder to achieve 03
11
Types of Multiprocessors, Unified Memory Access (UMA)
Centralized Shared Memory, distance from Memory to Core is approx. the same Replicate Cores, Caches to build a Symmetric Multi – Processor(SMP)
13
Issues in Centralized Main Memory:
Memory Size Large Slow Memory Bandwidth cache miss from all Cores serially ques multiple requests to Main Memory causing serious lag Works well for smaller machines up to 16 cores
14
Distributed (Multicomputer) Memory System(NUMA):
15
Distributed Memory : Each Core has it’s own local memory and cache forming a single core system A network interface card, connected to an interconnection network Cache miss goes directly to the local processor’s memory , it only accesses other processor’s memory through the network message passing. To communicate, data is sent EXPLICITLY to the core. Programmer is forced to be aware of communication between cores and try to minimize it. Think of it as a set of machines communicating over a network.
17
Shared Memory vs Message Passing (Hardware vs Software):
18
Performance Metrics (Message Passing vs Shared Memory):
Communication Programmer Automatic Data Distribution Manual Hardware Support Simple Extensive Programming: Correctness Difficult(Deadlocks) Less Difficult Performance Difficult Very Difficult
19
Multithreading as Shared Memory Hardware:
20
Analyzing Multithreading Performance:
21
Summary: Multithreaded Categories
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.