Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections 4.1-4.2)

Similar presentations


Presentation on theme: "1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections 4.1-4.2)"— Presentation transcript:

1 1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections 4.1-4.2)

2 2 Multiprocessors So far, we have spoken at length microprocessors. We will now study the multiprocessor, how they work, what are the specific problems that appear

3 3 Taxonomy SISD: single instruction and single data stream: uniprocessor MISD: no commercial multiprocessor: imagine data going through a pipeline of execution engines SIMD: vector architectures: lower flexibility MIMD: most multiprocessors today: easy to construct with off-the-shelf computers, most flexibility

4 4 Multiprocessor types used The first multi SIMD processors were kind, and this architecture is still used for certain specialized machines MIMD type seems to be nowadays choice target for the current application computers: The MIMD are flexible: they can be used as a single user machine, or as multi-programmed machinery The MIMD can be built from existing processors

5 5 In the center of MIMD processors: memory We Can be classified into two classes MIMD processors depending on the number of processors in the machine. Ultimately, it is the memory organization that is affected: - Memory Organization – I:Centralized shared memory - Memory Organization – II:distributed memory

6 6 Memory Organization - I Centralized shared-memory multiprocessor or Symmetric shared-memory multiprocessor (SMP) Multiple processors connected to a single centralized memory – since all processors see the same memory organization  uniform memory access (UMA) We use a bus that connects the processor and memory, with the help of local cache Shared-memory because all processors can access the entire memory address space

7 7 SMPs or Centralized Shared-Memory Processor Caches Processor Caches Processor Caches Processor Caches Main Memory I/O System

8 8 Memory Organization - II For higher scalability, memory is distributed among processors  distributed memory multiprocessors If one processor can directly address the memory local to another processor, the address space is shared  distributed shared-memory (DSM) multiprocessor If memories are strictly local, we need messages to communicate data  cluster of computers or multicomputers Non-uniform memory architecture (NUMA) since local memory has lower latency than remote memory

9 9 Distributed Memory Multiprocessors Processor & Caches MemoryI/O Processor & Caches MemoryI/O Processor & Caches MemoryI/O Processor & Caches MemoryI/O Interconnection network

10 10 Shared-Memory Vs. Message-Passing Shared-memory: Processors communicate via message passing Processors have private memories Focuses attention on costly non-local operations Message-passing: No cache coherence  simpler hardware Processors communicate by memory read/write Explicit communication  easier for the programmer to restructure code The kind that we will focus on today

11 11 Shared-Memory Vs. Message-Passing Shared-memory: Well-understood programming model Communication is implicit and hardware handles protection Hardware-controlled caching Message-passing: No cache coherence  simpler hardware Explicit communication  easier for the programmer to restructure code Sender can initiate data transfer

12 12 SMPs Centralized main memory and many caches  many copies of the same data A system is cache coherent if a read returns the most recently written value for that word Time Event Value of X in Cache-A Cache-B Memory 0 - - 1 1 CPU-A reads X 1 - 1 2 CPU-B reads X 1 1 1 3 CPU-A stores 0 in X 0 1 0

13 13 Cache Coherence A memory system is coherent if: P writes to X; no other processor writes to X; P reads X and receives the value previously written by P P1 writes to X; no other processor writes to X; sufficient time elapses; P2 reads X and receives value written by P1 Two writes to the same location by two processors are seen in the same order by all processors – write serialization The memory consistency model defines “time elapsed” before the effect of a processor is seen by others

14 14 Cache Coherence Protocols Directory-based: A single location (directory) keeps track of the sharing status of a block of memory Snooping: every cache is connected to a common memory bus and continually listen to what's happening.  Write-invalidate: The principle is as follows: following the writing of data, all other copies of that data are marked as invalid.  Write-update: when a processor writes, it updates other shared copies of that block

15 15 Design Issues Invalidate Find data Writeback / writethrough Processor Caches Processor Caches Processor Caches Processor Caches Main Memory I/O System Cache block states Contention for tags Enforcing write serialization

16 16 Example Protocol RequestSourceBlock stateAction Read hitProcShared/exclRead data in cache Read missProcInvalidPlace read miss on bus Read missProcSharedConflict miss: place read miss on bus Read missProcExclusiveConflict miss: write back block, place read miss on bus Write hitProcExclusiveWrite data in cache Write hitProcSharedPlace write miss on bus Write missProcInvalidPlace write miss on bus Write missProcSharedConflict miss: place write miss on bus Write missProcExclusiveConflict miss: write back, place write miss on bus Read missBusSharedNo action; allow memory to respond Read missBusExclusivePlace block on bus; change to shared Write missBusSharedInvalidate block Write missBusExclusiveWrite back block; change to invalid


Download ppt "1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections 4.1-4.2)"

Similar presentations


Ads by Google