1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
Published byModified over 4 years ago
Presentation on theme: "1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad."— Presentation transcript:
2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad news: its really hard to write good concurrent programs many commercial failures Issues: data sharing, synchronization, interconnection.
3 How do parallel processors share data ? Single shared address space Comes in two flavors: SMP (symmetric multiprocessors), aka UMA (uniform memory access multiprocessors) NUMA (non-uniform MA) Processors communicate through shared variables
4 How do processors share data (contd.) Distributed memory multiprocessors –Processors communicate via message passing
5 How are the processors connected? Connected by a single bus Connected through an interconnection network Reasons why a single bus can be used: Each microprocessor is fairly simple and small, so many of them can be placed on one bus Caches can lower bus traffic Need to worry about keeping caches and memory consistent.
7 Snooping protocols On a read miss, all caches check for the copy of a requested block and then supply the data to the cache that missed On a write, all caches either invalidate or update their local copies. Write-invalidate vs. write-update (write-broadcast)
8 Synchronization Need to coordinate processors working on a common task. Typically, a programmer uses lock variables (semaphores) to synchronize processes. Atomic swap operation (Read-Modify-Write cycle). Processors “spin” on the lock variable waiting their turn to enter the critical section. Can spin on the local copy (thanks to cache coherence protocol)
9 Limitations of a single bus Limitations of a single-bus design: –Incompatible characteristics: high bandwidth, low latency, long length. –Limit to the bandwidth of a single memory module attached to a bus –This leads to practical constraints on the number of processors that can be connected to a single bus. Another solution: use multiple private memories implying explicit communication using sends and receives.
10 Distributed Shared Memory (DSM) A software layer is added to provide a single address space on top of sends and receives. Compatible to virtual memory system. A uniprocessor uses page tables to decide if the data is in memory or on the disk. Here, the table tells if data is in the local memory, or in another processor’s memory, or on the disk. Shared-memory communication must be rare – otherwise most of the time will be spent transferring pages among memories.
11 Network topologies Fully connected network Bus A wide range of networks exists that represents a trade-off between cost and performance. –Multistage networks A crossbar network Omega network