Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. 2004.

Similar presentations


Presentation on theme: "Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. 2004."— Presentation transcript:

1 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.1 Distributed Shared Memory Chapter 9 Introduction Implementing Distributed Shared Memory (DSM) Achieving Consistent Memory in a DSM System DSM Programming Systems Implementing a Simple DSM System

2 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.2 DSM: making a group of interconnected computers, each with its own memory, appear as having a single address space memory That is, programmer views memory as grouped together and sharable among the processors Shared memory programming techniques are used in DSM Introduction

3 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.3 DSMs: Advantages and Disadvantages AdvantagesDisadvantages System scalableMay incur a performance penalty Hides the message passing - do not explicitly specific sending messages between processes Must provide for protection against simultaneous access to shared data (locks, etc.) Can us simple extensions to sequential programming Little programmer control over actual messages being generated Can handle complex and large data bases without replication or sending the data to processes Performance of irregular problems in particular may be difficult

4 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.4 A DSM system is likely to be less efficient than a true shared memory system. Why? A DSM system is likely to be less efficient than an explicit message-passing system. Why? What facilities must a system provide to implement a DSM? Introduction to DSM (cont’d)

5 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.5 1.Hardware Special network interfaces and cache coherence circuits 2.Software: no hardware changes to cluster Modifying the OS kernel Adding a software layer between the operating system and the application most convenient way for teaching purposes Software layer can be: Page based - Using the system’s virtual memory Shared variable approach- Using routines to access shared variables Object based- Shared data within collection of objects. Access to shared data through object oriented discipline (ideally) Implementing DSM Systems

6 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.6 Page Based DSM Implementation Disadvantage: unit of data movement is a complete page leading to Longer messages than necessary False sharing May not be portable - tied to particular vm hardware and software

7 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.7 Shared Variable and Object-Based DSMs Shared variable DSMs: Only variables declared as shared are transferred, and this is done on demand Software routines (not paging mechanism) used to cause transfer The routines, called by programmer directly or indirectly, perform the actions If performance is not a key factor, it can be implemented very easily Object-based DSMs Can be regarded as an extension of shared variable approach Shared data are embodied in objects that include data items and the only methods needed to access this data Relatively easy to implement in OO languages like Java and C++

8 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.8 A processor can get access to shared data by having A centralized server Multiple copies of shared data In the first solution, a centralized server is responsible for all read/write operations on shared data All reads/writes of shared data occur in one place and sequentially Implements a single reader/single writer policy Rarely used because of server bottleneck The second solution implements a multiple reader/single writer policy, allowing simultaneous access to shared data Achieved by replicating data at required sites Only one site (the owner) is allowed to alter shared data In the multiple reader/single solution, one of the following coherence policies can be used when owner modifies shared data Update policy Invalidate policy Managing Shared Data

9 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.9 Addresses when the current value of a shared variable is seen by other processors Strict Consistency - Processors see most recent update, i.e. read returns the most recent wrote to location. Relaxed Consistency- Delay making write visible to reduce messages. Weak consistency - Programmer must use synchronization operations to enforce sequential consistency when necessary. Release Consistency - Programmer must use specific synchronization operators, acquire and release. Lazy Release Consistency - update only done at time of acquire. Achieving Consistent Memory in a DSM System

10 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.10 Every write immediately visible Disadvantages: number of messages, latency, maybe unnecessary. Strict Consistency

11 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.11 Release Consistency An extension of weak consistency in which the synchronization operations have been specified: acquire operation - used before a shared variable or variables are to be read. release operation - used after the shared variable or variables have been altered (written) and allows another process to access to the variable(s) Typically acquire is done with a lock operation and release by an unlock operation (although not necessarily). Consistency Models used on DSM Systems

12 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.12 Release Consistency

13 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.13 Lazy Release Consistency Advantages: Fewer messages

14 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.14 Four necessary operations that must be provided in shared memory programming: 1.Process/thread creation (and termination) 2.Shared-data creation 3.Mutual-exclusion synchronization 4.Process/thread and event synchronization These have to be provided in a DSM system also, typically by user-level library calls Some DSM systems: Adsmith TreadMarks OpenMP DSM Programming Systems

15 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.15 Adsmith An object based DSM - memory seen as a collection of objects that can be shared among processes on different processors. A shared-variable system from user perspective Written in C++ and built on top of pvm TreadMarks One of the most famous page-based DMS system developed at Rice University Implements release consistency and multiple-writer protocols. Adsmith and TreadMarks

16 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.16 Basic shared-variable implementation can be easy to do Can sit on top of message-passing software such as MPI. Issues in Implementing a DSM System Managing shared data - reader/writer policies Timing issues - relaxing read/write orders Reader/writer policies Single reader/single writer policy - simple to do with centralized servers Multiple reader/single writer policy - again quite simple to do Multiple reader/multiple writer policy - tricky DSM Implementation Projects Using underlying message-passing software

17 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.17 Simple DSM System using a Centralized Server

18 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.18 Simple DSM System:Server Code for Shared Variables do{ recv(&command,&shared_x_name,&data,&source, any_source,any_tag); find(&shared_x_name,&x); /*find shared var, return ptr to it */ switch(command){ case rd: /* read routine */ send(&x,source) /* no lock needed */ case wr: /* write routine */ x = data; send(&ack, source); /* send ack; update done */ … } } while(command != terminator);

19 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.19 Simple DSM System using Multiple Servers

20 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 9.20 Simple DSM System using Multiple Servers and Multiple Reader Policy


Download ppt "Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. 2004."

Similar presentations


Ads by Google