1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Threads, SMP, and Microkernels
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
Relaxed Consistency Models. Outline Lazy Release Consistency TreadMarks DSM system.
Concurrency Important and difficult (Ada slides copied from Ed Schonberg)
Distributed Shared Memory
Multiple Processor Systems
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Computer Systems/Operating Systems - Class 8
Slides 8d-1 Programming with Shared Memory Specifying parallelism Performance issues ITCS4145/5145, Parallel Programming B. Wilkinson Fall 2010.
Distributed Operating Systems CS551 Colorado State University at Lockheed-Martin Lecture 4 -- Spring 2001.
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Memory consistency models Presented by: Gabriel Tanase.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
1 Lecture 1: Parallel Architecture Intro Course organization:  ~5 lectures based on Culler-Singh textbook  ~5 lectures based on Larus-Rajwar textbook.
3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.
3.5 Interprocess Communication
1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.
Distributed Resource Management: Distributed Shared Memory
Memory Consistency Models
PRASHANTHI NARAYAN NETTEM.
1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
Distributed Shared Memory Systems and Programming
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
TreadMarks Distributed Shared Memory on Standard Workstations and Operating Systems Pete Keleher, Alan Cox, Sandhya Dwarkadas, Willy Zwaenepoel.
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
1 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space. Then.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
TECHNIQUES FOR REDUCING CONSISTENCY- RELATED COMMUNICATION IN DISTRIBUTED SHARED-MEMORY SYSTEMS J. B. Carter University of Utah J. K. Bennett and W. Zwaenepoel.
Ihr Logo Operating Systems Internals & Design Principles Fifth Edition William Stallings Chapter 2 (Part II) Operating System Overview.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Ch 10 Shared memory via message passing Problems –Explicit user action needed –Address spaces are distinct –Small Granularity of Transfer Distributed Shared.
Distributed Shared Memory Based on Reference paper: Distributed Shared Memory, Concepts and Systems.
Cache Coherence Protocols 1 Cache Coherence Protocols in Shared Memory Multiprocessors Mehmet Şenvar.
Distributed Shared Memory Presentation by Deepthi Reddy.
1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.
Lecture 8 Page 1 CS 111 Online Other Important Synchronization Primitives Semaphores Mutexes Monitors.
Implementation and Performance of Munin (Distributed Shared Memory System) Dongying Li Department of Electrical and Computer Engineering University of.
DISTRIBUTED COMPUTING
Page 1 Distributed Shared Memory Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
Message-Passing Computing Chapter 2. Programming Multicomputer Design special parallel programming language –Occam Extend existing language to handle.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z. By Nooruddin Shaik.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
OpenMP for Networks of SMPs Y. Charlie Hu, Honghui Lu, Alan L. Cox, Willy Zwaenepoel ECE1747 – Parallel Programming Vicky Tsang.
Distributed shared memory u motivation and the main idea u consistency models F strict and sequential F causal F PRAM and processor F weak and release.
1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections )
Distributed Computing Systems CSCI 6900/4900. Review Definition & characteristics of distributed systems Distributed system organization Design goals.
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
Distributed Shared Memory
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Outline Midterm results summary Distributed file systems – continued
Threads Chapter 4.
Concurrency: Mutual Exclusion and Process Synchronization
Distributed Shared Memory
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Distributed Resource Management: Distributed Shared Memory
Lecture 17 Multiprocessors and Thread-Level Parallelism
Programming with Shared Memory Specifying parallelism
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

1 Chapter 9 Distributed Shared Memory

2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space. Then can use shared memory programming techniques. (physically distributed)

3 DSM System Still need messages or mechanisms to get data to processor, but these are hidden from the programmer: Bus, switch (cluster)

4

5

6 Advantages of DSM System scalable Hides the message passing - do not explicitly specific sending messages between processes Can us simple extensions to sequential programming Can handle complex and large data bases without replication or sending the data to processes

7 Disadvantages of DSM May incur a performance penalty Must provide for protection against simultaneous access to shared data (locks, etc.) Little programmer control over actual messages being generated Performance of irregular problems in particular may be difficult

8 SMP cluster

9 Methods of Achieving DSM Hardware Special network interfaces and cache coherence circuits Software Modifying the OS kernel Adding a software layer between the operating system and the application - most convenient way for teaching purposes

10 Software DSM Implementation Page based - Using the system’s virtual memory Shared variable approach- Using routines to access shared variables Object based- Shared data within collection of objects. Access to shared data through object oriented discipline (ideally)

11 Software Page Based DSM Implementation Virtual shared memory system Disadvantage: complete page, false sharing effect and not portable

12 Shared-variable and Object-based approach Shared-variable approach –Software routines Object-based approach –Similar to shared-variable

13 Some Software DSM Systems Treadmarks Page based DSM system Apparently not now available JIAJIA C based Obtained at UNC-Charlotte but required significant modifications for our system (in message-passing calls) Adsmith object based C++ library routines We have this installed on our cluster - chosen for teaching

14 Hardware DSM Implementation In the hardware approach, special network interfaces and cache coherence circuits are added to the system to make a memory reference to a remote memory location look like a reference to a local memory location. The hardware approach should provide a higher level of performance than the software approach. The purely software approach is more attractive than the hardware approach for teaching purposes.

15 Managing Shared Data There are several ways that a processor could be given access to shared data. –The simplest solution is to have a central server. (single reader/ single writer policy) –multiple servers –multiple copies of data (coherence policy, multiple reader/single writer policy) There are two possibilities to handle this situation –Update policy (broadcast) –Invalidate policy Multiple reader/multiple writer

16 Consistency Models Strict Consistency - Processors sees most recent update, i.e. read returns the most recent wrote to location. Sequential Consistency - Result of any execution same as an interleaving of individual programs. Relaxed Consistency- Delay making write visible to reduce messages. Weak consistency - programmer must use synchronization operations to enforce sequential consistency when necessary. Release Consistency - programmer must use specific synchronization operators, acquire and release. Lazy Release Consistency - update only done at time of acquire.

17 Strict Consistency Every write immediately visible Disadvantages: large number of messages, latency, maybe unnecessary. Use invalidate policy

18 Consistency Models used on DSM Systems Release Consistency An extension of weak consistency in which the synchronization operations have been specified: acquire operation - used before a shared variable or variables are to be read. (as lock) release operation - used after the shared variable or variables have been altered (written) and allows another process to access to the variable(s) (as unlock) Typically acquire is done with a lock operation and release by an unlock operation (although not necessarily).

19 Release Consistency

20 Lazy Release Consistency Advantages: Fewer messages

21 Adsmith

22 DISTRIBUTED SHARED MEMORY PROGRAMMING PRIMITIVES As shared memory programming

23 Adsmith User-level libraries that create distributed shared memory system on a cluster. Object based DSM - memory seen as a collection of objects that can be shared among processes on different processors. Written in C++ Built on top of pvm Freely available - installed on UNCC cluster User writes application programs in C or C++ and calls Adsmith routines for creation of shared data and control of its access.

24 Adsmith Routines These notes are based upon material in Adsmith User Interface document.

25 Initialization/Termination Explicit initialization/termination of Adsmith not necessary.

26 Process To start a new process or processes: adsm_spawn(filename, count) Example adsm_spawn(“prog1”,10); starts 10 copies of prog1 (10 processes). Must use Adsmith routine to start a new process. Also version of adsm_spawn() with similar parameters to pvm_spawn().

27 Process “join” adsmith_wait(); will cause the process to wait for all its child processes (processes it created) to terminate. Versions available to wait for specific processes to terminate, using pvm tid to identify processes. Then would need to use the pvm form of adsmith() that returns the tids of child processes.

28 Shared Data Creation

29 Access to shared data (objects) Adsmith uses “release consistency.” Programmer explicitly needs to control competing read/write access from different processes. Three types of access in Adsmith, differentiated by the use of the shared data: Ordinary Accesses - For regular assignment statements accessing shared variables. Synchronization Accesses - Competing accesses used for synchronization purposes. Non-Synchronization Accesses - Competing accesses, not used for synchronization.

30 Shared Data Access

31 Ordinary Accesses - Basic read/write actions Before read, do: adsm_refresh() to get most recent value - an “acquire/load.” After write, do: adsm_flush() to store result - “store” Example int *x; //shared variable. adsm_refresh(x); a = *x + b;

32 Synchronization accesses To control competing accesses: Semaphores Mutex’s (Mutual exclusion variables) Barriers. available. All require an identifier to be specified as all three class instances are shared between processes.

33 Barrier Routines One barrier routine barrier() class AdsmBarrier { public: AdsmBarrier( char *identifier ); void barrier( int count); };

34 Example AdsmBarrier barrier1(“sample”);. barrier1.barrier(procno);

35 Features to Improve Performance Overlapping Computations with Communications Routines that can be used to overlap messages or reduce number of messages: Prefetch Bulk Transfer Combined routines for critical sections

36 Prefetch adsm_prefetch( void *ptr ) used before adsm_refresh() to get data as early as possible. Non-blocking so that can continue with other work prior to issuing refresh.

37 Reducing the Number of Messages

38 DISTRIBUTED SHARED MEMORY PROGRAMMING