Implementing Sequentially Consistent Programs on Processor Consistent Platforms Lisa Higham and Jalal Kawash University of Calgary, Canada American University.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Model checking with Message Sequence Charts Doron Peled Collaborators: R. Alur, E. Gunter, G. Holzmann, A. Muscholl, Z. Su Department of Computer Science.
Visual Formalisms Message Sequence Charts Book: Chapter 10.
Distributed systems Total Order Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
Time-based Transactional Memory with Scalable Time Bases Torvald Riegel, Christof Fetzer, Pascal Felber Presented By: Michael Gendelman.
Machine scheduling Job 1Job 3 Job 4 Job 5Machine 1 Machine 2 time 0C max Job 2.
CS 603 Process Synchronization: The Colored Ticket Algorithm February 13, 2002.
Mutual Exclusion – SW & HW By Oded Regev. Outline: Short review on the Bakery algorithm Short review on the Bakery algorithm Black & White Algorithm Black.
Computer Architecture CST 250 K-Map Prepared by:Omar Hirzallah.
Global States in a Distributed System By John Kor and Yvonne Cheng.
Global States and Checkpoints
Overload Scheduling in Real-Time Systems
N-Consensus is the Second Strongest Object for N+1 Processes Eli Gafni UCLA Petr Kuznetsov Max Planck Institute for Software Systems.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 3, 2005 Topic: Instruction-Level Parallelism (Dynamic Scheduling: Introduction)
PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Process Synchronization Continued 7.2 The Critical-Section Problem.
Multiprocessor Synchronization Algorithms ( ) Lecturer: Danny Hendler The Mutual Exclusion problem.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
Parallel Algorithms Lecture Notes. Motivation Programs face two perennial problems:: –Time: Run faster in solving a problem Example: speed up time needed.
Concurrency.
Concurrency: Mutual Exclusion and Synchronization Why we need Mutual Exclusion? Classical examples: Bank Transactions:Read Account (A); Compute A = A +
Precise Inter-procedural Analysis Sumit Gulwani George C. Necula using Random Interpretation presented by Kian Win Ong UC Berkeley.
Processes CSCI 444/544 Operating Systems Fall 2008.
Introduction to Lock-free Data-structures and algorithms Micah J Best May 14/09.
What Can Be Implemented Anonymously ? Paper by Rachid Guerraui and Eric Ruppert Presentation by Amir Anter 1.
Chapter Resynchsonous Stabilizer Chapter 5.1 Resynchsonous Stabilizer Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of Jan 2004, Shlomi.
01/27/2005 Combinationality of cyclic definitions EECS 290A – Spring 2005 UC Berkeley.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
Evaluation of Memory Consistency Models in Titanium.
Lecture 6: Introduction to Distributed Computing.
High level & Low level language High level programming languages are more structured, are closer to spoken language and are more intuitive than low level.
Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995.
Parallel and Distributed Simulation Memory Management & Other Optimistic Protocols.
Process Synchronization Continued 7.2 Critical-Section Problem 7.3 Synchronization Hardware 7.4 Semaphores.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
IXA 1234 : C++ PROGRAMMING CHAPTER 1. PROGRAMMING LANGUAGE Programming language is a computer program that can solve certain problem / task Keyword: Computer.
Memory Management Issues, Solutions, and Examples.
Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
CSCI-100 Introduction to Computing
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
Memory Consistency Models 1. Uniform Consistency Models Only have read and write operations Sequential Consistency Pipelined-RAM Causal Consistency Coherence.
1 Concurrent Processes. 2 Cooperating Processes  Operating systems allow for the creation and concurrent execution of multiple processes  concurrency.
CS510 Concurrent Systems Jonathan Walpole. A Methodology for Implementing Highly Concurrent Data Objects.
Concurrency Properties. Correctness In sequential programs, rerunning a program with the same input will always give the same result, so it makes sense.
1 Chapter 1 Programming Languages Evolution of Programming Languages To run a Java program: Java instructions need to be translated into an intermediate.
Chapter 15: Recursion. Objectives In this chapter, you will: – Learn about recursive definitions – Explore the base case and the general case of a recursive.
Chapter 5 Finite Automata Finite State Automata n Capable of recognizing numerous symbol patterns, the class of regular languages n Suitable for.
Chapter 1 An Overview of Computers and Programming Languages.
Chapter 15: Recursion. Recursive Definitions Recursion: solving a problem by reducing it to smaller versions of itself – Provides a powerful way to solve.
Chapter 15: Recursion. Objectives In this chapter, you will: – Learn about recursive definitions – Explore the base case and the general case of a recursive.
OCR A Level F453: The function and purpose of translators Translators a. describe the need for, and use of, translators to convert source code.
Computer Systems Architecture Edited by Original lecture by Ian Sunley Areas: Computer users Basic topics What is a computer?
Software Engineering Algorithms, Compilers, & Lifecycle.
Software Development Environment
John Canny Where to put block multi-vector algorithms?
Lecture 7: Introduction to Distributed Computing.
More examples How many processes does this piece of code create?
Objective of This Course
Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Computer Programming.
Fast Communication and User Level Parallelism
Computer Science Core Concepts
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Chapter 4 The Von Neumann Model
Presentation transcript:

Implementing Sequentially Consistent Programs on Processor Consistent Platforms Lisa Higham and Jalal Kawash University of Calgary, Canada American University of Sharjah, UAE

Outline Memory consistency models –Sequential consistency –P-RAM –Coherence –PC-G Compiling from SC to PC-G –Good news –Bad news –Proof

Multi-Processor’s Computation PPPP..…… O o1o1 o1o1 o1o1 o1o1 o2o2 o2o2 o2o2 o2o2 o3o3 o3o3 o3o3 o3o3 ……..…

Sequential Consistency PPPP..…… Switch Shared Memory

Sequential Consistency: Example PPP xy r(y) 4 r(x) 2 w(x, 2) w(y, 1) w(x, 3) w(y, 4) w(x, 3) w(y, 4) r(y) 4 w(x, 2) w(y, 1) r(x)

Sequential Consistency: Definition A computation is sequentially consistent iff  a valid total order on O such that (O, )  (O, ) prog

P-RAM Copies of Memory P P P P FIFO Channels xyzxyz xyzxyz

P-RAM: Example w(x, 1) x  1 w(y, 2) y  2 P P P P PPPP r(y) 2 r(x) 0 r(x) 1 r(y) 0 w(y, 2)w(x, 1) r(x) 1 r(y) 0 r(y) 2 r(x) 0 xy xy xy xy

P-RAM Definition A computation is P-RAM iff for each process p,  a valid total order such that (O|p O|w, )  (O|p O|w, ) Lp ∩ prog ∩ Lp

Coherence x z y PPPP …...

x  2y  3 Coherence: Example 00 x y P PPP w(x, 2) r(y) 0r(x) 0 w(y, 3) 23 w(x, 2) r(y) 0 r(x) 0 w(y, 3) P

Coherence: Definition A computation is Coherent iff for each variable x,  a valid total order such that (O| x, )  (O| x, ) progx x

PC-G: P-RAM and Coherence P-RAM Coherence PC-G

P-CG Definition A computation is P-CG iff for each process p,  a valid total order such that –(O|p O|w, )  (O|p O|w, ) –  processes q, and  variable x: (O|w ∩ O|x, ) = (O|w ∩ O|x, ) Lp ∩ prog ∩ Lp Lq

P-CG vs. SC Algorithms are designed for SC machines Some of them work directly when run on P-CG (e.g. Peterson 2) Most of the SC algorithms do not work on P-CG machines (e.g. test&set and Bakery algorithm)

Can we transform an SC algorithm to an equivalent P-CG algorithm? Can we find a compiler that transforms any SC algorithm to an equivalent P-CG algorithm?

Program Transformation and Interpretation Program P Program α(P) Transformation α Interpretation D C= {Computations of P on SC machines} D= {Computations of α(P) on M machines} E = {Interpretations of D on SC machines} C E Execute P on SC Execute α(P) on PC-G

Program Implementation Program P Program α(P) Transformation α C E Interpretation D C= {SC Computations of P} D= {P-CG Computations of α(P)} E = {Interpretations of D} If  program P, α implements P, then α is a compiler from SC to PC-G Execute P on SC Execute α(P) on PC-G

Transformation Function α m: a new multi-writer variable Instructionα(Instruction) write(s y, val)write(m, id(y)); write(s y, val); write(m, id(y)) read(s y )

Results Claim 1: –  implements Lamport’s Bakery algorithm for 2 processes on PC-G Claim 2: –  is a compiler from SC to PC-G for any program provided: Only 2 processes Only single-writer variables

Transformation Example PP w(x, 1) r(y) w(y, 4) w(y, 2) r(x) Program Under SC: if r(y) returns 4, then r(x) returns 1 Possible PC-G Views PP w(x, 1) w(y, 4) r(y) 4 w(y, 2) w(y, 4) w(y, 2) r(x) 0 w(x, 1) Under PC-G: r(y) returns 4 and r(x) may return 0

Transformation Example Program  (Program) PP w(x, 1) r(x) w(y, 4) r(y) w(y, 2) w 4 (m, P) w 1 (m, P) w 2 (m, P) w 1 (m, P) w 2 (m, P) w 3 (m, P)

Transformation Example P P r(x) r(x) must return 1 w(x, 1)w(y, 4) w(y, 2) View for w 1 (m, P) w 2 (m, P) w 3 (m, P) w 4 (m, P) w 1 (m, P) w 2 (m, P) If r(y) returns 4 w(x, 1)w(y, 4) w(y, 2) w 1 (m, P) w 2 (m, P) w 3 (m, P) w 4 (m, P) w 1 (m, P) w 2 (m, P) r(y)

m m m m m m Proof Sketch PP w1w1 w w2w2..… m1m1 m5m5 m2m2 m3m3 m4m4 m6m6 Program PP PC-G Views … m1m1 m5m5 m2m2 m3m3 m4m4 m6m6 w w1w1 w2w2 System View … m1m1 m5m5 m2m2 m3m3 m4m4 m6m6 w w1w1 w2w2 ….. m2m2 m3m3 m4m4 m6m6 m1m1 m5m5

Proof Sketch System view – Contains all reads and writes by both processes –Maintains program order –Is valid

Summary Compiler: Only one additional variable Only writes to that variable Provided: Two processors Single writer variables

Impossibilities For more than 2 processors, there is no compiler from SC to PC-G that: Only adds write instructions (with any number of variables) nor Uses only one additional variable (with any number of reads and writes)

Pros and Cons Restricted –2 processes –Only single writer variables Valuable –ME Lamport’s Bakery algorithm –Wait-free test&set had no known solutions in weak memory consistency

Wait-Free Test&set Define test&set if (s i = you and s j ≠ rst) the return 1 repeat s i  choose case s j is: you, rst: s i  me me : s i  you choose: s i  random (me, you) end case until (s i ≠ s j ) if (s i = me) then return 0 else return 1 m  i Define reset s i  rst

Conclusions α works for any two-process program with single-writer variables α works for particular programs with > 2 processes (randomized wait-free n-process test&set) If there is a transformation that work for other cases, it must be more complicated: –Cannot be write-adding –Must use more than one additional variable

Thank You?Thank You?

m m m m m m Proof Sketch PP w1w1 w w2w2..… m1m1 m5m5 m2m2 m3m3 m4m4 m6m6 Program PP PC-G Views … m1m1 m5m5 m2m2 m3m3 m4m4 m6m6.. m2m2 m3m3 m4m4 m6m6 m1m1 m5m5 w w1w1 w2w2 System View … m1m1 m5m5 m2m2 m3m3 m4m4 m6m6 w w1w1 w2w2 …