Shared Memory Consistency Models: A Tutorial

Slides:



Advertisements
Similar presentations
Chapter 5 Part I: Shared Memory Multiprocessors
Advertisements

Symmetric Multiprocessors: Synchronization and Sequential Consistency.
1 Episode III in our multiprocessing miniseries. Relaxed memory models. What I really wanted here was an elephant with sunglasses relaxing On a beach,
Shared Memory Consistency
1 Lecture 20: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
Memory Consistency Models Sarita Adve Department of Computer Science University of Illinois at Urbana-Champaign Ack: Previous tutorials.
Memory Consistency Models Kevin Boos. Two Papers Shared Memory Consistency Models: A Tutorial – Sarita V. Adve & Kourosh Gharachorloo – September 1995.
SE-292 High Performance Computing
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
CS492B Analysis of Concurrent Programs Consistency Jaehyuk Huh Computer Science, KAIST Part of slides are based on CS:App from CMU.
Cache Coherence in Scalable Machines (IV) Dealing with Correctness Issues Serialization of operations Deadlock Livelock Starvation.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
CS 7810 Lecture 19 Coherence Decoupling: Making Use of Incoherence J.Huh, J. Chang, D. Burger, G. Sohi Proceedings of ASPLOS-XI October 2004.
By Sarita Adve & Kourosh Gharachorloo Review by Jim Larson Shared Memory Consistency Models: A Tutorial.
Memory consistency models Presented by: Gabriel Tanase.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 7: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models.
Lecture 13: Consistency Models
Computer Architecture II 1 Computer architecture II Lecture 9.
1 Lecture 15: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models.
Memory Consistency Models
1 Lecture 12: Relaxed Consistency Models Topics: sequential consistency recap, relaxing various SC constraints, performance comparison.
Shared Memory – Consistency of Shared Variables The ideal picture of shared memory: CPU0CPU1CPU2CPU3 Shared Memory Read/ Write The actual architecture.
Shared Memory Consistency Models: A Tutorial By Sarita V Adve and Kourosh Gharachorloo Presenter: Meenaktchi Venkatachalam.
ECE669 L17: Memory Systems April 1, 2004 ECE 669 Parallel Computer Architecture Lecture 17 Memory Systems.
1 Lecture 22: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
Shared Memory Consistency Models: A Tutorial By Sarita V Adve and Kourosh Gharachorloo Presenter: Sunita Marathe.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
Evaluation of Memory Consistency Models in Titanium.
Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995.
Memory Consistency Models Alistair Rendell See “Shared Memory Consistency Models: A Tutorial”, S.V. Adve and K. Gharachorloo Chapter 8 pp of Wilkinson.
By Sarita Adve & Kourosh Gharachorloo Slides by Jim Larson Shared Memory Consistency Models: A Tutorial.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Page 1 Distributed Shared Memory Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
Memory Consistency Zhonghai Lu Outline Introduction What is a memory consistency model? Who should care? Memory consistency models Strict.
CS533 Concepts of Operating Systems Jonathan Walpole.
CS267 Lecture 61 Shared Memory Hardware and Memory Consistency Modified from J. Demmel and K. Yelick
Fundamentals of Memory Consistency Smruti R. Sarangi Prereq: Slides for Chapter 11 (Multiprocessor Systems), Computer Organisation and Architecture, Smruti.
CSC/ECE 506: Architecture of Parallel Computers Bus-Based Coherent Multiprocessors 1 Lecture 12 (Chapter 8) Lecture 12 (Chapter 8)
1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 26 – Alternative Architectures.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Lecture 20: Consistency Models, TM
COSC6385 Advanced Computer Architecture
Software Coherence Management on Non-Coherent-Cache Multicores
CS5102 High Performance Computer Systems Memory Consistency
Memory Consistency Models
Lecture 11: Consistency Models
Memory Consistency Models
Example Cache Coherence Problem
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Introduction to High Performance Computing Lecture 20
Bus-Based Coherent Multiprocessors
Multiprocessor Highlights
Lecture 22: Consistency Models, TM
Background for Debate on Memory Consistency Models
Mengjia Yan† , Jiho Choi† , Dimitrios Skarlatos,
Shared Memory Consistency Models: A Tutorial
Lecture 10: Consistency Models
Programming with Shared Memory Specifying parallelism
Memory Consistency Models
Lecture 24: Multiprocessors
Lecture 8 Outline Memory consistency
Programming with Shared Memory Specifying parallelism
Lecture: Consistency Models, TM
Lecture 11: Relaxed Consistency Models
Advanced Operating Systems (CS 202) Memory Consistency and Transactional Memory Feb. 6, 2019.
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 19 Memory Consistency Models Krste Asanovic Electrical Engineering.
Lecture 11: Consistency Models
Presentation transcript:

Shared Memory Consistency Models: A Tutorial Authors: Sarita V. Adve Kourosh Gharachorloo Sourced from Adve's Presentation

Sourced from Adve's Presentation Overview Memory Consistency Model Implicit Memory Model -- Sequential Consistency Relaxed Memory Model (system-centric) Relaxed models (program-centric) Sourced from Adve's Presentation

Memory Consistency Model Definition: Order in which memory operations will appear to execute -- what value can a read return -- a read should return the value of the “last” write to the same memory location Affecting 3P -- Programmability (easy-of-programming) -- Performance (optimization) -- Portability (moving software across different systems) Sourced from Adve's Presentation

Sourced from Adve's Presentation Implicit Memory Model Sequential consistency (SC) [Lamport] Result of an execution appears as if All operations executed in some sequential order Memory operations of each process in program order MEMORY P1 P3 P2 Pn Sourced from Adve's Presentation

Sourced from Adve's Presentation Implicit Memory Model Sequential consistency (SC) [Lamport] Result of an execution appears as if All operations executed in some sequential order Memory operations of each process in program order MEMORY P1 P3 P2 Pn Two aspects: Program order Atomicity Sourced from Adve's Presentation

Architectures without Caches: example 1 Initially Flag1 = Flag2 = 0 P1 P2 Flag1 = 1 Flag2 = 1 if (Flag2 == 0) if (Flag1 == 0) critical section critical section Execution: (Operation, Location, Value) (Operation, Location, Value) Write, Flag1, 1 Write, Flag2, 1 Read, Flag2, 0 Read, Flag1, ____ Sourced from Adve's Presentation

Architectures without Caches: example 1 Initially Flag1 = Flag2 = 0 P1 P2 Flag1 = 1 Flag2 = 1 if (Flag2 == 0) if (Flag1 == 0) critical section critical section Execution: (Operation, Location, Value) (Operation, Location, Value) Write, Flag1, 1 Write, Flag2, 1 Read, Flag2, 0 Read, Flag1, 0? Sourced from Adve's Presentation

Architectures without Caches: example 1 P1 P2 (Operation, Location, Value) (Operation, Location, Value) Write, Flag1, 1 Write, Flag2, 1 Read, Flag2, 0 Read, Flag1, 0 Can happen if Write buffers with read bypassing Overlap, reorder write followed by read in h/w or compiler Allocate Flag1 or Flag2 in registers Optimization by use of writer buffer is safe on convention uniprocessor, but it can violate SC in multiprocessor system. Sourced from Adve's Presentation

Architectures without Caches: example 1 Write buffer Sourced from Adve's Presentation

Architectures without Caches: example 2 Initially Head = Data = 0 P1 P2 Data = 2000; while (Head != 1) {;} Head = 1; ... = Data; Write, Data, 2000 Read, Head, 0 Write, Head, 1 Read, Head, 1 Read, Data, 0? Can happen if Overlap or reorder writes or non-blocking reads in hardware or compiler Sourced from Adve's Presentation

Architectures without Caches: example 2 Overlapped writes Sourced from Adve's Presentation

Architectures without Caches: example 3 Non-blocking reads Sourced from Adve's Presentation

Architectures With Caches Cache Coherence and SC Cache Coherence A write is visible to all processors Serialization of writes to the same location SC Serialization of writes to all locations Operations appear to execute in program order SC implies Cache Coherence: A memory consistency model as the policy that places an early and late bound on when a new value can be propagated by invalidating or updating Atomicity for writes Propagating changes to cache copies in a non-atomic operation Serialize write can avoid the violation of SC Ordering of updates/invalidates between source and destination is preserved by network Or delay an update/invalidate from being sent out until any updates or invalidates from previous write are acknowledged Sourced from Adve's Presentation

Sourced from Adve's Presentation SC Summary SC constrains all memory operations: Write  Read Write  Write Read  Read, Write Simple model for reasoning about parallel programs But, intuitively reasonable reordering of memory operations in a uniprocessor may violate sequential consistency model in multiprocessor Modern microprocessors reorder operations all the time to obtain performance (write buffers, overlapped writes,non-blocking reads…). How do we reconcile sequential consistency model with the demands of performance? Sourced from Adve's Presentation

Sourced from Adve's Presentation Relaxed Memory Model Optimizations Program order relaxation: Write  Read Write  Write Read  Read, Write Read others’ write early Read own write early Sourced from Adve's Presentation

Relaxed Memory Model (system-centric) Models provide safety net Models maintain uniprocessor data and control dependences, write serialization Sourced from Adve's Presentation

System-Centric model assessment System-centric models provide higher performance than SC BUT how about 3P criteria Programmability? Programmer need to consider the correctness with the optimization the specific model provides Portability? Many different models Performance? Can we do better? Programmer-Centric Model Sourced from Adve's Presentation

Programmer-Centric Models Data operation executed more aggressively Programmer provide information about memory operations System based on the model exploit the optimization without violating consistency Sourced from Adve's Presentation

Programmer-Centric Model Sourced from Adve's Presentation

Programmer-Centric Model Assessment 3P criteria Programmability System ensure correctness instead of safety nets used by programmer Performance Optimization enabled by the WO can be applied Information enables more aggressive optimization Portability Not based on specific model Sourced from Adve's Presentation