CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.

Slides:



Advertisements
Similar presentations
Cache Coherence. Memory Consistency in SMPs Suppose CPU-1 updates A to 200. write-back: memory and cache-2 have stale values write-through: cache-2 has.
Advertisements

Symmetric Multiprocessors: Synchronization and Sequential Consistency.
1 Lecture 20: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 5 CS252 Graduate Computer Architecture Spring 2014 Lecture 5: Out-of-Order Processing Krste Asanovic.
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache III Steve Ko Computer Sciences and Engineering University at Buffalo.
4/16/2013 CS152, Spring 2013 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
4/4/2013 CS152, Spring 2013 CS 152 Computer Architecture and Engineering Lecture 17: Synchronization and Sequential Consistency Krste Asanovic Electrical.
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 12 CS252 Graduate Computer Architecture Spring 2014 Lecture 12: Synchronization and Memory Models Krste.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Complex Pipelining II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Chapter 6: Process Synchronization
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 5: Process Synchronization.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture ILP II Steve Ko Computer Sciences and Engineering University at Buffalo.
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture ILP III Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache II Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Directory-Based Caches I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Synchronization and Consistency II Steve Ko Computer Sciences and Engineering University at.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Address Translation and Protection Steve Ko Computer Sciences and Engineering University at.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Cache IV Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590 Computer Architecture Virtual Memory II
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Putting it all together: Intel Nehalem Steve Ko Computer Sciences and Engineering University.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture ILP I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Synchronization and Consistency I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Multithreading II Steve Ko Computer Sciences and Engineering University at Buffalo.
CS 152 Computer Architecture and Engineering Lecture 15 - Advanced Superscalars Krste Asanovic Electrical Engineering and Computer Sciences University.
Computer Architecture 2011 – coherency & consistency (lec 7) 1 Computer Architecture Memory Coherency & Consistency By Dan Tsafrir, 11/4/2011 Presentation.
Chapter 6: Process Synchronization. Outline Background Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores Classic Problems.
CS 152 Computer Architecture and Engineering Lecture 19: Synchronization and Sequential Consistency Krste Asanovic Electrical Engineering and Computer.
CS 152 Computer Architecture and Engineering Lecture 21: Directory-Based Cache Protocols Scott Beamer (substituting for Krste Asanovic) Electrical Engineering.
CS 152 Computer Architecture and Engineering Lecture 19: Synchronization and Sequential Consistency Krste Asanovic Electrical Engineering and Computer.
CS 252 Graduate Computer Architecture Lecture 11: Multiprocessors-II Krste Asanovic Electrical Engineering and Computer Sciences University of California,
April 13, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches Krste Asanovic Electrical Engineering and Computer.
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
April 8, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 19: Synchronization and Sequential Consistency Krste Asanovic Electrical.
CS 152 Computer Architecture and Engineering Lecture 20: Snoopy Caches Krste Asanovic Electrical Engineering and Computer Sciences University of California,
April 4, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 17: Synchronization and Sequential Consistency Krste Asanovic Electrical.
1 Lecture 22: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
April 18, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 19: Directory-Based Cache Protocols Krste Asanovic Electrical Engineering.
April 15, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 20: Snoopy Caches Krste Asanovic Electrical Engineering and Computer.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 15 CS252 Graduate Computer Architecture Spring 2014 Lecture 15: Virtual Memory and Caches Krste Asanovic.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed Shared Memory Steve Ko Computer Sciences and Engineering University at Buffalo.
Memory Consistency Models Alistair Rendell See “Shared Memory Consistency Models: A Tutorial”, S.V. Adve and K. Gharachorloo Chapter 8 pp of Wilkinson.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion & Leader Election Steve Ko Computer Sciences and Engineering University.
CS267 Lecture 61 Shared Memory Hardware and Memory Consistency Modified from J. Demmel and K. Yelick
4/13/2016 CS152, Spring 2016 CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches Dr. George Michelogiannakis EECS, University of California.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
CS 152 Computer Architecture and Engineering Lecture 18: Snoopy Caches
Krste Asanovic Electrical Engineering and Computer Sciences
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Krste Asanovic Electrical Engineering and Computer Sciences
Krste Asanovic Electrical Engineering and Computer Sciences
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Krste Asanovic Electrical Engineering and Computer Sciences
Dr. George Michelogiannakis EECS, University of California at Berkeley
Lecture 2 Part 2 Process Synchronization
CS 152 Computer Architecture and Engineering Lecture 20: Snoopy Caches
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 18 Cache Coherence Krste Asanovic Electrical Engineering and.
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 22 Synchronization Krste Asanovic Electrical Engineering and.
CSE 486/586 Distributed Systems Cache Coherence
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 19 Memory Consistency Models Krste Asanovic Electrical Engineering.
Presentation transcript:

CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches I Steve Ko Computer Sciences and Engineering University at Buffalo

CSE 490/590, Spring Last time… Implementations for semaphores –Test&set –Compare&swap –Load-reserve & store-conditional Sequential consistency vs. weaker consistencies –Agreement between hardware and software –For weaker consistency models, hardware provides extra instructions for software to implement stronger guarantees, e.g., memory fences

CSE 490/590, Spring Mutual Exclusion Using Load/Store A protocol based on two shared variables c1 and c2. Initially, both c1 and c2 are 0 (not busy) What is wrong? Process 1... c1=1; L: if c2=1 then go to L c1=0; Process 2... c2=1; L: if c1=1 then go to L c2=0;

CSE 490/590, Spring Mutual Exclusion: second attempt To avoid deadlock, let a process give up the reservation (i.e. Process 1 sets c1 to 0) while waiting. Deadlock is not possible but with a low probability a livelock may occur. An unlucky process may never get to enter the critical section starvation Process 1... L: c1=1; if c2=1 then { c1=0; go to L} c1=0 Process 2... L: c2=1; if c1=1 then { c2=0; go to L} c2=0

CSE 490/590, Spring A Protocol for Mutual Exclusion T. Dekker, 1966 Process 1... c1=1; turn = 1; L: if c2=1 & turn=1 then go to L c1=0; A protocol based on 3 shared variables c1, c2 and turn. Initially, both c1 and c2 are 0 (not busy) turn = i ensures that only process i can wait variables c1 and c2 ensure mutual exclusion Solution for n processes was given by Dijkstra and is quite tricky! Process 2... c2=1; turn = 2; L: if c1=1 & turn=2 then go to L c2=0;

CSE 490/590, Spring Analysis of Dekker’s Algorithm... Process 1 c1=1; turn = 1; L: if c2=1 & turn=1 then go to L c1=0;... Process 2 c2=1; turn = 2; L: if c1=1 & turn=2 then go to L c2=0; Scenario 1... Process 1 c1=1; turn = 1; L: if c2=1 & turn=1 then go to L c1=0;... Process 2 c2=1; turn = 2; L: if c1=1 & turn=2 then go to L c2=0; Scenario 2

CSE 490/590, Spring N-process Mutual Exclusion Lamport’s Bakery Algorithm Process i choosing[i] = 1; num[i] = max(num[0], …, num[N-1]) + 1; choosing[i] = 0; for(j = 0; j < N; j++) { while( choosing[j] ); while( num[j] && ( ( num[j] < num[i] ) || ( num[j] == num[i] && j < i ) ) ); } num[i] = 0; Initially num[j] = 0, for all j Entry Code Exit Code

CSE 490/590, Spring CSE 490/590 Administrivia CSE Graduate Conference on Friday, 145 Student Union –No class Keyboards available for pickup at my office Updated project 2 with more clarifications & grading criteria

CSE 490/590, Spring Memory Coherence in SMPs Suppose CPU-1 updates A to 200. write-back: memory and cache-2 have stale values write-through: cache-2 has a stale value Do these stale values matter? What is the view of shared memory for programming? cache-1 A100 CPU-Memory bus CPU-1 CPU-2 cache-2 A100 memory A100

CSE 490/590, Spring Write-back Caches & SC T1 is executed prog T2 LD Y, R1 ST Y’, R1 LD X, R2 ST X’,R2 prog T1 ST X, 1 ST Y,11 cache-2 cache-1memory X = 0 Y =10 X’= Y’= X= 1 Y=11 Y = Y’= X = X’= cache-1 writes back Y X = 0 Y =11 X’= Y’= X= 1 Y=11 Y = Y’= X = X’= X = 1 Y =11 X’= Y’= X= 1 Y=11 Y’= 11 X = 0 X’= 0 cache-1 writes back X X = 0 Y =11 X’= Y’= X= 1 Y=11 Y’= 11 X = 0 X’= 0 T2 executed X = 1 Y =11 X’= 0 Y’=11 X= 1 Y=11 Y’=11 X = 0 X’= 0 cache-2 writes back X’ & Y’ inconsistent

CSE 490/590, Spring Write-through Caches & SC cache-2 Y = Y’= X = 0 X’= memory X = 0 Y =10 X’= Y’= cache-1 X= 0 Y=10 prog T2 LD Y, R1 ST Y’, R1 LD X, R2 ST X’,R2 prog T1 ST X, 1 ST Y,11 Write-through caches don’t preserve sequential consistency either T1 executed Y = Y’= X = 0 X’= X = 1 Y =11 X’= Y’= X= 1 Y=11 T2 executed Y = 11 Y’= 11 X = 0 X’= 0 X = 1 Y =11 X’= 0 Y’=11 X= 1 Y=11

CSE 490/590, Spring 2011 Cache Coherence vs. Memory Consistency A cache coherence protocol ensures that all writes by one processor are eventually visible to other processors –i.e., updates are not lost A memory consistency model gives the rules on when a write by one processor can be observed by a read on another –Equivalently, what values can be seen by a load A cache coherence protocol is not enough to ensure sequential consistency –But if sequentially consistent, then caches must be coherent Combination of cache coherence protocol plus processor memory reorder buffer implements a given machine’s memory consistency model 12

CSE 490/590, Spring Maintaining Cache Coherence Hardware support is required such that only one processor at a time has write permission for a location no processor can load a stale copy of the location after a write  cache coherence protocols

CSE 490/590, Spring Warmup: Parallel I/O (DMA stands for Direct Memory Access, means the I/O device can read/write memory autonomous from the CPU) Either Cache or DMA can be the Bus Master and effect transfers DISK DMA Physical Memory Proc. R/W Data (D) Cache Address (A) A D R/W Page transfers occur while the Processor is running Memory Bus

CSE 490/590, Spring Problems with Parallel I/O Memory Disk: Physical memory may be stale if cache copy is dirty Disk Memory: Cache may hold stale data and not see memory writes DISK DMA Physical Memory Proc. Cache Memory Bus Cached portions of page DMA transfers

CSE 490/590, Spring Acknowledgements These slides heavily contain material developed and copyright by –Krste Asanovic (MIT/UCB) –David Patterson (UCB) And also by: –Arvind (MIT) –Joel Emer (Intel/MIT) –James Hoe (CMU) –John Kubiatowicz (UCB) MIT material derived from course UCB material derived from course CS252