Cache Coherence Protocols:

Slides:



Advertisements
Similar presentations
Chapter 5 Part I: Shared Memory Multiprocessors
Advertisements

Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Snoopy Caches II Steve Ko Computer Sciences and Engineering University at Buffalo.
CS 7810 Lecture 19 Coherence Decoupling: Making Use of Incoherence J.Huh, J. Chang, D. Burger, G. Sohi Proceedings of ASPLOS-XI October 2004.
CIS629 Coherence 1 Cache Coherence: Snooping Protocol, Directory Protocol Some of these slides courtesty of David Patterson and David Culler.
Computer Architecture 2011 – coherency & consistency (lec 7) 1 Computer Architecture Memory Coherency & Consistency By Dan Tsafrir, 11/4/2011 Presentation.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
1 Shared-memory Architectures Adapted from a lecture by Ian Watson, University of Machester.
Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.
Multiprocessor Cache Coherency
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
Ch4. Multiprocessors & Thread-Level Parallelism 2. SMP (Symmetric shared-memory Multiprocessors) ECE468/562 Advanced Computer Architecture Prof. Honggang.
December 1, 2006©2006 Craig Zilles1 Threads and Cache Coherence in Hardware  Previously, we introduced multi-cores. —Today we’ll look at issues related.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 March 20, 2008 Session 9.
Additional Material CEG 4131 Computer Architecture III
The University of Adelaide, School of Computer Science
Lecture 13: Multiprocessors Kai Bu
1 Lecture 8: Snooping and Directory Protocols Topics: 4/5-state snooping protocols, split-transaction implementation details, directory implementations.
1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 26 – Alternative Architectures.
Lecture 20: Consistency Models, TM
COSC6385 Advanced Computer Architecture
COMP 740: Computer Architecture and Implementation
תרגול מס' 5: MESI Protocol
Computer Engineering 2nd Semester
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
CS 704 Advanced Computer Architecture
Lecture 18: Coherence and Synchronization
A Study on Snoop-Based Cache Coherence Protocols
Cache Coherence for Shared Memory Multiprocessors
12.4 Memory Organization in Multiprocessor Systems
Multiprocessor Cache Coherency
Morgan Kaufmann Publishers Large and Fast: Exploiting Memory Hierarchy
The University of Adelaide, School of Computer Science
Example Cache Coherence Problem
Flynn’s Taxonomy Flynn classified by data and control streams in 1966
The University of Adelaide, School of Computer Science
Lecture 9: Directory-Based Examples II
Cache Coherence Protocols:
Cache Coherence (controllers snoop on bus transactions)
Lecture 2: Snooping-Based Coherence
James Archibald and Jean-Loup Baer CS258 (Prof. John Kubiatowicz)
CMSC 611: Advanced Computer Architecture
Interconnect with Cache Coherency Manager
Lecture 5: Snooping Protocol Design Issues
Chapter 5 Exploiting Memory Hierarchy : Cache Memory in CMP
Lecture 25: Multiprocessors
Lecture 9: Directory-Based Examples
High Performance Computing
Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini
Lecture 8: Directory-Based Examples
Lecture 25: Multiprocessors
Lecture 26: Multiprocessors
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Cache coherence CEG 4131 Computer Architecture III
Lecture 24: Virtual Memory, Multiprocessors
Lecture 23: Virtual Memory, Multiprocessors
Lecture 24: Multiprocessors
Coherent caches Adapted from a lecture by Ian Watson, University of Machester.
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 19: Coherence and Synchronization
Lecture 18: Coherence and Synchronization
The University of Adelaide, School of Computer Science
CSE 486/586 Distributed Systems Cache Coherence
Lecture 10: Directory-Based Examples II
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

Cache Coherence Protocols:

What is Cache Coherence? When one Core writes to its own cache the other core gets to see it, when they read it out of its own cache. Provides underlying guarantees for the programmer with respect to data validation. Even one large L1 Cache per core will not be able to update itself fast enough to processor requests. Less throughput

Cache Coherence: Do we need it?

Coherence Property - I: Read R from Address X on Core C0 returns the value written by the most recent write W on X on C0, if no other core has written to X between W and R

Coherence Property - II If C0 writes to X and C1 reads after a sufficient time and there are no other writes in between, then C1’s read returns the value from C0’s write.

Coherence Property –III: Writes to the same location are serialized: Any 2 OR multiple writes to X must be seen to occur in the same order on all Cores.

How to get Cache Coherence? No Caches (Bad Performance) All Cores share the same L1 Cache (Bad Performance) Force Read in One Cache to see Write made in another: Broadcasts writes to update other caches (Write Update Coherence)

Write Update Snooping Coherence(The initial issue):

Write Update Snooping (Issue Resolved)– II Snooping: Cache 0 monitors the write of 1 in A through the bus Update: When Write is seen, the value is updated in relevant Core’s Cache having Memory block A

Multiple Writes maintains synchronized:

Write Update Enhanced Version (Avoid Memory Writes): In previous write update protocol every Processor needs to broadcast it on the bus and the Memory(Write Through Caches) Add a dirty bit to each Cache. It would allow us delay a Memory Write until replaced from Cache.

Core 0 Block Refreshed and Read with A (value?) Dirty Bit : Memory needs to be updated(WB to RAM), Dirty Bit Cache only has the updated value

Multi – Writes and Dirty Block Replacement: Memory won’t be updated until Dirty block is replaced

Writing from a different Cache:

Dirty Bit Benefits: Write to Memory only when Dirty Block replaced Read from Memory only if no block in a dirty state, else all reads from the Dirty Bit Cache Significantly reduces Read and Write transactions to Memory

Write Update Bus Enhancement:

Write to same Memory Location (S = 1)

Broadcast only when shared among cores: