Interconnect with Cache Coherency Manager

Slides:



Advertisements
Similar presentations
Chapter 5 Part I: Shared Memory Multiprocessors
Advertisements

Cache Issues. General Cache Principle SETUP: Requestor making a stream of lookup requests to a data store. Some observed predictability – e.g. locality.
Lucía G. Menezo Valentín Puente José Ángel Gregorio University of Cantabria (Spain) MOSAIC :
Parallel Processing Problems Cache Coherence False Sharing Synchronization.
Quiz 4 Solution. n Frequency = 2.5GHz, CLK = 0.4ns n CPI = 0.4, 30% loads and stores, n L1 hit =0, n L1-ICACHE : 2% miss rate, 32-byte blocks n L1-DCACHE.
1 Lecture 4: Directory Protocols Topics: directory-based cache coherence implementations.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Shared-memory.
Using one level of Cache:
The Lord of the Cache Project 3. Caches Three common cache designs: Direct-Mapped store in exactly one cache line Fully Associative store in any cache.
How caches take advantage of Temporal locality
1 Lecture 20: Coherence protocols Topics: snooping and directory-based coherence protocols (Sections )
Caches The principle that states that if data is used, its neighbor will likely be used soon.
Direct Map Cache Tracing Exercise. Exercise #1: Setup Information CS2100 Cache I 2 Memory 4GB Memory Address 310N-1N Block Number Offset 1 Block = 8 bytes.
CSCE 212 Quiz 11 – 4/13/11 Given a direct-mapped cache with 8 one-word blocks and the following 32-bit memory address references: 1 2, ,
The Chip Set. At one time, most of the functions of the chipset were performed by multiple, smaller controller chips Integrated to form a single set of.
InputsMetricsCode MAIN MEMORY core Interconnection network Private data (LI) cache Cache controller core Cache controller Private data (LI) cache MULTICORE.
The PCI Bus is typically measured in Megabytes per second (MBps). The USB and FireWire Bus is typically measured in Megabits per second (Mbps) and.
InputsMetricsCodeResults MAIN MEMORY core Interconnection network Private data (LI) cache Cache controller core Cache controller Private data (LI)
SafetyNet Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill,
Cache Control and Cache Coherence Protocols How to Manage State of Cache How to Keep Processors Reading the Correct Information.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
S YMMETRIC S HARED M EMORY A RCHITECTURE Presented By: Rahul M.Tech CSE, GBPEC Pauri.
Alpha 21364: A Scalable Single-chip SMP Peter Bannon Senior Consulting Engineer Compaq Computer Corporation Shrewsbury, MA.
NUMAScale Cache-coherent Inter-Connect for exascale clusters NUMA-CIC ( pronounced “numa-kick”) Proposal PRACE WP9 – Emerging Technologies Prototype Hans.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 5, 2005 Session 22.
Influence Of The Cache Size On The Bus Traffic Mohd Azlan bin Hj. Abd Rahman M
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
1 Introduction ELG 6158 Digital Systems Architecture Miodrag Bolic.
Understanding Parallel Computers Parallel Processing EE 613.
1 Lecture: Coherence Protocols Topics: snooping-based protocols.
ICC Module 3 Lesson 2 – Memory Hierarchies 1 / 14 © 2015 Ph. Janson Information, Computing & Communication Memory Hierarchies – Clip 5 – Reading School.
Cache Operation.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 7, 2005 Session 23.
By Chad Andrus. TILE-Gx100  100 Identical Processor Cores Each core has its own L2 & L3 cache Each can run its own OS or group together for multiprocessing.
Memory Hierarchy and Cache Design (3). Reducing Cache Miss Penalty 1. Giving priority to read misses over writes 2. Sub-block placement for reduced miss.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Centralized Multiprocessor.
Cache memory. Cache memory Overview CPU Cache Main memory Transfer of words Transfer of blocks of words.
CSCI2510 Tutorial 5 Introduction to Cache Zong Wen
Interconnection structures
The PCI Bus is typically measured in Megabytes per second (MBps). The USB and FireWire Bus is typically measured in Megabits per second (Mbps) and.
Direct Cache Structure
Replacement Policy Replacement policy:
The Chip Set.
QuickPath interconnect GB/s GB/s total To I/O
Notation Addresses are ordered triples:
Assignment 4 – (a) Consider a symmetric MP with two processors and a cache invalidate write-back cache. Each block corresponds to two words in memory.
Exploiting Memory Hierarchy Chapter 7
Lecture 21: Memory Hierarchy
Directory-based Protocol
Shared Memory Multiprocessors
Cache Coherence Protocols:
Cache Coherence Protocols:
Parallel Architectures Based on Parallel Computing, M. J. Quinn
FIGURE 12-1 Memory Hierarchy
Set-Associative Cache
Lecture 22: Cache Hierarchies, Memory
Help! How does cache work?
Computer System Design (Processor Design)
Direct Mapping.
Module IV Memory Organization.
CDA 5155 Caches.
Computer System Design Lecture 9
Lecture 25: Multiprocessors
Lecture 10: Consistency Models
Ian Reynolds, Obasi Onuoha, Phillip Cherner
Lecture 18: Cache Coherence
Lecture 13: Cache Basics Topics: terminology, cache organization (Sections )
Lecture 11: Consistency Models
Presentation transcript:

Interconnect with Cache Coherency Manager Processor 0 Processor 1 Block State Tag Data Block State Tag Data B0 I 100 00 10 B0 S 108 00 08 B1 S 108 00 08 B1 M 128 20 10 B2 M 110 00 10 B2 I 130 10 12 B3 I I 118 00 18 B3 120 10 28 Interconnect with Cache Coherency Manager Tag Block Data Memory 100 00 10 Initial State 108 00 08 110 00 10 118 00 18 120 10 28 128 20 10 130 10 12 132 40 12 134 10 22

Interconnect with Cache Coherency Manager Processor 0 Processor 1 Block State Tag Data Block State Tag Data B0 S 128 20 10 B0 S 108 00 08 B1 S 108 00 08 B1 S 128 20 10 B2 M 110 00 10 B2 I 130 10 12 B3 I I 118 00 18 B3 120 10 28 Interconnect with Cache Coherency Manager Tag Block Data Memory 100 00 10 After reference 1: P0: read 128 108 00 08 110 00 10 118 00 18 120 10 28 128 20 10 130 10 12 132 40 12 134 10 22

Interconnect with Cache Coherency Manager Processor 0 Processor 1 Block State Tag Data Block State Tag Data B0 S 128 20 10 B0 S 108 00 08 B1 S 108 00 08 B1 S 128 20 10 B2 M 110 00 10 B2 I 130 10 12 B3 S 132 40 12 B3 S 132 40 12 Interconnect with Cache Coherency Manager Tag Block Data Memory 100 00 10 After reference 2: P1: read 132 108 00 08 110 00 10 118 00 18 120 10 28 128 20 10 130 10 12 132 40 12 134 10 22

Interconnect with Cache Coherency Manager Processor 0 Processor 1 Block State Tag Data Block State Tag Data B0 M 128 20 15 B0 S 108 00 08 B1 S 108 00 08 B1 I 128 20 10 B2 M 110 00 10 B2 I 130 10 12 B3 S 132 40 12 B3 S 132 40 12 Interconnect with Cache Coherency Manager Tag Block Data Memory 100 00 10 After reference 3: P0: write 128  20 15 108 00 08 110 00 10 118 00 18 120 10 28 128 20 10 130 10 12 132 40 12 134 10 22

Interconnect with Cache Coherency Manager Processor 0 Processor 1 Block State Tag Data Block State Tag Data B0 S 128 20 15 B0 S 108 00 08 B1 S 108 00 08 B1 S 128 20 15 B2 M 110 00 10 B2 I 130 10 12 B3 S 132 40 12 B3 S 132 40 12 Interconnect with Cache Coherency Manager Tag Block Data Memory 100 00 10 After reference 4: P1: read 128 108 00 08 110 00 10 118 00 18 120 10 28 128 20 10 130 10 12 132 40 12 134 10 22

Interconnect with Cache Coherency Manager Processor 0 Processor 1 Block State Tag Data Block State Tag Data B0 S 128 20 15 B0 S 108 00 08 B1 S 108 00 08 B1 S 128 20 15 B2 M 110 00 10 B2 I 130 10 12 B3 S 132 40 12 B3 S 132 40 12 Interconnect with Cache Coherency Manager Tag Block Data Memory 100 00 10 After reference 5: P0: reads 110 108 00 08 110 00 10 118 00 18 120 10 28 128 20 10 130 10 12 132 40 12 134 10 22

Interconnect with Cache Coherency Manager Processor 0 Processor 1 Block State Tag Data Block State Tag Data B0 S 128 20 15 B0 S 108 00 08 B1 S 108 00 08 B1 S 128 20 15 B2 S 110 00 10 B2 S 110 00 10 B3 S 132 40 12 B3 S 132 40 12 Interconnect with Cache Coherency Manager Tag Block Data Memory 100 00 10 After reference 6: P1: reads 110 108 00 08 110 00 10 118 00 18 120 10 28 128 20 10 130 10 12 132 40 12 134 10 22

Interconnect with Cache Coherency Manager Processor 0 Processor 1 Block State Tag Data Block State Tag Data B0 S 128 20 15 B0 S 108 00 08 B1 S 108 00 08 B1 S 128 20 15 B2 S 110 00 10 B2 S 110 00 10 B3 S 132 40 12 B3 S 132 40 12 Interconnect with Cache Coherency Manager Tag Block Data Memory 100 00 10 Memory after cache write-back: 108 00 08 110 00 10 118 00 18 120 10 28 128 20 15 130 10 12 132 40 12 134 10 22