Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput.

Slides:

Advertisements

Similar presentations

361 Computer Architecture Lecture 15: Cache Memory

Advertisements

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.

Miss Penalty Reduction Techniques (Sec. 5.4) Multilevel Caches: A second level cache (L2) is added between the original Level-1 cache and main memory.

LEVERAGING ACCESS LOCALITY FOR THE EFFICIENT USE OF MULTIBIT ERROR-CORRECTING CODES IN L2 CACHE By Hongbin Sun, Nanning Zheng, and Tong Zhang Joseph Schneider.

Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.

Cache Definition Cache is pronounced cash. It is a temporary memory to store duplicate data that is originally stored elsewhere. Cache is used when the.

Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Memory system.

Virtual Memory Chapter 18 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.

Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.

REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.

Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu*, and Lide Duan Dept. of Electrical & Computer Engineering Louisiana State University.

1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.

Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.

CS 300 – Lecture 20 Intro to Computer Architecture / Assembly Language Caches.

Caching I Andreas Klappenecker CPSC321 Computer Architecture.

1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.

Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.

Cache intro CSE 471 Autumn 011 Principle of Locality: Memory Hierarchies Text and data are not accessed randomly Temporal locality –Recently accessed items.

1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)

An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.

Dyer Rolan, Basilio B. Fraguela, and Ramon Doallo Proceedings of the International Symposium on Microarchitecture (MICRO’09) Dec /7/14.

Reducing Cache Misses 5.1 Introduction 5.2 The ABCs of Caches 5.3 Reducing Cache Misses 5.4 Reducing Cache Miss Penalty 5.5 Reducing Hit Time 5.6 Main.

11/10/2005Comp 120 Fall November 10 8 classes to go! questions to me –Topics you would like covered –Things you don’t understand –Suggestions.

Cache Organization of Pentium

Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.

Multiprocessor Cache Coherency

Maninder Kaur CACHE MEMORY 24-Nov

RAID Ref: Stallings. Introduction The rate in improvement in secondary storage performance has been considerably less than the rate for processors and.

Memory Systems Architecture and Hierarchical Memory Systems

Cache memory October 16, 2007 By: Tatsiana Gomova.

Roza Ghamari Bogazici University.  Current trends in transistor size, voltage, and clock frequency, future microprocessors will become increasingly susceptible.

Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.

A Novel Cache Architecture with Enhanced Performance and Security Zhenghong Wang and Ruby B. Lee.

IVEC: Off-Chip Memory Integrity Protection for Both Security and Reliability Ruirui Huang, G. Edward Suh Cornell University.

Lecture 19: Virtual Memory

Lecture Objectives: 1)Define set associative cache and fully associative cache. 2)Compare and contrast the performance of set associative caches, direct.

Energy-Efficient Cache Design Using Variable-Strength Error-Correcting Codes Alaa R. Alameldeen, Ilya Wagner, Zeshan Chishti, Wei Wu,

2013/10/21 Yun-Chung Yang An Energy-Efficient Adaptive Hybrid Cache Jason Cong, Karthik Gururaj, Hui Huang, Chunyue Liu, Glenn Reinman, Yi Zou Computer.

2013/01/14 Yun-Chung Yang Energy-Efficient Trace Reuse Cache for Embedded Processors Yi-Ying Tsai and Chung-Ho Chen 2010 IEEE Transactions On Very Large.

Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.

CSE 378 Cache Performance1 Performance metrics for caches Basic performance metric: hit ratio h h = Number of memory references that hit in the cache /

Garo Bournoutian and Alex Orailoglu Proceedings of the 45th ACM/IEEE Design Automation Conference (DAC’08) June /10/28.

Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 251 Introduction to Computer Organization.

Yun-Chung Yang SimTag: Exploiting Tag Bits Similarity to Improve the Reliability of the Data Caches Jesung Kim, Soontae Kim, Yebin Lee 2010 DATE(The Design,

Low-Power Cache Organization Through Selective Tag Translation for Embedded Processors with Virtual Memory Support Xiangrong Zhou and Peter Petrov Proceedings.

1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.

Cache Memory By Tom Austin. What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute.

M E M O R Y. Computer Performance It depends in large measure on the interface between processor and memory. CPI (or IPC) is affected CPI = Cycles per.

Chapter 9 Memory Organization By Nguyen Chau Topics Hierarchical memory systems Cache memory Associative memory Cache memory with associative mapping.

CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.

Exploiting Instruction Streams To Prevent Intrusion Milena Milenkovic.

Lecture 20 Last lecture: Today’s lecture: Types of memory

Cache Miss-Aware Dynamic Stack Allocation Authors: S. Jang. et al. Conference: International Symposium on Circuits and Systems (ISCAS), 2007 Presenter:

CAM Content Addressable Memory

IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo a, Jose G. Delgado-Frias Publisher: Journal of Systems.

High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.

RAID TECHNOLOGY RASHMI ACHARYA CSE(A) RG NO

Vivek Seshadri 15740/18740 Computer Architecture

Memory COMPUTER ARCHITECTURE

CAM Content Addressable Memory

Basic Performance Parameters in Computer Architecture:

Cache Memory Presentation I

Morgan Kaufmann Publishers

Virtual Memory 4 classes to go! Today: Virtual Memory.

Module IV Memory Organization.

Lecture 6: Reliability, PCM

Performance metrics for caches

Performance metrics for caches

Presentation transcript:

Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput. Eng., New Jersey Inst. of Technol., Newark, NJ, USA VLSI (ISVLSI), 2010 IEEE Computer Society Annual Symposium on

Scratchpad Memory : A Design Alternative for Cache On-chip memory In Embedded Systems Scratchpad Memory : A Design Alternative for Cache On-chip memory In Embedded Systems The architecture and application of SPM. The usage of tag This paper

Protecting the on-chip cache memories against soft errors has become an increasing challenge in designing new generation reliable microprocessors. Previous efforts have mainly focused on improving the reliability of the cache data arrays. Due to its crucial importance to the correctness of cache accesses, the tag array demands high reliability against soft errors while the data array is fully protected. Exploiting the address locality of memory accesses, we propose to duplicate most recently accessed tag entries in a small Tag Replication Buffer (TRB) thus to protect the information integrity of the tag array in the data cache with low performance, energy and area overheads. A Selective-TRB scheme is further proposed to protect only tag entries of dirty cache lines. The experimental results show that the Selective-TRB scheme achieves a higher access-with-replica(AWR) rate of 97.4% for the dirty- cache line tags. To provide a comprehensive evaluation of the tag-array reliability, we also conduct an architectural vulnerability factor (AVF) analysis for the tag array and propose a refined metric, detected- without-replica-AVF (DOR-AVF), which combines the AVF and AWR analysis. Based on our DOR-AVF analysis, a TRB scheme with early write-back (EWB) is proposed, which achieves a zero DOR-AVF at a negligible performance overhead.

On the characterization and optimization of on- chip cache reliability against soft errors Fault–tolerant cam architectures: A design framework This paper Fault-tolerant content addressable memory Cat – caching address tags: a technique for reducing area cost of on-chip caches Computing architectural vulnerability factors for address-based structures Enhancing data cache reliability by the additional of a small fully-associative replication cache

Ionizing radiation => single-event upsets(SEUs), also known as soft errors. This could cause an incorrect data value to be read out from cache, which may crash the computation/communication. The old-work to solve this problem  Parity code – protect the on-chip L1 caches(Intel Itanium 2, IBM Power6 processors)  Hamming code – widely adopted in L2/L3 caches 。 ECC(Error correct code) 。 SEC-DED(single error correction and double error detection)

Tag Replication Buffer(TRB) – a small buffer replicas of frequently accessed tag entries for L1 cache. Selective-TRB – duplicate tag entries for dirty cache lines. Vulnerability Factor Analysis(AVF)  Detected-without-replica-AVF  Early Write-back(Selective-TRB-EWB)

Two pointer in the TRB  The set pointer => indicate the original tag entry  The way pointer => indicate the way of the original tag entry in a set-associative cache, which is not needed in a directly-mapped cache. The copy bit indicate the tag has a replica or not. 1 is yes, 0 otherwise. The valid bit is to indicate whether it is a valid or invalid tag replica.

Deal Soft Error  Single-bit error, this kind of error can be detected by parity coding, but cannot be recovered. We can recover the error with TRB when the copy bit is 1. Access-with-replica(AWR), ratio of the tag access with a replica in the TRB. Duplicate  At the time when a cache line is brought into L2 cache.  With TRB miss. Replacement  LRU+ or FIFO+ 。 Find invalidated entry instead of just doing LRU or FIFO.

Improve the efficiency of protection by protecting only the dirty cache line.  Because the clean cache lines have their copies in L2 cache. And can be recoverable using ECC. Better AWR rate compared with TRB, and also reduce the number of duplications.

DOR-AVF(Detected without Replica AVF) Solution: Selective TRB with early write-back.  Duplicate only dirty cache line.  When replicate happens, its corresponding cache line will be forced to write back to L2.  100% AWR in order to reduce the DOR-AVF to zero.

SimpleScalar V3.0 to model a microprocessor similar Alpha SPEC CPU2000 as benchmark.

8-entry TRB – 69.9% 16-entry TRB – 82.7% 32-entry TRB – 91.5%

LRU  AWR 91.5%.  Implement complexity higher. FIFO  AWR 90.0% LRU+  AWR 92.1% FIFO+  AWR 91.0% FIFO+ policy is chosen, due to the overhead of LRU+.

33% dirty, 66% clean cache line. 91% -> 97.4% AWR.

TRB: 31.7% -> 22.6% Selective TRB: 31.7% -> 16.7%

Propose Tag Replication Buffer. Selective TRB to improve reliability. Selective TRB-EWB

This paper is so well organized. The experiment is also very clear. We can know that protecting cache information is a important topic as well.