Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput.

Similar presentations


Presentation on theme: "Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput."— Presentation transcript:

1 Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput. Eng., New Jersey Inst. of Technol., Newark, NJ, USA VLSI (ISVLSI), 2010 IEEE Computer Society Annual Symposium on

2 Scratchpad Memory : A Design Alternative for Cache On-chip memory In Embedded Systems Scratchpad Memory : A Design Alternative for Cache On-chip memory In Embedded Systems The architecture and application of SPM. The usage of tag This paper

3 Protecting the on-chip cache memories against soft errors has become an increasing challenge in designing new generation reliable microprocessors. Previous efforts have mainly focused on improving the reliability of the cache data arrays. Due to its crucial importance to the correctness of cache accesses, the tag array demands high reliability against soft errors while the data array is fully protected. Exploiting the address locality of memory accesses, we propose to duplicate most recently accessed tag entries in a small Tag Replication Buffer (TRB) thus to protect the information integrity of the tag array in the data cache with low performance, energy and area overheads. A Selective-TRB scheme is further proposed to protect only tag entries of dirty cache lines. The experimental results show that the Selective-TRB scheme achieves a higher access-with-replica(AWR) rate of 97.4% for the dirty- cache line tags. To provide a comprehensive evaluation of the tag-array reliability, we also conduct an architectural vulnerability factor (AVF) analysis for the tag array and propose a refined metric, detected- without-replica-AVF (DOR-AVF), which combines the AVF and AWR analysis. Based on our DOR-AVF analysis, a TRB scheme with early write-back (EWB) is proposed, which achieves a zero DOR-AVF at a negligible performance overhead.

4 On the characterization and optimization of on- chip cache reliability against soft errors Fault–tolerant cam architectures: A design framework This paper Fault-tolerant content addressable memory Cat – caching address tags: a technique for reducing area cost of on-chip caches Computing architectural vulnerability factors for address-based structures Enhancing data cache reliability by the additional of a small fully-associative replication cache

5 Ionizing radiation => single-event upsets(SEUs), also known as soft errors. This could cause an incorrect data value to be read out from cache, which may crash the computation/communication. The old-work to solve this problem  Parity code – protect the on-chip L1 caches(Intel Itanium 2, IBM Power6 processors)  Hamming code – widely adopted in L2/L3 caches 。 ECC(Error correct code) 。 SEC-DED(single error correction and double error detection)

6 Tag Replication Buffer(TRB) – a small buffer replicas of frequently accessed tag entries for L1 cache. Selective-TRB – duplicate tag entries for dirty cache lines. Vulnerability Factor Analysis(AVF)  Detected-without-replica-AVF  Early Write-back(Selective-TRB-EWB)

7 Two pointer in the TRB  The set pointer => indicate the original tag entry  The way pointer => indicate the way of the original tag entry in a set-associative cache, which is not needed in a directly-mapped cache. The copy bit indicate the tag has a replica or not. 1 is yes, 0 otherwise. The valid bit is to indicate whether it is a valid or invalid tag replica.

8 Deal Soft Error  Single-bit error, this kind of error can be detected by parity coding, but cannot be recovered. We can recover the error with TRB when the copy bit is 1. Access-with-replica(AWR), ratio of the tag access with a replica in the TRB. Duplicate  At the time when a cache line is brought into L2 cache.  With TRB miss. Replacement  LRU+ or FIFO+ 。 Find invalidated entry instead of just doing LRU or FIFO.

9 Improve the efficiency of protection by protecting only the dirty cache line.  Because the clean cache lines have their copies in L2 cache. And can be recoverable using ECC. Better AWR rate compared with TRB, and also reduce the number of duplications.

10 DOR-AVF(Detected without Replica AVF) Solution: Selective TRB with early write-back.  Duplicate only dirty cache line.  When replicate happens, its corresponding cache line will be forced to write back to L2.  100% AWR in order to reduce the DOR-AVF to zero.

11 SimpleScalar V3.0 to model a microprocessor similar Alpha 21364. SPEC CPU2000 as benchmark.

12 8-entry TRB – 69.9% 16-entry TRB – 82.7% 32-entry TRB – 91.5%

13 LRU  AWR 91.5%.  Implement complexity higher. FIFO  AWR 90.0% LRU+  AWR 92.1% FIFO+  AWR 91.0% FIFO+ policy is chosen, due to the overhead of LRU+.

14 33% dirty, 66% clean cache line. 91% -> 97.4% AWR.

15 TRB: 31.7% -> 22.6% Selective TRB: 31.7% -> 16.7%

16

17 Propose Tag Replication Buffer. Selective TRB to improve reliability. Selective TRB-EWB

18 This paper is so well organized. The experiment is also very clear. We can know that protecting cache information is a important topic as well.


Download ppt "Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput."

Similar presentations


Ads by Google