Download presentation

Presentation is loading. Please wait.

Published byBrendan Poindexter Modified over 2 years ago

1
Using Implications for Online Error Detection Nuno Alves, Jennifer Dworak, and R. Iris Bahar Division of Engineering Brown University Providence, RI 02912 Kundan Nepal Electrical Engineering Dept. Bucknell University Lewisburg, PA 17837 International Test Conference, October 28-30, 2008

2
Motivation Circuits are becoming more susceptible to transient errors…. –Soft errors, test escapes, noise, etc. Some applications need a reduction in error rates. $$$$ Error Detection Can we efficiently tradeoff error detection and cost? - using logic implications

3
Outline Common error detection techniques Our approach—logic implications Finding an implication set Error coverage Balancing error coverage and overhead Conclusions

4
Outline Common error detection techniques Our approach—logic implications Finding an implication set Error coverage Balancing error coverage and overhead Conclusions

5
(Some) Previous Techniques in Online Error Detection Redundancy in time — e.g. re-executing in a redundant thread Logic duplication or TMR Codes — e.g. Parity, Berger, Bose Lin Pre-computed test vectors and their expected responses (stored in hardware) High-level functional assertions

6
Outline Common error detection techniques Our approach—logic implications Finding an implication set Error coverage Balancing error coverage and overhead Conclusions

7
Our Approach—Logic Implications Error detection compares expected behavior to actual behavior Implications within a logic block describe expected relationships between values at circuit sites. Violation of an expected implication indicates the presence of an error.

8
Implications Naturally Occur in Circuits n1 n2 n3 n4 n5 n6 n7 n8 0 1 0 0 n5 = 1 → n8 = 0

9
Implication Violations Can Be Used to Detect Errors ERROR n1 n2 n3 n4 n5 n6 n7 n8 n5=1 n8=0 Appropriate checker logic can detect multiple errors with a single implication.

10
Implication Violations Can Be Used to Detect Errors ERROR n1 n2 n3 n4 n5 n6 n7 n8 n5=1 n8=0 Appropriate checker logic can detect multiple errors with a single implication. sa1

11
Identified Implications Determine Checker Hardware

12
Outline Common error detection techniques Our approach—logic implications Finding an implication set Error coverage Balancing error coverage and overhead Conclusions

13
Finding Implications Gate-level implications can be identified automatically without requiring functional knowledge of the circuit in three steps: Quickly identify potential implications: –Choose potential sites of adequate distance –Fast good circuit simulation –Look for missing logic value pairs Validate implications –SAT solver Reduce implication set –Structural and error detection analysis

14
So…how many “natural” implications are there?

16
Identifying “Subsumed” Implications All the errors covered by a short-distance implication may sometimes also be covered by a long-distance implication…. n1 n2 n3 n4 n5 n6 n7 n8 n9 n10 n11 n12 n13

17
Identifying “Subsumed” Implications All the errors covered by a short-distance implication may sometimes also be covered by a long-distance implication…. n1 n2 n3 n4 n5 n6 n7 n8 n9 n10 n11 n12 n13 n10 = 0 → n13 = 0

18
Identifying “Subsumed” Implications All the errors covered by a short-distance implication may sometimes also be covered by a long-distance implication…. n1 n2 n3 n4 n5 n6 n7 n8 n9 n10 n11 n12 n13 n10 = 0 → n13 = 0 n10 = 0 → n8 = 0

19
Identifying “Subsumed” Implications All the errors covered by a short-distance implication may sometimes also be covered by a long-distance implication…. n1 n2 n3 n4 n5 n6 n7 n8 n9 n10 n11 n12 n13 n10 = 0 → n13 = 0 n10 = 0 → n8 = 0 n4 = 1 → n8 = 0 n4 = 1 → n11 = 0 n4 = 1 → n13 = 0 n4 = 11 → n8 = 0

20
Reducing the Implication List Subsumed implications detected through structural analysis: –Implications fall on the same path with appropriate “implied values” –No fanout branches along the path –The implication with the longest “distance” between implication sites is retained.

21
So, how much does this reduce the size of our implication lists?

23
Compressing the Implication List While Maintaining Quality Once subsumed implications are removed, the implication list may still be too long. Evaluate the remaining implications for “implication quality” Implication quality calculated for every implication/fault pair: Each fault’s “highest quality” implication is added to the list

26
Outline Common error detection techniques Our approach—logic implications Finding an implication set Error coverage Balancing error coverage and overhead Conclusions

27
Covering Faults with Implications For each random input vector, and for each fault, the implications-based circuit operation can fall into the following 4 categories: Case 1Case2Case3Case4 Error Propagates To Output An Implication is Violated True detection False positive True miss Benign miss

30
Outline Common error detection techniques Our approach—logic implications Finding an implication set Error coverage Balancing error coverage and overhead Conclusions

31
What is the hardware overhead? Include all implications remaining after compress Used simple implementation for each implication (AND gate and up to 2 inverters) Outputs of AND gates OR’ed together 180nm TSMC library and Mentor Graphics Toolset used to generate layout and calculate area overhead.

33
Trading off Area Overhead and Coverage Coverage/area tradeoffs are intuitively easy with implications Threshold set for area overhead Gate count used to estimate number of implications that can be included Implications chosen by: –Coverage of all faults –Coverage of “most important” faults (more likely to be missed by test, more likely to cause important errors, etc.)

35
Conclusions Implications serve as gate-level “assertions” that can be automatically discovered without detailed functional knowledge of the circuit design Many implications naturally exist within circuits Good coverage of many faults (often almost 90%) Ideally suited to cost/coverage tradeoffs—especially for applications that require a significant reduction in error rates instead of “zero” errors With only a 10% area overhead, probability of an error being both observable and undetected is reduced to ~12% on average (and actual error rate will be much less)

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google