A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu 2006703357.

Slides:



Advertisements
Similar presentations
Interconnect Testing in Cluster Based FPGA Architectures Research by Ian G.Harris and Russel Tessier University of Massachusetts. Presented by Alpha Oumar.
Advertisements

Survey of Detection, Diagnosis, and Fault Tolerance Methods in FPGAs
FPGA (Field Programmable Gate Array)
ECE 506 Reconfigurable Computing ece. arizona
Digital Design: Combinational Logic Blocks
Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Programmable Logic Devices
1 Programmable Logic. 2 Prgrammable Logic Organization Pre-fabricated building block of many AND/OR gates (or NOR, NAND) "Personalized" by making or breaking.
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 9 Programmable Configurations Read Only Memory (ROM) – –a fixed array of AND gates.
DSD 2007 Concurrent Error Detection for FSMs Designed for Implementation with Embedded Memory Blocks of FPGAs Andrzej Krasniewski Institute of Telecommunications.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.
BIST for Logic and Memory Resources in Virtex-4 FPGAs Sachin Dhingra, Daniel Milton, and Charles Stroud Electrical and Computer Engineering Auburn University.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
Evolution of implementation technologies
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 26: April 18, 2007 Et Cetera…
Build-In Self-Test of FPGA Interconnect Delay Faults Laboratory for Reliable Computing (LaRC) Electrical Engineering Department National Tsing Hua University.
Programmable logic and FPGA
Digital Design – Physical Implementation Chapter 7 - Physical Implementation.
FPGA Defect Tolerance: Impact of Granularity Anthony YuGuy Lemieux December 14, 2005.
CS294-6 Reconfigurable Computing Day 14 October 7/8, 1998 Computing with Lookup Tables.
1 Introduction A digital circuit design is just an idea, perhaps drawn on paper We eventually need to implement the circuit on a physical device –How do.
1. 2 FPGAs Historically, FPGA architectures and companies began around the same time as CPLDs FPGAs are closer to “programmable ASICs” -- large emphasis.
Programmable Array Logic (PAL) Fixed OR array programmable AND array Fixed OR array programmable AND array Easy to program Easy to program Poor flexibility.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
EE 261 – Introduction to Logic Circuits Module #8 Page 1 EE 261 – Introduction to Logic Circuits Module #8 – Programmable Logic & Memory Topics A.Programmable.
Lecture 2: Field Programmable Gate Arrays September 13, 2004 ECE 697F Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays.
Power Reduction for FPGA using Multiple Vdd/Vth
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
ECE 465 Introduction to CPLDs and FPGAs Shantanu Dutt ECE Dept. University of Illinois at Chicago Acknowledgement: Extracted from lecture notes of Dr.
1 Dynamic Interconnection Networks Miodrag Bolic.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Programmable Logic Devices
Reconfigurable Computing Using Content Addressable Memory (CAM) for Improved Performance and Resource Usage Group Members: Anderson Raid Marie Beltrao.
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
Basic Sequential Components CT101 – Computing Systems Organization.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #4 – FPGA.
Programmable Logic Devices (PLDs)
Section 1  Quickly identify faulty components  Design new, efficient testing methodologies to offset the complexity of FPGA testing as compared to.
1 Carnegie Mellon University Center for Silicon System Implementation An Architectural Exploration of Via Patterned Gate Arrays Chetan Patel, Anthony Cozzie,
M.Mohajjel. Why? TTM (Time-to-market) Prototyping Reconfigurable and Custom Computing 2Digital System Design.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
Copyright © 2010 Houman Homayoun Houman Homayoun National Science Foundation Computing Innovation Fellow Department of Computer Science University of California.
1 Fundamentals of Computer Science Combinational Circuits.
1 Advanced Digital Design Reconfigurable Logic by A. Steininger and M. Delvai Vienna University of Technology.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
Digital Circuits Introduction Memory information storage a collection of cells store binary information RAM – Random-Access Memory read operation.
Digital Logic Design Basics Combinational Circuits Sequential Circuits Pu-Jen Cheng Adapted from the slides prepared by S. Dandamudi for the book, Fundamentals.
Reconfigurable Architectures Greg Stitt ECE Department University of Florida.
ECE 506 Reconfigurable Computing Lecture 5 Logic Block Architecture Ali Akoglu.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Programmable Logic Devices
Field Programmable Gate Arrays
This chapter in the book includes: Objectives Study Guide
Sequential Logic Design
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
This chapter in the book includes: Objectives Study Guide
Basics Combinational Circuits Sequential Circuits
Basics Combinational Circuits Sequential Circuits Ahmad Jawdat
We will be studying the architecture of XC3000.
The Xilinx Virtex Series FPGA
Programmable Configurations
The Xilinx Virtex Series FPGA
Programmable logic and FPGA
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu

Outline  Introduction to FPGA’s  Device-Level Fault Tolerance Methods  Configuration Level Fault Tolerance Methods  Comparison of Methodologies  Conclusion

Introduction (FPGA)  A field programmable gate array is a semiconductor device containing programmable logic components and programmable interconnects  Consists of regular arrays of processing logic blocks (PLBs)  Programmable routing matrix  Configuration of FPGA includes The functionality of the FPGA Which PLBs will be used The functionality of the PLBs Which wire segments will be used for connecting PLBs

Introduction (FPGA)  PLB’s are multi-input, multi-output circuits and allow: Sequential Designs Combinational Designs  PLB’s include: Look Up Tables (LUTs or small ROMs) Multiplexers Flip-Flops

Introduction (FPGA)  Look Up Tables (LUTs): 4 input-1 output units Can be used as:  RAM  ROM  Shift Register  Functional Unit Configured by an 16-bit “INIT” function

Introduction (FPGA)  An Example: y=(x1+x2)*x3+x4 Create truth table Assign “INIT” to the LUT Since there are 4 inputs and 1 output, 1 LUT is enough to represent the equation The LUT can be put into any PLB in the FPGA x1x2x3x4y

Introduction (FPGA)  Another Example: y=(x1+x2)*x3+x4 z=y*x5 Create truth tables Assign “INIT”s to LUTs Since there are 5 inputs and 1 output, 2 LUTs needed to represent the equation The LUTs can be put into any PLBs in the FPGA A1 and A0 are “don’t care”s x1x2x3x4y yx5z

Introduction (FPGA)  An example of a full design on an FPGA

Fault Tolerance  Device-Level Fault Tolerance Attempts to deal with faults at the level of FPGA hardware Select redundant HW, replace faulty one Solution with extra HW resources  Configuration-Level Fault Tolerance Tolerates faults at the level of FPGA configuration When a circuit is placed, fault-free resources are selected Status of the resources is considered each time a circuit is placed-and-routed Solution with extra reconfiguration time

Device-Level FT Methods(1)  Extra Rows One extra spare row is added Selection Logic is added to bypass the defective row Vertical wire segments are increased by one row Faults in one row can be tolerated More than 1 spare row needed to tolerate faults in multiple rows

Device-Level FT Methods(2)  Reconfiguration Network Four architectural changes  Additional routing resources (bypass lines)  Reconfiguration Memory to store locations of faulty resources  On-chip circuitry for reconfiguration routing  Additional column of PLBs

Device-Level FT Methods(2)  Reconfiguration Network Test and identify faulty resources Create fault map Load map into Reconfiguration Memory On-board router avoids faulty resources The network is constructed by shifting all PLBs in the fault- containing row towards the right Method can tolerate 1 fault in each row if there is one extra spare column.

Device-Level FT Methods(3)  Self-Repairing Architecture Sub-arrays of PLBs Routers between sub-arrays Extra columns of PLBs PLBs constantly test themselves If a fault is detected,  Column of affected PLB is shifted one position to the right  The inter-array routers are adjusted Area overhead of this method is significant If there is 1 spare column and N sub- arrays in vertical, method can tolerate N faults at a time

Device-Level FT Methods(4)  Block-Structured Architecture Goal: tolerate larger and denser patterns of defects efficiently Blocks of PLBs FPGA is configured by a loading arm. The block at the end of loading arm is configured

Device-Level FT Methods(4)  Block-Structured Architecture A block is selected by the loading arm and tested If the test is passed, it is configured, otherwise designated as faulty Loading arm configures blocks one by one If the arm cannot extend any further in a path, it’s retracted by one block Fault tolerance is provided by redundant rows and/or columns Area overhead is significant

Device-Level FT Methods(5)  Fault Tolerant Segments/Grids Fault Tolerant Segments:  Adds one track of spare segment to each wiring channel  If a faulty segment is found, segment is shifted to spare  Single fault can be tolerated Fault Tolerant Grids:  An entire spare routing grid is added  No additional elements in routing channel, no extra time delay

Device-Level FT Methods(6)  SRAM Shifting Based on shifting the entire circuit on the FPGA PLBs should be placed in 2 ways:  King Allocation: 8 PLBs uses one spare, circuit can move in 8 directions  Horse Allocation: 4 PLBs uses one spare, circuit can move in 4 directions Testing determines the faulty cells, feeds information to the shifter circuitry on the FPGA.

Device-Level FT Methods(6)  SRAM Shifting Additional spare PLBs surrounding the FPGA Horse Allocation used in the figure The circuit is shifted up and right Advantages of the Method:  No external reconfiguration algorithm is required  The timing of the circuit is almost fixed Any single fault can be tolerated

Configuration-Level FT Methods(1)  Pebble Shifting Find an initial circuit configuration, then move pieces from faulty units Occupied PLBs are called pebbles Pair pebbles on faulty cells with unique, unused cells such that sum of weighted Manhattan distance is minimized Start shifting pebbles If a pebble finds an empty cell other than the intended cell, this empty cell becomes the destination No limit to the number of faults that can be tolerated

Configuration-Level FT Methods(1)  Pebble Shifting Example: 1 and 6 are on faulty cells Using a minimum-cost, maximum matching algorithm, pairings are: 1->v 11 and 6->v 32 Element 1 is shifted its position To move 6, we shift 3,8 and 7 Now all elements are on non-faulty cells and allocation is done

Configuration-Level FT Methods(2)  Mini-Max Grid Matching Uses a grid matching algorithm to match faulty logic to empty, non-faulty locations Like Pebble Shifting, uses minimum cost, maximum matching algorithm Minimizes the maximum distance between the pairings, since the circuit’s performance is set by the critical (longest) path Can tolerate faults until there are no unused cells

Configuration-Level FT Methods(3)  Node-Covering and Cover Segments When a fault is discovered, nodes are shifted along the chain (row) towards the right The last PLB of a chain is reserved as a spare One fault in a row can be tolerated Needs no reconfiguration if local routing configurations are present

Configuration-Level FT Methods(4)  Tiling Partition FPGA into tiles Precompiled configurations of tiles are stored in memory Each tile contains system function, some spare logic and interconnect resources When a logic fault occurs in a tile, the configuration of the tile is replaced by a configuration that does not use the faulty resources Many logic faults can be tolerated Local interconnect faults can be tolerated, but global ones can’t be tolerated

Configuration-Level FT Methods(5)  Cluster-Based Intracluster tolerance in a PLB Basic Logic Elements (BLEs or LUTs) For simple LUT faults, preferred solution is to use another LUT in the PLB Instead of changing PLB, try to find a solution in the same PLB In example, T is faulty and 4 th PLB is used instead of 2 nd PLB

Configuration-Level FT Methods(6)  Column-Based Treats the design as a set of functional units, each unit is a column Like Tiling, less cost precompiled configurations At least one column should be spare If there is a faulty cell in a column, the column is shifted toward the spare column Method can tolerate m faulty columns, where m is the number of columns not occupied by system functions

Comparison of Methodologies(1)  Device Level (DL) Methods need extra HW and have more area cost  DL Methods use one initial reconfiguration and no extra reconfiguration cost  Configuration Level Methods needs more than one reconfiguration and sometimes result in high time cost  CL Methods don’t need extra HW and no additional area cost

Comparison of Methodologies(2)  DL Methods are less flexible, therefore less able to improve reliability  CL Methods usually tolerate more faults than DL Methods  Performance impact of fault tolerance is less for DL Methods than CL Methods

Conclusion  No single Fault Tolerance methodology is better than the others in all cases.  DL Techniques has less impact on performance, but not flexible  CL Methods tolerates more faults but have more impact on performance