Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sana Rezgui 1, Jeffrey George 2, Gary Swift 3, Kevin Somervill 4, Carl Carmichael 1 and Gregory Allen 3, SEU Mitigation of a Soft Embedded Processor in.

Similar presentations


Presentation on theme: "Sana Rezgui 1, Jeffrey George 2, Gary Swift 3, Kevin Somervill 4, Carl Carmichael 1 and Gregory Allen 3, SEU Mitigation of a Soft Embedded Processor in."— Presentation transcript:

1 Sana Rezgui 1, Jeffrey George 2, Gary Swift 3, Kevin Somervill 4, Carl Carmichael 1 and Gregory Allen 3, SEU Mitigation of a Soft Embedded Processor in the Virtex-II FPGAs 1 Xilinx, Inc., San Jose, CA 2 The Aerospace Corporation, El Segundo, CA 3 Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 4 NASA Langley, Hampton, VA For the North American Xilinx Test Consortium

2 MAPLD 2005/E238 Rezgui 2 Objective Use of embedded system applications built on S-FPGAs in radiation environment => Mitigation to SEUs and Design Implementation Mitigated Design Performances ― Simplicity, flexibility and automation ― Area and timing performances Upset Sensitivity in Radiation Environment ― Characterization of the FPGA sensitivity in beam ― Evaluation of the proposed mitigation solution for the embedded design Measure the in-beam performance of upset mitigation technique applied to a complex design - a processor- implemented on FPGA running a computationally intensive benchmark program

3 MAPLD 2005/E238 Rezgui 3 Studied Case Mitigation to SEUs of the Xilinx soft IP processor MicroBlaze by means of the Triple Modular Redundancy (TMR) technique Configuration Logic Block (CLB) Block RAM 18 bit Multipliers Programmable I/Os Digital Clock Manager MicroBlaze

4 MAPLD 2005/E238 Rezgui 4 Internal Architecture MicroBlaze is a 32-bit Harvard Bus RISC Architecture

5 MAPLD 2005/E238 Rezgui 5 MicroBlaze Mitigation 1.Use TMR technique to mitigate the design to SEUs MicroBlaze designs consist of I/Os, Look-Up Tables (LUT), Flip-Flops (FF) and user memory elements, For TMR Tool (developed by Xilinx), MicroBlaze is no different than any other design. 2.Run Active Readback and Continuous Scrubbing of all the static used resources for error detection and correction This is transparent and independent to/from the running design, User memory elements can not be scrubbed from the configuration port.

6 MAPLD 2005/E238 Rezgui 6 Internal Architecture User memory elements: SRL16s, Distributed Memory (LUT-RAM), BRAMs Active Readback causes problems with user memory elements (dynamic content) BRAM static partial reconfiguration is not possible if storing program data in addition to the code LUT-RAMs SRL16s BRAM

7 MAPLD 2005/E238 Rezgui 7 User Memory Mitigation Error Detection and Correction (EDAC) ―Additional decoding logic would be required ―Depends on the speed of detection and correction of upsets Replacement of the user memory elements by FFs and LUTs ―SRL16 are automatically replaced by FFs and LUTs by the TMR Tool ―Distributed RAM (LUT-RAM) are not set to be automatically replaced: A custom macro is then required for their replacement by FFs and LUTs Triple Modular Redundancy and Self-Correction of the BRAMs ―Done automatically through the TMR Tool by replacing each BRAM by a custom macro that scrubs the BRAM itself EDAC and TMR can be defeated by error accumulation

8 MAPLD 2005/E238 Rezgui 8 BRAM Mitigation Methodology 1.Apply TMR on the used BRAMs 2.Insert an internal scrub controller of the 3 BRAMs by their voted output value Mitigation Requirement: Only one BRAM port could be used for the MicroBlaze design Each Block RAM is replaced with the tmred BRAMs and the internal BRAM scrubber controller

9 MAPLD 2005/E238 Rezgui 9 EDK / TMR Tool Design Flow System DesignImplementationTMR ToolNGDBuildMAPPARBitGen / BitInit Design Entry EDK/ISE XTMR Conversion TMR Tool Implementation ISE.ngc.bmm.elf.edf (Manual edit).ucf.ngo LUTRAM & BRAM Macro Replacement

10 MAPLD 2005/E238 Rezgui 10 Implementation and Performance (1) Virtex II- 6000 Used Internal Resources Single String MicroBlaze Mitigated Mblaze design with LUT-RAM Mitigated Mblaze design without LUT-RAMs Full Mitigated Design

11 MAPLD 2005/E238 Rezgui 11 Implementation and Performance (2) Timing Performances and Core Voltage Current Consumption Tested DesignMaximum Frequency (MHz) Current Consumption (A) Single-string Mblaze (Phase 1)770.37 Mitigated Mblaze design before Replacement of LUT-RAM (Phase 2) 660.78 Mitigated Mblaze design after Replacement of LUT-RAM (Phase 3) 660.83 Full Mitigated Design (Phase 4)660.99

12 MAPLD 2005/E238 Rezgui 12 Experimental Test Designs Service FPGA: XC2V3000 1.Configuration Monitor DUT Configuration Continuous alternate scrubbing and readback at a rate of 4 per second SEFI Detection 2.Functional Monitor Sends input vectors to DUT Detects Errors based on the DUT outputs Records errors and exception occurrence Runs continuous handshaking with the DUT to assure its full synchronization with external peripherals DUT FPGA XQR2V6000 MicroBlaze design running Integer-based FFT software 33MHz MicroBlaze clock speed 0.25 MHz GPIO Bus Two mitigated design versions: 1.Without BRAM Scrubber 2.With BRAM Scrubber

13 MAPLD 2005/E238 Rezgui 13 DUT/Service FPGAs Communication

14 MAPLD 2005/E238 Rezgui 14 Experimental Setup Tested at Crocker Nuclear Laboratory at UC Davis using 63.3MeV Proton Beam DUT Service FPGA

15 MAPLD 2005/E238 Rezgui 15 Proton Beam Results (1) Error Classification ―Type 1: FFT program calculates an incorrect result ―Type 2: MicroBlaze communication sequence is wrong or stops (timeout) ―Type 3: An exception or interrupt is invoked Error Recovery Types ―The MicroBlaze recovers the next iteration of the program ―The MicroBlaze recovers when the processor was reset ―The MicroBlaze recovers after scrubbing the FPGA logic Non-Recovery Types (Type -R) ―Runaway Resets: Upsets in the MicroBlaze code (stored in the BRAM) in at least two domains ―Runaway Exceptions: Illegal operation on the MicroBlaze detected by the exception Handler (DUT/Service) ―Runaway Errors: Illegal code in the FFT computation code

16 MAPLD 2005/E238 Rezgui 16 Proton-Induced Cross Sections of the Design 1 at Various Fluxes Flux [p/cm 2 /s] CLB Upsets / Scrub Cycle Fluence [p/cm2] Type 1 Error Cross-Section [cm 2 ] Type 1R Error Cross-Section [cm 2 ] Type 2 Error Cross-Section [cm 2 ] Type 2R Error Cross-Section [cm 2 ] Type 3 Error Cross-Section [cm 2 ] (1) 1.70 x10 7 2 to 71.00 x10 11 7.00x10 -11 <1.00x10 -11 5.00x10 -11 <1.00x10 -11 (2) 1.70 x10 8 15 to 301.03 x10 11 2.92x10 -10 9.74x10 -12 2.05x10 -10 6.82x10 -11 <9.70x10 -12 (3) 1.70 x10 9 150 to 1904.86 x10 10 1.07x10 -9 <2.05x10 -11 7.82x10 -10 1.65x10 -10 3.60x10 -11 Flux [p/cm 2 /s] CLB Upsets / Scrub Cycle Fluence [p/cm 2 ] Type 1 Error Cross-Section [cm 2 ] Type 1R Error Cross-Section [cm 2 ] Type 2 Error Cross-Section [cm 2 ] Type 2R Error Cross-Section [cm 2 ] Type 3 Error Cross-Section [cm 2 ] (1) 1.94 x10 7 2 to 79.79 x10 10 7.56 x 10 -10 2.04 x 10 -11 6.34 x 10 -10 1.43 x 10 -10 8.17 x 10 -11 (2) 3.87 x10 7 4 to 152.49 x10 10 8.44 x 10 -10 < 4.02 x 10 -11 6.03 x 10 -10 2.01 x 10 -10 1.61 x10 -10 Proton-Induced Cross Sections of the Design 2 at Various Fluxes Proton Beam Results (2)

17 MAPLD 2005/E238 Rezgui 17 Conclusion A complete solution to mitigate an embedded processor implemented on a Xilinx Virtex II FPGA based on: ― Continuous external configuration scrubbing, ― Functional-block design triplication, ― Independent internal BRAM scrubbing (also triplicated). A high area and power dissipation penalties after replacement of the distributed RAMs At Low flux: Very low error cross-section (1.2x10 -10 cm 2 ) The error cross-section increase rapidly with increasing flux For space environment, it is predicted that the error rate of a MicroBlaze design should be lower than a SEFI rate, which prove the high efficacy of this solution

18 MAPLD 2005/E238 Rezgui 18 Learned Lessons Check if your design includes SRL16s or distributed RAMs to allow active scrubbing Do the SMOKE test: Break one domain and insure that the design is still running Reduce the flux to respect the first rule of TMR mitigation technique (1 upset / scrub cycle)

19 MAPLD 2005/E238 Rezgui 19 References 1.Lima, F., Carmichael, C., Fabula, J., Padovani, R. and Reis, R., "A Fault Injection Analysis of Virtex® FPGA TMR Design Methodology", RADECS’01, September 2001. 2.Lima (de) F., Rezgui S., Cota E.F., Lubaszewski M. and Velazco R., “Designing and testing a radiation hardened 8051-like micro-controller”, MAPLD’00, Laurel, Maryland, September 2000. 3.Swift G., Rezgui S., George J., Carmichael C., Napier M., Maksymowicz J., Moore J., Lesea A., Koga R. and Wrobel T., “Dynamic Testing of Xilinx Virtex-II Field Programmable Gate Array’s (FPGA’s) Input Output Blocks (IOBs)”, NSREC’04, July 2004. 4.Carmichael C., Bridgford B. and Moore J., “Triple Module Redundancy Scheme for Static Latch-Based FPGAs”, MAPLD 2004, Laurel, Maryland, September 2004. 5.Carmichael C., “Triple Module Redundancy Design Techniques for Virtex FPGAs”, http://www.xilinx.com/bvdocs/appnotes/xapp197.pdf, Xilinx Application Note XAPP197, November 2001. 6.MicroBlaze Processor Reference User Guide, Embedded Development Kit (EDK 6.3), UG081, Version 4.0, Xilinx Inc., August 2004. 7.Roberts T., Slaney M., FFT C Code available at http://www.jjj.de/fft/int_fft.c, December 1994. 8.TMR Tool User Guide, UG156, Version 6.2.3, http://support.xilinx.com/products/milaero/ug156.pdf, Xilinx Inc., September 2004. 9.Xilin Application Note 197, “Triple Module Redundancy Design Techniques for Virtex FPGAs”, November 2001.


Download ppt "Sana Rezgui 1, Jeffrey George 2, Gary Swift 3, Kevin Somervill 4, Carl Carmichael 1 and Gregory Allen 3, SEU Mitigation of a Soft Embedded Processor in."

Similar presentations


Ads by Google