Presentation is loading. Please wait.

Presentation is loading. Please wait.

LaRC MAPLD 2005 / A208 Ng 1 Radiation Tolerant Intelligent Memory Stack (RTIMS) Tak-kwong Ng, Jeffrey Herath Electronics Systems Branch Systems Engineering.

Similar presentations


Presentation on theme: "LaRC MAPLD 2005 / A208 Ng 1 Radiation Tolerant Intelligent Memory Stack (RTIMS) Tak-kwong Ng, Jeffrey Herath Electronics Systems Branch Systems Engineering."— Presentation transcript:

1 LaRC MAPLD 2005 / A208 Ng 1 Radiation Tolerant Intelligent Memory Stack (RTIMS) Tak-kwong Ng, Jeffrey Herath Electronics Systems Branch Systems Engineering Directorate NASA Langley Research Center t.ng@nasa.gov jeffrey.a.herath@nasa.gov 757-864-1097 (Tak) 757-864-1098 (Jeff)

2 LaRC MAPLD 2005 / A208 Ng 2 Agenda What is it ? Goals Components selection FPGA SEU mitigation XTMR tools Status Future work Points to ponder

3 LaRC MAPLD 2005 / A208 Ng 3 What is it ? Radiation tolerant –Use commercial-off-the-shelf (COTS) components Reprogrammable FPGA High performance Lower cost –Pick parts with applicable mitigation techniques Shielding, over-current protection, triple module redundancy, FPGA configuration scrubbing Intelligent –Reprogrammable FPGA SDRAM controller Capacity to add custom logic Memory –Large capacity SDRAM Stack –3D vs 2D, board space saving

4 LaRC MAPLD 2005 / A208 Ng 4 Goals Large memory capacity –256 MB EDAC Single +3.3V power supply Simple interface, LVTTL compatible Throughput –32 MWord write –16 MWord read Reprogram via the JTAG interface Spare FPGA gate capacity for user application Radiation characteristics –Total ionizing dose of 100 krad (Si) at 25 o C –SEU: best practice –SEL of 60 MeV-cm 2 /mg requirement Operating temperature: -40 o C / +85 o C

5 LaRC MAPLD 2005 / A208 Ng 5 Components Selection (1/3) FPGA –Reprogrammable –Xilinx Virtex, Virtex-II XQR2V1000 –Total ionizing dose of 200 krad (Si) (data sheet) –SEL of 160 MeV-cm 2 /mg (data sheet) –Current limiters Limited SEFI –POR, SelectMAP, JTAG –1.5E-6 upsets/device/day (data sheet) SOFT –Mitigation techniques: TMR, configuration scrubbing –XQ2V1000-4BG575 Military version for lower cost –SEL may not be as good as XQR2V1000 –SEL of 124 MeV-cm 2 /mg Capacity of 1 M gates 328 Signal I/Os

6 LaRC MAPLD 2005 / A208 Ng 6 Components Selection (2/3) EEPROM –Xilinx XQR18V04 Total ionizing dose of 10 krad (Si) (data sheet) –30 krad (Si) for read only (data sheet) SEL of 120 MeV-mg/cm 2 (data sheet) SEU of 120 MeV-mg/cm 2 (data sheet) SDRAM –Elpida EDS5108ABTA (512Mb) Total ionizing dose of 50 krad (Si) SEL of 80 MeV-mg/cm 2 at 85 o C, 100 o C, 125 o C SEU –Bit error rate of 6.96E-12 errors/bit-day –SEFI error rate of 1.3E-4 errors/device-day Linear Regulator –Texas Instrument TPS75715 (1.5V LDO regulator) Total ionizing dose of 10 krad (Si) SEL of 60 MeV-cm 2 /mg

7 LaRC MAPLD 2005 / A208 Ng 7 Components Selection (3/3) Current limiters –Maxim-IC MAX893L (1.2A), MAX891L (0.5A) Total ionizing dose SEL of 30 krad (Si) Power-On-Reset circuit –Maxim-IC MAX803 Total ionizing dose of 20 krad (Si) Stacking technology –Provided by 3D Plus

8 LaRC MAPLD 2005 / A208 Ng 8 Radiation Mitigation Total ionizing dose –Local shielding –Package shielding, thickness depend on requirement SEL –Current limiting device SEU –Memory contents TMR, EDAC –FPGA SEU Configuration scrubbing, TMR SEFI –Best effort to minimize the SEFI rate –Mitigate at higher level

9 LaRC MAPLD 2005 / A208 Ng 9 Block Diagram

10 LaRC MAPLD 2005 / A208 Ng 10 FPGA SEU Mitigation (1/5) Input –Xilinx recommendation Use 3 pins per signal, connected on the board Bus signals: use one pin per signal, add EDAC, save pins –The sending side must generate EDAC check bits Pins can be used up quickly –Implementation Module Interface –Use 3 pins per signal for address/controls –Use 1 pin per signal for Din EDAC is optional Single point failure rate increases without EDAC

11 LaRC MAPLD 2005 / A208 Ng 11 FPGA SEU Mitigation (2/5) Output –Xilinx recommendation Use 3 pins per signal, connected on the board –Not glitch-free –Signal integrity Bus signals: use one pin per signal, add EDAC, save pins –The receiving side must also implement EDAC Pins can be used up quickly –Implementation Module interface –Use 3 pins per signal for controls –Use 1 pin per signal for Dout EDAC is optional Single point failure rate increases without EDAC

12 LaRC MAPLD 2005 / A208 Ng 12 FPGA SEU Mitigation (3/5) Output –Implementation … SDRAM interface –Clock, Address 3 sets, equivalent signals are not connected together on the board, Each set drives two SDRAMs –Controls 4 sets, equivalent signals are not connected together on the board Two of the sets, each drives two SDRAMs The other two sets, each drives one SDRAM Switch EDAC/TMR configured SDRAM

13 LaRC MAPLD 2005 / A208 Ng 13 FPGA SEU Mitigation (4/5) Bi-directional –Xilinx recommendation Use 1 pin per signal Path from voter to the pin becomes possible single point failure –Implementation SDRAM Interface –TMR configured SDRAMs 3 sets of data bus –EDAC configure SDRAMs Use 1 pin per signal

14 LaRC MAPLD 2005 / A208 Ng 14 FPGA SEU Mitigation (5/5) Implication on data integrity of the SDRAM contents –EDAC configured SDRAMs 256 MB Output drivers and input receivers are possible single point failure –TMR configured SDRAMs 128 MB No single point failure Back ground SDRAMs content scrubbing

15 LaRC MAPLD 2005 / A208 Ng 15 XTMR Tool (1/4) Fairly fast Gates utilized –Average utilization cost of TMR is ~3.2x –RTIMS actual 4.3x Gates multiplier = 3 + 3 * (fraction of flops + fraction of I/Os) –It is closer to 3x for design that is mostly gates –It is closer to 6x for design that is mostly flops –RTIMS actual: 36% flops Additional multiplier for design with SRL16

16 LaRC MAPLD 2005 / A208 Ng 16 XTMR Tool (2/4) Internal performance degradation –Average performance impact of TMR is ~10% –RTIMS actual ~20% 6 logic levels original –Add a voter, 7 levels –~15% performance impact Longer routing –3.8x gates –~5% performance impact

17 LaRC MAPLD 2005 / A208 Ng 17 XTMR Tool (3/4) I/O performance degradation Input Pin –TMR Voters after the FF Lock the FF in the IOB –No TMR on input pin 3 FFs after the input receiver Can’t lock the FF in the IOB Performance penalty RTIMS actual: increased from 1.8 ns to 3.6 ns

18 LaRC MAPLD 2005 / A208 Ng 18 XTMR Tool (4/4) Output Pin –Triplicate pin, tied together on board Add Voter before the output driver Glitch Can’t lock the FF in the IOB Performance penalty Signal integrity –Not triplicating pin Add voter before the output driver Glitch Can’t lock the FF in the IOB Performance penalty –RTIMS actual: increased from 4.5 ns to 6.4 ns

19 LaRC MAPLD 2005 / A208 Ng 19 Storage state Correct SEU on storage state before the next SEU that make it uncorrectable Memory content –Scrubbing Flop state –Basic Xilinx flop: FDCPE(PRE, D, CE, C, CLR, Q) –Inputs of FLOP are corrected –Unless CE is active, the Flop state is not corrected. –3 minority voters and 3 OR gates can be added to force a CE on error detected –Expensive to apply this universally –For “almost” static flop, the following FLOP is used

20 LaRC MAPLD 2005 / A208 Ng 20 A few other things (1/4) Digital Clock Manger –Use 3 DCMs for each DCM that is in the original design –DCM is a unit SEU on a FLOP in the DCM –Corrected by configuration scrubbing –Reset only –3 counters, each counter is clocked by a DCM –When one of the counter value is different from the other two, we know which DCM is operating differently than the others –Each counter is TMR so that a SEU on the counter other than the clock path will not produce an error

21 LaRC MAPLD 2005 / A208 Ng 21 A few other things (2/4) Configuration scrubbing –Similar to Virtex –Virtex II Whole configuration is loaded with 1 type 2 command The order of configuration loading is –GCLK, CLB and IOB, Memory Content, and Memory Control –Script to split the loading into three type 2 command GCLK, CLB, IOB Memory control Memory content –On power up the whole configuration is loaded –On scrubbing, only GCLK, CLB, IOB, and memory control are loaded

22 LaRC MAPLD 2005 / A208 Ng 22 A few other things (3/4) Configuration scrubbing –Scrubber logic is TMR and it is part of the FPGA code –Master SelectMap for configuration with configuration clock continue to run after initial load –Scrubber logic is clocked by the configuration clock The generation of the configuration clock becomes a possible single point failure Can switch to Slave SelectMap and add an external oscillator

23 LaRC MAPLD 2005 / A208 Ng 23 A few other things (4/4) SelectMap Interface SEFI detection –Implement a 16x1 distribute memory as SRL16 with initial value of all zeros –Instruct XTMR not to convert it to registers –Write a signature into this memory prior to configuration scrubbing –This memory shall be clear because of the reloading of the CLB during configuration scrubbing –Read the memory content after configuration scrubbing –A non-zero content indicates scrubbing failure

24 LaRC MAPLD 2005 / A208 Ng 24 Stack SDRAMMISC

25 LaRC MAPLD 2005 / A208 Ng 25 Status 20 Modules –Related paper: "Radiation Tolerant and Intelligent Memory for Space" (P1025) –144-Lead QFP package –Dimensions:42.5mm x 42.5mm x 13.0 mm –Mass: 70g with radiation shielding –Power: ~4.0 W peak –To Be Verified / Analyzed Total Ionizing Dose > 100 krad (Si) SEU in GEO less than 1.5E-6 per day Latch-Up Immune to 60 MeV-cm 2 /mg

26 LaRC MAPLD 2005 / A208 Ng 26 Future Work VHDL and Place & Route –Works in progress Minimize SEFI Error detection and recording Error recovery What is the SEFI rate of RTIMS ? Environment testing –Life test (accelerated component life testing) –100 krad (Si) TID radiation tests –SEL and SEU radiation tests –Vacuum and temperature tests –Mechanical stress tests –Electrostatic discharge tests

27 LaRC MAPLD 2005 / A208 Ng 27 Points to ponder XTMR –Not a turn key process Scrub memory content Almost static flop DCM failure detection and reset Glitch-free output is no longer glitch-free Signal integrity with dotted output –IO 3 pins for one signal, EDAC Tie the triplicate IO together vs carry three signals on the board with the voter implemented on the receiving side –One size does not fit all


Download ppt "LaRC MAPLD 2005 / A208 Ng 1 Radiation Tolerant Intelligent Memory Stack (RTIMS) Tak-kwong Ng, Jeffrey Herath Electronics Systems Branch Systems Engineering."

Similar presentations


Ads by Google