Presentation is loading. Please wait.

Presentation is loading. Please wait.

FE-I4 Chip Design Marlon Barbero, Bonn University Vertex 2009, Putten, The Netherlands Feb. 17 th 2009.

Similar presentations


Presentation on theme: "FE-I4 Chip Design Marlon Barbero, Bonn University Vertex 2009, Putten, The Netherlands Feb. 17 th 2009."— Presentation transcript:

1 FE-I4 Chip Design Marlon Barbero, Bonn University Vertex 2009, Putten, The Netherlands Feb. 17 th 2009

2 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 2 Contents IBL (sensor damage) / sLHC (increased luminosity).  From FE-I3, current ATLAS pixel FE, to FE-I4, new ATLAS pixel FE for IBL & sLHC. Analog pixel. Digital pixel and digital Double-Column. Periphery. Yield. Conclusion and milestones.

3 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 3 FE-I4 for IBL & sLHC IBL (~2014): inserted layer @ 3.7cm in current pixel detector. Present beam pipe & B-Layer Existing B-layer New beam pipe IBL mounted on beam pipe r~37 mm FE-I4 Tobias Flick’s tentative ID layout for sLHC? ATLAS present Inner Detector (ID) - All Silicon. - Long Strips/ Short Strips / Pixels. - Pixels: - 2 or 3 fixed layers at ‘large’ radii (large area at 16 / 20 / 25 cms?) -2 removable layers at ‘small’ radii sLHC tentative layout (>2017): 4 to 5 pixel layers, small radii / large(r) radii (note: Discussion on boundary pixel / short strips, …).

4 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 4 Motivation for Redesign of FE Need for a new FE? Smaller b-layer radius + potential luminosity increase  higher hit rate.  FE-I3 column-drain architecture saturated.  FE-I4 new digital architecture: local regional memories, stop moving hits around (unless RO).  FE-I4 has smaller pixel (reduced cross-section). New technology:  Higher integration density for digital circuits, rad-hard, availibility. 0.25 μm  130 nm FE-I3  FE-I4 Hit prob. / DC Inefficiency [%] LHC IBL sLHC FE-I3 at r=3.7 cm! The “inefficiency wall” 100 80 60 40 20 0

5 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 5 Future FE-I4-Based Module (& Consequences for FE-I4) FE-Chip Sensor Flex 1 1 2 2 3 3 4 1) Big chip (periphery on one side of module). 2) Reduce size of periphery (2.8 mm  2 mm). 3) Thin down FE chips (190 μm  90 μm). 4) Thin down the sensor (250 μm  200 μm)? 5) Less cables (powering scheme)? 5 Increased active area: from less than 75 % to ~90 %:  Reduced periphery; bigger IC; cost down for sLHC (main driver is flip-chip costs per chip). No MCC:  More digital functionality in the IC. Power:  Analog design for reduced currents; decrease of digital activity (digital logic sharing for neighbor pixels); new powering concepts. 8 metal layers [2 thick Alu.]  power routing. challenging: power (routing, start-up), clk. distrib., simulation / management, yield 4

6 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 6 Some Target Specs for FE-I4 Rad.-hardness: >200 MRad ionizing dose (FE-I3: >50 Mrad). Minimal guidelines: no ELT, NMOS guard rings for analog & sensitive digital circuitry. ToT coded 4 bits. DC leakage current tolerant to > 100 nA. FE-I3FE-I4 Pixel Size [μm 2 ]50×40050×250 Pixel Array18×16080×336 Chip Size [mm 2 ]7.6×10.820.2×19.0 Active Fraction74 %89 % Analog Current [μA/pix]2610 Digital Current [μA/pix]1710 Analog Voltage [V]1.61.5 Digital Voltage [V]21.2 pseudo-LVDS out [Mb/s]40160 biggest in HEP to date analog / digital power tuned for IBL occupancy

7 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 7 Analog Pixel In FE-I4_proto1 (FE-I4 prototype submitted in 2008): 2-stage architecture optimized for low power, low noise, fast rise time.  regul. casc. preamp. nmos input.  folded casc. 2 nd stage pmos input.  Additional gain, Cc/Cf2~6.  2 nd stage decoupled from leakage related DC potential shift.  Cf1~17fF (~4 MIPs dyn. range). 12b configuration:  FDAC: tuning feedback current.  TDAC: tuning of discriminator threshold.  Local charge injection circuitry. 50  m 150  m Preamp Amp2 FDAC TDAC Config Logic discri

8 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 8 Analog Pixel: Noise & Irradiation Dose received: ~200 Mrad  Low I (10 μA): noise increases ~20 %. Low current (which is target value) is 10 μA/pixel for preamp+amp2+comparator Current for minimum noise: 17 μA/pixel (12μA) m~65 e - Excellent un-tuned threshold dispersion. m~100 e - (17μA) m~65 e - m~90 e - (loaded ~400fF) ~200 Mrad

9 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 9 Digital Pixel: Regional Architecture local storage Store hits locally in region until L1T. Only 0.25% of pixel hits are shipped to EoC  DC bus traffic “low”. Each pixel is tied to its neighbors -time info- (clustered nature of real hits). Small hits are close to large hits! To record small hits, use space instead of time. Handle on TW. low traffic on DC bus Consequences: physics simulation shows efficient architecture. lowers digital power consumption. spatial association of digital hit to recover lower analog performance. disc. top left disc. bot. left disc. top right disc. bot. right 5 ToT memory /pixel 5 latency counter / region hit proc.: TS/sm/big/ToT Read & Trigger Neighbor Token L1TRead Digital Region 4-Pixel Unit

10 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 10 Performance / Efficiency IBL: charge sharing in Z comparable with r/phi. Memories SimulationAnalytical IBL10xLHCIBL10xLHC 5 0.047%2.19%0.029%2.25% 6 0.011%0.65%0.003%0.57% 7 <0.01%0.16%<0.01%0.13% η=0 Mean ToT = 4 0.6% Regional Buffer Overflow @ IBL rate, pile-up inefficiency is the dominant source of inefficiency Inefficiency: Pile-up inefficiency (related to pixel x-section and return to baseline behavior of analog pixel)  ~ 0.5%. Regional Buffer overflow  ~0.05% Inefficiency under control for IBL occupancy

11 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 11 Digital Column Architecture 168 regions + CLK + buffering scheme  1 digital DC

12 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 12 Digital Power 4-pixel region for 21 regions <7mV Digital power: at IBL occupancy, digital power < 10μW/pixel. Drop on Vdd

13 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 13 Pixel Layout Note: Digital ground tied to substrate, mixed signal environment BUT digital region placed in “T3” deep n-well. Preamp Amp2 FDAC TDAC Config Logic discri 50  m 250  m synthezised digital region (1/4 th )

14 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 14 FE-I4 Periphery Pix Array: data formatting / compression 80×336 pixel array Asynch. FIFO (hamming code) ‘LVDS’-out 160Mb/s 2 monitoring config. Periphery: PLL, 40MHz in, 160MHz out 40MHz digital ctrl block interface L1T global config global register bank pixel config trigger FIFO EoC Powering clk select 160MHz aux L1T, token, read, … EoC token EoC token 28 b × 40 DC The pixel array + End of Column logic (triple redundant token) pixel config Data formatting block readout FIFO digital ctrl block Command decoder: configuration, reset & L1T 8b10b encoder unit PLL (40MHz  160MHz), high speed serializer… LVDS output, 160 Mb/s, suited for IBL occupancy Powering: DC-DC converters, Shunt-LDO

15 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 15 Yield Estimated from: – Small analog test chips. – 8 fully tested wafers of Medipix 3 ICs, assuming same defect density for synthesized logic. Expect of order ~39% digitally perfect chips. Yield enhancement: – Triple redundant read tokens. – Hamming coded pixel data and address (w. minimal # of gates). – Redundant configuration shift register.  Fully functional chips yield might be as high as 76%. (with isolated dead pixels at level <0.1%).

16 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 16 Test Chip Submission FE-I4-P1 LDO Regulator Charge Pump Current Reference DACs Control Block Capacitance Measurement 3mm 4mm 61x14 array SEU test IC 4-LVDS Rx/Tx ShuLDO +trist LVDS/LDO/10b-DAC turboPLL : PLL core + PRBS + 8b10b coder + LVDS driv low power discri.

17 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 17 FE-I4 Collaboration Meeting 1 time a week. Collaborate remotely using Cliosoft platform. Participating institutes: Bonn: D. Arutinov, M. Barbero, T. Hemperek, A. Kruth, M. Karagounis. CPPM: D. Fougeron, M. Menouni. Genova: R. Beccherle, G. Darbo. LBNL: S. Dube, D. Elledge, M. Garcia- Sciveres, D. Gnani, A. Mekkaoui. Nikhef: V. Gromov, R. Kluit, J.D. Schipper Schedule: Submission planned for November 2009 (with submission readiness review 3-4 Nov. 2009) 160 18 FE-I3 FE-I4

18 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 18 backup BACKUP SLIDES

19 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 19 Existing B-layer Newbeam pipe IBL mounted on beam pipe Present beam pipe through present B-Layer

20 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 20 Preamp. & Leakage Compensation I_leakage compensation Cst I feedback Regulated cascode preamp.  high gain.  less crosstalk path through biasing voltage. Triple well NMOS input.  shield from substrate noise. Feedback capacitor discharged by NMOS feedback transistor. Leakage current compensation scheme based on differential amplifier. 2ke<Qin<22ke 100nA leakage  10mV DC shift Vout[V] t[s] Vout[V] t[s] 200m 250m 225m 0.1  0.5  0.1  0.5  200m 300m 380m

21 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 21 2 nd stage & Comparator PMOS input folded regulated cascode (straight cascode in futur?). Negative going output. Classic 2-stage comparator. 20ns timewalk for 2ke - < Qin < 52ke - & threshold @1.5ke - ENC=160e- @ Cd=0.4p & Il=100nA ENC[e - ] Cd[F] Qin[C] t LE [s] 100f 200f300f 60 100 150 10k20k30k40k Il=0nA 20n 10n 0 2 nd stage amplif. Discriminator Il=100nA

22 Clock Distribution  Physical implementation  Differential Logic.  Single Ended.  Possible structures  simple  H-tree  with skew balancing 22

23 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 23 Clock Multiplier For IBL, need to transmit data out at BW of 160Mb/s 2 options: – send a 80MHz CLK to the FE and use both edges to transmit Needs modification of BOC / ROD to produce higher speed TTC Needs synchronization protocol on the FE between 80MHz clock & beam crossing. A new DORIC needs to decode CLK at twice frequency – send a 40MHz CLK to the FE and multiply clock on FE Needs a clock multiplier on chip Note: synergy with what the strip MCC need In FE-I4, we will provide both options: – Clock multiplier from the 40MHz input clock – AUX: possibility to send the 80MHz to the FE I/O choices for ATLAS IBL, ATLAS Pixel System Design Task Force

24 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 24 8b10b encoder For IBL, need to transmit data out at BW of 160Mb/s At BOC/ROD: – Data rate 4 times the clock rate – Phase adjustment Use Clock Data Recovery mechanism CDR requires an output data stream with good engineering properties 8b10b: – adequate for this purpose, enough transitions for reliable CDR – widely used  easy to implement – provides some level of error detection – provides comma for frame identification & synchronization I/O choices for ATLAS IBL, ATLAS Pixel System Design Task Force

25 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 25 SEU-hardened latch CPPM has studied the influence of various layout of a DICE latch on the SEU x-section. Latch5.1 and latch5.2 ; Area :12µm × 4µm = 48 μm 2 nMos separation : 7µm ; pMos separation : 3 µm Physical separation of sensitive node pairs. Calin et al, IEEE Trans. Nucl. Sci. vol43, n.6, 1996 1.a 1.b 2.a 2.b 3.a 3.b 1.a 1.b 2.a 2.b 3.a 3.b Triple Redundant Logic with Interleaved Layout. X-section [cm 2.bit -1 ]: - Standard Latch: ~ 5.10 -14 - DICE w. improved layout: ~ 3.10 -16 X-section : < 1.10 -17

26 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 26 PLL Overview Charge Pump Voltage Controlled Oscillator Phase Frequency Detector Frequency Divider Loop Filter Conversion and Buffering 40 MHz 640 MHz

27 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 27 Upset detection II 3 pC upset charge in fast divider Feedback too fast signal Feedback too slow signal

28 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 28 Settling Behaviour Control Voltage Frequency of single-ended 40 MHz clock Frequency of single-ended 640 MHz clock 3 pC upset charge in fast divider

29 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 29 3×LHC / b-layer replacement 40 80 120 160 200 100 0 0200300400500 600 r [mm] z [mm] 37 50.5 88.5 122.5 6.306.466.035.855.916.466.11 2.552.562.542.552.642.652.64 1.411.241.26 1.371.341.33 12.1011.5312.0111.8511.7212.118.02 rates given in [pixel hits.bx -1 cm -2 ] FE-I3, 50μm×400μm. FE-I4 simul., 50μm×250μm. η=0.1η=0.2η=0.3η=0.4η=0.5η=0.6η=0.7η=0.8 η=0.9 η=1.0 η=1.2 η=1.5 η=2.0 η=2.5 η=3.0 η=3.5

30 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 30 Pixel occupancy  Data bandwidth Pixel hit rate  FE output bandwidth: – # bits / pixel transmitted? » address 7+9 bits, analog info 4+2 bits  22b? » data output protocol? Reduce data output by taking into account clustered nature of real physics hits. NUMBER OF PIXELS FE-I4, central module, 21cm layer FE-I4, central module, 3.7cm layer 10xLHC FE-I4, central module, 3.7cm layer 3xLHC

31 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 31 Pixel occupancy  Data bandwidth Example 3: clustered data out with fixed format. compression factor (all at 3×LHC) 3.7cm (vs. 21cm), η=0 indiv pixels: 4.09 (0.25)×(7+9+4+2)= 1.00 (1.00)A.U. static 1×2: 3.45 (0.18)×(7+8+2×4+2)=0.96 (0.83) A.U. dynamic 1×2: 3.02 (0.15)×(7+9+2×4+2)= 0.87 (0.74) A.U. static 1×4: 2.86 (0.17)×(6+8+4×4+4)=1.08 (1.08) A.U. dyn. in-DC 1×4: 2.43 (0.15)×(6+9+4×4+4)= 0.95 (0.95) A.U. dynamic 1×4: 2.13 (0.14)×(7+9+4×4+4)= 0.85 (0.94) A.U. DC(×40) row (×336) column rowToT NL assumption: 100kHz L1T, 336×80 pixels FE-I4 Disclaimer: no header, trailer, DC-balancing, error correction… 10 6.count.FE -1.s -1 preliminary

32 19.05.2009 7th International Meeting on Front-End Electronics – Shunt-LDO Regulator Slide 32 Shunt Regulator (FE-I3 approach) Shunt regulator generates a constant output voltage out of the current supply current that is not drawn by the load is shunted by transistor M1 Very steep voltage to current characteristic Mismatch & process variation will lead to different Vref and Vout potentials Most of the shunt current will flow to the regulator with lowest Vout potential Potential risk of device break down at turn on Using an input series resistor reduces the slope of the voltage to current characteristic I=f(V) R SLOPE helps distributing the shunt current between the parallel placed regulators R SLOPE does not contribute to the regulation and consumes additional power without resistor Iin[mA] with resistor Vout[V] 0 0.5 1 1.5 2 750 500 250 0

33 19.05.2009 7th International Meeting on Front-End Electronics – Shunt-LDO Regulator LDO Regulator with Shunt Transistor (ShuLDO) Simplified Schematic Combination of LDO and shunt transistor M4 shunts the current not drawn by the load M4 shunts the current not drawn by the load Fraction of M1 current is mirrored & drained into M5 Fraction of M1 current is mirrored & drained into M5 Amplifier A2 & M3 improve mirroring accuracy Amplifier A2 & M3 improve mirroring accuracy Ref. current defined by resistor R3 & drained into M6 Ref. current defined by resistor R3 & drained into M6 Comparison of M5 and ref. current leads to constant current flow in M1 Comparison of M5 and ref. current leads to constant current flow in M1 Ref. current depends on voltage drop V Iin which again depends on supply current Iin Ref. current depends on voltage drop V Iin which again depends on supply current Iin Shunt-LDO“ regulators having completely different output voltages can be placed in parallel without any problem regarding mismatch & shunt current distribution „Shunt-LDO“ regulators having completely different output voltages can be placed in parallel without any problem regarding mismatch & shunt current distribution Resistor R3 mismatch will lead to some variation of shunt current (10-20%) Resistor R3 mismatch will lead to some variation of shunt current (10-20%) „Shunt-LDO“ can cope with an increased supply current if one FE-I4 does not contribute to shunt current e.g. disconnected wirebond  ref current goes up „Shunt-LDO“ can cope with an increased supply current if one FE-I4 does not contribute to shunt current e.g. disconnected wirebond  ref current goes up Can be used as an ordinary LDO when shunt is disabled Can be used as an ordinary LDO when shunt is disabled Slide 33

34 19.05.2009 7th International Meeting on Front-End Electronics – Shunt-LDO Regulator Slide 34 Parallel Regulator Operation 2 regulators placed in parallel with Vout1=1.2 and Vout2=1.5 Output voltages settle at different potentials Current flowing through the regulator stays the same  Vout=1.2  Vout=1.5 Simulation

35 01.07.2009 FE-I4 Design Collaboration Meeting - LVDS Slide 35 LVDS Driver Standard LVDS architecture for 320MHz clock rate and adapted to low supply voltage 1.5- 1.2V Standard LVDS architecture for 320MHz clock rate and adapted to low supply voltage 1.5- 1.2V Current is routed to/from the output by four switches M3-M6 Current is routed to/from the output by four switches M3-M6 Common-mode voltage is measured by a resistive divider and controlled by a common- mode feedback circuit Common-mode voltage is measured by a resistive divider and controlled by a common- mode feedback circuit Output current is switchable between 0.6 - 3mA Output current is switchable between 0.6 - 3mA Boni et al., IEEE JSSC VOL. 36 NO. 4, 2001

36 01.07.2009 FE-I4 Design Collaboration Meeting - LVDS Slide 36 LVDS Receiver Tyhach et al., Tyhach et al., A 90-nm FPGA I/O Buffer Design With 1.6-Gb/s Data Rate for Source-Synchronous System and 300- MHz Clock Rate for External Memory Interface, IEEE JSSC, Sept. 2005 parallel PMOS/NMOS comparator input stages allow operation in a wide range of common- mode voltages parallel PMOS/NMOS comparator input stages allow operation in a wide range of common- mode voltages cross coupled positive feedback structure allows introduction of hysteresis cross coupled positive feedback structure allows introduction of hysteresis second stage sums singals from both input stage and converts them to a CMOS output signal second stage sums singals from both input stage and converts them to a CMOS output signal

37 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 37 LVDS transciever For IBL and outer layers sLHC, need for a 320Mb.s -1 BW/ LVDS i/0. LVDS transciever IC irradiated up to ~180Mrad. No degradation observed. IBM 130nm 0.8mm 1.8mm 40MHz 160MHz 320MHz Clock-Rate Common Mode Voltage 1050mV 600mV 150mV TX output Chained RxTx output @ 320 MHz Clock tests with differential probe and 100 Ω on board term. @ 1.2V supply

38 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 38 Output data protocol 24-bit Record Word Acro -nym Field 1Field 2Field 3Field 4Field 5Comments Data Header DH11101001xxxx [3:0] trigID [7:0] bcID xxxx reserved for later use Data Record DR [6:0] Column [8:0]Row [3:0] ToTtop [3:0] ToTbot Column numbering: 0000001 to 1010000 Address Record AR11101010Type [14:0] Address Type 0: Global Register / Type 1: Shift Register Position Value Record VR11101100 [15:0] Value Value Record without previous Address Record allowed Service Record SR11101111 [15:0] Message Service Message (e.g. error codes) Empty Record ER00000000 Idle = K.28.1 commas (8b10b coding case) 8b10b frame: SOF (K.28.7), followed by 24 bits record word(s) and an EOF (K.28.5) + idle (K.28.1) Table: possible data words

39 Marlon Barbero, FE-I4 Chip Design, Vertex09, Putten, The Netherlands, Sep. 17 th 2009 39 Future Directions Still higher rate capability Need smaller, faster pixel Yet need more memory per pixel to buffer higher rate Two directions to explore: 3D and 65nm  FE-I4 region placed on 2 130nm tiers would have 60% pixel size and 50% more logic/memory Goal FE-I3 pixel -- 3.2 bits FE-I4 -- 25 bits ~35 bits


Download ppt "FE-I4 Chip Design Marlon Barbero, Bonn University Vertex 2009, Putten, The Netherlands Feb. 17 th 2009."

Similar presentations


Ads by Google