Presentation is loading. Please wait.

Presentation is loading. Please wait.

SiGe HBT BiCMOS Field Programmable Gate Arrays for Fast Reconfigurable Computing Bryan S. Goda Rensselaer Polytechnic Institute Troy, New York.

Similar presentations


Presentation on theme: "SiGe HBT BiCMOS Field Programmable Gate Arrays for Fast Reconfigurable Computing Bryan S. Goda Rensselaer Polytechnic Institute Troy, New York."— Presentation transcript:

1 SiGe HBT BiCMOS Field Programmable Gate Arrays for Fast Reconfigurable Computing Bryan S. Goda Rensselaer Polytechnic Institute Troy, New York

2 Agenda Introduction BiCMOS FPGA History SiGe HBT BiCMOS Process Current Mode Logic Xilinx 6200 FPGA Design Configuration Memory Performance Results Conclusions and Future Work

3 Current Role of SiGe “More Zip per Chip” Wireless Phones -> Watch Sized Phone Direct Broadcast Satellite Fiber-Optic Lines, Switches, and Routers

4 Programmable Bipolar Logic 1983: Fairchild ECL Field Programmable Logic Array Fuse Based 4ns Cycle Rate High Power Scaling Problems 1990: Algotronix 1.2uM 256 Cell Configurable Logic Array f T 6 GHz, 200ps Gate Delay 4 Transistor Static RAM Memory Cells ASIC Emulation and Signal Processing Forerunner of XC6200

5 V+ a a V ref a a EN1 EN2 V- Y1 Y2 US Patent CMOS Switchable 2 Input Multiplexer

6 SiGe Heterojunction Bipolar Transistor Selectively introduce Ge into the base of a Si BJT Smaller Base Bandgap increases e - injection, higher Beta (100) Higher Beta allows more heavily doped base R B (125 Ohm) Graded Bandgap decrease base transit time f T

7

8 SiGe HBT 50Ghz Process, 100Ghz process within a year (30uA at 50 Ghz) 5 layers of metal Used in RPI VLSI Class co-integrated with CMOS process –can have HBT logic with CMOS memory –low power and high speed

9 f T Curves for Various Emitter Lengths

10 SiGe HBT Layout Base Emitter Collector Sub-Collector

11 ECEC EVEV e-e- h+h+ n + Si emitter p-Si Ge p-SiGe base E g,Ge (x=0) E g,Ge (grade)= E g,Ge (x=W b )- E g,Ge (x=0) Drift Field n - Si collector Band Diagram =0.031 ev Dielectric Constant Si = 11.7 Ge =16.2 SiGe (7.5% Ge)=12.03

12 CML Branch Current vs. Differential DC Voltage

13 IBM SiGe and CMOS Load Gate Delays on M1, M2, LM

14 Current Steering Logic Level 1 V cc 0 V -250 mV Level mV -1.2 V Level V V V ee 4.5 V Fastest Logic Level Limited Drive Capability Inter-block Signal Level Good Fan-Out (10) Clock Signal Slowest Level Level 4 Possible

15 Current Steering Logic In SiGe 13ps Transistor Switching Time (75 Ghz) – 6ps Process Next Year Small Voltage Swings (250mv) vs 3.3 or 5 V – Less Power – Smaller Swing = Faster “Steer” Currents, Use Differential Logic – Less Switch Noise Less Transistors needed, Complement Signal Present Flip-Flops and Multiplexers Easy to Implement

16 V cc AA A B B V ee V ref A XOR B O V Level V Level V -4.5V A B A XOR B CML XOR Logic Schematic A level1 B level 2 A XOR B

17 Logic Cell General FPGA Structure I/O Cell Routing Network Configuration Memory

18 High Speed FPGA Applications Real Time Image Processing - Radar - Pattern Recognition Digital Networks - Mobile Subscriber Equipment - Command Information Systems - High Speed Switching Nodes Control Systems - Guidance Systems - Reprogrammable Survivability Satellite Systems

19 Desired Image Search Image 1. Desired Image is programmed into chip (1 pixel = 1CLB) 2. Load a section of search image 3. If enough pixels match, then turn found bit on 4. Load another section, or reprogram with new desired image Image Correlation

20 Samples From XC6200 CAD Tools CLBs IO Blocks Pins

21 FPGA Drawbacks Slowdown – 200 Mhz Internal Speed down to MHz External – Pass Transistor = Low Pass Filter Limited Bandwidth Relatively Long Configuration Times (Seconds) Vender Guarded Information More Expensive than Comparable ASIC

22 Equivalent Circuit from Node 3 to Node 2 M M MM M M 1 On Interconnect Pass Transistor (Memory) Pass Transistor Interconnect Modeling

23 Field Programmable Gate Arrays (FPGA) Hierarchy Level Organization (Sea of Gates) – Simple Cells (Configurable Logic Blocks) – 4x4, 16x16, 64x64 groupings – Hierarchy of routing resources at each level – I/O Blocks (external interface )

24 Design Parameters Logic Swings Levels - Based on Differential Pair Switching - Current Levels Redesign of the Configurable Logic Block - Take Advantage of Differential Wiring - What Parts Can be Turned off if not Used? Supply Levels - How Many Levels of Logic? Routing Resources CMOS Voltage Levels - Integrate CMOS into Bipolar Current Tree

25 Current Tree with CMOS Routing

26 Pulse Width 50ps 60ps 70ps 100ps CMOS Bipolar Bipolar vs Bipolar/CMOS Current Trees

27 4:1 Multiplexer Level 1 Inputs Level 1 Output Level 1 Output Level 2 Input Level 2 Input Level 3 Input Level 3 Input CMOS Version W/L 5:1

28 If a=1 then select Y2 output = b If a=0 then select Y3 output = 0 A OR B A and B If a=1 then select Y2 output = 1 If a=0 then select Y3 output = b X2:= a 1010 X1:= a X3:= b Y2 Y3 X2:= b 1010 X1:= a X3:= a Y2 Y3 Sample Logic Using Multiplexers

29 Non-Inverted Output Inverted Output X2:=b 1010 X1:= a X3:= a Y2 Y3 X2:= b 1010 X1:= a X3:= a Y2 Y3 Redesign of XC6200 Logic Original XC6200 Design Have to Track Inversions Revised Design Use Differential Pair Logic Eliminate XC6200 Fast Logic No Inversion Tracking

30 X1 Y2 X2 X3 Y D Q Clk Q F C S RP Multiplexer CS Multiplexer Clr Original XC6200 Architecture Redesigned Architecture X1 Y2 X2 X3 Y D Q Clk Q F C S RP Multiplexer CS Multiplexer Clr Switchable Bipolar with CMOS Routing

31 10 Ghz Three CLB Simulation

32 4:1 Mux High Speed Logic 2:1 Mux CMOS Control Buffer 4:1 Mux (off switchable) CMOS Control Master/Slave Latch (off switchable) ( off switchable ) CLB Layout

33 Sample CLB Test Circuit CLB8:1 Mux Pad Drivers 8/1 Divide Buffer Vref

34 Actual Fabricated Test Circuit Pads (110u x 110u)

35 N S E W N4 S4 E4 W4 N S E W N4 S4 E4 W4 F N S E W N4 S4 E4 W4 X1 X2 CLB X3 Outgoing CLB RoutingIncoming CLB Routing

36 4x4 Block Boundary Routing S Switches E Switches N Switches W Switches S Switches E Switches N Switches W Switches Local Routing Magic Routing Length 4 FastLane (4x4) Length 16 Fastlane (16x16) Chip Length Fastlane (64x64)

37 N S E W N4 S4 E4 W4 N S E W N4 S4 E4 W4 F N S E W N4 S4 E4 W4 X1 X2 CLB X3 Local CLB Routing N S W F Wout S E W F Sout N S E F Eout N E W F Nout Nearest Neighbor Routing Output (F) or Local Through Example: Route East Signal Through to Next CLB Note: Can’t Route Signal Back to Origin at this Level

38 New Configuration Data V EE V V SS V SRAM Bits In Memory PlanesCMOS to CML Buffer decode CLB Multiplexe r Inputs V REF Normal CMOS Memory-CML Interface

39 D Latch M/S 40 Transistors D Latch M/S 18 Transistors D Q Clock Q CLK D Q Q Data Word Out RAM Cell 6 Transistors Parallel Load Memory Design

40 Memory Planes CLBs 3-D Chip Stacking Shorter Wires More CLBs/Area Optimize Memory

41 CLB with Routing and RAM (2) MUX CLB MUX Selects CLB Select RAM1 RAM2

42 Layout of Configurable Logic Block with 2 sets of RAM RAM 2:1 Mux 8:1Mux (routing) CMOS Selects CLB (logic) Master/Slave Latch (memory) Circuit Elements: 240 nfets 122 pfets 36 resistors 98 npn1 HBTs 16 npnhb1 HBTs

43 Circuit TypeBufferCML XOR,AND,OR MUX XOR,AND,OR CLB Propagation Delay 17ps22-25ps23-26ps100ps SiGe Performance * Projected Power Levels for 7HP Process: At 50Ghz, 30 uA, 20x+ reduction in power Power Decreasing Ideas DateIdea Power Consumption/CLB Dec 98Original CLB 73 mW June 99CLB Redesign I 34 mW Aug 99CLB Redesign II 24 mW Dec 99Widlar Current Mirror with CMOS Control, CMOS Routing 10.8 mW Mar 00Supply Voltage 4.5 -> 3.3V 7 mW Dec 00*7HP Process 0.3 mW

44 Multiplexer Performance vs Temperature Normal 250 mV Swing 200 mV Min Swing

45 Vcc Vref Vee Input Widlar Current Mirror with CMOS Control

46 XC6200 Design Improvements Developed at the University of Scotland Inversion of Signal at Every CLB - Taken care of due to differential pair wiring No Pass Transistors, Use Multiplexers for Routing Able to turn off unused parts with CMOS controlled current mirror No CMOS-CML Conversion circuits needed, CMOS in current trees Handcrafted, dense layouts Context Switching

47 Power Delay Product Year uW/gate/Mhz (log scale) PDP BiCMOS PDP CMOS High PDP CMOS Low 5HP 7HP 8HP

48 AABBCC AABBCC Slow Transition Fast Transition Data Dependent Switching Could Vary Signals Up to 30% Setup Time Violations Differential Logic has Complement Switching In Opposite Direction Bit Line Twisting

49 Future Work Testing Overall FPGA Architecture Scaling Integrate with Other Systems Projected Graduation May 2001, work to continue at USMA Power Reduction - 7HP Process

50 Pattern ps ~ 7.1 GHz Pattern ps Select AND OR AND OR AND OR CLB Context Switch Example

51 Redesigned CLB Cell with Routing and Memory (2x) 2x24 Bit RAM Three 8-1 Input Mux CLB Four 4-1 Output Mux M1 M2 M3 M4

52 CLB Row 4x1 Switch Circuit Elements 1520 Nfets 792 Pfets 260 Resistors 140 NPN1 HB 576 NPN1 Memory Bus Lines N/S Input Output

53 Device XC6209XC6216 XC6236 XC6264 Gate Count 9-13K 16-24K 36-55K K Number Cells I/O Blocks Row x Col 48x48 64x64 96x96 128x128 XC6200 Device Family

54 Typical Routing Delays Symbol ParameterXC6200 SiGe Redesign T NN Route Nearest Neighbor 1 ns 23 ps T magic Route X2/X3 to Magic Out 1.5 ns 47 ps T L4 Length 4 FastLane 1.5 ns 47 ps T L16 Length 16 FastLane 2 ns 70 ps T CL64 Chip-Length (64) Delay 3 ns 94 ps ~31x improvement

55 4x4 CLB Layout Cell Largest Basic Block Over 13,000 Transistors Commercial Product Size is a 4x4 Array of this Cell

56

57

58 Example High Speed Switch of 2 Incoming Signals Pattern Pattern Switch Point

59

60 5 Stage Ring Oscillator Schematic6.36 Ghz--8.4mA Parasitics5.71 Ghz89%8.6mA 50 o C5.26 Ghz82%8.85 mA 75 o C4.87 Ghz76%9.1 mA 100 o C4.16 Ghz65%9.34 mA 125 o C3.12 Ghz49%9.5 mA Speed Relative to Schematic Current

61

62

63 Technology Size, V threshold Effective Size, Vdd PDP Level (uW/gate/MHz) 1998 CMOSLdrawn=0.5u Vth=0.87V Leff=0.36u Vdd=3.3V Hi=0.36 Low= CMOSLdrawn=0.25u Vth=0.5V Leff=0.18u Vdd=2.5V Hi=0.18 Low= CMOSLdrawn=0.22u Vth=0.4V Leff=0.12u Vdd=1.8V Hi=0.1 Low= BiCMOS 5HPVbe=0.85VVdd=4.5V BiCMOS 7HPTBD 0.01 BiCMOS and CMOS Characteristics


Download ppt "SiGe HBT BiCMOS Field Programmable Gate Arrays for Fast Reconfigurable Computing Bryan S. Goda Rensselaer Polytechnic Institute Troy, New York."

Similar presentations


Ads by Google