Presentation on theme: "DeHon 2008 1 Devils Advocate View: CMOL, FPNI, nanoPLA…. André DeHon Benjamin Gojman, Nikil Mehta During the canonization process."— Presentation transcript:
DeHon 2008 1 Devils Advocate View: CMOL, FPNI, nanoPLA…. André DeHon email@example.com Benjamin Gojman, Nikil Mehta During the canonization process of the Roman Catholic Church, the Promoter of the Faith (Latin Promotor Fidei), popularly known as the Devil's Advocate (Latin advocatus diaboli), was a canon lawyer appointed by the Church to argue against the canonization of the candidate. It was his job to take a skeptical view of the candidate's character, to look for holes in the evidence, to argue that any miracles attributed to the candidate were fraudulent, etc. -- WikipediacanonizationRoman Catholic ChurchLatincanon lawyerskepticalmiracles
DeHon 2008 2 Case Molecules are not miraculous. Miracle of high density is exaggerated. Miracle of low energy is a slight of hand. Curse of variation falls on all who would dare reach the atomic-scale.
DeHon 2008 3 Two Ideas Benefits follow from two hypotheses: 1.Can fabricate parallel wires denser than arbitrary topology 2.Can place resistance-varying switch with quasi-non-volatile state in space of dense wire crossing Hysteretic switching No extra area to program Valid Prospects? Lets build regular architectures around resistive switches!
DeHon 2008 4 Inquisition What problem does CMOL/FPNI solve? Is this the bottleneck to scaling?
DeHon 2008 5 Problem Solved? What problem do these technology hypotheses address? –Density –(Economical) density ASIC Mgates/cm 2 1401802202803604507112800 ITRS 2007 Execsum Table 1i; assume 4TR/gate 11
DeHon 2008 6 Unpack Assumptions Previous table appears to assume –100,000 F 2 per gate in FPGA case 250,000 F 2 / 4-LUT × 2.5 gates/4-LUT Plausible, conservative –64 F CMOS 2 per gate in CMOL case assuming each buffer is a gate and buffer is 64F 2 –This assumption is stated in FPGA2006 paper. –Optimistically small. …plausibly within factor of 2. Ignores that most of these buffers will act as route through (provide no gates).
DeHon 2008 7 Right Problem? Is logic density of gates the bottleneck in scaling? –Economical logic density? –Density of programmable gates?
DeHon 2008 8 What is the Scaling Bottleneck? Density? Delay? Power Density? Reliability? Test and handling economics?
DeHon 2008 9 Methodology: Benchmark-Level Quantification For following, map Toronto 20 benchmarks –20 Largest MCNC benchmarks –Order of 10K gates each (so think small cores) Composite density/performance/energy –Includes overheads, route-through, fanout…
DeHon 2008 10 Density: Mapped Logic Strukov and Likharev FPGA2006 Only about 1 in 4 gates used as logic –775/4 190 comparable to ASIC gate density
DeHon 2008 12 How much density from nanowires? Look at F cmos =F nano =22nm (F cmos /F nano largest) –42 Mgates/cm 2 20× better than CMOS FPGA 5--20× worse than Fnano=3nm F CMOS (nm) 50455036323028262422 F nano (nm) 201816141210643.53 (Mgates/cm 2 ) Conservative 791114192765130160210 Extreme 516380100140200570130017002300 CMOS FPGA 0.40.50.60.8126.96.36.199.72.1 CMOL revised 160190250300375425500575675800 CMOS ASIC 140180220280360450710
DeHon 2008 13 Delay Challenge has been to turn capacity (area) into performance –Linear scaling considered excellent Something which is 10× denser –Better be less than 10× slower E.g. we expect 10 cores running at 100MHz to run slower than 1 core running at 1GHz If give up too much delay, no benefit.
DeHon 2008 14 Obtaining Performance Highly Pipelined nanoPLA designs –Conservative (demonstrated tech.) R onxpoint =100K Si =10 -3 -cm, NiSi =10 -5 -cm Only NiSi non-active areas Likharev only claim about 1GHz (unpipelined). (Nanoarch2007) F CMOS (nm)50455036323028262422 F nano (nm)201816141210643.53 Delay (ns)1.831.701.571.451.321.200.990.900.860.82 Conservative791114192765130160210 Extreme516380100140200570130017002300 CMOS FPGA0.40.50.60.8188.8.131.52.72.1 CMOL revised160190250300375425500575675800 CMOS ASIC140180220280360450710 Pipe delay stages = 452
DeHon 2008 16 Power Density Clock rates stopped scaling due to power density We can already fabricate more transistors than we can afford to activate. –Looking at gate capacitance alone (45nm) (highly optimistic, no wire) 6×10 -17 J/Tr/op (V dd =1V) ×700MTr/cm 2 ×10GHz = 420W/cm 2 (3000W/cm 2 at 22nm, V dd =0.7V)
DeHon 2008 17 Power Density: Quantitative F CMOS (nm)50455036323028262422 F nano (nm)201816141210643.53 Conservative (ns)1.831.701.571.451.321.200.990.900.860.82 (Mgates/cm 2 )791114192765127162215 Vdd=0.7V (W/cm 2 ) 12141720242946606878 Extreme (ns)0.2184.108.40.206.160.130.070.04 0.03 (Mgates/cm 2 )516380104142205569128016722276 Vdd=0.7V (W/cm 2 ) 2372873544535998562428549068168717 Vdd=0.3V (W/cm 2 ) 44536583110157446100812521601 What if we run them at full speed? CMOL dodge here is assuming Vdd=0.3V.
DeHon 2008 18 Power Density: Quantitative What can we use at 100W/cm 2 ? F CMOS (nm)50455036323028262422 F nano (nm)201816141210643.53 (ns)0.2220.127.116.11.160.130.070.04 0.03 (Mgates/cm 2 )516380104142205569128016722276 0.7V (W/cm 2 )2372873544535998562428549068168717 0.7V (W/cm2)100 Extreme Factor18.104.22.168.56.08.624.354.968.287.2 (Mgates/cm2)22 23 24 23 2526 (ns)0.7 0.80.91.01.22.214.171.124.9
DeHon 2008 19 Energy per Gate Evaluation (CMOL) Fcmos50454036323028262422 Fnano201816141210643.53 Cwire (fF)0.320.2126.96.36.199.200.260.320.310.30 Egate(0.3v) fJ0.170.150.130.188.8.131.52.180.170.16 Egate(0.3v)/kTln(2) / 1000 59.552.345.340.936.637.349.861.058.556.4 40,00060,000 kTln(2) per gate at T=300K Cg,total (FO4)0.18fF 22nm CMOS W=2F cmos Vdd=0.65 13,000 kTln(2) for T=300K Vdd=0.3 2,800 kTln(2)
DeHon 2008 20 Reliability: Can we lower the voltage? Lower voltage + Lower energy/op –Less headroom for V t variation More leakage, lower performance More bad parts compensate with sparing ? Subthreshold Operation Trade energy for performance –Fewer electrons defining state Higher susceptibility to transient upset –Thermal, shot ionizing particles.
DeHon 2008 21 Upset Rates Lower Voltage to achieve 100W/cm 2 –Assume (10% activity) –V=176mV (1GHz, 22nm,3Ggates/cm 2 ) 1cm 2 FIT Rates –Thermal 10 -6233 [calc. based on Kish PhysLetA 2002] –Shot 10 -700 [calc. based on Kish FNL2004] Increase in upset rate V=700mV to 176mV –Ionizing Particle upsets increase 20-100× [calc. based on Cohen IEDM1999, Degalahal ISQED2004] –Lack information for absolute grounding. Suggestions for better sources for reliability calculations appreciated.
DeHon 2008 22 Variation and Yield Are voltages plausible given variation? ASIC: optimistic bound –Require devices have 0<V th <V dd –V th (1-k )>0 and V th (1+k )<V dd –Say V th =V dd /2 –For 3Ggates k6-7 14% With ability to avoid gates –Let valid range be +/-1 68% of devices –Good buffer 46% of time density impact ~ 2 –Tolerates much larger variation
DeHon 2008 23 Testing and Handling Highly defective nanoPLA/CMOL/FPNI exploit component- specific mapping to tolerate Demands painful paradigm shift Assume can run mapping in 4 hrs on 250W workstation –1KWhr/chip x $0.15/1KWhr = $0.15 –(2000 Wafers/day x 675 dies/wafer) / 6 = 225,000 Workstations »But those live at customer site… not to mention handling …. Penn IC Group have ideas to address.
DeHon 2008 24 Bottleneck Conclusion Work in an E-D-A-Relability trade space Density is not the clear limiter Big hope is to trade this density to address other problems –Power density –Energy –Variation –Reliability
DeHon 2008 25 Additional Assumptions By Style CMOL –Pins above metallization FPNI –Nanoscale alignment of lithographic contacts Not just parallel lines Kuekes says litho rotated (7/12) nanoPLA –Relatively reliable assembly of large number of NWs –Reasonably controlled production of doped (coded) NWs
DeHon 2008 26 Inquisition Report If believed could achieve roadmap –CMOS ASICs provide density @ higher performance If need fine-grained programmability –Variation –Economics force few unique platforms …benefit from inexpensive programmability –100-400× density benefit –Plausible performance (as far as energy allows) Maybe 1GHz instead of 10GHz (1/10 th the speed) –Reduce energy through sparing/repair to contain variation Will cost post-fabrication handling
DeHon 2008 27 Summing Up Molecules are not miraculous. Miracle of high density is exaggerated. –Non-existent compared to ASIC –Closer to 2 orders of magnitude than 3 for FPGA Miracle of low energy is a slight of hand. –Comes with a curse on reliability. Curse of variation falls on all who would dare reach the atomic-scale. –…grace of repair may be all that saves us –Not unique to CMOL Small switches may help.
DeHon 2008 28 References nanoPLA articles http://www.seas.upenn.edu/~andre/sublithographic.html http://www.seas.upenn.edu/~andre/sublithographic.html Likharev, Hybrid CMOS/Nanoelectronic Circuits(CMOL, FPNI, etc.), White Paper for ITRS ERD Working Group 2008 Strukov and Likharev, FPGA 2006 Likharev and Strukov, Nanoarch 2007
DeHon 2008 34 Langmuir-Blodgett (LB) transfer Can transfer tight-packed, aligned SiNWs onto surface –Maybe grow sacrificial outer radius, close pack, and etch away to control spacing + Transfer aligned NWs to patterned substrate Transfer second layer at right angle Whang, Nano Letters 2003 v7n3p951