# Chapter 12: Reliability and Failure Analysis S-69.4123 Postgraduate Course in Electron Physics I P Alexander Perros 02.11.2012.

## Presentation on theme: "Chapter 12: Reliability and Failure Analysis S-69.4123 Postgraduate Course in Electron Physics I P Alexander Perros 02.11.2012."— Presentation transcript:

Chapter 12: Reliability and Failure Analysis S-69.4123 Postgraduate Course in Electron Physics I P Alexander Perros 02.11.2012

Outline 1.Introduction 2.Failure Times and Acceleration Factors 3.Distribution Functions 1.Cumulative Distribution Function 2.Probability Density Function 4.Reliability Concerns 1.Electromigration – in depth 2.Hot carriers – in depth 3.Briefly 1.Gate Oxide Integrity 2.Negative Bias Temperatute Instability 3.Stress induced lecakge current 4.Electrostatic discarge 5.Failure Analysis and Characterization Techniques 1.Quiescent Drain Current (I DDQ ) 2.Emission Microscopy 6.Overview of other Techniques 2Reliability and Failure Analysis

Introduction Reliability – Probability of operating a product for a given time under specified conditions without failure. [1] Reliability is a probability; i.e. measure of the confidence that an event will occur that is based on empirical evidence – In principle, if ALL conditions are known, then probability doesn’t exist. 3Reliability and Failure Analysis

Introduction Reliability – Probability of operating a product for a given time under specified conditions without failure. [1] Time is arbitrary – Starts at 0. – When time -> ∞ the probability that an event, ‘failure’ here, is 100% Reliability can be specified for other units than time and can be independent of time Reliability and Failure Analysis4

Introduction Reliability – Probability of operating a product for a given time under specified conditions without failure. [1] Conditions are a fixed and chosen particular environment the product is subjected to and plays a critical role in its failure. – Reliability of formula 1 tires in different weather conditions – Sensor in deep space vs normal terrestrial applications (see cosmic rays) Reliability and Failure Analysis5

Introduction Reliability – Probability of operating a product for a given time under specified conditions without failure. [1] Failure is defined as the product not meeting an intended function at a certain performance level that is defined based on the application and other considerations (cost etc.) – Failure can range from to the complete inoperability to fully functioning but with an unacceptable performance deviation. Reliability and Failure Analysis6

Failure Times and Acceleration Factors Failure Times – mean time to failure (MTTF) – mean time between failures (MTBF) Where n is the number of products and t n is the operating time after a product fails. Reliability and Failure Analysis7

Failure Times and Acceleration Factors Failure Times – MTFB is more useful for repairable products – MTTF is more used for products replaced after failure (i.e. sensor) Reliability and Failure Analysis8 -------- MTBF -------- ---- MTTF ---- Fig. 1 Illustration of MTTF and MTBF operating failure

Failure Times and Acceleration Factors Failure Times – Macro manufacturing defects (infant mortality) Can be reduced or eliminated with rigorous testing and burn-in* – Constant (random) failures – Wear out failures Reliability and Failure Analysis9 Fig. 2 Illustration of reliability bathtub curve with components broken out. [2] *Burn-in process eliminates, or reduces, the infant mortality by filtering those defective products or components before placing in service. Effectively moves t start further down the bathtub timeline. [3]

Failure Times and Acceleration Factors It is not practical to test products over their estimated operational lifetimes which can span years. This gives rise to the need to ‘accelerate time’. This acceleration is achieved by ‘accelerating’ (increasing) a stress condition (temperature, voltage, current, etc.) which is called an acceleration factor (AF). It is assumed results can be extrapolated to normal operating conditions. Not always the case as failure mechanisms may differ under accelerated condition (AF ‘degree’) α uncertainty – Modest AF for lower uncertainty: longer reliability stress, product life tests – Higher AF for faster measurement times – Knowledge of degradation mechanism Reliability and Failure Analysis10

Failure Times and Acceleration Factors Acceleration Factor (AF) – Many failure modes are activation energy limited Arrhenius equation characterizes such Where A, a constant, and E A, activation energy, are assumed to be independent of temperature Acceleration factor, AF T, is then simply the ratio of time at the base (normal operating temperature), T o, to the elevated temperature, T 1 : Reliability and Failure Analysis11

Failure Times and Acceleration Factors Acceleration Factor (AF) – For when voltage is increased above the operating value the time to failure is expressed as: Acceleration factor is again simply the ratio t(V o )/t(V 1 ): Reliability and Failure Analysis12

Distribution Functions Product failures are random in nature and will occur at different frequency according to the measured unit, such as time. – A frequency distribution over some unit – A failure or hazard rate λ: the probability of failure/unit time at time t given that the member of the original distribution has survived until time t.[4] Reliability and Failure Analysis13

Distribution Functions Probability density function The number of failures of at time t. Cumulative Distribution Function – Known as failure probability: the probability that a device will fail at or before time t. Reliability and Failure Analysis14

Distribution Functions Probability density function Models the failure distribution Choose the ‘right’ model Cumulative Distribution Function Is the area of the probability density function up to t. When t=∞ F(t)=1, or, in other words,100 % of products have failed. When t=0 F(t)=0. Mean time to failure Reliability and Failure Analysis15

Distribution Functions Exponential Distribution – Simplest distribution function. Characterized by a constant failure rate over the lifetime of the device. – Useful when early failures and wear out mechanisms have been eliminated – Frequently used in semiconductor analysis Reliability and Failure Analysis16 Fig. 3 Exponential probability and cumulative distribution curves[5]

Distribution Functions Weibull Distribution – Weibull distribution function [6] the failure rate varies as a power of the device age. – Also known as extreme value distribution τ and β are scale and shape parameters, respectively For β > 1 failures increase with time, β < 1 failures decrease with time, B=1 Weibull becomes the exponential distribution. Reliability and Failure Analysis17 Fig. 3 Weibull probability and cumulative distribution curves[7]

Distribution Functions Normal Distribution Reliability and Failure Analysis18 σ is the scale parameter and t 15.87 is the time 15.87 % of the devices have failed. [8] Fig. 4 Normal probability and cumulative distribution curves[9]

Log Normal Distribution – Frequently used to describe failure statistics of semiconductor devices over long times Reliability and Failure Analysis19 Fig. 4 Log normal probability and cumulative distribution curves[10] Distribution Functions

Reliability Concerns Electronmigration Electromigration – Gradual segregation of line components due to momentum transfer to ions from moving electrons, under an applied electric field, and diffusing metal. – Momentum transfer dominates in Al lines and Cu lines. Reliability and Failure Analysis20 Fig. 8 Illustration of electromigration [11] Fig. 9 Illustration of electromigration on polycrystalline line [1]

Reliability Concerns Electronmigration Electromigration can lead to failures due to: – Increase in line resistance – A line becoming an open circuit (voids) – An adjacent line becoming a short circuit (whiskers) Factors – Intrinsic material, grain size, grain boundary orientation, triple point density, surface conditions, passivation etc. – Extrinsic current and power density (related to size and dimensions i.e. implications on scaling!) AC vs DC Reliability and Failure Analysis21

Reliability Concerns Electronmigration Electromigration mechanism – Diffusion through grain boundaries/material interface dominates Triple points (void formation) Mass depletion at large to small grains (void formation) Surface Single crystalline lines do not experience electromigration degradation Reliability and Failure Analysis22 Fig. 10 Illustration of electromigration on polycrystalline line [1]

Reliability Concerns Electronmigration What is affected – All metallic lines, interconnects and contacts face failure which can result in complete IC failure. Risk increases as current density and/or power density increases high power devices such as low voltage MOSFETS or scaled down VLSI and ULSI Solder joints as is used in flip-chip bonding Reliability and Failure Analysis23

Reliability Concerns Electronmigration Electromigration mitigation – Triple point removal Natural consequence as lines narrow (see bamboo structure) (activation energy: intergrain vs- intragrain) – Diffusion retardation using impurities Addition of 4% Cu to Al lines. (70x lifetime increase in Al lines). Addition of 1% Pd or 1% Ag [11] to Cu. – Material selection Cu is less susceptible to EM degradation than Al Refractory metals (TiN, W, etc.) experience almost no EM Reliability and Failure Analysis24

Reliability Concerns Electronmigration Electromigration mitigation contd. – Creating a shunt with layered structure TiN can act as a shunt for a Al line should a weak spot develop. – Blech length A critical length for metal lines exist, below which EM is inhibited Reliability and Failure Analysis25

Reliability Concerns Hot Carriers Hot Carriers – Energetic carriers (electrons,e, or holes, h). Electron that gains energy in an electric field causing ionization collisions and avalanche multiplication of e/h pairs. Energetic species can be injected in gate oxide as trapped charges or appear as gate drift current, form interface traps, and can generate photons or appear as substrate current. The most dominant failure mechanism for nanoscale NMOS transistors. Reliability and Failure Analysis26 Fig. 10 Illustration of hot carrier damage on NMOS[1]

Reliability Concerns Hot Carriers Hot carriers manifest in NMOS as: – Threshold voltage change (i.e. minimum voltage to create an inversion layer below gate) – Transconductance degradation (drain current change divided by change in the gate/source voltage with constant drain/source voltage) – Forward bias of the source-substrate junction due to substrate current Snapback breakdown (when output enters a negative resistance region, leads to more impact ionization and can result in device destruction) Reliability and Failure Analysis27

Reliability Concerns Hot Carriers The chief degradation mechanism for n-channel MOSFETs is interface trap generation – substrate current is a good monitor of such damage. p-channel devices chief degradation mechanisms is trapped electrons near the gate-drain interface – gate current is a good monitor of such. The weakness of measuring by means of substrate or gate currents is that it is an indirect measurement Reliability and Failure Analysis28

Reliability Concerns Hot Carriers The interface trap density can be measured also – directly by charge pumping [12][13] – indirectly by threshold voltage, transconductance, or drain current changes Reliability and Failure Analysis29

Failure Analysis and Characterization Techniques Microprobes Mechanical probes are used to contact parts of an IC during FA. – AFM probes for sub-micron dimensions – Scanning capacitance and spreading resistance microscopy for ion implant and dielectric characterization Reliability and Failure Analysis30

Failure Analysis and Characterization Techniques Microprobes A number of metal lines are tested at various temperatures for a given current density. The resulting data are plotted as cumulative failures as a function of test time [1] Reliability and Failure Analysis31

Failure Analysis and Characterization Techniques Microprobes The median times to failure are then plotted as log(t 50 ) versus 1/T and the activation energy is extracted [1] Reliability and Failure Analysis32

Failure Analysis and Characterization Techniques Microprobes Then measurements are made for various current densities at a given temperature. The exponent n can be found from the fitted slope. Reliability and Failure Analysis33 With n and activation energy known, MTTF can be found for other temperature or current densities. (Eq. 12.20)

Failure Analysis and Characterization Techniques Emission Microscopy uses the emission of light in response to an electrical stimulus to locate potential defect site. [1] A failed chip is placed in an emission microcope and illuminated. An image is taken to later locate the defect sites. Illumination is turned off and then a voltage is applied which causes the stimulated emission of light Images are then superimposed Top vs. bottom imaging Reliability and Failure Analysis 34 [1]

Failure Analysis and Characterization Techniques What can be detected? – latch-up conditions when there is a high density of excess carriers in the device and a ‘short’ is formed between components. – hot carriers i.e. in the drain space charge region of a MOSFET – When carriers flow through an oxide and lose energy (reverse biased np-junction). – Weak spots on the gate oxide Reliability and Failure Analysis35

Failure Analysis and Characterization Techniques What can be detected? continued – The spectral content of emitted light can be used to gain insight into the failure mode. The light emission correlates well with substrate current (see hot carrier and Isub) and device degradation. 5 – picosecond imaging circuit analysis can be use to show time resolved switching of devices Reliability and Failure Analysis36 [1]

Overview Oxide Integrity: Oxides are commonly characterized by their charge-to- breakdown behavior or time-to-breakdown. Measured by the constant or ramped gate voltage or gate current techniques. Pro The constant gate current has the advantage of simply yielding the charge-to-breakdown as a simple product QBD = JGtBD. Con the current may not be uniform and most devices are not operated under constant current, but under constant gate voltage conditions. Thin oxides frequently do not exhibit well defined breakdowns partly because the gate leakage current prior to breakdown is quite high. Negative bias temperature instability: most commonly characterized by threshold voltage, transconductance, interface trap density, and drain current measurements. Occurs in p-channel MOS devices stressed with negative gate voltages at elevated temperatures. Reliability and Failure Analysis37

Overview IDDQ: Pro: Simple to implement as only the input current to an IC needs to be monitored; very good at detecting shorts. Con: Does not localized the fault; difficult to determine opens. Emission Microscopy: Pro: Entire die may be viewed at one time; no deprocessing except for lid removal; functional failure does not need to propagate to output. In the form of PICA it can be used to follow the switching function of an IC and allows circuit FA. Con: IC must be biased and toggled; ohmic defects do not emit light; no light detection through opaque layers; emission site may not be defect site. For imaging from the back of the chip: sample preparation; substrate thinning may impact device characteristics; Si is an infrared filter and limits detection bandwidth of emission sites; doping atoms serve to scatter IR photons leading to reduced sensitivity; CCD based systems have a low quantum efficiency in the required IR spectrum. Reliability and Failure Analysis38

Overview Voltage Contrast: Pro: Contactless method to determine the spatial and temporal voltage within an IC. The electron beam is small and can contact most lines in an IC. Con: Difficult when the line of interest is buried below other metallization levels. Liquid Crystal: Pro: Low cost, easy to use, very good thermal and spatial resolution, useful for thermal and voltage contrast analysis, real time imaging. Con: Tends to “wick up” around probes and bonding wires making identification of hot spots difficult; poor thermal resolution for measurements from the back of the wafer; the number of layers between the source of the failure and the surface where the liquid crystal resides limits spatial resolution and sensitivity; liquid crystal has a set transition temperature. Multiple hot spots can be difficult to resolve if the warmer spot creates a significant temperature gradient. Fluorescent Microthermography: Pro: Offers high thermal and spatial resolution. Con: The film must be calibrated for quantitative temperature measurements. Infrared Thermography: Pro: Is a passive technique not requiring thermal excitation with good temperature resolution allowing imaging from front and rear surfaces. Con: Calibration is necessary for quantitative information but is not easy since the emissivity is generally not known. OBIRCH: Pro: A sensitive technique for a variety of FA investigation with high resolution. When OBIRCH does not work well, frequently EMMI does. The two are complementary. Con: Cannot be used for multiple metal layer chips; when used from the rear surface, wafer must be thinned to 150– 200 μm. Reliability and Failure Analysis39

References 1.M. Ohring, Reliability and Failure of Electronic Materials and Devices, Academic Press, San Diego, 1998. 2.(public domain) http://en.wikipedia.org/w/index.php?title=File:Bathtub_curve.svg&page=1http://en.wikipedia.org/w/index.php?title=File:Bathtub_curve.svg&page=1 3.http://en.wikipedia.org/wiki/Burn_inhttp://en.wikipedia.org/wiki/Burn_in 4.F.R. Nash, Estimating Device Reliability: Assessment of Credibility, Kluwer, Boston, 1993. 5.http://en.wikipedia.org/wiki/Exponential_decayhttp://en.wikipedia.org/wiki/Exponential_decay 6.W. Weibull, “A Statistical Distribution Function of Wide Applicability,” J. Appl. Mech. 18, 293 – 297, Sept. 1951. 7.http://en.wikipedia.org/wiki/Weibull_distributionhttp://en.wikipedia.org/wiki/Weibull_distribution 8.W.J. Bertram, “Yield and Reliability,” in VLSI Technology 2 nd ed. (S.M. Sze, ed.), McGraw- Hill, New York, 1988. 9.http://en.wikipedia.org/wiki/Normal_distributionhttp://en.wikipedia.org/wiki/Normal_distribution 10.http://en.wikipedia.org/wiki/Log-normal_distributionhttp://en.wikipedia.org/wiki/Log-normal_distribution 11.http://en.wikipedia.org/wiki/electromigrationhttp://en.wikipedia.org/wiki/electromigration 12.A. Scorzoni, B. Neri, C. Caprile, and F. Fantini, “Electromigration in Thin-Film Interconnection Lines: Models, Methods and Results,” Mat. Sci. Rep. 7, 143–220, Dec. 1991. 13. J. S. Brugler and P. G. A. Jespers, ``Charge Pumping in MOS Devices,'' IEEE Trans.Electron Devices, vol. 16, no. 3, pp. 297- 302, 1969. 14.G. Groeseneken, H. Maes, N. Beltran, and R. F. De Keersmaecker, ``Reliable Approach to Charge-Pumping Measurements in MOS Transistors,'' IEEE Trans.Electron Devices, vol. 31, no. 1, pp. 42-53, 1984. 15.T. Schweinb ¨ ock, S. Sch ¨ omann, D. Alvarez, M. Buzzo, W. Frammelsberger, P. Breitschopf, and G. Benstetter, “New Trends in the Application of Scanning Probe Techniques in Failure Analysis,” Microelectron. Reliab. 44, 1541 – 1546, Sept./Nov. 2004; G. Benstetter, P. Breitschopf, W. Frammelsberger, H. Ranzinger, P. Reislhuber, and T. Schweinb ¨ ock, “AFM- based Scanning Capacitance Techniques for Deep-submicron Semiconductor Failure Analysis,” Microelectron. Reliab. 44, 1615 – 1619, Sept./Nov. 2004. Reliability and Failure Analysis40

Exercises 1.Choose two reliability concerns a not covered in the lecture and describe them (~1 paragraph). Explain how these two reliability concerns manifest in an example device of your choosing. Use figures and references. 2.Choose four failure analysis characterization techniques b concerns not covered in the lecture and describe them (~1 paragraph). Briefly list the pros and cons of the chosen techniques and be sure to mention if any techniques compliment each other (For example, quiescent drain current testing is sometimes combined with emission microscopy). Use figures and references where appropriate. a. (Gate Oxide Integrity, Negative Bias Temperature Instability, Stress Induced Leakage Current, Electrostatic Discharge) b. (Quiescent Drain Current, fluorescent Microthermography, Infrared Thermography, Voltage Contrast, Laser Voltage Probe, Liquid Crystals, Optical Beam Induced Resistance Change, Noise) Reliability and Failure Analysis41

Download ppt "Chapter 12: Reliability and Failure Analysis S-69.4123 Postgraduate Course in Electron Physics I P Alexander Perros 02.11.2012."

Similar presentations