# Advanced Digital Design Metastability A. Steininger Vienna University of Technology.

## Presentation on theme: "Advanced Digital Design Metastability A. Steininger Vienna University of Technology."— Presentation transcript:

Advanced Digital Design Metastability A. Steininger Vienna University of Technology

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 2 Outline What is metastability What is metastability Effects and threats Effects and threats The unavoidability The unavoidability MTBU estimation MTBU estimation Synchronizers & Countermeasures Synchronizers & Countermeasures Trends Trends Measurement of Model Parameters Measurement of Model Parameters

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 3 Metastability: An Example Ball may remain on top („metastable“) for unbounded time A small disturbance causes the ball to fall in either direction stable right position stable left position

What is Metastability ? continuous-valued input space continuous-valued input space (initial position of the ball) mapped to binary output space binary output space (left or right position) mapping may be undecided for unbounded time mapping may be undecided for unbounded time Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 4

Mestastability in Logic ? „In the synchronous digital world we do not have a continuous space“ (after all, that‘s the key benefit!) „In the synchronous digital world we do not have a continuous space“ (after all, that‘s the key benefit!) „Inputs and outputs of gates are all digital“ „Inputs and outputs of gates are all digital“ So why bother about metastability? So why bother about metastability? Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 5

The real world signal levels representing the digital state are continuous signal levels representing the digital state are continuous pulse lengths are continuous in time pulse lengths are continuous in time relative signal arrival times are continuous relative signal arrival times are continuous transistors and the circuits built from them operate in continuous time with continuous voltage amplitudes transistors and the circuits built from them operate in continuous time with continuous voltage amplitudes Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 6

Specifying Problems Away is the input high or low? is the input high or low? spec: forbidden range spec: forbidden range is the pulse long enough to be recognized by a gate? is the pulse long enough to be recognized by a gate? spec: min pulsewidth spec: min pulsewidth did A occur before or after B? did A occur before or after B? spec: setup/hold time spec: setup/hold time Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 7

Limits of the Abstraction in a closed world these issues can be „specified away“, but in a closed world these issues can be „specified away“, but what happens at interfaces what happens at interfaces what happens with faults what happens with faults The synchronous digital abstraction cannot comprise these issues The synchronous digital abstraction cannot comprise these issues when facing metastability, CMOS circuits are operated out of spec, hence have undefined behavior when facing metastability, CMOS circuits are operated out of spec, hence have undefined behavior Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 8

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 9 Level: Inverter Example analog transfer characteristics analog transfer characteristics „forbidden“ input level may lead to „forbidden“ output level „forbidden“ input level may lead to „forbidden“ output level propagation of „forbidden“ level propagation of „forbidden“ level u in u out Inverter- characteristics

Pulsewidth: RC Example short digital input pulse short digital input pulse creates analog output in forbidden range creates analog output in forbidden range parasitic RCs are omnipresent in ASICs parasitic RCs are omnipresent in ASICs Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 10

A before B: AND Example contradicting digital transi- tions on inputs contradicting digital transi- tions on inputs depending on timing a glitch is produced depending on timing a glitch is produced RC will convert it into ambi- guous voltage RC will convert it into ambi- guous voltage Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 11 a b a AND b

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 12 Setup/Hold Time of Latch Otherwise we feed the storage loop with a marginal condition (pulse width, level), thus creating undefined behavior Otherwise we feed the storage loop with a marginal condition (pulse width, level), thus creating undefined behavior feedback path must be stable when swiching from „transparent“ to „hold“. feedback path must be stable when swiching from „transparent“ to „hold“.

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 13 Metastability in the Latch normal operation: strong momentum will roll ball to other side metastability: marginal momentum will roll ball just to the top stable right position stable left position

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 14 Observation: An input transition during the decision window leads to an (unbounded) increase of clock-to-output delay t clk2out t clk2out,nom t clk2data t setup t hold 0 CLK D Response Time of a FF off-spec

Observation combinational elements combinational elements transform off-spec inputs into off- spec outputs immediatey transform off-spec inputs into off- spec outputs immediatey sequential (stateful) elements sequential (stateful) elements are expected to decide for one state; are expected to decide for one state; off-spec inputs will delay this decision off-spec inputs will delay this decision only they can become metastable only they can become metastable Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 15

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 16 Faces of Metastability (properly shaped) late transition (properly shaped) late transition may cause timing problems may cause timing problems problem specific for synchronous design problem specific for synchronous design creeping through forbidden voltage range creeping through forbidden voltage range generates long undefined level generates long undefined level oscillation oscillation generates erroneous transitions generates erroneous transitions

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 17 Metastability: Creeping 1 2 3 4 5 12345 Inv 1 1 u e,2 = u a,1 u e,1 = u a,2 stable (HI) stable (LO) metastable A Inv 2

Metastability: Oscillation A pulse with length shorter than the roundtrip delay through the inverter loop can circulate Thus it appears periodically at the output  „oscillation“ Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 18 11 22 PW<1+2 1

Ways of Triggering MS Time domain Time domain glitch in feedback loop S/H violation, or S/H violation, or glitch on D glitch on D Value domain Marginal input voltage stored even without S/H violation Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 19 D Clk L FB D Clk L FB

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 20 Why voilate Setup/Hold? in a closed synchronous system no violations will occur in a closed synchronous system no violations will occur BUT: no system is really closed BUT: no system is really closed non-synchronous interfaces non-synchronous interfaces clock domain boundaries clock domain boundaries fault effects (single-event upsets) fault effects (single-event upsets) off-spec operation (temp, VCC, frequency) off-spec operation (temp, VCC, frequency)

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 21 asynchronous event setup/hold clock period T clk dec. win. T 0 probability of setup/hold violation Asynchronous Inputs

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 22 CLK 1 (Ref) CLK 2 A Multiple Clock Domains arbitrary „phase“ relation arbitrary „phase“ relation setup/hold violation inevitable (fundamentally!) setup/hold violation inevitable (fundamentally!)

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 23 Metastability: Threats propagation propagation undefined logic level/timing at input may produce undefined output undefined logic level/timing at input may produce undefined output „Byzantine“ Interpretation „Byzantine“ Interpretation Thresholds/timing of different inputs are different (type variations) Thresholds/timing of different inputs are different (type variations) marginal input level/timing may be interpreted differently marginal input level/timing may be interpreted differently

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 24 X Metastab. X data clk u in u out Combinational gates as well as the inverters inside the FF map metastable inputs to metastable outputs Inverter- characteristics A Metastability Propagation

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 25 Inconsistent Perception X 0 1 Metastab. The metastable state may be regarded as „1“ by one FF and as „0“ by another X threshold A A B treshold B A

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 26 Metastability Proofs Formal proofs exist that Formal proofs exist that metastability can in principle not be avoided („Buridan‘s Principle“) metastability can in principle not be avoided („Buridan‘s Principle“) no upper bound on the duration of metastable state can be given no upper bound on the duration of metastable state can be given but after infinite time the state will be resolved with probability 1 but after infinite time the state will be resolved with probability 1 Fundamental issue Fundamental issue Mapping from a continuous space to a discrete space involves a decision that may take unbounded time (namely in borderline cases) Mapping from a continuous space to a discrete space involves a decision that may take unbounded time (namely in borderline cases)

Approaching the Border The mapping from continuous to binary space needs a borderline The mapping from continuous to binary space needs a borderline In the proximity of the borderline the force pulling towards one of the binary states becomes smaller (compare momentum of the ball) In the proximity of the borderline the force pulling towards one of the binary states becomes smaller (compare momentum of the ball) In the continuous input space one can go arbitrarily close to the borderline, thus moving this force towards zero In the continuous input space one can go arbitrarily close to the borderline, thus moving this force towards zero Often the stable binary states represent energy-minima, while the metastable state represents a (local) maximum (Remember: energy must change continuously) Often the stable binary states represent energy-minima, while the metastable state represents a (local) maximum (Remember: energy must change continuously) Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 27

Metastability Avoidance? Can‘t we avoid metastability in practice, if we Can‘t we avoid metastability in practice, if we avoid borderline cases? (only those are problematic!) avoid borderline cases? (only those are problematic!) => synchronous design, noise margins… allow arbitrary time for resolving? allow arbitrary time for resolving? change input threshold of successor stage ? change input threshold of successor stage ? use a different storage element ? use a different storage element ? Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 28

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 29 Why use the D-Flipflop? Metastability is not restriced to D-FFs, it is encountered with SR-latch, JK-Flipflop, Muller C-Gate,… SR-latch, JK-Flipflop, Muller C-Gate,… Basically all biststable elements can become metastable: Basically all biststable elements can become metastable: state is always associated with energy state is always associated with energy state change always involves energy transfer state change always involves energy transfer law of physics dictate continuous transfer law of physics dictate continuous transfer but: binary state but: binary state min max

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 30 Mitigating Metastability Metastability cannot be eliminated in general Metastability cannot be eliminated in general all such circuits have been shown to fail… all such circuits have been shown to fail… in practice systems still work because metastability is very improbable in practice systems still work because metastability is very improbable it can be made more or less probable by design techniques it can be made more or less probable by design techniques it can be transformed between its different modes it can be transformed between its different modes marginal voltage level marginal voltage level late transition late transition oscillation oscillation

Conversions Low-Pass Low-Pass oscillation => creeping oscillation => creeping Discriminator Discriminator creeping + noise => oscillation creeping + noise => oscillation High / Low threshold input High / Low threshold input creeping => glitch creeping => glitch Schmitt Trigger Schmitt Trigger creeping => late transition creeping => late transition Flip-Flop Flip-Flop late transition => creeping or oscillation late transition => creeping or oscillation Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 31

Masking Metastability assume m-of-n voting assume m-of-n voting Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 32 … … m-1 n If the metastable input just makes the difference, MS can propagate If the metastable input just makes the difference, MS can propagate in all other cases MS will be masked in all other cases MS will be masked

Detecting Metastability … often possible by comparing Q and Q creeping creeping both, Q and Q deliver VDD/2; this is often perceived as the „same“ logic level both, Q and Q deliver VDD/2; this is often perceived as the „same“ logic level late transition late transition with proper separation of Schmitt-Trigger / High threshold inverter and output inverter => no visible effect with proper separation of Schmitt-Trigger / High threshold inverter and output inverter => no visible effect oscillation oscillation literature reports about „in phase“ oscillation of Q and Q literature reports about „in phase“ oscillation of Q and Q Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 33

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 34 Quantifying the Risk of MS „Upset“ „Upset“ metastable output is captured by subsequent FF after t r metastable output is captured by subsequent FF after t r Mean Time Between Upset (MTBU) Mean Time Between Upset (MTBU) expected value (statistics!) for interval between two subsequent upsets expected value (statistics!) for interval between two subsequent upsets

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 35 Resolution Time clk asyn syn t clk2out t comb t SU t res asyn clk syn comb. logic normal operation: t res >0 upset: t res <0

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 36 Parameters Resolution time t res Resolution time t res interval available for output to settle after active clock edge interval available for output to settle after active clock edge Flip-Flop parameters  c, T 0 Flip-Flop parameters  c, T 0 experimentally determined experimentally determined time constant  c dep. on transit frequ. time constant  c dep. on transit frequ. T 0 from effective width of decision window T 0 from effective width of decision window Clock period of FF T clk = 1/f clk Clock period of FF T clk = 1/f clk Average rate of change dat Average rate of change dat Avg. rate of transitions at FF data input Avg. rate of transitions at FF data input

Modeling Metastability How can we derive this equation? How can we derive this equation? Which model to apply? Which model to apply? Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 37

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 38 Simple Metastability Model model bistable element by inverter pair use linear model for inverter, around midpoint of transfer function („balance point“) consider „homo- genuous“ case, i.e. closed loop w/o inputs u in u out Inverter- characteristics u out = -A * u in u1u2

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 39 Introducing Dynamics 1st order approximation of dynamic behavior: RC element assume symmetry (same A, RC for both inverters) for simplicity WLOG assume symmetric supply (+VCC/-VCC) against GND -A RC =  u1 u2

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 40 Differential Equations Basics: Basics: forward path: forward path: backward path: backward path: Laplace: Laplace: time-domain solution: time-domain solution:

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 41 The Solution u 2 0 -u 1 0 … difference of initial voltages (charges on Cs); zero at balance point u 2 0 -u 1 0 … difference of initial voltages (charges on Cs); zero at balance point  … RC constant, bandwidth = 1/RC  … RC constant, bandwidth = 1/RC A … inverter gain at balance point A … inverter gain at balance point A/  … gain bandwidth product of inverter A/  … gain bandwidth product of inverter starting from the initial difference u 2 rises exponentially with time towards the positive or negative supply voltage

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 42 Plot of u 2 over Time For a given t we can project „forbidden“ input range back to a „forbidden“ range of the initial voltage difference

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 43 Forbidden Initial Range u0u0u0u0 The forbidden output voltage range relates to a forbidden range of initial difference voltage (i.e. just after sampling). This range becomes exponentially smaller for high resolution time t res and high gain- bandwidth product A/.

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 44 Aperture Window T AW How long does it take for the input voltage difference to cross the forbidden range? How long does it take for the input voltage difference to cross the forbidden range? Depends on slopes of both, input voltage AND feedback voltage Depends on slopes of both, input voltage AND feedback voltage +u 0,border  u 0,border T AW u diff (t), slope S

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 45 Calibrating T AW depends on, which in turn depends on T AW depends on u 0,border, which in turn depends on t res for immediate use of the output: for immediate use of the output: thus thus technology parameter

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 46 Hitting the Aperture with exponentially distributed inter-arrival time of input events (rate dat ) and sampling with period T clk (i.e. window T AW is repeated) the upset rate can be calculated as with exponentially distributed inter-arrival time of input events (rate dat ) and sampling with period T clk (i.e. window T AW is repeated) the upset rate can be calculated as Hence the MTBU becomes Hence the MTBU becomes

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 47 Putting it all together T0T0   C

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 48 The widely used equation rate of input events sampling frequency technology parameters expected time between upsets (statistical!) available resolution time

Late Transition calculate output delay over data to clk distance Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 49 detector threshold input slope S output delay depends on input phase with ln(1/x)

Graphical View Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 50

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 51 Provoking Metastability asynchronous inputs asynchronous inputs multiple clock domains multiple clock domains clock divider (uncontrolled delay) clock divider (uncontrolled delay) low timing margins low timing margins slow technology (gain/BW prod) slow technology (gain/BW prod) supply drop (excessive delay) supply drop (excessive delay) Operation under high temperature Operation under high temperature

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 52 Determination of T 0,  C experimental: vary t res vary t res observe MTBU observe MTBU log graph => straight log graph => straight slope ->  C slope ->  C offset -> T 0 offset -> T 0 typical values typical values dat *f clk *T 0 CC 1 t res (ns) dat = 2MHz f clk = 10MHz 1 1

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 53 Claim: „Metastability is a non-issue in modern technologies“ Claim: „Metastability is a non-issue in modern technologies“ log MTBU[s] t res 6 12 5 1996 (XC4005) 2002 (XC2VP4) BUT: clock rates have increased by a factor of 16 during that period – and timing margins have shrunk in the same way! Metastability – Trends

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 54 Mitigating Metastability avoid/minimize non synchronous IFs avoid/minimize non synchronous IFs leave sufficient timing margins leave sufficient timing margins use fast technology (gain/BW prod) use fast technology (gain/BW prod) ensure proper operating conditions (stable power supply, cooling,…) ensure proper operating conditions (stable power supply, cooling,…) basic principle of synchronizers: basic principle of synchronizers: trade performance for increased timing margins (t res )

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 55 Synchronizer Example: Cascade of n Input-FFs asyn clk syn MTBU calculation: same equation as before, but now individual resolution times sum up:

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 56 MTBF of n-Stage Synchr. Recall the projection of allowed output range to an input range considering the exponential increase during the resolution time: Recall the projection of allowed output range to an input range considering the exponential increase during the resolution time: u 0 for FF k is provided by the output of a preceding stage FF k-1 => we make the same projection again: u 0 for FF k is provided by the output of a preceding stage FF k-1 => we make the same projection again:

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 57 Synchronizer-Rules never synchronize more than one signal (rail) never synchronize more than one signal (rail) danger of data inconsistecy danger of data inconsistecy degradation of MTBU by number of signals degradation of MTBU by number of signals for a wider bus, use one signal for handshaking for a wider bus, use one signal for handshaking never introduce a fork before the end of synchronizer never introduce a fork before the end of synchronizer estimate the MTBU of your solution estimate the MTBU of your solution too low MTBU leads to failures too low MTBU leads to failures too many stages introduce unnecessary delay too many stages introduce unnecessary delay there is definitely no magic solution to eliminate the potential for metastability, but it can be made arbitrarily improbable there is definitely no magic solution to eliminate the potential for metastability, but it can be made arbitrarily improbable

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 58 Synchronizer – Trends need for more synchronizers need for more synchronizers more function units being integrated on a chip more function units being integrated on a chip more standardized frequencies more standardized frequencies higher communication demands higher communication demands need for more synchronizer stages need for more synchronizer stages increasing PVT variations => larger safety margins increasing PVT variations => larger safety margins synchronizer paramters become worse: synchronizer paramters become worse:  C used to scale proportional to (FO4) propagation delay for decades,  C used to scale proportional to (FO4) propagation delay for decades, below 45nm technologies the scaling is worse below 45nm technologies the scaling is worse synchronizers tend to create a considerable performance loss in the future synchronizers tend to create a considerable performance loss in the future

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 59 Even/Odd Synchronizer works for two periodic clocks only works for two periodic clocks only avoids performance penalty of synchronizers avoids performance penalty of synchronizers largely eliminated potential for metastability largely eliminated potential for metastability for details see [Dally & Tell, The Even/Odd Synchronizer, ASYNC 2010] for details see [Dally & Tell, The Even/Odd Synchronizer, ASYNC 2010]

Mutex For deciding the „A before B“ problem a special circuit exists, namely the Mutex (mutual exclusion element) For deciding the „A before B“ problem a special circuit exists, namely the Mutex (mutual exclusion element) Unlike the Synchronizer it assumes there is unbounded time to resolve Unlike the Synchronizer it assumes there is unbounded time to resolve It will be treated in a later Section. It will be treated in a later Section. Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 60

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 61 Assumptions made so far linear inverter slope (1st order model) linear inverter slope (1st order model) load independent gain load independent gain dominating RC const. (1st order model) dominating RC const. (1st order model) full symmetry (RCs, inverter properties, rising/falling slopes,…) full symmetry (RCs, inverter properties, rising/falling slopes,…) decreasing exp term neglected decreasing exp term neglected homogenuous case (MUX switching and input signal shape neglected) homogenuous case (MUX switching and input signal shape neglected) equally distributed voltage levels equally distributed voltage levels exponentially distributed input events exponentially distributed input events

What about Oscillation? Can our model be used for oscillatory behavior? Can our model be used for oscillatory behavior? How / Why not? How / Why not? Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 62

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 63 A More general MS Model ideal amplifier gain -A pure delay delay  slope limiter time constant RC slope S GBWP = A/RC determines dynamics (decay of metastable state) oscillation for  > RC/A creeping otherwise

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 64 Characterizing Metastability know (=assume) exponential MTBU relation know (=assume) exponential MTBU relation measure MTBU over t res measure MTBU over t res draw semilog plot => straight line draw semilog plot => straight line find params: find params: slope   C slope   C offset  T 0 offset  T 0 need very good setup for measurements ! (assumptions made…) need very good setup for measurements ! (assumptions made…) dat. f clk. T 0 CC 1 t res (ns) dat = 2MHz f clk = 10MHz 1 1

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 65 Measuring Metastability DUT clk DQ MS producerMS detectorcounter [Altera]

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 66 MS Producer Single clock source, controllable relative delay between clock and data path Single clock source, controllable relative delay between clock and data path variable delay element, optional: feedback control variable delay element, optional: feedback control create as many MS events as possible in short time create as many MS events as possible in short time well-controlled and reproducible phase well-controlled and reproducible phase steer into deep metastability steer into deep metastability problems: noise, cannot derive MTBU problems: noise, cannot derive MTBU Two independent clock sources: Two independent clock sources: uniform distribution of phase relations uniform distribution of phase relations problems: MS rare, phase distribution truly uniform? problems: MS rare, phase distribution truly uniform?

Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 67 MS Detector Aims: Aims: detect metastable output of DUT detect metastable output of DUT Problem: Problem: How define MS ? How define MS ? late transition detection late transition detection intermediate voltage detection intermediate voltage detection output proximity detection output proximity detection Implementation options (late trans det): Implementation options (late trans det): sample DUT output with FF1 after t res sample DUT output with FF1 after t res compare with reference FF2 having „infinite“ t res compare with reference FF2 having „infinite“ t res mismatch indicates metastability mismatch indicates metastability many sources of error! many sources of error!

Late Transition Detection 68 DQDQ DQ osc 1 osc 2 var  ∞∞ ≠ CNT DUT DET REF max of var  determines maximum detectable t CO max of var  determines maximum detectable t CO infinite delay not feasible => false positive for large t CO infinite delay not feasible => false positive for large t CO

Detecting Metastability (1) Fundamental problems MS behavior is highly sensitive esp. to loading MS behavior is highly sensitive esp. to loading cannot measure w/o influencing cannot measure w/o influencing can only make indirect observation can only make indirect observation What is an „upset“ at all? no sharp definition What is an „upset“ at all? no sharp definition MS interpretation becomes ambiguous MS interpretation becomes ambiguous often „by chance“ (threshold of next stage) or „deliberate“ (scope) often „by chance“ (threshold of next stage) or „deliberate“ (scope) 69

Detecting Metastability (2) Practical problems FFs in „relevant“ circuits are not accessible, FFs in „relevant“ circuits are not accessible, cannot propagate subtle effects over pins cannot propagate subtle effects over pins cannot reliably capture them on-chip either cannot reliably capture them on-chip either detection circuits usually involve forks detection circuits usually involve forks different path delays, different thresholds different path delays, different thresholds usually ignored: symmetry assumed usually ignored: symmetry assumed how do PVT variations impact the results? how do PVT variations impact the results? in DUT and measurement circuit in DUT and measurement circuit which manifestation of MS to observe? which manifestation of MS to observe? intermediate voltage, output proximity, late trans. intermediate voltage, output proximity, late trans. where get the reference from? infinite time… where get the reference from? infinite time… 70

Relating the Results We plot log(MTBU) or t CO over t DtoC We plot log(MTBU) or t CO over t DtoC How determine t DtoC ? How determine t DtoC ? measure with oscilloscope/counter measure with oscilloscope/counter know from timing control: dly 2 – dly 1 know from timing control: dly 2 – dly 1 This relates to the external view (pins)! This relates to the external view (pins)! The actual FF cell will perceive a different timing due to non-matching path delays for C/D The actual FF cell will perceive a different timing due to non-matching path delays for C/D At best this may shift the MS point, but what about variable path delays (VT) ? At best this may shift the MS point, but what about variable path delays (VT) ? 71

Time Accuracy Clock Clock how accurate/stable is it? how accurate/stable is it? where is it used? where is it used? Delay Delay how accurate is it how accurate is it in which granularity can I vary it? in which granularity can I vary it? Output delay measurement Output delay measurement how accurate is my scope? how accurate is my scope? Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 72

Uncertainty Characterization …is a „must“ in many types of measurement. …is a „must“ in many types of measurement. Result is given as value ± u% Result is given as value ± u% For probabilistic results: confidence interval For probabilistic results: confidence interval These types of characterization allow These types of characterization allow Estimation of the credibility of value Estimation of the credibility of value Determination of worst case for value Determination of worst case for value Calculation of compound uncertainty Calculation of compound uncertainty Why not care for this in metastability measurement / MTBU prediction? Why not care for this in metastability measurement / MTBU prediction? 73

Why we SHOULD care There is no other evidence for the (even approximate) correctness of MTBU prediction: Wait for 1000 years? There is no other evidence for the (even approximate) correctness of MTBU prediction: Wait for 1000 years? Highly super-linear dependence of predicted MTBU on measured parameters => may amplify errors! Highly super-linear dependence of predicted MTBU on measured parameters => may amplify errors! Given the ample PVT variations – how to translate a specific measurement result into a generally valid prediction? Given the ample PVT variations – how to translate a specific measurement result into a generally valid prediction? 74

What about simulation simulation can provide access to all nodes of interest in a non-intrusive way simulation can provide access to all nodes of interest in a non-intrusive way metastability is, however, a very subtle effect, depending on many details metastability is, however, a very subtle effect, depending on many details a very detailed model for transistors (parasitics) and circuit (layout!) is needed a very detailed model for transistors (parasitics) and circuit (layout!) is needed analog simulation is needed, so the simulation time may become considerable analog simulation is needed, so the simulation time may become considerable finding the right phase CLK to data is difficult finding the right phase CLK to data is difficult the simulator tends to run into numeric problems the simulator tends to run into numeric problems noise is not necessarily considered noise is not necessarily considered so are the results finally representative? so are the results finally representative? Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 75

Summary (1) Metastability is unavoidable when mapping from a continuous space to a binary one. Metastability is unavoidable when mapping from a continuous space to a binary one. It can result in late transition, creeping or oscillation. It can result in late transition, creeping or oscillation. It can be specified away, but only in a closed system. It can be specified away, but only in a closed system. Metastable inputs make gates operate out of spec, hence their behavior is undefined. Metastable inputs make gates operate out of spec, hence their behavior is undefined. Metastability can propagate, even over masking provisions (TMR, etc.) Metastability can propagate, even over masking provisions (TMR, etc.) Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 76

Summary (2) In practice, the risk of facing a metastable upset can be made arbitrarily small. In practice, the risk of facing a metastable upset can be made arbitrarily small. On a statistical base, the upset probability of a flip-flop can be predicted. On a statistical base, the upset probability of a flip-flop can be predicted. The corresponding equation can be derived by investigating the homogenouns solution of a dynamic model built from first-order models of the inverters. The corresponding equation can be derived by investigating the homogenouns solution of a dynamic model built from first-order models of the inverters. The generally used equation is based on many simplifying assumptions. The generally used equation is based on many simplifying assumptions. Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 77

Summary (3) The required model parameters are often hard to find. Their determination by measurements involves a lot of uncertainties. The required model parameters are often hard to find. Their determination by measurements involves a lot of uncertainties. Synchronizers trade performance for a reduced probability of a metastable upset. Synchronizers trade performance for a reduced probability of a metastable upset. Metastability is also an issue for modern technologies. It can be best mitigated by conservative design and large timing margins. Metastability is also an issue for modern technologies. It can be best mitigated by conservative design and large timing margins. Lecture "Advanced Digital Design"© A. Steininger / TU Vienna 78