Simulation meets formal verification David L. Dill Stanford University Serdar Tasiran U.C. Berkeley.

Simulation meets formal verification David L. Dill Stanford University Serdar Tasiran U.C. Berkeley

2 David Dill, Serdar Tasiran Why do we care? Verification is increasingly a bottleneck Large verification teams Huge costs Increases time-to-market Bugs are being shipped Simulation and emulation are not keeping up Formal verification is hard We need alternatives to fill the gap.

3 David Dill, Serdar Tasiran Outline General observations Conventional answers Semi-formal methods Conclusion

4 David Dill, Serdar Tasiran Orientation Focus of this talk: Late stage bugs in register transfer level descriptions (and above). Late stage bugs are hard to find few bugs per simulation cycle, person-hour delays time-to-market Functional errors in RTL are not eliminated by synthesis not discovered by equivalence checking.

5 David Dill, Serdar Tasiran Where do bugs come from? Incorrect specifications Misinterpretation of specifications Misunderstandings between designers Missed cases Protocol non-conformance Resource conflicts Cycle-level timing errors …

6 David Dill, Serdar Tasiran Design scales Now: Single FSM: ~12 bits of state, ~30 states Individual designer subsystem: ~50K gates, 10 FSMs Major subsystem: ~ 250K gates, 50 FSMs ASIC: ~2M gates In a few years: 10 Billion transistor chips Lots of reusable IP

7 David Dill, Serdar Tasiran Properties Verification requires something to check Properties can be represented in many ways Temporal logic Checkers in HDL or other language Properties can be specified at various points: End-to-end (black-box) properties. Internal properties (white-box). [0-In] Whitebox properties are easier to check, because results don’t have to be propagated to system output.

8 David Dill, Serdar Tasiran “Coverage” is the key concept Maximize the probability of stimulating and detecting bugs, at minimum cost (in time, labor, and computation)

9 David Dill, Serdar Tasiran Outline General observations Conventional answers Semi-formal methods Conclusion

10 David Dill, Serdar Tasiran Simulation Simulation is predominant verification method Gate level or register transfer level (RTL) Test cases manually defined, or randomly generated

11 David Dill, Serdar Tasiran Typical verification experience Functional testing Weeks BugsperweekBugsperweek Tapeout Purgatory

12 David Dill, Serdar Tasiran Near-term improvements Faster simulators compiled code cycle simulation emulation Testbench authoring tools (Verisity, Vera (Synopsys)) make pseudo-random better/easier Incremental improvements won’t be enough.

13 David Dill, Serdar Tasiran Formal verification Ensures consistency with specification for all possible inputs (equivalent to 100% coverage of... something). Methods Equivalence checking Model checking Theorem proving Valuable, but not a general solution.

14 David Dill, Serdar Tasiran Equivalence checking Compare high level (RTL) with gate level Gaining acceptance in practice Products: Abstract, Avant!, Cadence, Synopsys, Verplex, … Internal: Veritas (IBM) But the hard bugs are usually in both descriptions Targets implementation errors, not design errors.

15 David Dill, Serdar Tasiran Model checking Enumerates all states in state machine. Gaining acceptance, but not yet widely used. Abstract, Avant!, IBM, Cadence,… Internally supported at Intel, Motorola,... Barrier: Low capacity (~200 register bits). Requires extraction (of FSM controllers) or abstraction (of the design). Both tend to cause costly false errors.

16 David Dill, Serdar Tasiran Theorem proving Theorem prover checks formal proof Mostly check detailed manual proof. Sometimes provides some automatic help. Useful for verifying algorithms [Russinoff, AMD K7 floating pt] integrating verification results [Aagard, et al. DAC 98] Many parts of a big problem can be solved automatically Theorem prover ensures that parts fit together with no gaps. Not a general solution (too hard!)

17 David Dill, Serdar Tasiran Outline General observations Conventional answers Semi-formal methods Coverage measurement Test generation Symbolic simulation Directed model checking Conclusion

18 David Dill, Serdar Tasiran Semi-formal methods Coverage measurement Test generation Symbolic simulation Model checking for bugs

19 David Dill, Serdar Tasiran How to make simulation smarter Simulation driver Simulation engine Monitors Symbolic simulation Coverage analysis Diagnosis of unverified portions Vector generation Conventional Novel [Keutzer & Devadas] IDEAL: Comprehensive validation without redundant effort

20 David Dill, Serdar Tasiran Coverage Analysis: Why? IDEAL: Comprehensive validation without redundant effort What aspects of design haven’t been exercised?  Guides vector generation How comprehensive is the verification so far?  A heuristic stopping criterion Coordinate and compare Separate sets of simulation runs Model checking, symbolic simulation, …  Helps allocate verification resources

21 David Dill, Serdar Tasiran Coverage Metrics A metric identifies important structures in a design representation HDL lines, FSM states, paths in netlist classes of behavior Transactions, event sequences Metric classification based on level of representation. l Code-based metrics (HDL code) l Circuit structure-based metrics (Netlist) l State-space based metrics (State transition graph) l Functionality-based metrics (User defined tasks) Spec-based metrics (Formal or executable spec)

22 David Dill, Serdar Tasiran Desirable scenario IDEAL: Direct correspondence with design errors 100% coverage = All bugs of a certain type detected Desirable Qualities Of Coverage Metrics 0% 100% Metric 1 Metric 2 Metric n Simple, cheap Elaborate, expensive...

23 David Dill, Serdar Tasiran Desirable Qualities Of Coverage Metrics IDEAL: Direct correspondence with bugs PROBLEM: No good model for design errors No analog of “stuck-at faults” for design errors Bugs are much harder to characterize formally  Difficult to prove that a metric is a good proxy for bugs Then why use metrics? Need to gauge status of verification. Heuristic measures of verification adequacy Coverage guided validation uncovers more bugs  Must look for empirical correlation with bug detection  Higher coverage   Higher chance of finding bugs  ~100% coverage   Few bugs remain

24 David Dill, Serdar Tasiran Desirable Qualities Of Coverage Metrics  Direct correspondence with bugs  Ease of use Tolerable overhead to measure coverage Reasonable computational and human effort to: interpret coverage data achieve high coverage generate stimuli to exercise uncovered aspects Minimal modification to validation framework Every metric is a trade-off between these requirements

25 David Dill, Serdar Tasiran Coverage Metrics l Code-based metrics l Circuit structure-based metrics l State-space based metrics l Functionality-based metrics Spec-based metrics

26 David Dill, Serdar Tasiran Code-Based Coverage Metrics On the HDL description Line/code block coverage Branch/conditional coverage Expression coverage Path coverage Tag coverage (more detail later) Useful guide for writing test cases Little overhead A good start but not sufficient < max. code coverage   must test more Does not address concurrency always @ (a or b or s) // mux begin if ( ~s && p ) d = a; r = x else if( s ) d = b; else d = 'bx; if( sel == 1 ) q = d; else if ( sel == 0 ) q = z

27 David Dill, Serdar Tasiran Code-Based Coverage Metrics Many commercial tools that can handle large-scale designs VeriCover (Veritools) SureCov (SureFire, now Verisity) Coverscan (DAI, now Cadence) HDLScore, VeriCov (Summit Design) HDLCover, VeriSure (TransEDA) Polaris (formerly CoverIt) (interHDL, now Avant!) Covermeter (ATC, now Synopsys)...

28 David Dill, Serdar Tasiran Circuit Structure-Based Metrics Toggle coverage: Is each node in the circuit toggled? Register activity: Is each register initialized? Loaded? Read? Counters: Are they reset? Do they reach the max/min value? Register-to-register interactions: Are all feasible paths exercised? Datapath-control interface: Are all possible combinations of control and status signals exercised? s init s3s3 s4s4 s2s2 s5s5 s6s6 Control Datapath (0-In checkers have these kinds of measures.)

29 David Dill, Serdar Tasiran Circuit Structure-Based Metrics Useful guide for test writers. Intuitive, easy to interpret. Not sufficient by themselves. More of a sanity check. Difficult to determine if a path is false a combination of assignments to variables is possible Problem with all metrics: “Is... coverable?” Ask user or use heuristics s init s3s3 s4s4 s2s2 s5s5 s6s6 Control Datapath

30 David Dill, Serdar Tasiran Design Fault Coverage During test, faulty and original designs behave differently Fault detected by a test  Use faults as proxy for actual design errors. Faults are local mutations in HDL code Gate-level structural description (netlist) State transition diagram of a finite state machine, …  COVERAGE: Fraction of faults detected by test suite. Measurement methods similar to fault simulation for mfg. test [Abadir, Ferguson, Kirkland, TCAD ‘88] [Kang & Szygenda, ICCD ‘92] [Fallah, Devadas, Keutzer, DAC ‘98]...

31 David Dill, Serdar Tasiran Design Fault Coverage: Critique Various fault models have been considered Gate (or input) omission/insertion/substitution Wrong output or wrong next state for given input Error in assignment on HDL line Fault models motivated more by ease of use and definition Not really “common denominators” for design errors Additional restrictions, e.g. “single fault assumption” But they provide a fine grain measure of how adequately the design is exercised and observed.

32 David Dill, Serdar Tasiran Observability Simulation detects a bug only if a monitor flags an error, or design and reference model differ on a variable  Portion of design covered only when  it is exercised (controllability)  a discrepancy originating there causes discrepancy in a monitored variable (observability) Low observability   false sense of security Most of the design is exercised  Looks like high coverage But most bugs not detected by monitors or ref. model Observability missing from most metrics Simulation driver Simulation engine Monitors Symbolic simulation Coverage analysis Diagnosis of unverified portions Vector generation

33 David Dill, Serdar Tasiran Tag Coverage [Devadas, Keutzer, Ghosh ‘96] HDL code coverage metrics + observability requirement. Bugs modeled as errors in HDL assignments. A buggy assignment may be stimulated, but still missed EXAMPLES: Wrong value generated speculatively, but never used. Wrong value is computed and stored in memory Read 1M cycles later, but simulation doesn’t run that long.

34 David Dill, Serdar Tasiran Tag Coverage [Devadas, Keutzer, Ghosh ‘96] IDEA: Tag each assignment with + , -   : Deviation from intended value 1 +  : symbolic representation of all values > 1 Run simulation vectors Tag one variable assignment at a time Use tag calculus Tag Coverage: Subset of tags that propagate to observed variables Confirms that tag is activated and its effect propagated. A +  = 1 C -  = 4 - k  A +  // k  0 D  = C -  + A +  A +  = 1

35 David Dill, Serdar Tasiran Tag Coverage: Critique Easily incorporated can use commercial simulators simulation overhead is reasonable Easy to interpret can identify what blocks propagation of a tag can use ATPG techniques to cover a tag Error model doesn’t directly address design errors BUT a better measure of how well the design is tested than standard code coverage

36 David Dill, Serdar Tasiran State-Space-Based Metrics (FSM Coverage) State, transition, or path coverage of “core” FSM: Projection of design onto selected variables Control event coverage [Ho et al., ‘96, FLASH processor] Transition coverage for variables controlling datapath Pair-arcs (introduced by 0-in) For each pair of controller FSMs, exercise all feasible pairs of transitions. Catches synchronization errors, resource conflicts,... Benjamin, Geist, et. al. [DAC ‘99] Hand-written abstract model of processor Shen, Abraham, et.al. Extract FSM for “most important” control variable Cover all paths of a given length on this FSM

37 David Dill, Serdar Tasiran Probably the most appropriate metrics for “bug coverage” Experience: Rare FSM interactions cause difficult bugs Addressed best by multiple-FSM coverage Trade-off: Sophisticated metric on small FSM vs. Simple metric on large FSM/ multiple FSMs. Relative benefits design dependent. Difficult to check if something is coverable May require knowledge of entire design Most code-coverage companies also provide FSM coverage Automatic extraction, user-defined FSMs Reasonable simulation overhead State-Space-Based Metrics

38 David Dill, Serdar Tasiran Functional Coverage Define monitors, tasks, assertions, … Check for specific conditions, activity, … User-defined Coverage [Grinwald, et al., DAC ‘98] (IBM) User defines “coverage tasks” using simple language: First-order temporal logic + arithmetic operators Snapshot tasks: Condition on events in one cycle Temporal tasks: Refers to events over different cycles User expressions (Covermeter), Vera, Verisity Assertion synthesis (checkers) (0-in) Event Sequence Coverage Metrics (ESCMs) [Moundanos & Abraham, VLSI Test Symp. ‘98]

39 David Dill, Serdar Tasiran Functional Coverage Good because they make the designer think about the design in a different and redundant way BUT May require a lot of user effort (unless synthesized) User needs to write monitors May not test corner cases Designers will write monitors for expected case Are design specific Monitors, assertions need to be re-defined for each new design.

40 David Dill, Serdar Tasiran Spec-Based Metrics Model-based metrics are weak at detecting missing functionality The spec encapsulates required functionality  Apply (generalize) design coverage metrics to formal spec PROBLEMS: Spec-based metrics alone may not exercise design thoroughly Spec is often incomplete Two cases that look equivalent according to spec may be implemented differently A formal spec may not exist for the unit being tested  Model and spec-based metrics complement each other

42 David Dill, Serdar Tasiran Verification test generation Approach: Generate tests automatically that maximize coverage per simulation cycle. Automatic test generation is crucial for high productivity. Tests can be generated off-line: vectors saved in files, or on-line: vectors generated as you simulate them. Specific topics ATPG methods (design fault coverage) FSM-based methods (FSM coverage) Test amplification

43 David Dill, Serdar Tasiran ATPG methods Use gate-level design fault model maybe just standard stuck-at model. Generate tests automatically using ATPG (automatic test pattern generation) techniques Takes into account “observability” of error. Oriented towards combinational designs. General solution would need sequential ATPG [hard].

44 David Dill, Serdar Tasiran FSM-based test generation Generate FSM tests using model checking techniques (e.g. BDD, explicit). Map FSM test to design test vector [ hard! ] FSM Design FSM test Design test

45 David Dill, Serdar Tasiran Test vector mapping User defines mapping rules from FSM event to input vectors. [Ho PhD, Stanford 1996, Geist, et al., FMCAD 96] Mapping must be relatively simple. Automatically map to test vectors using sequential ATPG techniques. [Moundanos, et al., IEEE TOC Jan. 1998] Published examples are small.

46 David Dill, Serdar Tasiran Coverage-driven search [Ganai, Aziz, Kuehlmann DAC ‘99] Identify signals that were not toggled in user tests. Attempts to solve for inputs in current cycle that will make signal toggle using BDDs and ATPG methods. Similar approach could be taken for other coverage metrics. General problem: controllability (as in FSM coverage).

47 David Dill, Serdar Tasiran Test Amplification Approach: Leverage interesting behavior generated by user. Explore behavior “near” user tests, to catch near misses. Many methods could be used Satisfiability BDDs Symbolic simulation Formal += Simulation0-In Search

49 David Dill, Serdar Tasiran Symbolic simulation Approach: Get a lot of coverage from a few simulations. Inputs are variables or expressions Operation may compute an expression instead of a value. Advantage: more coverage per simulation one expr can cover a huge set of values. “a” “b - c” “a + b - c” +

50 David Dill, Serdar Tasiran BDD-based symbolic simulation Symbolic expressions are represented as BDDs. Symbolic trajectory evaluation (STE): Special logic for specifying input/output tests. Used at MOS transistor or gate level. COSMOS [Bryant, DAC 90] (freeware), Voss [Seger] Used at Intel, Motorola Transistor and RTL simulation Innologic (commercial)

51 David Dill, Serdar Tasiran Higher-level symbolic simulation Symbolic simulation doesn’t have to be bit-level. RTL symbolic simulation can have built-in datatypes for: Bitvectors, Integers (linear inequalities) Arrays Especially useful if combined with automatic decision procedure for these constructs. [Barrett et al. FMCAD 96, DAC 98]

52 David Dill, Serdar Tasiran Semi-formal verification using Symbolic simulation Symbolic simulation is a tool that can be used for full or partial formal verification. Many papers are about full formal verification. But tools naturally encourage partial verification. Partial verification Use constants for some inputs Convert variables to constants “on-the-fly” [Innologic] Start with constant state, simulate a few cycles with symbolic inputs May miss states with errors. Example: Robert Jones PhD thesis (Stanford/Intel) - symbolic simulation of retirement logic of Pentium Pro.

54 David Dill, Serdar Tasiran Partial model checking When BDD starts to blow up, delete part of state space. High-density BDDs [Ravi,Somenzi,ICCAD ‘95] Subset state space that maximizes statecount/BDDsize Prune BDDs using multiple FSM coverage (“saturated simulation”) [Aziz,Kukula,Shiple, DAC 98] Prioritized model checking Use best-first search for assertion violation states Useful with BDDs or explicit model checking Metrics : Hamming distance [Yang, Dill HLDVT 96, Yuan et al. CAV 97] “ Tracks ” [Yang & Dill, DAC 98] Estimated probability of reaching target state in a random walk [Kuehlmann, McMillan, Brayton, ICCAD 99]

55 David Dill, Serdar Tasiran Comments on model-checking for bugs Topic is not mature. Published examples are small. Big increases in capacity needed.

56 David Dill, Serdar Tasiran Outline General observations Conventional answers Semi-formal methods Research issues Conclusion

57 David Dill, Serdar Tasiran Research methodology Research in this area is empirical. “Scientific method” is important! How do we measure success (can it find bugs?)? What do we use for controls? What is the “null hypothesis”? Apparent effectiveness depends on Design methodology (language, processes) Type of design Designer style, training, and psychology Size of design! Design examples need to be large, realistic, and varied.

58 David Dill, Serdar Tasiran State of the art Research and product development are immature There are many ideas. Experiments are encouraging, but not conclusive. No clear winner has emerged. Commercial products are on the way, but no clear winners (yet).

59 David Dill, Serdar Tasiran Coverage vs. scale Scale (gates) CoverageCoverage 1 FSM 50K 250K 2M Model checking Random simulation Manual test w/ coverage FSM-based generation Symbolic simulation Based on papers

60 David Dill, Serdar Tasiran The future How can we verify huge systems with many reusable components? System-level simulation won’t find bugs efficiently enough. Maybe: Vendors help with semi-formal verification Supply designs with checkers Inside the design At interfaces Environmental constraints, also. Supply information about component Coverage info (e.g. conditions to trigger) Hints for efficient vector generation

61 David Dill, Serdar Tasiran Predictions This is going to be an important area Many papers Verification products Simulation & emulation will continue to be heavily used. Formal verification will be crucial, when applicable Special application domains: protocols, FSMs, floating point, etc. Design for verification would increase scope

62 David Dill, Serdar Tasiran Web page http://verify.stanford.edu

Simulation meets formal verification David L. Dill Stanford University Serdar Tasiran U.C. Berkeley.

Similar presentations

Presentation on theme: "Simulation meets formal verification David L. Dill Stanford University Serdar Tasiran U.C. Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Simulation meets formal verification David L. Dill Stanford University Serdar Tasiran U.C. Berkeley.

Similar presentations

Presentation on theme: "Simulation meets formal verification David L. Dill Stanford University Serdar Tasiran U.C. Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback