Presentation on theme: "A. Gotmanov, S. Chatterjee, M. Kishinevsky 1 Proving Absence of Deadlocks in Hardware Alexander Gotmanov, Satrajit Chatterjee and Mike Kishinevsky Intel."— Presentation transcript:
A. Gotmanov, S. Chatterjee, M. Kishinevsky 1 Proving Absence of Deadlocks in Hardware Alexander Gotmanov, Satrajit Chatterjee and Mike Kishinevsky Intel Corp., Strategic CAD Labs (SCL)
A. Gotmanov, S. Chatterjee, M. Kishinevsky 2 Outline Basics of hardware verification Communication fabrics – Modeling – Verification
A. Gotmanov, S. Chatterjee, M. Kishinevsky 3 Basics of hardware modeling and verification
A. Gotmanov, S. Chatterjee, M. Kishinevsky 4 Hardware vs. Software Software Thread parallelism “Infinite” state Termination Asynchronous Easy to fix Hardware Device parallelism Finite state Unbound execution Synchronous Impossible to fix X F in out
Describing Hardware RTL = Register Transfer Level –Registers –Wires –Finite data types –Expressions –Modularity –Initial state(s) –No final state X u := (v+1) * a v u a b := (2 <= v) & (v <= 5) b Circuit X – Boolean state I(X) – initial predicate T(X,X’) – transition relation
Correctness Equivalence verification –M1, M2 are models –Is it true that f M1 = f M2 ? When hardware model is correct? Property verification –Make statements about behaviors of the model M that are expected to hold –Is it true that they hold in every execution of M?
Linear Temporal Logic (LTL) s1s2s3s4s5s6s7 LTL is a formal language to reason about executions of a state machine in time … Execution = infinite sequence of consecutive states Predicate = statement about a state s42 p ¬q p is true in s42, while q is false
Temporal operators / 1 s1s2s3s4s5s6s7 pp G(p)G(p) ppppp … s1s2s3s4s5s6s7 ¬p p--- … F(p)F(p) “always p” “eventually p” LTL is a formal language to reason about executions of a state machine in time
Temporal operators / 2 s1s2s3s4s5s6s7 ¬p F(G(p)) p¬pppp … s1s2s3s4s5s6s7 ¬pp p p … G(F(p)) “eventually G(p)” “always F(p)” G(p)G(p) F(p)F(p)F(p)F(p)F(p)F(p)F(p)F(p)F(p)F(p)F(p)F(p)F(p)F(p) p is satisfied infinitely often in the execution Plus all propositional connectives: “·”, “+”, “→”, “¬”, … LTL is a formal language to reason about executions of a state machine in time
Safety and Liveness Safety property (invariant) –G(P(X)) –P(X) is always true Liveness property –GF(P(X)) –P(X) is satisfied infinitely often
Guarantee and Persistence Guarantee property –F(P(X)) –P(X) becomes true Persistence property –FG(P(X)) –P(X) becomes “stuck” at true
Property verification / 1 Simulation (~debugging) –Applied everywhere –Check that the model is correct for a (small) subset of possible behaviors –Random / semi-random / trace simulation –Need a testbench and checkers
Property verification / 2 Compliance monitors –How to choose the “right” properties? –Assuming that all properties are true, prove “correctness theorem” –Example: proving producer-consumer ordering for PCI Express
Property verification / 3 Small-scale formal verification –Applied at module level, late in design cycle –Use RTL description –Prove properties –Bit-level verification –Bounded proofs –Complete proofs Perfecting model checking algorithms
Property verification / 4 Large-scale formal verification –Applied at system level, early in design cycle –Build an abstract model (HLM) –TLA+ –Murphi –SMV –xMAS –Prove properties –SMT & bit-level –Theorem proving –Strengthening –Automatic proof
Property verification / 2 HLM-to-RTL correlation –Demonstrate that RTL is compliant with its High Level Model –Simulation –Generate compliance checkers –Formal verification –(No practical examples)
Reachability analysis / 1 Construct step-wise reachability sets –R 0, R 1, R 2, … –R 0 (X) = I(X) –R i+1 (X’) = R i (X) + X. R i (X) & T(X,X’) Need to prove: –G(P(X)) For some k, R k = R k+1 = R Check that R(X) → P(X). Q.E.D. X – Boolean state I(X) – initial predicate T(X,X’) – transition relation
Reachability analysis / 2 Implementation options –Explicit state enumeration –Reduced Ordered BDD –And-Inverter Graphs + SAT –Formulas + SMT Bounded model checking –Partial reachability analysis –Unable to reach k, where R k = R k+1 –Still have “bounded” proof that P(X) is true for first n cycles of execution (n < k)
Induction / 1 P(X) is called inductive, iff –I(X) → P(X) –P(X) & T(X,X’) → P(X ’ ) Can we prove G(P(X)) without checking all reachable states? Can we prove 2 n ≤ n! + 100, n ≥ 0, without checking all non-negative n? Theorem: if P(X) is inductive then G(P(X)) is true –Two SAT (SMT) checks to establish inductivity
Induction / 2 Theorem: if G(P(X)) is true, there exists inductive Q(X), such that Q(X) → P(X) What if P(X) is not inductive? “Local” properties are usually not inductive G(y = 1) y = 1 → y’ = 1 G(y = 1 & x = 1) is inductive
Other bit-level approaches Interpolation [K. McMillan, CAV 2003] –Property-oriented –Based on Craig’s interpolation theorem IC3 [A. Bradley, VMCAI 2011] –Approximate reachability –Incremental inductive strengthening ABC [A. Mischenko, R. Brayton, UC-Berkeley] –Synthesis for verification –Parallel model checking
Equivalence verification Correctness of synthesis algorithms −Prove that optimized circuit is equivalent to the original Equivalence can be expressed as a property X1 F1 in X2 F2 z = Assert(z == 1);
A. Gotmanov, S. Chatterjee, M. Kishinevsky 23 Communication fabrics: modeling Communication Fabric = logic connecting different agents on a chip
A. Gotmanov, S. Chatterjee, M. Kishinevsky 24 Verification is not optional Tricky, distributed interaction –deadlock –starvation Modular verification doesn’t work Problems seen late –during integration –late RTL, in Silicon
A. Gotmanov, S. Chatterjee, M. Kishinevsky 25 Our Approach: High-Level Modeling Build early, abstract models of the microarchitecture Goal: (Automatically) prove correctness by exploiting high-level information provided by the model
A. Gotmanov, S. Chatterjee, M. Kishinevsky 26 xMAS Modeling Framework Model is composed from a small set of primitives primitives are formally specified A simple example friendly to microarchitects Networks where packets flow under backpressure naturally concurrent packets are conserved
A. Gotmanov, S. Chatterjee, M. Kishinevsky 27 xMAS Semantics is short-hand for Each primitive is a synchronous (RTL) module Different modules are connected by channels channel
A. Gotmanov, S. Chatterjee, M. Kishinevsky 28 xMAS Semantics is short-hand for Primitives are parameterized Example k k (x # x+42)
A. Gotmanov, S. Chatterjee, M. Kishinevsky 29 A Simple Example xx x xx cycle 1cycle 2 cycle 3cycle 6… cycle 7cycle 8
A. Gotmanov, S. Chatterjee, M. Kishinevsky 30 Agent P Agent Q Fabric A more complex example kk kk kk kk ~~ ~~ k k kk agent Q P x # x=req x # rsp x # fabric
A. Gotmanov, S. Chatterjee, M. Kishinevsky 31 Modeling and Verification Flow C++ xMAS API (Library) exec- utable by hand compile model.v + SVA random test- bench.v run VCD Model Checker simulate (RTL Simulator) abc Proof (from architects) Other backends to SystemC, TLA, Murphi, etc. possible Additional Formal Analysis To make this easier
A. Gotmanov, S. Chatterjee, M. Kishinevsky 32 Safety proofs
A. Gotmanov, S. Chatterjee, M. Kishinevsky 33 Channel Properties 44 Although this property is obviously true: ―Interpolation (in ABC) takes 10 mins to prove this! ―(Usually explicit or BDD-based reachability not an option ) (queue 1)(queue 2) Example of channel property: G (z.irdy (z.data = 0)) A channel property checks that all packets on a channel satisfy some condition ―e.g. all packets received by an agent have correct dest_id
A. Gotmanov, S. Chatterjee, M. Kishinevsky 34 Inductive Strengthening / 1 Target channel property: G (z.irdy (z.data = 0)) 44 (queue 1)(queue 2) Use high-level structure to add invariants Invariant 1: Invariant 2: G (y.irdy (y.data = 0)) Invariant 3: G (used j (mem j = 0)) (For queue 1) G (used j (mem j = 0)) If location j in queue 2 is in use, it must contain 0 This set of invariants is inductive!
A. Gotmanov, S. Chatterjee, M. Kishinevsky 35 Inductive Strengthening / 2 Similar propagation for other primitives Property: G (z.irdy (z.data = 7)) Invariant: G (y.irdy ((y.data+7)=7)) The invariant is obtained syntactically (no need to invert function) Example: Propagation across function primitive k xyz v # v+7 0 (6-bit)
A. Gotmanov, S. Chatterjee, M. Kishinevsky 36 Non-blocking property non-blocking property on channel r: G (r.irdy r.trdy) Intuition: If a packet is in r there is room in the ingress queue (useful to reason about liveness) Hard property to prove: Not inductive!
A. Gotmanov, S. Chatterjee, M. Kishinevsky 37 Strengthening non-blocking properties (Inductive) Invariant: G (num i + num c = num o ) num i num c num o non-blocking property on channel r: G (r.irdy r.trdy)
A. Gotmanov, S. Chatterjee, M. Kishinevsky 38 How to find num i + num c = num o automatically? num i num c num o Eliminate λs using modified Gaussian Elimination Introduce λ variables to count transfers on each channel
A. Gotmanov, S. Chatterjee, M. Kishinevsky 39 Practical Experience More robust than model checking –model checking fails on trivial examples More automation than theorem proving –hundreds of invariants generated automatically to make it inductive –invariants for non-blocking can have 20+ num terms each
A. Gotmanov, S. Chatterjee, M. Kishinevsky 40 Liveness proofs
A. Gotmanov, S. Chatterjee, M. Kishinevsky 41 Channel protocol Xfer irdy = 1 trdy = 1 Inactive irdy = 0 trdy = 0 Initiator is ready, waiting for target Target is ready, waiting for initiator FwdRetry irdy = 1 trdy = 0 BwdRetry irdy = 0 trdy = 1 A retry state either persists or ends in transfer. Idle (irdy=0) Blocked (trdy=0)
Deadlock in xMAS Dead(u) = F(u.irdy · G¬u.trdy) Deadlock in xMAS is defined for a channel Dead(u) = “eventually a packet arrives at input of u, but output of u never accepts it” irdy data trdy Live(u) = ¬Dead(u) = G(u.irdy → Fu.trdy)
Example GF(w.trdy) Live(u) Dead(u) FG(¬w.trdy) Sink is fair Sink is not fair
Fairness Constraints Fairness constraint: GF(x.trdy) = ¬FG(¬w.trdy) Fair = conjunction of all fairness constraints Execution of xMAS model is fair if it satisfies all fairness constraints
State of the art ABC (liveness-to-safety translation) –Reduces liveness problem to equivalent safety problem by circuit transformation Doubles number of flops –BMC can reveal “shallow” deadlocks –Induction, interpolation, etc. –Does not scale well to xMAS models with 10s of queues Example Simple Messaging Fabric with Ordering 75 primitives (24 queues) ABC Proof – infeasible. BMC – 29 frames in 1hr. Our method Proof – 4s.
Idle and Block Dead(u) = ¬Idle(u) · Block(u) “u is in deadlock iff u stuck at blocked, but not stuck at idle” Idle(u) = FG(¬u.irdy) “u (eventually) stuck at idle” Block(u) = FG(¬u.trdy)“u (eventually) stuck at blocked” Fair = ¬Block(w) “sink is fair iff w not stuck at blocked”
Equations for xMAS queue Idle(u) Block(u) Idle(v) Block(v) Full(q) Empty(q) Full(q) = FG(q.num=k) “q (eventually) stuck at full” Empty(q) = FG(q.num=0) “q (eventually) stuck at empty” Block(u) = Full(q) “u stuck at blocked iff q stuck at full” Full(q) → Block(v) “if q stuck at full then v stuck at blocked” Other equations: Idle(v) = Empty(q), Empty(q) → Idle(u), ¬Idle(u) · Block(v) → Full(q), etc. Similar equations for all xMAS primitives. u.trdy = (q.num < k) v.irdy = (q.num > 0) … Equations propagate knowledge about Idle and Block conditions through a queue.
Liveness proof Full(q 1 ) Full(q 2 ) Block(v) Block(w) Block(u) = Full(q 1 ) Full(q 1 ) → Block(v) Block(v) = Full(q 2 ) Full(q 2 ) → Block(w) Fair = ¬Block(w) (true for all fair executions) Dead(u) · DeadEq · Fair = 0 in propositional logic DeadEq (true for every execution) There is no fair execution of the model, where channel u is dead. Dead(u) = ¬Idle(u) · Block(u) Q.E.D. ¬Block(w)
Method overview As ¬Idle · Block LTL theorems, describing how Idle and Block propagate through primitives As ¬Block Dead(u) DeadEq (per primitive) Fair UNSAT = Liveness proof SAT = Counterexample + + May return unreachable executions as counterexamples Rule out with over approximate reachability using automatically generated invariants. Can’t comprehend message-dependent deadlocks with Idle and Block only Capture properties of data with additional Prop conditions. (details in the backup)
Experimental results: proof ExperimentNPrims (NQueues) Time for ABC (proof) Num. of BMC frames in 1hr Time for our method Time % (DE+SAT) Time for safety Two queues4 (2)<1s>100<1s- SMF = simple messaging fabric 57 (20)N/A>1002s-<1s SMF with ordering75 (24)N/A294s-2s IOF1 = IO fabric91 (19)N/A152s-3s IOF2293 (58)N/A1414s53%+47%1min IOF3480 (92)N/A1112min94%+6%9min IOF4 (minimal)493 (97)N/A182.5min85%+15%28s IOF4 (non-minimal)493 (97)N/A203min65%+35%3.5hr IOF5 (minimal)680 (131)N/A1540min97%+3%2min IOF5 (non-minimal)680 (131)N/A2044min94%+6%10hr+ No proofs with ABC, when number of queues is in 10s.
Conclusion Verification technology to reduce liveness of xMAS model to satisfiability in propositional logic Can get proofs for models with 100s of queues Open problem: method requires extension for models with non-restricted joins
A. Gotmanov, S. Chatterjee, M. Kishinevsky 52 References “Quick Formal Modeling of Communication Fabrics to Enable Verification”, HLDVT 2010 “Automatic Generation of Inductive Invariants from High-Level Microarchitectural Models of Communication Fabrics”, CAV 2010 “Verifying Deadlock-Freedom of Communication Fabrics”, VMCAI 2011
A. Gotmanov, S. Chatterjee, M. Kishinevsky 53 Thank you
A. Gotmanov, S. Chatterjee, M. Kishinevsky 54 Backup
A. Gotmanov, S. Chatterjee, M. Kishinevsky 55 Careful with Joins G (z.irdy (z.data = 42)) Hard case Easy case Most joins like this in practice (used for synchronization) ( x, y ) # x + y x y z ( x, y ) # y x y z
Adding invariants / 1 Dead(a) Fair = ¬Block(g) Dead(a) · DeadEq · Fair has a solution Solution is unreachable execution with q 1 stuck at empty and q 2, q 3 stuck at full The execution contradicts with model flow invariant (automatically generated): q 1.num = q 2.num + q 3.num Full(q 2 ) Full(q 3 ) Empty(q 1 ) (q1.num = 0) → ((q2.num = 0)·(q3.num = 0)) Deadlock equations – about executions Flow invariants – about instantaneous state ?
Adding invariants / 2 Dead(a) Fair = ¬Block(g) Add equations connecting Idle, Block, Empty, Full, etc. with instantaneous state of the model (q.num, irdy, trdy, etc.): Empty(q) → (q.num = 0) Full(q) → (q.num = k) Idle(u) → (u.irdy = 0) … Full(q 2 ) Full(q 3 ) Empty(q 1 ) InfEq (eventually true for every execution) Add any instantaneous invariants (Inv), solve Dead(u) · DeadEq · Fair · Inv · InfEq for propositional satisfiability UNSAT Live
Method revisited As ¬Idle · Block LTL theorems, describing how Idle and Block propagate through primitives As ¬Block Dead(u) DeadEq (per primitive) Fair UNSAT = Liveness proof SAT = Counterexample + + InfEq + Connect Idle, Block, etc. with instantaneous state Inv + Automatically generated instantaneous invariants
Data dependencies / 1 Idle(u) Block(u) Idle(v) Block(v) Idle(w) Block(w) a(x) b(x) Idle(v) depends on what kind of data is coming from input channel u Idle(v) = Idle(u) + “all packets are going to the bottom branch” Prop p(x) (u) = FG(u.irdy → p(u.data))“u (eventually) stuck at p(x)” Idle(v) = Idle(u) + Prop ¬a(x) (u) “v stuck at idle, iff u stuck at idle or u stuck at ¬a(x)” Prop conditions in addition to Idle and Block, etc. Add equations to propagate Prop conditions through the model u.trdy = u.irdy · (a(u.data) · v.trdy + + b(u.data) · w.trdy) v.irdy = u.irdy · a(u.data) w.irdy = u.irdy · b(u.data) This method is a recent improvement on the one in the paper.
Data dependencies / 2 Prop p(x) (v) Prop p(x) (v) = Prop p(f(x)) (u) “v stuck at p(x) iff u stuck at p(f(x))” Prop p(f(x)) (v) Number of all possible Prop conditions is (number of channels) x (average number of predicates per channel). Linear from size of the model. May be exponential from width of data on channels. Explosion is avoidable in practice Compute a limited subset of Prop conditions. Start with Prop conditions used in equations for all switches. Add new Prop conditions with propagation. Stop propagation, if Prop predicate is repeated or becomes a tautology.
Method revisited (again) As ¬Idle · Block LTL theorems, describing how Idle and Block propagate through primitives As ¬Block Dead(u) DeadEq (per primitive) Fair UNSAT = Liveness proof SAT = Counterexample + + InfEq + Connect Idle, Block, etc. with instantaneous state Inv + Automatically generated instantaneous invariants and Prop Find a set of Prop conditions to use
Experimental results: finding a bug ExperimentNPrims (NQueues)Time for ABC (BMC) Time for our method 2 Master/Slave agents (no credit logic) 16 (2)6s<1s SMF1 (unfair sink) 57 (20)4s2s SMF1 (broken credits logic) 57 (20)8s2s SMF2 (unfair sink) 75 (24)1hr3s SMF2 (broken ordering logic) 75 (24)4hr3s Deadlocks found with our method can be unreachable (due to over- approximation). In experiments, all found deadlocks were reachable.