4Hardware vs. Software X F in out Software Thread parallelism “Infinite” stateTerminationAsynchronousEasy to fixHardwareDevice parallelismFinite stateUnbound executionSynchronousImpossible to fixXFinout
5b := (2 <= v) & (v <= 5) Describing HardwareRTL = Register Transfer LevelRegistersWiresFinite data typesExpressionsModularityInitial state(s)No final stateXu := (v+1) * avuab := (2 <= v) & (v <= 5)bCircuitX – Boolean stateI(X) – initial predicateT(X,X’) – transition relation
6Correctness When hardware model is correct? Property verification Make statements about behaviors of the model M that are expected to holdIs it true that they hold in every execution of M?Equivalence verificationM1, M2 are modelsIs it true that fM1 = fM2?
7Linear Temporal Logic (LTL) LTL is a formal language to reason about executions of a state machine in timeExecution = infinite sequence of consecutive statess1s2s3s4s5s6s7…Predicate = statement about a statep¬qp is true in s42, while q is falses42
8Temporal operators / 1LTL is a formal language to reason about executions of a state machine in timeppppppps1s2s3s4s5s6s7G(p)…“always p”¬p¬p¬pp---F(p)s1s2s3s4s5s6s7…“eventually p”
9Temporal operators / 2LTL is a formal language to reason about executions of a state machine in timePlus all propositional connectives:“·”, “+”, “→”, “¬”, …¬p¬pp¬ppppF(G(p))s1s2s3s4s5s6s7…“eventually G(p)”G(p)¬pp¬p¬pp¬ppG(F(p))s1s2s3s4s5s6s7…“always F(p)”p is satisfied infinitely often in the executionF(p)F(p)F(p)F(p)F(p)F(p)F(p)
10Safety and Liveness Safety property (invariant) Liveness property G(P(X))P(X) is always trueLiveness propertyGF(P(X))P(X) is satisfied infinitely often
11Guarantee and Persistence Guarantee propertyF(P(X))P(X) becomes truePersistence propertyFG(P(X))P(X) becomes “stuck” at true
12Property verification / 1 Simulation (~debugging)Applied everywhereCheck that the model is correct for a (small) subset of possible behaviorsRandom / semi-random / trace simulationNeed a testbench and checkers
13Property verification / 2 Compliance monitorsHow to choose the “right” properties?Assuming that all properties are true, prove “correctness theorem”Example: proving producer-consumer ordering for PCI Express
14Property verification / 3 Small-scale formal verificationApplied at module level, late in design cycleUse RTL descriptionProve propertiesBit-level verificationBounded proofsComplete proofsPerfecting model checking algorithms
15Property verification / 4 Large-scale formal verificationApplied at system level, early in design cycleBuild an abstract model (HLM)TLA+MurphiSMVxMASProve propertiesSMT & bit-levelTheorem provingStrengtheningAutomatic proof
16Property verification / 2 HLM-to-RTL correlationDemonstrate that RTL is compliant with its High Level ModelSimulationGenerate compliance checkersFormal verification(No practical examples)
18Reachability analysis / 2 Implementation optionsExplicit state enumerationReduced Ordered BDDAnd-Inverter Graphs + SATFormulas + SMTBounded model checkingPartial reachability analysisUnable to reach k, where Rk = Rk+1Still have “bounded” proof that P(X) is true for first n cycles of execution (n < k)
19Induction / 1Can we prove G(P(X)) without checking all reachable states?Can we prove 2n ≤ n! + 100, n ≥ 0, without checking all non-negative n?P(X) is called inductive, iffI(X) → P(X)P(X) & T(X,X’) → P(X’)Theorem: if P(X) is inductive then G(P(X)) is trueTwo SAT (SMT) checks to establish inductivity
20Induction / 2 What if P(X) is not inductive? Theorem: if G(P(X)) is true, there exists inductive Q(X), such that Q(X) → P(X)“Local” properties are usually not inductiveG(y = 1 & x = 1)is inductiveG(y = 1)y = 1 → y’ = 1
21Other bit-level approaches Interpolation [K. McMillan, CAV 2003]Property-orientedBased on Craig’s interpolation theoremIC3 [A. Bradley, VMCAI 2011]Approximate reachabilityIncremental inductive strengtheningABC [A. Mischenko, R. Brayton, UC-Berkeley]Synthesis for verificationParallel model checking
22Equivalence verification Correctness of synthesis algorithmsProve that optimized circuit is equivalent to the originalEquivalence can be expressed as a propertyzAssert(z == 1);X1F1F2X2=in
23Communication fabrics: modeling Communication Fabric = logic connecting different agents on a chip
24Verification is not optional Tricky, distributed interactiondeadlockstarvationModular verification doesn’t workProblems seen lateduring integrationlate RTL, in Silicon
25Our Approach: High-Level Modeling Build early, abstract models of the microarchitectureGoal: (Automatically) prove correctness by exploiting high-level information provided by the model
26xMAS Modeling Framework Model is composed from a small set of primitivesprimitives are formally specifiedfriendly to microarchitectsNetworks where packets flow under backpressurenaturally concurrentpackets are conservedA simple example
27xMAS Semantics is short-hand for Each primitive is a synchronous (RTL) moduleDifferent modules are connected by channelsis short-hand forchannel
28Primitives are parameterized xMAS SemanticsPrimitives are parameterizedis short-hand fork(x#+42)Example
29A Simple Example x x cycle 1 cycle 2 x x x cycle 3 … cycle 6 cycle 7
30Agent P Fabric Agent Q ~ # k agent Q P x = req rsp fabric 2 A more complex example
31Modeling and Verification Flow (from architects)To make this easierAdditional Formal Analysisby handModel CheckerProofC++exec- utablerandom test- bench.vmodel.v+SVAabccompilerunsimulate(RTL Simulator)xMAS API(Library)VCDOther backends to SystemC, TLA, Murphi, etc. possible
33Channel PropertiesA channel property checks that all packets on a channel satisfy some conditione.g. all packets received by an agent have correct dest_id44Example of channel property:G (z.irdy (z.data = 0))(queue 1)(queue 2)Although this property is obviously true:Interpolation (in ABC) takes 10 mins to prove this!(Usually explicit or BDD-based reachability not an option )
34Inductive Strengthening / 1 Use high-level structure to add invariants44Target channel property:G (z.irdy (z.data = 0))(queue 1)(queue 2)Invariant 1:G (usedj (memj = 0))If location j in queue 2 is in use, it must contain 0Invariant 2:G (y.irdy (y.data = 0))Invariant 3:G (usedj (memj = 0)) (For queue 1)This set of invariants is inductive!
35Inductive Strengthening / 2 Similar propagation for other primitivesExample: Propagation across function primitivekxyzv#+7(6-bit)Property: G (z.irdy (z.data = 7))Invariant: G (y.irdy ((y.data+7)=7))The invariant is obtained syntactically (no need to invert function)
36Non-blocking property non-blocking property on channel r: G (r.irdy r.trdy)Intuition: If a packet is in r there is room in the ingress queue(useful to reason about liveness)Hard property to prove: Not inductive!
37Strengthening non-blocking properties numonumcnuminon-blocking property on channel r: G (r.irdy r.trdy)(Inductive) Invariant: G (numi + numc= numo)
38How to find numi + numc= numo automatically? Introduce λ variables to count transfers on each channelnumonumcnumiEliminate λs using modified Gaussian Elimination
39Practical Experience More robust than model checking model checking fails on trivial examplesMore automation than theorem provinghundreds of invariants generated automatically to make it inductiveinvariants for non-blocking can have 20+ num terms each6x6 Mesh: validated global routing in minuteseach router has a static local routing functionchannel properties to check that a router never receives packets it doesn’t know how to route100+ simultaneous transactions in flightHappens less often than you think due to switchesIntroduces implications which are tautologiesCan remove tautological properties with an SMT solver during propagation
41A retry state either persists or ends in transfer. Channel protocolBlocked (trdy=0)FwdRetryirdy = 1trdy = 0Initiator is ready, waiting for targetInactiveirdy = 0trdy = 0Xferirdy = 1trdy = 1BwdRetryirdy = 0trdy = 1Target is ready, waiting for initiatorIdle (irdy=0)Make it clear that all 4 combinations are possible.“For the rest of the presentation, … Idle, Block”A retry state either persists or ends in transfer.
42Deadlock in xMAS Deadlock in xMAS is defined for a channel datairdytrdyDeadlock in xMAS is defined for a channelDead(u) = “eventually a packet arrives at input of u, but output of u never accepts it”Dead(u) = F(u.irdy · G¬u.trdy)Live(u) = ¬Dead(u) = G(u.irdy → Fu.trdy)
43Example FG(¬w.trdy) Dead(u) Sink is not fair GF(w.trdy) Live(u) Sink is fair
44GF(x.trdy) = ¬FG(¬w.trdy) Fairness ConstraintsFairness constraint:GF(x.trdy) = ¬FG(¬w.trdy)Execution of xMAS model is fair if it satisfies all fairness constraintsFair = conjunction of all fairness constraints
45Simple Messaging Fabric with Ordering State of the artABC (liveness-to-safety translation)Reduces liveness problem to equivalent safety problem by circuit transformationDoubles number of flopsBMC can reveal “shallow” deadlocksInduction, interpolation, etc.Does not scale well to xMAS models with 10s of queuesExampleSimple Messaging Fabric with Ordering75 primitives(24 queues)ABCProof – infeasible.BMC – 29 frames in 1hr.Our methodProof – 4s.
46Idle and Block Idle(u) = FG(¬u.irdy) “u (eventually) stuck at idle” Block(u) = FG(¬u.trdy) “u (eventually) stuck at blocked”Dead(u) = ¬Idle(u) · Block(u)“u is in deadlock iff u stuck at blocked, but not stuck at idle”Fair = ¬Block(w)“sink is fair iff w not stuck at blocked”
47Equations for xMAS queue Idle(v)Idle(u)Block(v)u.trdy = (q.num < k)v.irdy = (q.num > 0)…Block(u)Empty(q)Full(q)Full(q) = FG(q.num=k) “q (eventually) stuck at full”Empty(q) = FG(q.num=0) “q (eventually) stuck at empty”Equations propagate knowledge about Idle and Block conditions through a queue.Block(u) = Full(q)“u stuck at blocked iff q stuck at full”Full(q) → Block(v)“if q stuck at full then v stuck at blocked”Similar equations for all xMAS primitives.Other equations: Idle(v) = Empty(q), Empty(q) → Idle(u),¬Idle(u) · Block(v) → Full(q), etc.
48Liveness proof Block(u) = Full(q1) Full(q1) → Block(v) DeadEq Dead(u) =¬Idle(u) · Block(u)Block(v)Block(w)Full(q1)Full(q2)¬Block(w)Block(u) = Full(q1)Full(q1) → Block(v)DeadEq(true for every execution)Fair = ¬Block(w)(true for all fair executions)Block(v) = Full(q2)Full(q2) → Block(w)Dead(u) · DeadEq · Fair = 0 in propositional logicThere is no fair execution of the model, where channel u is dead.Q.E.D.
49Method overview Dead(u) As ¬Idle · Block + DeadEq (per primitive)LTL theorems, describing how Idle and Block propagate through primitives+May return unreachable executions as counterexamplesRule out with over approximate reachability using automatically generated invariants.Can’t comprehend message-dependent deadlocks with Idle and Block onlyCapture properties of data with additional Prop conditions.(details in the backup)FairAs ¬BlockUNSAT = Liveness proofSAT = Counterexample
50Experimental results: proof No proofs with ABC, when number of queues is in 10s.ExperimentNPrims (NQueues)Time for ABC (proof)Num. of BMC frames in 1hrTime for our methodTime % (DE+SAT)Time for safetyTwo queues4 (2)<1s>100-SMF = simple messaging fabric57 (20)N/A2sSMF with ordering75 (24)294sIOF1 = IO fabric91 (19)153sIOF2293 (58)1414s53%+47%1minIOF3480 (92)1112min94%+6%9minIOF4 (minimal)493 (97)182.5min85%+15%28sIOF4 (non-minimal)203min65%+35%3.5hrIOF5 (minimal)680 (131)40min97%+3%2minIOF5 (non-minimal)44min10hr+
51ConclusionVerification technology to reduce liveness of xMAS model to satisfiability in propositional logicCan get proofs for models with 100s of queuesOpen problem: method requires extension for models with non-restricted joins
52References“Quick Formal Modeling of Communication Fabrics to Enable Verification”, HLDVT 2010“Automatic Generation of Inductive Invariants from High-Level Microarchitectural Models of Communication Fabrics”, CAV 2010“Verifying Deadlock-Freedom of Communication Fabrics”, VMCAI 2011
55# Careful with Joins # ( x , y ) + z ( x , y ) z Hard case Easy case G (z.irdy (z.data = 42))Most joins like this in practice(used for synchronization)
56? Adding invariants / 1 Dead(a) Fair = ¬Block(g) Empty(q1)Dead(a)Fair = ¬Block(g)Full(q2)Full(q3)Dead(a) · DeadEq · Fair has a solutionSolution is unreachable execution with q1 stuck at empty and q2, q3 stuck at fullDeadlock equations – about executionsFlow invariants – about instantaneous stateThe execution contradicts with model flow invariant (automatically generated):q1.num = q2.num + q3.num?(q1.num = 0) → ((q2.num = 0)·(q3.num = 0))
57Dead(u) · DeadEq · Fair · Inv · InfEq for propositional satisfiability Adding invariants / 2Empty(q1)Dead(a)Fair = ¬Block(g)LiveFull(q2)Full(q3)Add equations connecting Idle, Block, Empty, Full, etc. with instantaneous state of the model (q.num, irdy, trdy, etc.):Empty(q) → (q.num = 0)Full(q) → (q.num = k)Idle(u) → (u.irdy = 0)…InfEq (eventually true for every execution)UNSATAdd any instantaneous invariants (Inv), solveDead(u) · DeadEq · Fair · Inv · InfEq for propositional satisfiability
58Method revisited Dead(u) As ¬Idle · Block + DeadEq (per primitive)LTL theorems, describing how Idle and Block propagate through primitives+FairUNSAT = Liveness proofSAT = CounterexampleAs ¬Block+InfEqConnect Idle, Block, etc. with instantaneous state+InvAutomatically generated instantaneous invariants
59This method is a recent improvement on the one in the paper. Data dependencies / 1a(x)Idle(v)Block(v)Idle(u)u.trdy = u.irdy · (a(u.data) · v.trdy ++ b(u.data) · w.trdy)v.irdy = u.irdy · a(u.data)w.irdy = u.irdy · b(u.data)Block(u)Idle(w)b(x)Block(w)Idle(v) depends on what kind of data is coming from input channel uIdle(v) = Idle(u) + “all packets are going to the bottom branch”Propp(x)(u) = FG(u.irdy → p(u.data)) “u (eventually) stuck at p(x)”Idle(v) = Idle(u) + Prop¬a(x)(u)“v stuck at idle, iff u stuck at idle or u stuck at ¬a(x)”Prop conditions in addition to Idle and Block, etc.Add equations to propagate Prop conditions through the modelThis method is a recent improvement on the one in the paper.
60This method is a recent improvement on the one in the paper. Data dependencies / 2Propp(f(x))(v)Propp(x)(v)Propp(x)(v) = Propp(f(x))(u)“v stuck at p(x) iff u stuck at p(f(x))”Number of all possible Prop conditions is(number of channels) x (average number of predicates per channel).Linear from size of the model.May be exponential from width of data on channels.Explosion is avoidable in practiceCompute a limited subset of Prop conditions.Start with Prop conditions used in equations for all switches.Add new Prop conditions with propagation.Stop propagation, if Prop predicate is repeated or becomes a tautology.This method is a recent improvement on the one in the paper.
61Method revisited (again) Dead(u)As ¬Idle · BlockFind a set of Prop conditions to use+and PropDeadEq(per primitive)LTL theorems, describing how Idle and Block propagate through primitives+FairUNSAT = Liveness proofSAT = CounterexampleAs ¬Block+InfEqConnect Idle, Block, etc. with instantaneous state+InvAutomatically generated instantaneous invariants
62Experimental results: finding a bug NPrims (NQueues)Time for ABC(BMC)Time for our method2 Master/Slave agents(no credit logic)16 (2)6s<1sSMF1(unfair sink)57 (20)4s2s(broken credits logic)8sSMF275 (24)1hr3s(broken ordering logic)4hrDeadlocks found with our method can be unreachable (due to over-approximation).In experiments, all found deadlocks were reachable.