Connecting High Level Models and RTL: an Ongoing Battle Jesse Bingham Intel Feb 25 2009.

Connecting High Level Models and RTL: an Ongoing Battle Jesse Bingham Intel Feb 25 2009

Big Picture Architecture RTL Netlists Layout/Backend FPV FEV CDC, TOV Diagram unapologetically stolen from Erik This red arrow is the problem de jour

Formal Verification (ideal case): full coverage of design space Simulation: spot coverage of design space Motivation for Formal Verification Formal Verification (real life): full coverage in some areas Also stolen from Erik

Another Dimension … State/behavior coverage Property coverage FEV Arithmetic FV @Intel Traditional Simulation-based Testing Bounded Model Checking Protocol Model checking Type-checking Formal specification Theorem proving Today’s Topic (formal) Today’s Topic (checker)

Overview Protocols naturally & succinctly specified by high level models (HLM) –In a sense, all RTL safety properties are captures by the HLM Actual HW design (RTL) is hand-written by engineers How do we establish that RTL adheres to its HLM? –What does adherence even mean mathematically? Two approaches –Checker: HDL code that “watches” the design during simulation, raises alarms if it detects non-adherence Most of this talk is about checkers –Formal Proof: prove that checker can never ever ring alarm having the checker is obviously a prerequisite for formal proof Notoriously hard problem in FV –but getting more and more important in HW design

HW Protocols Distributed components exchanging messages Control Oriented Cannot be specified by input/output relations State is king Typically message latency insensitive (though message ordering often matters) Naturally specified at high level using guarded command languages (Murphi, TLA, Unity, etc) –we’ll call this the high level model (HLM) –we use Murphi, but this work is independent of the particular modeling language

HLM: Guarded Commands [Dijkstra 1975] Guard: predicate on states Command: function mapping states to states Guarded Command (GC): a guard & a command –Command is only allowed to fire if guard is true Called rules or rulesets in Murphi… Rule “go to park” NOT raining ==> location := nearest_park(); end Ruleset food : FOOD “have picnic” hungry AND NOT raining ==> location := nearest_park(); eat(food); end

… initial state enabled GC fires HLM Behaviors & Properties State invariants: all reachable states are “okay” –Cache always has at most one entry for each address More general safety properties –Cache returns most recently written data to a read request Liveness (typically assuming fairness) –If you send a read request, cache will eventually return data

Register Transfer Level (RTL) Clock/state accurate (or at least close) Pipelines Schedulers Special logic –Design-for-test –Clock gating –Reset Written in hardware description language like System Verilog or VHDL (we use SV) Can be formalizes as finite state automata or Kripke structures; we won’t do that today FV methods and CAD tools below RTL have advanced to the point where one can (if they choose to) safely think of RTL as the real Silicon

Refinement Map A function RM taking RTL states to HLM states is called a refinement map –Intuitively, RM(r) is the HLM state that summarizes RTL state r –Many-to-one in general –Human writes this in our methodology Generalization: RM depend on RTL signals at fixed offsets from current cycle –Useful for dealing with RTL pipelines

HLM Behavior RTL Behavior … one RTL clock cyle … reset state initial state Refinement map a guarded command fires Behavioral Refinement Each RTL clock cycle corresponds to zero or more guarded commands firing

HLM RTL one RTL clock cyle Checking Refinement … r RM(r) r (gc 1,gc 2, gc k )…GC_prediction(r) = =? Next

Running Example: Toy Cache Controller Cache Controller Main Memory CPU

Cache Controller HLM ……… Addr Data State  {Invalid,Dirty,Clean} CacheArray Cpu2Cache Cache2Mem Cache2Cpu Mem2Cache Let’s pretend these don’t exist

Ruleset i : CacheIndex “Recv Store" Cpu2Cache.opcode = Store & CacheArray[i].State != Invalid & CacheArray[i].Addr = Cpu2Cache.Addr ==> CacheArray[i].Data := Cpu2Cache.Data; CacheArray[i].State := Dirty; Absorb(Cpu2Cache); end Cache Controller HLM GC Recv_Store

Cache Controller HLM GC Evict Ruleset i : CacheIndex “Evict" CacheArray[i].State != Invalid ==> if (CacheArray[i].State == Dirty) begin Cache2Mem.opcode := WriteBack; Cache2Mem.Addr = CacheArray[i].Addr; Cache2Mem.Data = CacheArray[i].Data; end; CacheArray[i].State := Invalid; end

Cache Controller RTL Cpu2Cache Cache2Mem Cache State & Addr Array Eviction Logic Hit? Pipe stage 1 Pipe stage 2 Cache Data Array

Store with Eviction Cpu2Cache Cache2Mem Cache State & Addr Array Eviction Logic Hit? Pipe stage 1 Pipe stage 2 Store(A0,D0) Cache Data Array Dirty,A1 Store(A0,D0) WriteBack(A1,D1) D1Dirty,A0D0 WriteBack(A1,D1)

Store with Eviction Revisited Cache State & Addr Array Eviction Logic Hit? Pipe stage 1 Pipe stage 2 Store(A0,D0) Cache Data Array Dirty,A1 Store(A0,D0) WriteBack(A1,D1) D1Dirty,A0D0 WriteBack(A1,D1) When do the HLM GCs “happen” in the RTL? Store Evict

Key Point #1 Pipelining causes GCs that are atomic in the HLM to be non-atomic in the RTL. This non-atomicity must be handled by the refinement map.

Key Point #2 In the HLM GCs are interleaved; while the RTL can exhibit true GC concurrency. This must be resolved by the GC prediction.

Cache Controller Refinement Map (conceptual) function HLM_STATE RM(); // refinement map function HLM_STATE HLM; HLM.CacheArray[].State = RTL.AddrArray[].State; HLM.CacheArray[].Addr = RTL.AddrArray[].Addr; HLM.CacheArray[].Data = RTL.DataArray[]@+1; HLM.Cpu2Cache = RTL.Cpu2Cache@-1; HLM.Cache2Cpu = RTL.Cache2Cpu@+1; return(HLM); end; @k denotes the value will have k clock cycles in the future (k can be negative too, to refer to the past)

Cache Controller Refinement Map (with only non-positive temporal offsets) function HLM_STATE RM(); // refinement map function HLM_STATE HLM; HLM.CacheArray[].State = RTL.AddrArray[].State@-1; HLM.CacheArray[].Addr = RTL.AddrArray[].Addr@-1; HLM.CacheArray[].Data = RTL.DataArray[]; HLM.Cpu2Cache = RTL.Cpu2Cache@-2; HLM.Cache2Cpu = RTL.Cache2Cpu@; return(HLM); end; @-k can be constructed using System Verilog’s $past operator

Store with Eviction Re-Revisited Cache State & Addr Array Pipe stage 1 Pipe stage 2 Store(A0,D0) Cache Data Array Dirty,A1 Store(A0,D0) WriteBack(A1,D1) D1 Dirty,A0 D0 WriteBack(A1,D1) HLM RTL EvictRecvStore

Cache Controller GC Prediction function HLM_STATE Next_HLM_STATE(HLM_STATE hs); if (RTL.Cpu2Cache.Valid@-2) begin i = get_target_cache_index()@-2; if (will_need_eviction()@-2) hs = Evict(hs,i); if (RTL.Cpu2Cache.Op@-2 = STORE) hs = Recv_Store(hs,i); else if (RTL.Cpu2Cache.Op@-2 = LOAD) hs = Recv_Load(hs,i); end;... // figure out when to fire Send_Memory_Request // and Recv_Memory_Response end; Can result in 0, 1, or 2 GCs fired

Back-to-back Stores with Eviction State& Addr Array Pipe stage 1 Pipe stage 2 Store(A0,D0) Data Array Dirty,A1 Store(A0,D0) WriteBack(A1,D1) D1Dirty,A0D0 WriteBack(A1,D1) HLM RTL EvictRecvStore(A0) Store(A2,D2) D2 Dirty,A2 Store(A2,D2) RecvStore(A1)

FYI, we do everything in System Verilog Actual design under verificaiton –written by HW designers Test bench –written by HW validators HLM –written in Murphi by FV team in consultation with Architects –compiled into SV by a tool we wrote Refinement Map –hand-written in SV by FV team GC Prediction –hand-written in SV by FV team

Formal Proof of Refinement

HLM RTL one RTL clock cyle Formal Proof of Refinement version 1.0: looks like FEV  RM(  )  RM(  ) = Next(RM(  )) Totally symbolic RTL state; (represents all possible RTL states) Next(RM(  )) This will most certainly fail for some unreachable RTL states! Rats! ? Can be decided by SAT- or BDD-based solver engine Also might blow-up

HLM RTL one RTL clock cyle Formal Proof of Refinement version 2.0: write an invariant  RM(  )  Inv(  )  RM(  ) = Next(RM(  )) Totally symbolic RTL state; (represents all possible RTL states) Next(RM(  )) Can be decided by SAT- or BDD-based solver engine But concocting Inv is difficult, not to mention you need to also prove Inv is invariant Also might blow-up

Formal Proof of Refinement version 3.0: Model Checking Will likely blow-up; Probably need to restrict behaviors; e.g. use 4 addresses rather than 2 32 RTL & checker HLM of Environment start from initial state of env-HLM & RTL compute forward reachability via symbolic model checking verify that checker never fires.

Open Problems Refinement map is part of spec… or is it? Formal proof: best approach? –I spent 1.5 years banging my head on the formal side; the fact that I’ve retreated to checkers says something Tool issues: pain in the butt –Generated System Verilog has hit 4 bugs so far in expensive third-party simulator HLM/RTL discrepancies: can we weaken our notion of refinement to allow for reasonable mismatches? –E.g. HLM transmits message instantaneously, while RTL scheduling causes arbitrary delay before transmission

Partial Bibliography Using formal HLM as a checker: –Linking simulation with Formal Verification at a Higher Level, Tasiran, Batson, & Yu, 2004 –Runtime Refinement Checking of Concurrent Data Structures, Tasiran & Qadeer, 2004 Original Murphi paper: –Protocol Verification as a Hardware Design Aid, Dill, Drexler, Hu, & Yang, 1992 Formal verification of refinement maps for hardware –Automatic Verification of Pipelined Microprocessor Control, Burch & Dill, 1994 –Protocol Verification by Aggregation of Distributed Transactions, Park & Dill, 1996 –A Methodology for Hardware Verification using Compositional Model Checking, McMillan, 2000 –The Formal Design of 1M-gate ASICs, Eiriksson, 2000 Theory involving refinement in the face of fairness –On the Existence of Refinement Maps, Abadi & Lamport, 1991 Commercial Tools –BlueSpec (BlueSpec Inc.) –Pico (Synfora) –SLEC (Calypto)

Backups

type ---- Type declarations ---- CACHE_ENTRY : record State : enum {Invalid, Dirty, Clean}; Addr : ADDR; Data : DATA; end; var ---- State variables ---- CacheArray : array [0...CACHE_SIZE-1] of CACHE_ENTRY; Cpu2Cache : CPU2CACHE_MSG; Cache2Cpu : CACHE2CPU_MSG; Mem2Cache : MEM2CACHE_MSG; Cache2Mem : CACHE2MEM_MSG; Cache Controller HLM (typedefs & var decls in Murphi)

Guarded Commands Formalized State space S = type consistent assignments to variables Init : subset of state space specifying initial states A guarded command (GC) is a pair (g,c), where –g : S  {True,False} is called the guard; GC is enabled in state s if g(s) = True –c : S  S is called the command; GC fires from s to c(s) Semantics: HLM can transition from s to s iff there exists a GC that –is enabled in s –fires from s to s Nondeterminism arrises when multiple GCs are enabled In practice GCs are often parameterized We assume that the stuttering GC ( s.True, s.s ) is implicit

Refinement Formalized Let H and R be respective state spaces of HLM and RTL A function RM: R  H is called a refinement map –Intuitively, RM(r) is the HLM state that summarizes RTL state r –Many-to-one in general –Human writes this in our methodology We generalize this so that RM: R w  H, for some fixed w –Hence RM maps a fixed length sequence of RTL states to H –Useful for dealing with RTL pipelines

Ruleset i : CacheIndex “Recv Store" Cpu2Cache.opcode = Store & CacheArray[i].State != Invalid & CacheArray[i].Addr = Cpu2Cache.Addr ==> CacheArray[i].Data := Cpu2Cache.Data; CacheArray[i].State := Dirty; Absorb(Cpu2Cache); Ruleset i : CacheIndex “Recv Load" Cpu2Cache.opcode = Load & CacheArray[i].State != Invalid & CacheArray[i].Addr = Cpu2Cache.Addr ==> Cache2Cpu.Data := CacheArry[i].Data; Absorb(Cpu2Cache); Ruleset i : CacheIndex “Evict" CacheArray[i].State != Invalid ==> if (CacheArray[i].State == Dirty) begin Cache2Mem.opcode := WriteBack; Cache2Mem.Addr = CacheArray[i].Addr; Cache2Mem.Data = CacheArray[i].Data; end; CacheArray[i].State := Invalid; Cache Controller HLM GCs (1/2)

Ruleset i : CacheIndex ; a : Addr “Send Memory Request" CacheArry[i].State = Invalid ==> Cache2Mem.opcode := Get; Cache2Mem.Index := i; Cache2Mem.Addr = a; end Ruleset i : CacheIndex “Recv Memory Response" Mem2Cache.opcode = Response ==> CacheArry[Mem2Cache.Index].Data := Mem2Cache.Data; CacheArry[Mem2Cache.Index].Addr := Mem2Cache.Addr; CacheArry[Mem2Cache.Index].State := Clean; Absorb(Mem2Cache); end Cache Controller HLM GCs (2/2)

Load Miss (moot) Cpu2Cache Cache2Cpu Cache2Mem Mem2Cache Cache State & Addr Array Eviction Logic Hit? Pipe stage 1 Pipe stage 2 Load(A0) Cache Data Array Get(A0) Response(A0,D0) Clean,A0 D0 Response(D0)

Cache Controller Refinement Map (conceptual) function HLM_STATE RM(); // refinement map function HLM_STATE HLM; for (int i=0 ;i < CACHE_SIZE; i++) begin HLM.CacheArray[i].State = RTL.AddrArray[i].State; HLM.CacheArray[i].Addr = RTL.AddrArray[i].Addr; HLM.CacheArray[i].Data = RTL.DataArray[i]@+1; end; HLM.Cpu2Cache = RTL.Cpu2Cache@-1; HLM.Cache2Cpu = RTL.Cache2Cpu; return(HLM); end; @k denotes the value will have k clock cycles in the future (k can be negative too)

Connecting High Level Models and RTL: an Ongoing Battle Jesse Bingham Intel Feb 25 2009.

Similar presentations

Presentation on theme: "Connecting High Level Models and RTL: an Ongoing Battle Jesse Bingham Intel Feb 25 2009."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Connecting High Level Models and RTL: an Ongoing Battle Jesse Bingham Intel Feb 25 2009.

Similar presentations

Presentation on theme: "Connecting High Level Models and RTL: an Ongoing Battle Jesse Bingham Intel Feb 25 2009."— Presentation transcript:

Similar presentations

About project

Feedback