Presentation is loading. Please wait.

Presentation is loading. Please wait.

IP & SoC Verification.

Similar presentations


Presentation on theme: "IP & SoC Verification."— Presentation transcript:

1 IP & SoC Verification

2 Contents IP Verification SoC Verification Cycle-level
Transaction Level Testbench build-up Hardware debugging An example (iPROVE) SoC Verification Design Flow Multi-level, multi-lingual verification Multiple-FPGA set-up Debugging An example (iSAVE)

3 IP verification Important issues Testbench issues IP reuse
Testbench reuse Debuggability Testbench issues Various testbench support HLL: C/C++ HDL: Verilog and VHDL De facto standards: SCE-MI, SystemC, OpenVera and so on Various levels of testbench Transaction-level: control by command level, e.g., read/write Cycle-level: control over pin-by-pin Abstract-bus-level: standard on-chip networks

4 Cycle-level verification
DUT (HDL) Testbench (C/HDL) Device Driver PCI Controller DUT Buffer/ Pin Signal Generator Testbench PCI Channel S/W simulation part FPGA part

5 Cycle-level verification
SW: Testbench Modeled with HDL or C language Generate stimulus at every clock cycle Check the result of DUT at every clock cycle HW: DUT Mapped on FPGA Stimulus are transferred through a system bus, e.g., PCI. All signals are assigned to DUT concurrently after they are transferred from the SW test bench. Operating speed Faster than SW simulation due to the acceleration of HDL or C model of DUT in FPGA. Determined by the interface requirement (number and bandwidth of signals to be transferred), and bandwidth of the interface (PCI)

6 Transaction-level verification
DUT Testbench Device Driver Main Memory PCI Controller Transactor DUT Testbench DMA Channel S/W simulation part FPGA part

7 Transaction-level verification
SW: Testbench Modelled with C language Generate stimulus and check the result of DUT Only information enough to form the transaction is transferred to DUT. i.e., command, address and data HW: DUT and transactor Mapped on FPGA Transactor knows how to interpret the transaction and thence generates all signals necessary for DUT. Operating speed HW and SW parts are operated independently. Faster than cycle-level verification as well as SW simulation.

8 Testbench automation Overview SCE-MI VERA Test Builder

9 SCE-API (MI) Standard Co-Emulation API (Modeling Interface)
SW part implemented as C or C++, with recommendation on HW implementation Based on IKOS’ multi-channel co-modeling technology: TIP (Transaction Interface Portal) SCE-API Consortium Founded June 2000 Aptix, CoWare, IKOS, Mentor, STMicroelectronics, Synopsys, TransEDA SCE-API version 1.0 modeling interface SCE-MI v1.0 released through Open SystemC Initiative ( April 2001 Accellera’s Interface Technical Committee Merged into ITC, Oct. 2001 SCE-DI (Debug Interface) & SCE-CI (Control Interface) on progress

10 Vera Vera Functional verification language for testbench description
Language specification can be obtained from OpenVera site ( Vera Language Object-oriented language Includes HDL features Waiting clock event Bit data type, bit operation (extraction, concatenation) Data expectation (‘do something when the expectation is hit’) 0,100 bus.ack == 1; // ack must be 1 in at least 100 cycles Vera Verification Environment Commercial product from Synopsys Vera source codes are compiled and runs with HDL simulator in which DUT is simulated. Additional features Automatic stimulus generation, Coverage analysis

11 Vera .vr – Vera Source .vrh – Vera Header HDL Simulator
Compiler .vrh – Vera Header HDL Simulator .vro – Vera Object Vera Shell Vera PLI DUT .vrl – Vera List Supplied by User Automatically generated by Vera compiler

12 TestBuilder Transaction-Based Verification
Functional verification in higher level of abstraction Engineer develops tests from a system level perspective Advantages Enhance reusability of each component in the testbenches Improve debugging and overage analysis Transaction Level Signal Level TVM (Transactor) Design Tests TVM: Transaction Verification Model

13 TestBuilder How TestBuilder Operates Transaction Level Signal Level
Tests TVM DUV While(){ Tx.send_packet(..); Mem.expect_write(..); .. } Tx.send_packet(..){ header = “hd”; address = 0xff0011; data = 0xff0011; } C/C++/TestBuilder Implementable using TestBuilder/HDL HDL C library (PLI/FLI) HDL Simulation

14 Specman Functional Testbench Automation Tool by Verisity ( Its concept is similar to Vera, but starts earlier and more widely used. Describe user specification with e language Interface Spec & Test Plan by e Legacy code in C/VHDL/Verilog Specman Elite Automatic Testbench Generation Data & Temporal Checking Coverage Analysis DUT

15 Debugging feature Built-In Logic Analyzer (BILA) DUT boundary – ports
DUT internal – internal nodes PCI iPROVE PC Board

16 Hardware debugging schemes
Low speed scheme Operating speed: < 10MHz There is no dedicated storage element in the device. All debugging information is transferred to main memory or large storage device at every cycle. Readback scheme of Xilinx device is a typical example. Usually, the scheme needs only a small number of IO pins. JTAG interface: 4 pins (TCK, TDI, TMS, TDO) 8-bit parallel interface (CLK, INIT, CS, RW, D[7:0])

17 Hardware debugging scheme
High speed scheme Operating speed: < 100MHz There is several dedicated elements which can be internal memories or external memories. All debugging information is stored in the dedicated elements. Typical example Xilinx: ChipScope Altera: SignalTap-II

18 What is iPROVE iPROVE is a small scale design verification tool by enabling C/C++, HDL and de facto standards interfacing API’s. API Proprietary C/C++ API Proprietary Verilog API SCE-API/MI Testbench and/or rest blocks in C, HDL and/or SystemC IP in HDL/EDIF PCI bus

19 iPROVE tool positioning
Running Speed 100MHz Real Silicon 10MHz Rapid Prototype 1MHz HW Emulator 100KHz iPROVE 10KHz There are many verification solutions to get working silicon. S/W simulation is most popular and basic one. Although it is cheap solution, it can only run around 10 to 100 Hz speed. To overcome this speed problem, hardware accelerator can be a next solution, But it still provide only around 1 KHz speed. H/W emulation usually utilizes a bunch of FPGA and emulates logics up to 1 MHz. Rapid prototyping system utilizes pre-verified IP-blocks instead of FPGA. It seems to be suitable for IP-based ASIC design, but cost high And 10 MHz speed is not sufficient. Considering that most logic today runs over 30 or 60 MHz even over 100 MHz. Here we are longing to have an ideal verification solution that costs low, but runs fast. That’s it. iSAVE costs one-tenth of other H/W emulator. iSAVE runs around 50 to 60 MHz which means that runs ten times faster than others. 1KHz HW Accelerator 100Hz SW Simulator 10Hz Investment

20 iPROVE typical usage: IP verification
Automatically generated module DUT PC iPROVE Testbech IP Test PCI DPP Interactive IO Signal information Signals Cycle-level verification IP verification without prototyping Transactors DUT Test Transactions Signals Transaction-based verification BFM DUT Test DUT bus Abstract-bus-based verification

21 iPROVE typical usage: DPP
PC iPROVE Multi-media board Large size data PCI DPP

22 iPROVE structure User design User testbench OS De facto standards API
Verilog, VHDL User testbench C/C++, Verilog, VHDL OS Windows 2000 or XP Linux De facto standards SCE-MI/API SystemC OpenVera TestBuilder API C/C++ Visual C Borland C GNU GCC under Cygwin Verilog

23 iPROVE design flow

24 Mapping by running testbench
iPROVE design flow synthesis P&R compilation Mapping by running testbench execution Debugging with BILA

25 Cycle-level with Verilog (1/3)
A simple ALU example

26 Cycle-level with Verilog (2/3)
Step 1: Start with EDIF of the ALU – need synthesizer Step 2: make FPGA mapping data Step 3: modify testbench by inserting PLI’s for iPROVE Step 4: run the ALU with iPROVE and HDL simulator Testbench runs at host computer DUT goes to iPROVE

27 Testbench example (Cycle-level)
*alu-proxy is image of ALU mapped on FPGA `define CARD_ID 0 module alu_top(); // inputs and outputs … always #5 clk = ~clk; `ifdef iPROVE alu_proxy(…) `else alu(…) `endif alu_sim(.resetb(resetb), .clk(clk), .cmd(cmd), .src1(op1), .src2(op2), .cin(carry), .result(result), .cf(cf), .vf(vf), .nf(nf), .zf(zf)); // other thestbench codes initial begin $dumpfile("alu.vcd"); $dumpvars(); `ifdef iPROVE $iProveOpenCard(`CARD_ID); $iProveInitCard(`CARD_ID, “ALU.tcf"); $iProveLoadModuleInfoFile(`CARD_ID, "ALU.mit"); $iProveCycLoadSignalInfoFile("alu", “ALU.pin"); `endif clk = 1'b0; resetb = 1'b1; repeat (posedge clk); resetb = 1'b0; // other testbench codes $iProveCloseCard(`CARD_ID); $stop; end endmodule Automatically generated by iPROVE software $iProveCycSignalWrite(modhl_alu, sighdl_reset,reset); $iProveCycSignalWrite(modhl_alu, sighdl_cmd,cmd); $iProveCycClockAdvanceByModule(modhl_alu, sighdl_clk); $iProveCycSignalRead(modhl_alu, sighdl_cf, cf); $iProveCycSignalRead(modhl_alu, sighdl_vf, vf); $iProve…; system task for iProve defined as PLI routine

28 Transaction-level with C (1/3)
A simple SSRAM example

29 Transaction-level with C (2/3)
Step 1: Start with EDIF of the SSRAM – need synthesizer Step 2: make FPGA mapping data Step 3: modify testbench by inserting PLI’s for iPROVE Step 4: run the SSRAM with iPROVE and HDL simulator DUT & transactor goes to iPROVE Testbench runs at host computer

30 Testbench example (Transaction-level)
#include “iprove.h” int main(int argc, char** argv) { // other codes iProveOpenCard(card_id); iProveInitCard(card_id, tcf); iProveLoadModuleInfoFile(card_id, mit); iProveGetModuleHandle(instance_name, &module_handle); iProveAllocReadBuffer(module_handle, sbm_size); iProveAllocWriteBuffer(module_handle, sbm_size); #ifdef BILA iProveBILAConfig(card_id, trg); iProveBILATrigOn(card_id); #endif iProveStart(card_id); TestBench(); iProveBILAUpload(card_id, dmp); bila_info.cid = card_id; bila_info.dump_filename = dmp; bila_info.signallist_filename = lst; WithCheck(iProveDump2Vcd(&bila_info, 1, vcd); iProveStop(card_id); iProveCloseCard(card_id); return 0; } void TestBench(void) { // other codes iProveCmdWrite(module_handle, &cmd, 1); iProveDataWrite(module_handle, pbuf, num, &tmp); iProveDataRead(module_handle, pbuf, num, &tmp); }

31 Performance comparisons
IDCT: 59K gates FPACC0: 56K gates FPACC1: 104K gates FPACC2: 208K gates

32 iPROVE performance x2053 x1 x47 x69
iPROVE provides outstanding speed-up over x2000. Example FPACC2 (Floating-point number calculation IP) Gate count: 208,479 Logic usage: 99% of XCV1000E x2053 x1 x47 x69 iPROVE iPROVE iPROVE ModelSim with ModelSim with Cycle- Level C-API with Transaction -Level C-API

33 iPROVE-Summary Easy to use and fast setup time to emulation
No or minor source modification The same testbench for simulation and emulation Various verification mode Cycle, transaction and abstract bus modes Powerful debugging BILA (Built-in Logic Analyzer) as a real hardware logic analyzer High-performance interface to S/W side High-speed DMA feature High to low level languages such as C/C++, Verilog and VHDL Open interface system API layer provides easy-to-interface mechanism to de facto standards Scalability Multiple iPROVE cards as well as various gate count options

34 SoC Verification Key technologies in SoC Verification
Early/Consistent Verification Environment Progressive Refinement Multi-level, Multi-lingual Verification

35 ASIC Verification Methods
Running Speed Ideal Verification Solution Make it faster Make it cheaper 100MHz Real Silicon 10MHz Rapid Prototype 1MHz HW Emulator 100KHz 10KHz There are many verification solutions to get working silicon. S/W simulation is most popular and basic one. Although it is cheap solution, it can only run around 10 to 100 Hz speed. To overcome this speed problem, hardware accelerator can be a next solution, But it still provide only around 1 KHz speed. H/W emulation usually utilizes a bunch of FPGA and emulates logics up to 1 MHz. Rapid prototyping system utilizes pre-verified IP-blocks instead of FPGA. It seems to be suitable for IP-based ASIC design, but cost high And 10 MHz speed is not sufficient. Considering that most logic today runs over 30 or 60 MHz even over 100 MHz. Here we are longing to have an ideal verification solution that costs low, but runs fast. That’s it. iSAVE costs one-tenth of other H/W emulator. iSAVE runs around 50 to 60 MHz which means that runs ten times faster than others. 1KHz HW Accelerator 100Hz SW Simulator 10Hz Investment

36 What’s the point in SoC Verification?
Mixture of SW and HW Make it easier to cooperate with Processor Model such as ISS or BFM Mixture of pre-verified, not-verified components Make it easier to utilize legacy IPs already verified Mixture of different language, different abstraction level Provide common interface structure between SoC components

37 Canonical SoC design flow
System Spec. Design HW/SW Partitioning HW Development SW HW refinement (UT->T->RTL) Gate HW IP SW IP Software Verification Functional Gate-Level HW-SW Co-Design Co- SW refinement (RTOS mapping) Final code Emulator In-system emulator HW-SW co-debugging

38 Tools for HW-SW Co-Verification
System Spec. HW IP System Design SW IP HW-SW Co- HW/SW HW/SW Verification Partitioning Functional Software Verification Verification HW SW Development Development SW refinement HW refinement (RTOS (UT->T->RTL) mapping) High-level synthesis Testbench automation IP accelerator HW-SW co-simulation ISS RTOS simulator

39 Tools for System-level
Spec. HW-SW HW IP System Co-Design Design SW IP HW/SW Partitioning System-level design (Performance analysis tools) Hot-spot analyzer High-level cycle count estimation High-level power analysis High-level chip area estimation On-chip-bus traffic estimation

40 Verification Environment
Early test-bench setup Accurate and fast test-bench setup in early design stage greatly reduces verification time and efforts Consistent test-bench utilization Once the test-bench is built up, it must be consistently reused in the following design steps In-system test bench The test bench must be switchable between SW simulation and in-system verification to cover all corner cases.

41 In-System Verification
In-System Gate Level Verification design synthesis manufacture Integration test silicon spec. RTL gate board functional verification formal verification test pattern In-System Behavioral Level Verification

42 Flexible Verification Environment
C Test Bench HDL Test Bench In-System Test Bench Conventional Verification Environment C Model HDL Design Gate Level Design C Test Bench In-System Test Bench HDL Test Bench HDL Test Bench In-System Test Bench Flexible Verification Environment C Model C Model HDL Design Gate Level Design Gate Level Design

43 Progressive refinement
With the advent of design reuse methodology for System-On-a-Chip designs, a mixture of C, HDL, EDIF netlist and IP core blocks is required to be verified together as one system. For a large design, it is necessary to verify each design blocks/modules one after another until whole design is verified. IP has to be prepared in various abstraction levels in order to support progressive refinement process. EDIF A typical SoC chip RTL uP Core SRAM FLASH D-Cache USB MPEG FIFO Logic BCA TF Incremental/progressive refinement UTF

44 Multi-Level & Multi-Lingual
level of abstraction Multiple Programmable Cores (20%) Algorithm Functional UT (20~50%) Memory other IPs (>20%) Custom contents (15~20%) Behavioral BCA RTL CA gate TA EDIF (gate-level netlist) HDL (Veilog VHDL) SystemC (HW) C/C++ (HW) C/C++ (SW) UT: untimed, BCA: bus cycle accurate CA: cycle accurate, TA: timing accurate RTL: register transfer level

45 Supporting Multi-Language
Simulation Vehicle HDL Simulator User C process ( C/C++/SystemC model for HW or SW ) ISS for embedded processor core Test description language (Vera, TestBuilder) Emulation Vehicle FPGA containing one or more IP’s (enables gate-level IP verification) FPGA interfacing with target system (enables in-system verification) Communication channel between vehicles IPC (inter-process communication) for designs simulated in multiple processes Dedicated device driver for designs mapped in FPGAs

46 Supporting Multi-Level
Bridging abstraction gap Using transactor Using cycle-level transactor Read Channel Write Channel Transaction -Level C/HDL Model Transactor Cycle Accurate HDL/EDIF Model Read Channel Cycle Accurate C/HDL Model Cycle Accurate API Write Channel Cycle-Level Transactor Cycle Accurate Model

47 Multi-Level & Multi-Lingual
C sessions HDL sessions Design in Verilog Design in VHDL Design in C Design in SystemC Transactor Transactor Inter-Lingual Communication TIE EDIF sessions I/F protocol I/F protocol Transactor Transactor Design in EDIF Design in EDIF Target board

48 iSAVE-MP & MPEG2/4 iSAVE-MP main iSAVE-MP TIM GUI windows
Decoded image MPEG Board

49 ILC(Inter-lingual Communication)
SoC model with ARM CCM Debugger ARM CCM Memory model IP models IP in HDL AMBA model Address Decoder Bus wrapper ILC(Inter-lingual Communication)

50 Using Multiple FPGA’s Using multiple FPGAs
Partitioning into multiple FPGAs Bus split Host Processor ARM ISS Memory Model FPGA1 FPGA2 Transactor Transactor Bus Split Logic IP0 IP1 IP2 IP3

51 Debugging in Multi-Level
Traditional debugging tools Design in emulation vehicle Logic analyzer Design in simulation vehicle Source-level debugger Waveform viewer Challenges in SoC How to manage waveforms from different abstraction level How to manage trigger conditions How to probe out internal signals of designs in emulation vehicles

52 Debugging in Multi-Level
Built-in logic analyzer Built-in logic analyzer enables the designer to watch what is actually going on. Built-in logic analyzer samples the states of the DUT and stores them in the external dump memory. (non-intrusive) FPGA Configure FPGA Download Trigger Design Under Test Built-In Logic Analyzer Run Upload Post Processing External Dump Memory VCD

53 Debugging in Multi-Level
Built-in logic analyzer Triggering condition is dynamically configured. After the emulation is over, the dump data in the external memory is read and processed to generate VCD file. VCD file $date Fri Dec 6 22:50: $end $version 4.10 $timescale 100ps $scope module BILA $end $var reg 32 ! user_data $end $var reg 1 “ write_en $end $var reg 4 $ mode $end Waveform viewer

54 Debugging in Multi-Level
Probing internal nodes Sometimes the designer wants to watch internal nodes in his design. Internal node probing enables this by wiring-out the internal nodes to the boundary of the DUT top block. Top block DUT Built-In Logic Analyzer Sub-block External Dump Memory Internal node

55 Debugging in Multi-Level
Monitoring software variables Software dump data is merged with hardware dump data (Built-In Logic Analyzer) to generate unified waveform. The waveform contains both hardware and software debugging information. Built-In Logic Analyzer Dump data with timing information VCD with both Hardware and software Debugging information dump2vcd Software variable Dump data with timing information

56 Using Multiple FPGAs Synchronous Built-In Logic Analyzer
When the design is partitioned into multiple FPGAs, Built-In Logic Analyzer(BILA) in each FPGA samples the internal state of each FPGA. All the dump data are merged to provide the user a unified and synchronized waveform. FPGA #n DUT BILA Dump data #n #3 #2 #1 dump2vcd FPGA #3 DUT BILA FPGA #2 DUT BILA Unified VCD FPGA #1 DUT BILA

57 Using Multiple FPGAs Multiple FPGAs with multiple Processes
BILA in each FPGA samples FPGA states. SVA in each process samples program states. All of the dump data are merged

58 Using Multiple FPGAs Multiple FPGAs with multiple Processes Process #m
Dump data #n Process #3 SVA dump2vcd Dump data #3 Process #2 SVA Process #1 Dump data #2 SVA FPGA #n DUT BILA Dump data #1 Unified VCD SVA Dump data #n #3 #2 #1 FPGA #3 DUT BILA FPGA #2 DUT BILA FPGA #1 DUT BILA


Download ppt "IP & SoC Verification."

Similar presentations


Ads by Google