Download presentation
Presentation is loading. Please wait.
1
Prototyping and Emulation
이광엽 (서경대학교, 컴퓨터공학과) Copyrightⓒ2003
2
과목 개요(Learning Map) Copyrightⓒ2003
3
목 차 Verification Methodology Emulation FPGA Architecture for emulators
목 차 Verification Methodology Emulation FPGA Architecture for emulators Prototyping Virtual Chip System Case Study Conclusion Copyrightⓒ2003
4
Trend of Verification Effort in the Design
Verification portion of design increases to anywhere from 50 to 80% of total development effort for the design. 1996 300K gates Code Verify (30 ~ 40%) Synthesis P&R 2000 1M SoC Code Verify (50 ~ 80%) Synthesis P&R Verification methodology manual, 2000- TransEDA Copyrightⓒ2003
5
Verification Performance Gap; more serious than the design productivity gap
Growing gap between the demand for verification and the simulation technology offered by the various options. Verification Performance Gap System-on-a-chip verification, 2001 – P.Rashinkar SOC HW simulation acceleration (<10KCPS) Cycle-base simulation (<1KCPS) Simulation performance Verification complexity Event-base simulation (<10CPS) Complex ASIC Medium ASIC Small ASIC Design complexity Copyrightⓒ2003
6
Verification Methodology
Verification Alternatives Simulation Hardware-accelerated simulation Emulation Prototyping Virtual Chip System Copyrightⓒ2003
7
Overview of Verification Methodologies
Prototyping Faster speed, closer to final product Emulation Hardware Accelerated Simulation Simulation Basic verification tool Semi-formal Verification Formal Verification Bigger coverage Copyrightⓒ2003
8
Design Verification Methodologies
Simulation : allows observation of a fraction of real world interactions (sequential) Emulation : enables the designer to explore alternatives (parallel) Verification takes places in a hardware environment The design is retargeted to a programmable hardware environment Rapid prototyping : allows operating speeds such that all Interfaces to target applications can operate in real time. If the system runs at real time, the quality of the algorithm can be evaluated on the fly and DSP design time can be greatly reduced. Stimuli software model software algorithm Workstation or accelerator Copyrightⓒ2003
9
Software Simulation Pros
The design size is limited only by the computing resource. Simulation can be started as soon as the RTL description is finished. Set-up cost is minimal. Cons Slow (~100 cycles/sec) ; Speed gap between the speed of software simulation and real silicon widens. (Simulation speed = size of the circuit simulated / speed of the simulation engine) The designer does not exactly know how much percentage of the design have been tested. Copyrightⓒ2003
10
Hardware-Accelerated Simulation
Simulation performance is improved by moving the time-consuming part of the design to hardware. Usually, the software simulation communicates with FPGA-based hardware accelerator. Hardware Accelerator Simulation environment Testbench Module 2 is synthesized & compiled into FPGAs Module 0 Module 1 Module 2 Copyrightⓒ2003
11
Hardware-Accelerated Simulation
Pros Fast (100K cycles/sec) Cheaper than hardware emulation Debugging is easier as the circuit structure is unchanged. Not an Overhead : Deployed as a step stone in the gradual refinement Cons (Obstacles to overcome) Set-up time overhead to map RTL design into the hardware can be substantial. SW-HW communication speed can degrade the performance. Debugging of signals within the hardware can be difficult. Copyrightⓒ2003
12
Emulation Imitating the function of another system to achieve the same results as the imitated system. Usually, the emulation hardware comprises an array of FPGA’s (or special-type processors) and interconnection scheme among them. About 1000 times faster than simulation. Simulation Hardware Accelerated Emulation Prototyping Copyrightⓒ2003
13
Emulation Pros Fast (500K cycles/sec)
Verification on real target system. Cons Setup time overhead to map RTL design into hardware is very high. Many FPGA’s + resources for debugging high cost Circuit partitioning algorithm and interconnection architecture limit the usable gate count. Copyrightⓒ2003
14
Emulation Challenges Efficient interconnection architecture and Hardware Mapping efficiency for Speed and Cost RTL debugging facility with reasonable amount of resource Efficient partitioning algorithm for any given interconnection architecture Reducing development time (to take advantage of more recent FPGA’s) Copyrightⓒ2003
15
H/W Emulation Conventional design flow of chip Manual job Or HLS
Specification Architecture design Behavioral Model Manual job Or HLS RTL Model Logic Synthesis Tool H/W Emulation to verify Gate-level Model Gate-Level Model P&R Tool Placement & Route Fabrication chip Copyrightⓒ2003
16
H/W Emulation Verification System Rapid prototyping system
Map gate-level model into FPGA array Verify gate-level model with H/W board before fabricating chip Very high simulation speed Rapid prototyping system H/W prototype of chip can be obtained. H/W prototype is only available with gate-level of model. Co-verification with target H/W board is postponed to gate-level design stage. Copyrightⓒ2003
17
Prototyping Special (more dedicated and customized) hardware architecture made to fit a specific application. Prototyping Emulation Hardware Accelerated Simulation Simulation Copyrightⓒ2003
18
Prototyping Pros 10X higher clocking than emulation
Components as well as the wiring can be customized. Can be carried and dispatched for demo or customer evaluation Cons Not flexible for design change (Even a small change requires a new PCB.) Copyrightⓒ2003
19
Application-Based Verification
About 90% of ASIC designs work right the first time, although only about 50% work right the first time in the system because most ASIC design teams do not do system-level simulation Running significant amounts of real application code is the only way to reach this level of confidence in an SoC design. The available options for rapid prototyping include FPGA or LPGA prototyping Emulation-based testing Real silicon prototyping Copyrightⓒ2003
20
FPGA and LPGA Prototyping
For small designs FPGA Reprogrammable Allowing rapid turnaround of bug fixes LPGA higher gate counts Faster clock speed For prototype of a single large chip Use multiple FPGAs to build a prototype But it is impossible to modify quickly when a bug fix requires repartitioning of the design between devices Copyrightⓒ2003
21
Silicon Prototyping If an Soc Design is too large for upper cases then building a real silicon prototype may be the best option. Reasonable set of criteria The bug rate form simulation testing should have peaked and be on its way down. The time to determine that a bug exists should be much greater than the time to fix it The cost of fabricating and testing the chip is on the same order The scenario we want to avoid is building a prototyping only to find a critical bug that prevents any useful debug of the prototype Help facilitate debug of this initial prototype Good debug structures for controlling and observing the system The ability to selectively rest individual blocks in the design The ability to selectively disable various blocks to prevent bugs in these blocks form affecting operation of the rest of the system Copyrightⓒ2003
22
Emulation Introduction Anatomy FPGA Architecture for Emulators
Design for emulation Emulation System Example description : APTIX System Copyrightⓒ2003
23
Introduction A logic emulator is a system of
Programmable hardware with capacity much greater than one FPGA Software which automatically programs the hardware according to gate level design representation Software and hardware to support operation and analysis of the emulated design as a component in real hardware Copyrightⓒ2003
24
Why Emulate? Concurrent design verification – faster time to market
Higher predictability of schedule – reduced project risk Fewer design changes in final phases – improve quality Low cost for fewer silicon iterations – lower cost For mission critical designs – high quality Simulation : only methodologies are limited by processing power in verifying complex designs – better verification Emulation is the only verification methodology which is keeping up with system complexity Copyrightⓒ2003
25
Advantages of Emulation
Emulation is the only verification methodology which is keeping up with system complexity Time Speed up factor 1 second ...... 107 ←← Actual Hardware 10 106 Logic Emulation 2 minutes 105 16 104 3 hours 103 day 102 12 days 101 months 1 Software Simulations Copyrightⓒ2003
26
Advantages of Emulation
Emulation performance is not a function of design size Copyrightⓒ2003
27
Motivation : Verification Realism
Often the chip meets its spec, but does not work in the system : Spec errors, misunderstandings: “I thought your chip was handling that…” Real system puts designs into unanticipated situations: Interaction between components across time and function: Combinatorial Explosion i.e. the Ethernet driver interrupts a page fault which is servicing a floating point exception. Other parts of system don’t adhere to their specs, op their specs aren’t known: Undocumented behavior in other devices, such as CPU, Peripherals from other projects or other vendors. Copyrightⓒ2003
28
Motivation : Verification Realism (cont.)
Some applications need real-time operation for verification : Display are far easier to verify by actual observation Closed-loop operation with analog hardware Electro-mechanical controllers Human perception : audio, video compression, processing Simulation generally requires test vector development: Costly and difficult, critical path in schedule, Verification depends of test vector correctness, Test vectors may have to be based on assumptions, Test vectors are intrinsically open-loop. Copyrightⓒ2003
29
Motivation : Verification Realism (cont.)
Only when the real design is running its real application in its real environment is correctness assured : Emulated design connected to actual hardware can run : actual diagnostic code, compatibility test, actual operating systems, actual applications, receiving real data from storage, sensors, devices, sending real data to storage, devices, displays, Copyrightⓒ2003
30
Motivation : Visibility
Once a chip is fabricated, placed in a system, and fails Internal probing is impossible It may be difficult or impossible to put the simulation into the failing state for analysis Emulated design can have internal probes programmed in, for direct connection to instrumentation Emulated design may be used to generate test vectors for fabrication Copyrightⓒ2003
31
What is Emulation? Turnkey rapid prototyping Systems
Read users design and automatically partition & map to array of FPGAs Enable user to run at system level and verify with application software Full internal visibility to debug – thousands of probes Modify design in minutes Copyrightⓒ2003
32
Comparison with FPGA FPGA Emulation Gerneric FPGAs used for emulation
High chip capacity Slow compilation Low I/O to gate ratio Emulation Fast compile speed Productive debugging High I/O to gate ratio On-board logic analyzer Gerneric FPGAs used for emulation Un predictable capacity and highly variable routing delays with poor debuggability Copyrightⓒ2003
33
Bugs Found with Emulation:
Functional ASIC bugs Boad/system-level bugs Software, firmware bugs Synthesis bugs Bugs that require rich, real-world stimulus or high throughput to find Bugs caused by spec. misnterpretation Copyrightⓒ2003
34
Comparison with Co-simulation
Performance potential of simulation accelerator is not achievable with current testbench strategies Speed of testbench (workstation) Channel latency & bandwidth Frequency of communication Design under test execution speed Copyrightⓒ2003
35
Anatomy of an Emulator Copyrightⓒ2003
36
Emulator Architecture
Hierarchical Multiplexed Architecture Simplifies Design Mapping Process Copyrightⓒ2003
37
System Overview : SW Components
Design compiler Netlist reader and parser : Reads and parses gate-level design netlists Technology mapper : Maps design components into optional emulator equivalents System-level Partitioner and Placer : Partitions mapped design into boxes, boards, ultimately into FPGA netlists. System-level Iterconnect router : Determines the programming of interconnect hardware to complete nets cut by the partitioner FPGA compiler : Reads each FPGA netlist, maps, partitions, places and routes FPGA. Timing Analysis(optional) : Analyzes compiled design on emulation hardware for speed, hold violations. Runtime download and analysis controller. Graphical User Interface Hardware diagnostics Copyrightⓒ2003
38
System Overview : HW Components
Logic emulation boards : FPGAs and interconnect chips Memory emulation boards : RAMs, FPGAs and interconnect. System interconnect board : chips which interconnect emulation boards. I/O Connectors and Pods : connects to in-circuit interfaces, external components Instrumentation : sitmulus generator, logic analyzer, Vector interface. Controller : downloads configurations, operates instruments. Interface : to host computer. Copyrightⓒ2003
39
FPGA Architecture for Emulators
Partitioning, placement and routing optimize for the emulator performance FPGA efficient interconnection technology (currently use ~25% of FPGA logic gates) Interconnection of logic blocks, of multiple FPGAs, of multiple emulator modules Incremental design change Observability and controllability of debug process Memory resource (separate memory or FPFA RAM) Clock lines (low skew, no setup and hold time violations) Lower cost (than silicon Copyrightⓒ2003
40
Interconnect Problem It is critical to maximize gate capacity and speed by packing as much logic into each FPGA as possible. Interconnect hardware architecture must: Provide successful connectivity in all cases. Permit maximum logic utilization of the FPGAs, With minimum added delay and skew, At minimum hardware cost. Rent’s Rule applies: Observation of Rent at IBM in 1960’s: Pincount of arbitrary subpart of a digital system is proportional to a fractional power of the gatecount. P = K * G ^ r Example in high-performance systems: pins = 2.5 * gates ^ 0.56 Copyrightⓒ2003
41
Interconnect Problem Commercial FPGAs are sized for engineered applications: Designed is designing for FPGA structures. Designer architects system so that subparts fit into FPGA pincounts. Vendors design FPGAs accordingly. Emulated designs are completely arbitrary : Structures are not optimum for FPGAs FPGA size and pincount is arbitrary. FPGA subparts are automatically extracted by partitioner. Result: FPGAs are pin-limited, not gate-limited. Logic emulator gets 20~30% as much gate utilization as ordinary FPGA applications. There is a challenge for interconnect architecture and software to maximize gate utilization FPGA pincount. Copyrightⓒ2003
42
Field Programmable Interconnects
Aptix FPIC A place and rout architecture (not a crossbar) Routing delay not controllable 940 user programmable I/Os IQ 160 176× 176 crossbar Every port can be configured to connect to any port Routing delay is predictable FPGAs A place and route architecture Routing delay not entirely controllable Typically < 300 programmable I/Os Copyrightⓒ2003
43
Virtual Wires (IBM) Increase bandwidth by multiplexing
> 80% gate utility, but decrease emulation speed. Copyrightⓒ2003
44
Partial Crossbar Interconnect
Useful with the ability of FPGAs to freely assign pins Each small full crossbar chip is connected to the same subset of pins on each logic chip To configure the system, logic partitioning, placement, routing are performed Placement is insignificant, since the interconnection is symmetrical Interconnect router is a simple repetitive table-driven task Copyrightⓒ2003
45
Design for Emulation Synchronous design Pipeline design techniques
Short arithmetic functions (minimize logic level) Minimize bit width Use of I/O FFs where possible (latches) Careful mapping between functions and FPGAs Minimize high fanout net or use available buses Use of small blocks (20-40K gates) Care when gating the clock tree (e.g. low power) Limit module I/O count Copyrightⓒ2003
46
Design Flow for Emulation System
Design netlist Design Import Compile System Setup Download Emulation Logic Analyzer. Debugger Copyrightⓒ2003
47
Synthesis to support Emulation
Synthesis to compile behavioral models for emulation Copyrightⓒ2003
48
Need for System Level Emulation
Emulation combines the flexibility of simulation and the realism of a prototype Simulation : limited by the availability of software models Custom prototyping : increased time to build and debug Leverages synthesis technology to optimize design differentiation and uniqueness Enable fast, incremental design changes that shorten design iteration cycles and iteration cycles and improve quality Avoid costly respins of silicon and saves months of redesign Increased confidence in your entire project schedule and your ability to meet requirements Copyrightⓒ2003
49
Example description : APTIX System Explorer
Feature of System Explorer Emulation 장비 기능 타 emulation 장비와 co-simulation mode 지원 (MVP : Module Verification Platform) Workstation의 DUT synthesize, System Explorer 제어, software simulation Logic analyzer를 통한 waveform 관찰 Ethernet을 통한 system component와 연결 System Explorer configuration System Explorer 내부에 FPGA(Xilinx 또는 Altera) 모듈 및 hard IP core module을 부착하여 prototyping 수행 가능 Ethernet Workstation System explorer HP logic analyzer Copyrightⓒ2003
50
Example description : APTIX System Explorer
Procedure of emulation using System Explorer Design Pilot과 System Explorer 사용 Design Pilot Design 순서 : Setup→ Import→ Cell→ Synthesis→ Grouping→ Optimize→ Explorer Setup : System Explorer version, FPGA 모듈타입, synthesis tool 선택 Import : RTL code or gate-level netlist 파일 import Cell : DUT의 component( RTL code, hardware component, Pre-synthesized component)결정 Synthesis : DUT synthesis Grouping : component를 grouping하여 FPGA 모듈로 분배 Optimize : Speed와 FPGA count 사이의 trade-off 조절 디자인 최적화 Explorer : 주요 signal port 결정 후 in-circuit mode 또는 co-simulation mode에 사용할 파일을 생성 Copyrightⓒ2003
51
Example description : APTIX System Explorer
Design pilot의 netlist와 pin map 정보를 입력 DUT의 placement & routing 수행 System Explorer의 configuration을 세팅 제어 역할 DUT의 internal signal을 probe Logic analyzer와 연동 waveform 출력 옆 그림은 configuration 세팅 모듈의 위치 결정 단계. System Explorer와 FPGA P&R 세팅 P&R을 수행한 후 ethernet을 통하여 DUT를 download Simulator 또는 logic analyzer를 통하여 waveform 관찰 Copyrightⓒ2003
52
Prototyping Need of Prototyping Motivation : Rapid Prototyping
Reconfigurable Processors Virtual Chip System Copyrightⓒ2003
53
Need of Prototyping :Disadvantage of Logic Emulation
Once emulated design is debugged, it is available for immediate use by software evelopers. This can directly reduce the projects’s critical path and time-to-market. Emulated design is available for demonstration to customers, users, management. Proof of concept, proof of progress. Find out early whether result will bi satisfactory. Architectural workbench : Drive emulation with RTL-level synthesis. Experiment with architectural features on real code and data: structures, sizes, algorithms of caches, busses, buffers, quantity and design of functional units, novel architectures, representations, algorithms, etc., etc., Copyrightⓒ2003
54
Prototyping Rapid prototyping FPGA and laser prototyping technologies
complements simulation and compensates for the less-than-100% coverage at the macro level FPGA and laser prototyping technologies provide the ability to create prototypes very rapidly: a very useful debugging mechanism Building a prototype ASIC required for macros that must be tested at speeds or gate counts exceeding those of FPGA and laser technologies Copyrightⓒ2003
55
Motivation : Rapid Prototyping
Once emulated design is debugged, it is available for immediate use by software evelopers. This can directly reduce the projects’s critical path and time-to-market. Emulated design is available for demonstration to customers, users, management. Proof of concept, proof of progress. Find out early whether result will bi satisfactory. Architectural workbench : Drive emulation with RTL-level synthesis. Experiment with architectural features on real code and data: structures, sizes, algorithms of caches, busses, buffers, quantity and design of functional units, novel architectures, representations, algorithms, etc., etc., Copyrightⓒ2003
56
Motivation : Rapid Prototyping
Once emulated design is debugged, it is available for immediate use by software developers. This can directly reduce the project’s critical path and time-to-market. Emulated design is available for demonstration to customers, users, management. Proof of concept, proof of progress. Find out early whether result will bi satisfactory. Architectural workbench : Drive emulation with RTL-level synthesis. Experiment with architectural features on real code and data: structures, sizes, algorithms of caches, busses, buffers, quantity and design of functional units, novel architectures, representations, algorithms, etc., etc., Copyrightⓒ2003
57
Reconfigurable Processors
Xilinx Vertex II Altera Excalibur Copyrightⓒ2003
58
Xilinx Vertex II (1/2) This product line targets high-end communications The high-speed serial channels can deal with demanding protocols like InfiniBand, 10-Gbit Ethernet, Serial ATA, and 3GIO The PowerPC processors are major powerhouses They include 16-kbyte caches for code and data and incorporate an MMU that supports variable page sizes The programmable matrix includes features such as dedicated 18-by bit multiplier blocks and fast carry chains These are ideally suited for high-speed packet processing Copyrightⓒ2003
59
Xilinx Vertex II (2/2) Xilinx targets high-performance communications with the vertex II pro The multigigabit transceiver can handle infiniBand and other high-speed links The configurable logic blocks, meory, and powerPC processors must be designed to tolerate high level protocols Copyrightⓒ2003
60
Altera Excalibur (1/2) Excalibur is a large programmable logic device (PLD), up to 1 million gates, with some fixed components tacked on one side of the array The processor is a 200-MIPS ARM9 It includes ARM's trace support plus a JTAG interface for debugging The ARM9 works with the ARMv4T instruction set, including the 16-bit Thumb extensions The processor has 8-kbyte data and code caches and an MMU Some other fixed components are a UART and some timers, an interrupt controller and an external memory interface that supports flash, SRAM, and DDR SDRAM One interesting aspect of the design is a mixture of dual-port and single-port RAM They're linked to the AMBA bus to the processor The second port of the dual-port RAM is dedicated to the PLD Copyrightⓒ2003
61
Altera Excalibur (2/2) Altera wraps a large PLD and single-and dual-port RAM around an ARM9 processor in its Excalibur reconfigurable processor Copyrightⓒ2003
62
Summarize Copyrightⓒ2003
63
Virtual Chip System Rapid prototyping system(DAC’98) Modeling in C
With only behavioral C model, you can get H/W prototype System can be verified at earlier design stage Reduce design time in system design Chip Specification Modeling in C Virtual Chip Compile & Debug Virtual Chip download Target Chip monitor & profile Host Computer Target Board Copyrightⓒ2003
64
Front-end Design Flow Target Chip Specification Virtual Chip
Interface Function Virtual Chip Environment Waveform Synthesizer Ptolemy FSM I/O vars C Model Interface Synthesizer API inserter C2V H/W synthesis IP library C model w/ API Verilog RTL of chip Verilog RTL of PSG API functions Verilog RTL Of block Real board Conventional Design flow processor PSG processor IP Emulation (Virtual Chip) Real Chip Partita chip Copyrightⓒ2003 performed by designer
65
Benefits in Design Time
idle Conventional design flow Virtual-Chip-based design flow Architectural model RTL Gate-level H/W Emulation Verification w/ H/W H/W prototype (H/W emulation) Board design Application S/W Design (Virtual Chip) Design time is drastically reduced Copyrightⓒ2003
66
H/W of Virtual Chip = Engine Processor(CPU) Memory(SDRAM) PSG(FPGA)
Execute behavioral model Memory(SDRAM) Contain behavioral model PSG(FPGA) Generate bus cycle of target chip socket Target Board Chipset Engine Processor DRAM PSG (FPGA) Virtual Chip board Target Chip = Copyrightⓒ2003
67
System Model in C User models both chip and board in C
Compiles together and run on computer Chip model Board model variable function H/W Emulation Board model is replaced with real board Virtual chip provide interface between chip model and board Real board function Virtual Chip Copyrightⓒ2003
68
H/W Template - I/O Variable
I/O Variable : The variable that is related to bus activity whenever I/O variables are accessed, correspond bus cycle should occur Two types of I/O variables I/O variables represent the components in board I/O variables represent the pins of TC Target Chip board Copyrightⓒ2003
69
API Function API function Application Programming Interface
Generate bus cycle of EP to invoke bus cycle of target chip VC_read(addr) int *addr; { port=get_port(addr); bus=get_buscycle(addr); outportb(port,addr); outportb(port,buscycle); return inportb(port); } Engine Processor PSG (FPGA) DRAM Chipset Virtual Chip board Copyrightⓒ2003
70
S/W of Virtual Chip Automatic API Insertion Interface Synthesis C2V
With I/O variable list, replace I/O variable with corresponding API function Execute code segment in parallel to realize hardware feature Interface Synthesis Virtual Chip (II) C2V To complete design path of chip Generate RTL code from behavioral model Monitoring, Debugging tool Copyrightⓒ2003
71
Case study : How does HES™ work?
RTL Simulator HES™ Design Verification Manager Test Bench (Software) New design Design reuse IP Core Interconnect Fabric FPGA Space Software HES™ Hardware Conceptually, the design, comprised of different element is “assembled” on a fabric that interconnects the elements into a working system. The elements may consist of any or all of the following; Purchased IP cores New design codes increasing functionality or features Legacy or design re-use components Et.c The test bench is typically un-synthesizable code and therefore resides in the simulator memory. The HES™ Design Verification Manager “orchestrates” this set of components, interconnects, and test bench definition and “pushes” them incrementally into the FPGA space (hardware). Once in hardware application of the test bench (a vector set) can easily and quickly be applied to the component set and the results tracked and recorded. Initial condition Simulation using the designer’s software selection Test Bench connected through RTL simulator Copyrightⓒ2003
72
DVM incrementally pushes design into the FPGA
RTL Simulator HES™ Design Verification Manager Test Bench (Software) Glue Logic IP Core Design reuse Interconnect Fabric FPGA Space Software HES™ Hardware Conceptually, the design, comprised of different element is “assembled” on a fabric that interconnects the elements into a working system. The elements may consist of any or all of the following; Purchased IP cores New design codes increasing functionality or features Legacy or design re-use components Et.c The test bench is typically un-synthesizable code and therefore resides in the simulator memory. The HES™ Design Verification Manager “orchestrates” this set of components, interconnects, and test bench definition and “pushes” them incrementally into the FPGA space (hardware). Once in hardware application of the test bench (a vector set) can easily and quickly be applied to the component set and the results tracked and recorded. Hardware Step II Partial design segment “pushed” into FPGA Design Verification Manager maintains the Interconnect Fabric Test Bench connected through RTL simulator Copyrightⓒ2003
73
Last step, Design DVM with design installed in FPGA
RTL Simulator HES™ Design Verification Manager Test Bench (Software) Glue Logic IP Core Design reuse Interconnect Fabric FPGA Space HES™ Hardware Conceptually, the design, comprised of different element is “assembled” on a fabric that interconnects the elements into a working system. The elements may consist of any or all of the following; Purchased IP cores New design codes increasing functionality or features Legacy or design re-use components Et.c The test bench is typically un-synthesizable code and therefore resides in the simulator memory. The HES™ Design Verification Manager “orchestrates” this set of components, interconnects, and test bench definition and “pushes” them incrementally into the FPGA space (hardware). Once in hardware application of the test bench (a vector set) can easily and quickly be applied to the component set and the results tracked and recorded. Hardware Last step; All three elements of design “pushed” into FPGA Design Verification Manager maintains Interconnect Fabric Test Bench connected through RTL simulator Copyrightⓒ2003
74
How does HES™ compare with conventional flow
Simplified process: Generate spec, then in parallel… Enter design Develop test bench Enter and simulate each step Each simulation verifies current increment plus previous entries Simulation time grows exponentially as blocks are added Perform synthesis Perform place & route/implementation Design Spec Test Bench Design entry RTL Simulator Synthesis Place & Route Conventional design flows like the one shown are typical of FPGA and ASIC design methodologies. Additional features and variations serve to improves various aspects but do not invalidate the concept. The flow is practical in understanding the concept, but does not convey any limitations. For example, the RTL simulator will perform simulations for design sizes of 10’s of gates to billions of gates. Unfortunately somewhere near the billion gate limit simulation in software becomes untenable. In other words, its just not practical above a certain gate count. ASIC Copyrightⓒ2003
75
Where does HES™ fit into the design flow?
Typical Design Project Time Costs DDesign Entry 5% BBehavioral Simulation & Debugging 55% Synthesis 15 % PPlace & Route 15% Design Spec Test Bench Design entry RTL Simulator HES Improvement Area Synthesis Place & Route Aldec has carefully targeted HES™ to permit designers to significantly reduces behavioral simulation and debugging of RTL designs. The reason is that this portions of the design process typically consumes 50+% of the design team’s time. Its far more complex than just asserting that simulations with HES™ typically take milliseconds rather than hours. The design and verification process is highly iterative and very complex. As a result it is very dependant on methodology and process. However, at a HES™ BETA site a sophisticated user organization estimates that savings of 40% are being achieved – with an un-optimized process. ASIC Copyrightⓒ2003
76
Eliminates Verification Bottlenecks
Larger gate counts are accelerated more effectively HES offers unsurpassed verification acceleration These preliminary acceleration results, again, to not reflect any process optimization. They do, however, indicate that verification performance is very fast and is not significantly related to numbers of gates in the design. 92K gates 82K gates 24K gates 89K gates 65K gates Copyrightⓒ2003
77
Conclusion RTL acceleration of large design using Prototyping
Reduce verification time Verification times reduced to 40% or more Verification not gate count driven Accelerated time to market HES™ introduces no risks due to tool changes in its implementation for verification accelerations HES™ provides support for Windows/Unix/Linux Verification times driven by test bench size, rather than gate counts. Copyrightⓒ2003
78
(모듈12) 참고문헌 P. Rashinkar et al., “System-on-a-chip Verification”, KAP, 2001. H. Chang et al., “Surviving the SOC Revolution”, KAP, 1999. M. Keating et al., “Reuse Methdology Manual”, Third edition, KAP, 2002. Copyrightⓒ2003
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.