Presentation on theme: "Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim."— Presentation transcript:
Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim ; Ciesielski, M. ; Dept. of Electr. & Comput. Eng., Univ. of Massachusetts, Amherst, MA, USA Kyuho Shim ; Seiyang Yang ; Dept. of Comput. Eng. Pusan National Univ., Busan, Korea Design, Automation & Test in Europe Conference & Exhibition (DATE), 2011 National Sun Yat-sen University Embedded System Laboratory
Simulation speedup offered by distributed parallel event-driven simulation is known to be seriously limited by the synchronization and communication overhead. These limiting factors are particularly severe in gate-level timing simulation. This paper describes a radically different approach to gate-level simulation based on a concept of temporal rather than conventional spatial parallelism. The proposed method partitions the entire simulation run into simulation slices in temporal domain and each slice is simulated separately. With each slice being independent from each other, an almost linear speedup is achievable with a large number of simulation nodes. 2
This concept naturally enables correct by simulation methodology that explicitly maintains the consistency between the reference and the target specifications. Experimental results clearly show a significant simulation speed-up. 3
4 Whats the problem The performance of hardware simulation For complex designs becomes prohibitively low. Limited by the synchronization and communication overhead. Proposed method to solve above problem A radically different approach to gate-level simulation based on a concept of temporal parallelism.
5  SimCluster  SimCluster [This paper] Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models [This paper] Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models  TPSim – GL timing simulation  TPSim – GL timing simulation The basic idea of this approach and preliminary results for special cases were introduced.  Parallel Discrete Event Simulation (PDES)  Parallel Discrete Event Simulation (PDES)  Principles of conservative parallel simulation  Principles of conservative parallel simulation lock-step based synchronization partitions the design into separate modules and performs concurrent simulation Rollback-based synchronization  performance improvement  performance improvement  speed up  speed up Developed the first Verilog distributed simulator A large gate-level decoder design improvement
6 Proposed method – TPSim TPSim (Temporal Parallel Simulation) (1) Partitions the entire simulation into slices in temporal domain. (2) Each slice is simulated separately. It consists of two major steps: Fast reference simulation Performed on a high-level abstraction of the design. To store essential state information. Detailed, fine-grained target simulation Performed on a lower level (gate-level) model. It is applied in parallel to each simulation slice. (1) State checking (2) State matching
7 Difficulties in Generalization of Temporal Parallelism (1) Multiple Asynchronous Clocks Multiple-clock design may not be 100% cycle-by-cycle consistent with the RTL simulation. Proposed solution : Abstract delay annotation method Allowed to overlap by the value equal to the longest delay in the design DataA[N-1:0] ReqB ClkB
8 Difficulties in Generalization of Temporal Parallelism (2) State Checkpointing in Event-driven Simulation Finding correct placement for checkpoints is more difficult because of arbitrary delay between the event edges. Proposed solution : Checkpoint window The size of the checkpoint window is one clock-cycle equivalent The correct value for Q could be reliably obtained at the end of each window Overlap period must be increased accordingly so that it contains the entire target checkpoint window.
9 Difficulties in Generalization of Temporal Parallelism (3) State Matching Maintain functional correctness of the restored target state. During synthesis the design undergoes a number of logic transformations Combinational and sequential logic optimization, retiming, and algebraic transformations Proposed solution : A promising preliminary work in state matching has recently been published in . Handling testbench Testbench is a sequential process It has no hardware states,so it cannot be restarted at an arbitrary point of time. Proposed solution : Testbench forwarding Saved continuously during the reference simulation
10 How many performance can TPSim improve ? Slices Multiple clock issue ? Tool selection Synthesis : Design Compiler Cell library : 65nm technology library Simulator : NC-Sim 8.2
11 This design was from OpenCores Total gate count of GL design is 0.9M
This design was from OpenCores Total gate count of GL design is 25K 12
13 Conclusions This is accomplished by performing temporal partitioning of the simulation period. This paper provides not only significant performance improvement but also a smarter method for simulation-based verification. My comments Because, I have some problem about the Performance gap between RTL and GL timing simulation. This paper give me a other reference about this area.