Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chungki Oh, Jianfeng Liu, Seokhoon Kim, Kyung-Tae Do,

Similar presentations


Presentation on theme: "Chungki Oh, Jianfeng Liu, Seokhoon Kim, Kyung-Tae Do,"— Presentation transcript:

1 Critical Signal Flow for Power Estimation: The Road to Billion Gate SoC Power Verification
Chungki Oh, Jianfeng Liu, Seokhoon Kim, Kyung-Tae Do, JungYun Choi, Hyo-Sig Won, Kee Sup Kim Design Technology Team System LSI Division Samsung Electronics Jeongwon Kang, Kamlesh Madheshiya, Arti Dwivedi Ansys Apache

2 Table of Contents Mobile SoC Design Trend
Challenges in SoC Power Analysis Power Critical Signal Flow in RTL/Gate power analysis Summary

3 Mobile SoC Design Trend
The design size of mobile SoC has been increasing at a rapid speed Fierce competition in mobile market has driven SoC design to provide high performance and numerous functionality, which was only previously available in PC and laptop To meet the power wall of mobile design and leverage the additional capacity in silicon processing scaling, multiple cores and parallelism are popular in current SoC design Billion Gate SoC SoC consumer portable design complexity trends - ITRS, 2011 edition Let’s take a look at the Mobile SOC design trends. There is fierce competition to provide high performance and numerous functionalities in smart phones which were previously available in PCs or laptops. The designs have large number of modes of operation, and functional as well as power verification in different modes has become critical. To meet the power budget and leverage the additional capacity in silicon processing scaling, multi-core designs and parallel processing are popular in SOC designs.

4 Challenges in SoC Power Analysis
The era of billion gate SoC design put significant challenges for power analysis Simulation is needed to analyze the dynamic power accurately. However, for billion gate SoC, the simulation runtime is becoming too long for reasonable design cycle The simulation waveform generated from simulation can occupy more than hundreds of GigaBytes, which puts significant burden on power analysis tools to deal with. 10’s of modes Millions of clocks Video streaming GPS + Voice Call Web + For accurate power analysis, simulation vector is essential as part of power analysis flow. As the design size increases, simulation of billion gates SOC designs is becoming more and more challenging. Simulation runtimes are very long for reasonable design cycles and the simulation dump sizes are in 100s of GBs. Such large simulation dumps cause significant performance and memory degradation in power analysis tools, and it is impractical to perform power analysis using such large fsdb files.

5 RTL Power Estimation Flow
Basic concept of RTL power estimation Inputs: RTL-coded design, power library, capacitance model, activity file Elaborate: RTL design is compiled and elaborated into an interconnection of primitive gates Calculate Power: Design is mapped to the target technology and average/time-based power analysis is performed based on switching activity PowerArtist Elaborate Calculate Power RTL (Verilog/VHDL) RTL power report Power Library (.lib) Activity File (.vcd/.fsdb/.saif) Capacitance model Verilog Simulation Micro-architectural Inferred netlist We use PowerArtist for power estimation at RTL. The key inputs required by PowerArtist are RTL/gate design, power characterized libraries (liberty files), capacitance model and simulation dump which can be vcd, fsdb or saif file. PowerArtist first compiles and elaborates the RTL design into an inferred netlist, which is an interconnection of primitive gates. In the next step, activity file is read, and power estimation is done. During power estimation, PowerArtist performs cell selection, infers a clock tree and performs average or time based power calculation. A text report and and an OADB based database is generated. To obtain reasonable accuracy, simulation is needed for vector-based power estimation

6 Critical Signal Extraction with PowerArtist
Generate a significantly smaller power-critical-signals-only FSDB from the Emulator/Simulator Full FSDB Verilog Simulation RTL Test Bench testbench.top_inst.temp_out testbench.top_inst.temp testbench.top_inst.en testbench.top_inst.out testbench.top_inst.clk testbench.top_inst.inC testbench.top_inst.inB testbench.top_inst.inA initial befin $fsdbDumpfile(“pa_extracted.fsdb”); $fsdbDumpvarsByFile(“sig_file_name”); end RTL Power-Critical Signal Extraction PowerArtist Verilog Simulation Partial FSDB Critical Signal List Test Bench PowerArtist enables us to reduce the simulation dump size using the critical signal extraction. The goal of this flow is to generate a simulation dump for only the power-critical signals in the design. These are signals like I/O ports, sequential elements etc. PowerArtist reads in the RTL design and generates a list of power-critical signals. We provide this list to our simulator or emulator, so that an fsdb is generated only for the critical signals. This causes significant reduction in simulation runtime as well as fsdb file size, with an acceptable error in accuracy. This also reduces the runtime for power analysis in PowerArtist.

7 Power-Critical ≠ Functional-Debug Signals
Identify Power-Critical Signals Power Analysis + Debug L1 Apache PowerArtist Reduced FSDB Simulator/Emulator Optimized for power analysis over entire simulation duration Identify Function-Critical Signals Functional Debug L2 Functional Debug Tools Reduced FSDB Simulator/Emulator Optimized for functional debug over limited clock cycles Some of the functional debug tools also generate a critical signal list. This is different from the power-critical signal list generated by PowerArtist. Power critical signal list consists of all signals important for power analysis accuracy like signals connected to sequential elements, control logic etc. Functional-critical signals are for functional debug and primarily consists of primary I/Os and sequential elements.

8 The Principle of Power-Critical Signal Flow
Power-critical signals Activity for only a subset of signals is necessary for accurate power estimation Critical signals consists of signals such as sequential and module in/out ports Non-critical signals Activity propagation can be performed for the remaining signals based-on activity propagation formulae of various cell types IO cells Flip-Flops ICGCs Latches PI & PO MUX Power-critical signals consist of a subset of signals in the design, which are essential for power analysis accuracy, such as signals connected to sequential elements, primary I/Os etc. PowerArtist annotates the activity of critical signals from fsdb, and propagates activity for the remaining logic in the design.

9 Power-Critical Signal Flow with PowerArtist
Application Power-critical signals can be extracted for both RTL and gate-level designs Critical signals can be utilized in simulation as well as emulation flows Impact Activity file dumped only for power-critical signals saves simulator/emulator and power analysis runtime and memory resource with small error in power analysis Power-critical signal flow enables power analysis of huge design for which power estimation used to be unrealizable Elaborate Calculate Power RT/Gate-level design RTL Power Report Power Library Partially dumped Activity File Wire Load Model Simulation/Emulation Micro-architecturally Inferred netlist Crit. Sig. Extraction Crit. sig. list Test Bench Time & Memory Saving PowerArtist Power-critical signal flow is useful for power analysis at RTL as well as gate level. This flow can be used with both simulators and emulators. When simulation data is dumped using critical-signal list, it reduces the runtime and memory usage of simulator or emulator. It also significantly reduces the size of simulation dump. This flow also helps to reduce runtime and memory usage in PowerArtist.

10 Critical Signal Flow for RTL Power Estimation
Experimental result with Design-A in RTL The first experiment was done with a multimedia codec IP design Design size is about 8 Million Gates, with 32nm library CPU time Impact on CPU time 69% Time reduction Impact on memory resource & power result 46% Memory saving 5% Power mismatch 58% Disk saving I would like to share some results based on this flow now. The first design is a multimedia codec IP design. This RTL design is about 8M gates in size at 32nm technology. Looking at the total runtime of simulation, and power analysis, we have achieved a runtime reduction by 69%. Power analysis runtime itself reduced by about 89%. Simulation dump size was 50% smaller and PowerArtist memory usage improved by 46%. Our power analysis results were within 5% of power numbers with full simulation dump.

11 Critical Signal Flow for RTL Power Estimation (2)
Experimental result with Design-B in RTL The second experiment was done with quad-core CPU block Design size is Tens of Million Gates, with 32nm library 42% Memory saving 2% Power mismatch 73% Disk saving Impact on memory resource & power result CPU time [hr] 117 24 12 14 78% Time reduction Impact on CPU time The 2nd design is a quad-core CPU block. This RTL design is 10M+ gates at 32nm technology. For this block, we achieved 78% reduction in runtime. FSDB size was reduced by a signicant 73% and power correlation with full-simulation dump is within 2%. PowerArtist memory usage improved by 42%.

12 Critical Signal Flow for Gate-level Power Estimation
Experimental result with Design-A in Gate-level The third experiment was done with same design as the first one but in gate-level Design size is about 8 Million Gates, with 32nm library 69% Time reduction CPU time Impact on CPU time 87% Memory saving 9% Power mismatch 97% Disk saving Impact on memory resource & power result The 3rd design is a gate level netlist of 1st testcase, a multimedia codec IP design. For this design, we achieved a total runtime improvement of 69%. There was a large reduction in fsdb size at 97%. The correlation with full-simulation dump is 9%.

13 Summary In the era of billion gate SoC chip design, the runtime and generated waveform database size are challenging issues for accurate power estimation. To solve this challenge, we have proposed to use a subset of the full signal list in the design when dumping the waveform. We have introduced the methodology on how to choose this signal subset for good power correlation while keep this signal subset small enough. The PowerArtist power critical signal flow has been verified by extensive experiments covering both RTL and gate-level power estimation flows. Our experimental results show that critical signal flow cut the runtime by 70-80%, simulation waveform size by 60-97%, while keeping the power correlation within less 10% mismatch.


Download ppt "Chungki Oh, Jianfeng Liu, Seokhoon Kim, Kyung-Tae Do,"

Similar presentations


Ads by Google