Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai.

Similar presentations


Presentation on theme: "Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai."— Presentation transcript:

1 Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

2 2 Overview Introduction Challenges of hierarchical design Hierarchical methodology – Full chip physical prototyping Performance data Summary

3 3 Introduction As chip size and complexity grow, hierarchical design approach is necessary During last 12 months, there is a big increase in the number of chips designed with hierarchical approach The advantages of hierarchical approach is divide-and-conquer

4 4 The Challenges How to get full-chip (10 million gates+) physical reality early on to identify potential problems? How to have convergence process to reach design closure from beginning to end? How to achieve die utilization similar to “flat” approach? How to achieve clock speed and skews similar to “flat” approach? How to automatically generate optimal pin assignments for each module? How to automatically come up with realistic timing budgets for each module? How to achieve top level timing/signal integrity closure?

5 5 Creating the Physical Prototype Full-chip flat prototype delivers the complete physical, timing, clock and power data –Eliminates the guessing of the traditional block-based approaches Drives the partitioning in manageable blocks Flat Full-Chip Delivers an Accurate Physical Prototype

6 6 Estimation Prototyping Starts Early in the Flow Most accurate view possible at all design stages Physical timing budgeting drives synthesis RTL/ Black box 75% netlist/ Black box Complete netlist Refinement Optimization Design Completion P r o t o t y p i n g Initial timing budgets Refined timing budgets

7 7 Hierarchical Design Flow Flat Full Chip Physical Prototype Physically Feasible? Physical Partitioning Top Level Implementation CTS, Optimization, Power NO Optimized Top Level Netlist Die size Timing Clock skew Power SI LEF/GDSII RTL/Black Box Process Data Quick synthesis Floor planning Placement CTS Trial route Partition Data Block Implementation Place, CTS, Optimize Partition Data Partition Data Partition Data Partition Data Pin assignment Timing budget Clock spec Power grid DEF Placement Chip Level Timing Constraints DEF Placement

8 8 Hierarchical Partitioning Pin assignment Timing budgeting Clock tree generation Power grid planning Partitioning Independent block-level implementation SoC assembly

9 9 Accurate Pin Assignment Full-chip prototype results in optimal pin placement –Results in narrower channels and reduced die size –Reduces the routing congestion –Improves the chip timing Accurate Physical Prototype Flat Full-Chip Top Level Partition View

10 10 Timing Budgeting Each block requires: Clock definition Set_input_delay Set_output_delay Set_drive Set_load Path exceptions (false, multicycle paths) Block 1 Block 3 Block 2 L L L Accurate timing budgets result in predictable timing convergence

11 11 Hierarchical Clock Tree Synthesis Accurate physical timing data enables the creation of an optimal clock tree –Block-level followed by top-level clock tree Final clock tree routing generates near zero skew –Balanced tree at the top level Worst block skew + Zero top level skew = 150ps total clock skew Balanced clock tree 150ps skew 120ps skew 50ps skew 50ps skew 100ps skew 130ps skew

12 12 Full Chip Power Analysis

13 13 Hierarchical Power Grid Design P/G are planned at full chip level P/G network gets automatically pushed down during partitioning Full chip Block

14 14 Performance Data Design DescriptionNetlist to SDF Time 1.8M cells; 200 macros6 hours 900K cells3 hours 2.3M cells; 700 macros14 hours 2M cells; 100+ macros5 hours 2.8M cells10 hours 1.7M cells; 70 macros5 hours

15 15 High Performance Environment Design Import Detail Place Detail Route* RC Extract Delay Calculation Timing Analysis IPO Design Iteration 60x 4 min 4 hr 1x 3 hr 20 min 2 hr 50 min 56x 8 min 7 hr 30 min 57x 6 min 5 hr 45 min 33x 7 min 3 hr 50 min 7x 20 min 2 hr 15 min 5x 1 hr 50 min 9 hr 6x 5 hr 25 min 35 hr 40 min Design 580K cells, 0.25um process, 5LM, 100MHz Data collected on a 500MHz processor workstation (*) SPC Trial Route First Encounter Traditional

16 16 High Accuracy of the Prototype The prototype closely correlates with post-route layout –Comparison to ‘tape-out’ back-end flow –More than 90% of the interconnect and IO path delays within 2% Design:  5LM  0.25um  580K cells  620K nets  572 I/Os  4 blocks

17 17 Summary SoC Hierarchical Methodology Build a full-chip physical prototype early on –Start at RTL –Identify problems early Achieve design closure before partitioning –Close full-chip timing –Optimize die size –Meet power requirements –Resolve signal integrity issues Maintain the design closure throughout the design process


Download ppt "Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai."

Similar presentations


Ads by Google