Presentation is loading. Please wait.

Presentation is loading. Please wait.

TBS: Fast Analysis of Structured Power Grid by Triangularization Based Structure Preserving Model Order Reduction Hao Yu, Yiyu Shi and Lei He Electrical.

Similar presentations


Presentation on theme: "TBS: Fast Analysis of Structured Power Grid by Triangularization Based Structure Preserving Model Order Reduction Hao Yu, Yiyu Shi and Lei He Electrical."— Presentation transcript:

1 TBS: Fast Analysis of Structured Power Grid by Triangularization Based Structure Preserving Model Order Reduction Hao Yu, Yiyu Shi and Lei He Electrical Engineering Dept. UCLA Partially supported by NSF and UC-MICRO fund sponsored by Analog Devices, Intel and Mindspeed http//:eda.ee.ucla.edu

2 2 New Challenges in Integrity Verification n Integrity verification is to check transient V/T-violation for linear power/signal/thermal network l Large-scale u millions of nodes and ports l Often structured u e.g., locally regular and globally irregular P/G network [Singh- Sapatnekar:TCAD’05] n A fast yet accurate linear simulator to perform large- scale transient verification is necessary l Linear-network macromodeling is one effective approach How to use structure information to build accurate and efficient macromodels

3 3 Existing Structured Macromodeling n Hierarchical node-elimination (HNE) by [Zhao-Panda- Sapatnekar-Blaauw:DAC’00] l Build macromodel by internal node elimination with source mapping l Analyze macromodel in a hierarchical (two-level) fashion l Require a sparsification by linear-programming (LP) due to the dense fill-in n SPRIM [Freund:ICCAD’04] and BSMOR [Yu-He-Tan:BMAS’05] l Leverage block structure in the state matrix l Build macromodel by a structure-preserved moment-matching n HiPRIME [Cao-Lee-Chen: DAC’02], a hierarchical extension of PRIMA [Odabaisoglu-Celik-Pileggi:TCAD’98] l Build macromodel by hierarchical orthonormalization l Lose the hierarchy due to the final flat-projection We propose a new structure-preserved moment matching, with 20x less waveform error and 50x speedup

4 4 Outline n Review macromodeling by moment matching n Our Approach: TBS method n Experimental Results n Conclusions

5 5 Macromodeling by Moment Matching (I) n Electric systems can be described in MNA (modified nodal analysis) Solution ( x ) of MNA is contained in block Krylov subspace n Grimme’s Projection Theorem

6 6 Macromodeling by Moment Matching (II) a) To remove linear dependency in the low-dimensioned projection matrix V, block-Arnoldi orthnormalization is applied c) To handle large number of inputs such as P/G network, SIMO (single-input-multi-output) reduction can be assumed b) To preserve passivity, a congruence transformation is used to project state matrices ( G,C,B,L ) respectively Replace the input port matrix B by a common input vector J l All poles are matched w.r.t. one superposed input Matched moments/poles ( q ) are independent on input number ( p ) Feldmann-Liu: ICCAD’04 V is flat and destroys the structure of state matrices [Feldmann-Liu: ICCAD’04]

7 7 Structure-preserved Moment Matching n Limitations of SPRIM and BSMOR Moment/pole matching is not localized l Reduction does not preserve the structure of latency l Model does not leverage redundancy l Inefficient and inaccurate for P/G grid macromodeling SPRIM [Freund:ICCAD’04] leverages the 2 x 2 block structure in MNA Splits V into a 2 x 2 block diagonal form l Preserves the structure of reciprocity (symmetry between input and output), and hence achieves a higher accuracy than PRIMA n BSMOR [Yu-He-Tan:BMAS’05] partitions state matrices into more blocks Splits V into a m x m block diagonal form l Preserves the block structure and sparsity, and hence achieves better efficiency than SPRIM

8 8 Outline n Review macromodeling by moment matching n Our Approach: TBS method l Triangular Block Structured moment matching n Experimental Results n Conclusions

9 9 l Stamp interconnection blocks off-diagonally l Stamp basic blocks diagonally From Layout to Structured Model n Build a structured state matrix by partitioning the layout 1 2 3 4 5 6 7 8 2g-g-g -g 1 g 3 -g-gxgx -g 1 2g 1 -g1g1 -g-g1g1 g3g3 -g -g -g g4g4 -g-g -g 2 2g 2 -g -g 2 g 4 -g -g-g 1 2 3 4 5 6 7 8 2g 1 -g1g1 -g1g1 --g1g1 - -- -g1g1 --gxgx -gxgx -gxgx -g2g2 -g2g2 - -g2g2 - -g2g2 -g2g2 -g2g2 2g 2 12 34 56 78 12 34 56 78 w1w1 w2w2 g 3 =2g 1 +g x g 4 =2g 2 +g x n A number of interconnected basic blocks can be used to represent both homogenous and heterogeneous circuits g1g1 g2g2 gxgx

10 10 Properties of Interconnected Basic Blocks n Structure of latency : the spatial distribution of time constants l Each basic block has a time constant Due to redundancy, basic block representation is not compact n Redundancy : different basic blocks can share a same or similar time constant

11 11 Dominant-pole Based Clustering removes redundancy TBS Flow (Reduced Blocks) (Basic Blocks) Block Diagonal Projection (Block Integrity) Two-level Relaxation Analysis (Triangular Blocks) Triangularization (Compact Blocks) Dominant-pole Clustering

12 12 Clustering Procedure n Compress basic blocks into compact blocks n Cluster number is determined by the nature of the network structure l There is no need to cluster a homogeneous circuit, but TBS still applies 2. Cluster basic blocks if the mode-distance is small enough 1. Calculate the q -dominant pole-set (mode) for each basic block and

13 13 Advantages of Clustering n Redundant poles are removed l Hence redundant columns in the projection matrix are also removed, i.e., the effective rank of projection matrix is improved n Structure of latency is leveraged l Each compact block can be solved with different time-step n A complete modal decomposition is achieved l Each compact block has a unique pole-set or mode, and the resulted system is block-wisely stiff System poles are determined by both diagonal and off- diagonal blocks, which is not efficient

14 14 TBS Flow Triangularization can localize system poles to diagonal blocks, which is the key contribution of this work (Reduced Blocks) (Basic Blocks) Block Diagonal Projection (Block Integrity) Two-level Relaxation Analysis (Triangular Blocks) Triangularization (Compact Blocks) Dominant-pole Clustering

15 15 Triangularization Procedure 2. Move the original lower-triangular parts to the new upper-triangular parts 1. Stack a replica-block diagonally n This procedure is implemented by a block matrix data structure without increasing memory usage

16 16 Advantages of Triangularization n System poles are determined only by those compact blocks in diagonal l Compact blocks are almost decoupled from each other n A triangular system has a factorization cost only coming from those diagonal blocks l There is no need to factorize the entire matrix n Block duplication results in an equivalent solution l Simpler than the existing permutation based triangularization procedure [Kim Davis: KLU] Due to the replica block, the overall cost of factorization is the same as the original

17 17 TBS Flow Block diagonal projection can reduce the system size and the cost of the factorization (Reduced Blocks) (Basic Blocks) Block Diagonal Projection (Block Integrity) Two-level Relaxation Analysis (Triangular Blocks) Triangularization (Compact Blocks) Dominant-pole Clustering

18 18 2. Reduce the state matrices block by block respectively Block Diagonal Projection Procedure 1. Split a flat into a structured with an increased rank by a factor of cluster number n The reduced system preserves upper-triangular structure

19 19 Advantages of Block Diagonal Projection n System moments and poles are matched locally Each compact block is reduced locally to match q poles Total mq poles are matched for m unique compact blocks (poles from the replica are duplicate poles) n Reduced model preserves block triangular structure and structure of latency l Each reduced block can be factorized independently l Each reduced block could have different time-constant n More matched poles improves accuracy l Using a low-order reduction for each compact block locally can achieve a high-order accuracy for the overall system It can be efficiently solved by a block backward-substitution or a two-level analysis with relaxation

20 20 TBS Flow Two-level relaxation can further reduce simulation cost Reduced Blocks Basic Blocks Block Diagonal Projection Block Integrity Two-level Relaxation Analysis Triangular Blocks Triangularization Compact Blocks Dominant-pole Clustering

21 21 Two-level Relaxation Solver n The time-domain iteration of a triangular system always converges [White: Book’87] n Two-level representation and analysis + n Each reduced diagonal block can be factorized independently, and solved with different time step during backward-Euler (BE) integration l In contrast, the previous pole-residue solution u eigen-decompose the entire reduced matrix (dense and no structure) u structure of latency cannot be explored

22 22 Outline n Review macromodeling by moment matching n Our Approach: TBS method n Experimental Results n Conclusions

23 23 Experiment Settings n Large-scale homogeneous and heterogeneous P/G grid (RC-mesh) with millions of nodes n For heterogeneous case, each block has different wire-pitch/width, block-size and hence different time-constant n Reduction algorithm assumes SIMO reduction for large number of inputs but also supports the general MIMO reduction n Compare TBS to BSMOR [Yu-He-Tan:BMAS’05], HiPRIME [Cao-Lee-Chen:DAC’02], and HNE [Zhao-Panda-Sapatnekar-Blaauw:DAC’00]

24 24 Triangular Block Structure Preservation n Nonzero (nz) pattern of conductance matrices l (a) original system l (b) triangular system l (c) reduced system by TBS

25 25 m x q Pole Matching (m0=32, m=4, q=8 ): TBS has exact 32 -pole matched, BSMOR has exact 8 -pole matched and 24 -pole approximately matched, and HiPRIME (a partitioned PRIMA) has only 8 -pole matched n Waveforms in time domain: improved accuracy with more matched poles

26 26 Study Waveform-error Scalability ckt Node (N)Port (p)Order (q)HNEHiPRIMEBSMORTBS ckt11K4885.54e-69.09e-64.87e-65.03e-7 ckt210K320401.21e-52.31e-57.93e-61.84e-6 ckt3100K480601.31e-26.82e-41.91e-43.02e-5 ckt41M8001006.01e-29.67e-34.23e-31.27e-4 ckt57.68M48002000.119,93e-25.10e-23.01e-3 ckt67.68M6.14M300NA 5.04e-3 n HiPRIME, BSMOR and TBS use the same order (moments) to generate the macromodel n The macromodel obtained by HNE has a similar size and sparsity as TBS 1. TBS reduces waveform-error by 38X compared to HNE as truncation used in HNE leads to large error 2. TBS reduces waveform-error by 33X compared to HiPRIME as more poles are matched 3. TBS reduces waveform-error by 17X compared to BSMOR as more poles are exactly matched

27 27 Study Runtime Scalability 1day:1hr:2 9min 6min:16sNA ckt6 1day:18min2min:8s1day:1hr: 36min 1hr:45m in ~5day2min:42s1day:5hr:1 1min 4hr:43min:18sckt5 11min:23s20.7s11min:42s4min:54 s ~1day47.3s21min:32s34min:58sckt4 1min:32s1.62s1min:38s1min:2s2hr:48min :20s 5.76s1min:51s1min:17sckt3 1.02s0.11s1.18s0.63s1min:42s0.54s1.24s2.19sckt2 0.08s0.09s0.08s0.12s1.02s0.15s0.08s0.44sckt1 simulationbuildsimulationbuildsimulationbuildsimulationbuild TBSBSMORHiPRIMEHNE ckt n All methods generate macromodels with similar accuracy 1. TBS (and HiPRIME) is 133X faster to build than HNE as no LP-truncation is needed to preserve sparsity 2. TBS (and HiPRIME) is 54X faster to build than BSMOR as the orthonormalization is performed locally 3. TBS (and BSMOR/HNE) is 109X faster to simulate than HiPRIME as their macromodels have hierarchy n Runtime includes macromodel-building/simulation time

28 28 Conclusions n TBS enables localized moment matching, and matches more poles than PRIMA n TBS is stable, and is passive for MIMO reduction n TBS is applicable to both homogenous and heterogeneous designs n TBS achieves over 20x less waveform error and 50x speedup compared to HNE, HiPRIME, and BSMOR (an improved version of SPRIM) n TBS approach has been extended to l Handle inductance and its inverse element [Yu-Shi-He:ICCAD’06] l Optimize simultaneous power and thermal integrity in 3D integration [Yu-Ho-He:ICCAD’06] More details can be found in DAC Ph. D forum 2006


Download ppt "TBS: Fast Analysis of Structured Power Grid by Triangularization Based Structure Preserving Model Order Reduction Hao Yu, Yiyu Shi and Lei He Electrical."

Similar presentations


Ads by Google