Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning.

Similar presentations


Presentation on theme: "CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning."— Presentation transcript:

1 CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning

2 Previously Established –cannot afford fully general interconnect –must exploit locality in network Quantified –locality (geometric growth / Rent’s Rule)

3 Today Automation to extract/exploit locality –Partitioning Heuristics FM Spectral

4 Partitioning Heirarchical placement Often used to initialize placement NP Hard in general Fast heuristics exist Don’t address critical path delay

5 Partitioning Problem Given: netlist of interconnect cells Partition into two (roughly) equal halves (A,B) minimize the number of nets shared by halves “Roughly Equal” –balance condition: (0.5-  )N  |A|  (0.5+  )N

6 Fiduccia-Mattheyses (Kernighan-Lin refinement) Randomly partition into two halves Repeat until no updates –Start with all cells free –Repeat until no cells free Move cell with largest gain Update costs of neighbors Lock cell in place (record current cost) –Pick least cost point in previous sequence and use as next starting position Repeat for different random starting points

7 FM Cell Gains Gain = Delta in number of nets crossing between partitions

8 FM Recompute Cell Gain For each net, keep track of number of cells in each partition Move update:(for each net on moved cell) –if T(net)==0, increment gain on F side of net (think -1 => 0) –if T(net)==1, decrement gain on T side of net (think 1=>0) –decrement F(net), increment T(net) –if F(net)==1, increment gain on F cell –if F(net)==0, decrement gain on all cells (T)

9 FM Recompute (example)

10 FM Recmopute (example)

11 FM Data Structures Partition Counts A,B Two gain arrays –Key: constant time cell update Cells –successors (consumers) –inputs –locked status

12 FM Optimization Sequence (ex)

13 FM Running Time? Randomly partition into two halves Repeat until no updates –Start with all cells free –Repeat until no cells free Move cell with largest gain Update costs of neighbors Lock cell in place (record current cost) –Pick least cost point in previous sequence and use as next starting position Repeat for different random starting points

14 FM Running Time Claim: small number of passes (constant?) to converge Small (constant?) number of random starts N cell updates Updates K + fanout work (avg. fanout K) –assume K-LUTs Maintain ordered list O(1) per move –every io move up/down by 1 Running time: O(KN)

15 FM Starts? 21K random starts, 3K network -- Alpert/Kahng

16 Tool Admin “breather” ppw autoretime limitdepth gbufplace ar (sliding) my_adder Makefile ex

17 Spectral Partitioning Minimize Squared Wire length -- 1D layout Start with connection array C (c i,j ) “Placement” Vector X for x i placement cost = 0.5*  (all I,j) (x i - x j ) 2 c i,j cost sum is X’BX –B = D-C –D=diagonal matrix, d i,i =  (over j) c i,j

18 Spectral Partition Constraint: X’X=1 –prevent trivial solution all x i ’s same Minimize cost=X’BX w/ constraint –minimize L=X’BX- (X’X-1) –  L/  X=2BX-2 X=0 –(B- I)X=0 –X => Eigenvector of B –cost is Eigenvalue

19 Spectral Partitioning X (x i ’s) continuous use to order nodes cut partition from order

20 Spectral Ordering Midpoint bisect isn’t necessarily best place to cut, consider: K (n/4) K (n/2)

21 Spectral Partitioning Options Can bisect by choosing midpoint Can relax cut critera –min cut w/in some  of balance Ratio Cut –minimize (cut/|A||B|) idea tradeoff imbalance for smaller cut –more imbalance =>smaller |A||B| –so cut must be much smaller to accept

22 Spectral vs. FM From Hauck/Boriello ‘96

23 Improving Spectral More Eigenvalues –look at clusters in n-d space –5--70% improvement over EIG1

24 Spectral Theory There are conditions under which spectral is optimal –Boppana Provides lower/upper bound on cut size

25 Improving FM Clustering technology mapping initial partitions runs partition size freedom replication Following comparisons from Hauck and Boriello ‘96

26 Clustering Group together several leaf cells into cluster Run partition on clusters Uncluster (keep partitions) –iteratively Run partition again –using prior result as starting point

27 Clustering Benefits Catch local connectivity which FM might miss –moving one element at a time, hard to see move whole connected groups across partition Faster (smaller N) –METIS -- fastest research partitioners exploits heavily –…works for spectral, too FM work better w/ larger nodes (???)

28 How Cluster? Random –cheap, some benefits for speed Greedy “connectivity” –examine in random order –cluster to most highly connected –30% better cut, 16% faster than random Spectral –look for clusters in placement –(ratio-cut like) Brute-force connectivity (can be O(N 2 ))

29 LUT Mapped? Better to partition before LUT mapping.

30 Initial Partitions? Random Pick Random node for one side –start imbalanced –run FM from there Pick random node and Breadth-first search to fill one half Pick random node and Depth-first search to fill half Start with Spectral partition

31 Initial Partitions If run several times –pure random tends to win out –more freedom / variety of starts –more variation from run to run –others trapped in local minima

32 Number of Runs

33 2 - 10% 10 - 18% 20 <20% (2% better than 10) 50 (4% better than 10) …but?

34 FM Starts? 21K random starts, 3K network -- Alpert/Kahng

35 Replication Trade some additional logic area for smaller cut size Replication data from: Enos, Hauck, Sarrafzadeh ‘97

36 Replication 5% => 38% cut size reduction 50% => 50+% cut size reduction

37 Partitioning Summary Two effective heuristics –Spectral –FM many ways to tweak –Hauck/Boriello half size of vanilla even better with replication only address cut size, not critical path delay

38 Next Lecture LIVE (Taping): Wednesday 4pm (here) Playback: Thursday, classtime/place Programmable Computing Elements –Computing w/ Memories


Download ppt "CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning."

Similar presentations


Ads by Google