Michele Santoro: michele.santoro@dresd.org Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism Michele.

Michele Santoro: michele.santoro@dresd.org
Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism Michele Santoro: Relatore: Donatella Sciuto Correlatore: Marco D. Santambrogio

Resource Allocation and Binding
Motivation Problem statement: Interconnects have great impact on circuit design: on area: interconnect size on area on circuit's latency: signals propagation on power consumption: parasite capacitance and intrinsic resistance Solution: Decreasing the number of Interconnects. Focus on the single steps or tasks of HLS process. Operation Scheduling Resource Allocation and Binding Controller Synthesis Behavioral Description Datapath Placement Floorplanning HLS

Interconnection Sharing
Innovation Innovative contribution of this thesis is focused on scheduling and allocation phases This thesis combines different techniques in an innovative way to obtain a reduction of Synthesis cost. Coloring Resource Sharing Interconnection Sharing Analysis of the scheduling and allocation problem is divided into two phases. Static Dynamic

Outline Introduction State of Art Implementations Results
Scheduling and Allocation problem definition State of Art Implementations MR-LCS Coloring Aware PushDown algorithm Best Resource Results Benchmark Random DFGs Conclusion and Future Work

Introduction As briefly said so far, the VLSI Design Flow allows to create from high level specifications an actual device. High Level Synthesis is part of the VLSI Design flow, and it is made of several steps, like: Scheduling Allocation Placement Floorplanning Scheduling Selects control step for each operation. Determines the number of type of resources to allocate. Allocation Da dire vari tipi di scheduling: esatti e euristiche vincoli risorse e latenza Maps operations to Functional Units. Determines the total number of all kind of resources, including Mux & Registers

Coloring Aware MR-LCS Starting point: MR-LCS (ALAP generalization)
Improving in 2 phases: Coloring: pre-processing phase Scheduling: together with allocation and binding Identifying isomorphic 2-level sub-graphs Join patterns Split patterns Linear patterns

Estimated Available Time
Scheduling priority is given to Colored Sub-graphs. Need to Estimate Availability of nodes. R1 P1 R2 P2 EAT = 6 ST = 5 P1 P2 R3 n x y 2 2 P2 generates an overlapping R1 P1 n P2 = 1 EAT(n) = 8 = 0 EAT(n) = Alap(n) R2 R3 n x y

Pushdown algorithm The Pushdown algorithm exploits the Safe Range to find the best solution in case of overlapping. It also better manages the utilization of the resources. Eg: u Initial situation n1 n2 n3 n4 n5 n4 n5 Start backward from R_L n2 n3 n4 n5 u Schedule the operations n1 n2 n3 n4 n5 u Final situation

Best Resource algorithm
It is possible to take advantage of the current state of the scheduling keeping record of all the existing interconnections. R1 P1 P2 R2 n R5 R6 R3 R4 C3 C4

Results To validate the results, the algorithms have been applied to Media Benchmark and also to Random generated DFGs. Captured Costs have been divided into: Direct costs: Indirect costs: Derived costs: Number of interconnections Number of resources Number of registers Number of multiplexers Max fan-out Wire Length Total Area

Benchmark Results fft: Fast Discrete Fourier Transformation
convolve: convolution of 2 functions jdmerge: used in reconstructing JPEG images getblk: a kernel service that manages buffers Wires Resources Benchmark Nodes Edges MR-LCS CA CA_PD_BR BR fft2 11 9 6 7 5 4 fft1 17 12 13 8 convolve2 18 10 convolve1 23 14 16 getblk 33 29 22 20 21 convolve0 49 41 30 31 15 jdmerge 79 65 60 54 44 32 19 Avg 32.86 26.29 20.86 21.14 18.43 19.43 12.71 11.00 10.00 10.43 Improv. -1.37% 11.64% 6.85% 13.48% 21.35% 17.98%

Random DFGs: Direct Costs
Wires #Nodes MR-LCS CA CA_PD_BR BR α=0.0 α=0.5 α=1.0 50 32 35 300 263 240 217 216 225 550 512 460 432 431 433 444 800 759 670 639 635 662 1050 1010 881 847 845 842 884 1300 1260 1091 1066 1054 1103 1550 1505 1297 1246 1234 1323 1800 1761 1519 1500 1478 1476 1548 7102 6194 5993 5937 5923 6221 Improv. 12.8% 15.6% 16.4% 16.6% 12.4% Table shows an improvement of about 60% for Resource Sharing Table shows an improvement of about 17% for Wire Sharing Resource #Nodes MR-LCS CA CA_PD_BR BR α=0.0 α=0.5 α=1.0 50 15 10 9 300 101 54 53 55 62 550 179 99 98 100 117 800 263 139 134 169 1050 333 168 172 217 1300 412 207 206 212 257 1550 488 233 228 226 296 1800 579 299 273 277 353 2368 1209 1170 1190 1480 Improv. 22.9% 60.7% 61.9% 61.3% 51.8%

Random DFGs: Indirect Costs
Registers #Nodes MR-LCS CA CA_PD_BR BR α=0.0 α=0.5 α=1.0 50 29 30 27 300 172 163 133 132 550 303 288 232 234 228 236 800 438 419 323 325 320 332 1050 580 544 407 411 398 423 1300 721 679 512 510 497 533 1550 841 798 588 605 581 628 1800 986 920 684 695 673 726 4069 3841 2907 2941 2856 3039 Improv. 5.6% 28.6% 27.7% 29.8% 25.3% Multiplexers #Nodes MR-LCS CA CA_PD_BR BR α=0.0 α=0.5 α=1.0 50 6 7 300 38 40 26 27 550 68 67 45 48 800 101 93 61 60 59 65 1050 134 123 74 72 73 82 1300 170 155 92 88 104 1550 194 180 106 102 118 1800 233 209 122 119 142 943 873 531 530 519 593 Improv. 7.5% 43.6% 43.8% 44.9% 37% Max Fan-out #Nodes MR-LCS CA CA_PD_BR BR α=0.0 α=0.5 α=1.0 50 9 8 7 300 43 29 25 24 27 550 48 32 31 33 800 56 37 36 40 1050 62 42 45 1300 65 44 1550 73 47 49 46 53 1800 71 51 55 426 286 276 281 307 Improv. 33% 32.9% 35.3% 34.2% 28%

Random DFGs: Derived Costs
Wire Length #Nodes MR-LCS CA CA_PD_BR BR α=0.0 α=0.5 α=1.0 50 327 476 258 256 264 328 300 7225 4919 4859 5061 4425 5530 550 19195 16434 11289 11785 13346 13547 800 33625 23352 18485 18398 18506 24278 1050 53617 44911 30835 30872 29861 39128 1300 94362 47793 35643 34661 35451 41891 1550 101569 75465 53907 52420 53373 69657 1800 146467 90954 66617 64530 63814 83573 456387 304304 221893 217983 219040 277932 Improv. 33.3% 51.4% 52.2% 52.0% 39.1% Total Area #Nodes MR-LCS CA CA_PD_BR BR α=0.0 α=0.5 α=1.0 50 1344 1360 816 864 832 928 300 11680 8624 5376 5760 6912 550 22672 15232 10752 10240 10816 12672 800 33792 23296 13376 14976 13728 17920 1050 47728 31520 20672 19712 18816 24960 1300 57792 33600 20800 1550 67936 46144 26496 28288 25920 33408 1800 81312 55040 32000 28800 34800 324256 214816 128304 127968 124384 152400 Improv. 33.8% 60.4% 60.5% 61.6% 53.0% Tables show a reduction of about 52% for Total Wire Length and of about 60% for Total Area

Conclusions and Future Works
In this Master Thesis Project simple considerations have been used It has been proved that proposed algorithms perform better than standard MR-LCS achieving: up to 17% of improvement in interconnection sharing around 68% of improvement in resource sharing reduction of around 64% of overall cost Future Works Recognize and exploit different topological patterns. Multi coloring pre-processing. Reiterate the solution through the algorithm: This allows to get further improvements, because the algorithm will be aware of the solution upperbound.

Questions?

Michele Santoro: michele.santoro@dresd.org Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism Michele.

Similar presentations

Presentation on theme: "Michele Santoro: michele.santoro@dresd.org Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism Michele."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Michele Santoro: michele.santoro@dresd.org Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism Michele.

Similar presentations

Presentation on theme: "Michele Santoro: michele.santoro@dresd.org Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism Michele."— Presentation transcript:

Similar presentations

About project

Feedback