Presentation is loading. Please wait.

Presentation is loading. Please wait.

Seoul National University Memory Efficient Software Synthesis from Dataflow Graph Wonyong Sung, Junedong Kim, Soonhoi Ha Codesign and Parallel Processing.

Similar presentations


Presentation on theme: "Seoul National University Memory Efficient Software Synthesis from Dataflow Graph Wonyong Sung, Junedong Kim, Soonhoi Ha Codesign and Parallel Processing."— Presentation transcript:

1 Seoul National University Memory Efficient Software Synthesis from Dataflow Graph Wonyong Sung, Junedong Kim, Soonhoi Ha Codesign and Parallel Processing Lab. Seoul National University

2 Contents  Introduction  Code Generation from Block Diagram Specification  Synchronous Data Flow and Single Appearance Schedule  Proposed Strategies  Optimization 1 : code sharing optimization  Optimization 2 : minimize buffer requirement  Experiments  Conclusions

3 Seoul National University Introduction  Motivations  Embedded system has limited amount of memory large program = memory cost, performance penalty, power consumption  New trend of software development : high level design methodology growing complexity, fast design turn-around time, limited budget, etc.  Goal of Research  Reduce the code and data size of automatically generated software  In an automatic software synthesis environment Specification = Dataflow graph with SDF(Synchronous DataFlow) semantics

4 Seoul National University Software Synthesis from SDF graph A B C D 2 3 2 2 2 1 13 Possible Schedules : = AABCABACDABABCD = (6A)(4B)(3C)(2D) = (2(3A2B))(3C)(2D) main(){ for(i=0;i<6;i++){A} for(i=0;i<4;i++){B} for(i=0;i<3;i++){C} for(i=0;i<2;i++){D} } main(){ for(i=0;i<2;i++){ for(j=0;j<3;j++){A} for(j=0;j<2;j++){B} } for(i=0;i<3;i++){C} for(i=0;i<2;i++){D} } Single Appearance Schedule (SAS)

5 Seoul National University Previous Efforts  Single Appearance Schedule (SAS): APGAN,RPMC  [by Battacharyya et. al.] in Ptolemy Group  SAS guarantees the minimum code size (without code sharing)  APGAN,RPMC : heuristics to find data minimized SAS schedule  ILP formulation for data memory minimization  [by Ritz et. al.] in Meyr Group  flat single appearance schedule + sharing of data buffer  Rate optimal compile time schedule  [by Govindarajan et. al.]in Gao Group  tried to minimize the buffer requirement using linear programming  An algorithm to compute the smallest data buffer size  [by Ade et. al.] in GRAPE group

6 Seoul National University Proposed Strategies  Coding style  not stuck to one coding style, hybrid approach  generated code is a mixture of inlines and functions  Optimization 1: Code Sharing  Multiple instances of a same kernel treated as different node in SAS  Code sharing optimization has gain(block size) and cost(context size)  Optimization 2: Schedule Adjustment  give up single appearance schedule to reduce the data size  (1) represents schedule information with BTLC data structure  (2) find possible location for adjustment  (3) schedule adjustment

7 Seoul National University Flowchart of Optimization Procedure Get SAS schedule [RPMC,APGAN] Code sharing optimization code-block size context size BTLC Schedule Adjustment C code generation

8 Seoul National University Example of Code Sharing (CD2DAT) ramp ramp’ sine sine’  fir1fir2fir3fir4xgraph Code before sharing for(int i=0;i<2;i++) { { /* code for fir1 */ ……………… out = tap*input[i]’ ……………… } /* code for fir 2 */ …………….. Code after sharing for(int i=0;i<2;i++) fir(1); for(int i=0;i<3;i++) fir(2); …………… void fir(int context){ ……………… context_FIR[context].out... ……………… } context definition typedef struct{ double *out; int output_ofs; int output_bs; int output_nx; …………. double decimation; double tap; }context_FIR;

9 Seoul National University Code Size Overhead (in Sparc/Solaris) without contextwith context 4 bytes40 bytes Reference Overhead = 36 bytes! ….. = value; ….. = *(context_CGCRamp[context].value); ldd[%fp + -336],%o0sethi %hi(0x20800),%o1 ld [%o1+0x3c8], %o0 mov%o0, %o2 sll%o2, 2, %o1 add%o1, %o0, %o1 sll%01, 3, %o0 add%fp, -424, %o1 add%o1, %o0, %o2 ld[%o2 + 0x1c], %o0 ldd[%o0], %o2

10 Seoul National University Optimization 1 : Code sharing  Multiple instances of a same kernel have their own contexts  Kernel code should be transformed into shared version function  Shared Version  references are only through context variable  Gain and cost of sharing  Gain = (# instances -1)  (code block size)  Cost = (#instances)  (context variable size) + (code block overhead)  Code sharing is performed only when the gain is larger than the cost

11 Seoul National University Decision Formula 1 >  (  -1)  (  -1)  >    >  +    >  context +  reference +    >  context +  shared (1)  = code sharing overhead =  context +  reference (2)  context =  p i (p i ), p i  ports where, (x) = 3*sizeof(int) + sizeof(pointer) (3)  reference =  t  {S,C,AS,AP} (  (t)  (t))  (t) = reference count  (t) = unit overhead t = type of reference (4)  = code block size (5)  = number of instances

12 Seoul National University Optimization 2 : Adjusting SAS  Adjusting Single Appearance Schedule  2(7A3B)5C ==> 51  2(7A3B2C)C ==> 39  give up single appearance schedule  BTLC (Binary Tree with Leaf Chain) CAB 3756 C AB G 2 5 7 3 [6,0,0] = [input, inside, output] [7,0,5] [21,0,15] [0,0,3] [0,0,21]

13 Seoul National University Computation of Buffer Requirements A 3 B 75 73 2 21 30 W = |O L  I R | I =  | I L  I R - W | O =  | O L  O R -W | In general W  LR [I,W,O] C AB G 2 5 7 3 [7,0,5] [21,0,15] [0,0,3] [0,0,21] [0,21,30] [30,0,0] [0,30,0] 21 30

14 Seoul National University Flowchart of Schedule Adjustment Construct BTLC Compute buffer requirement Find candidate for adjustment Adjust schedule (split a chain) SAS schedule BTLC found yes Done code generation no

15 Seoul National University Splitting A Chain C AB G 25 73 Schedule = 2(7A3B)5C [0,0,3] [7,0,5] [6,0,0] [0,0,21] [21,0,15] [0,21,30] [30,0,0] [0,30,0] 21 30 Split point  Finding split candidate  a chain which has the largest number  in this example BC is selected  Schedule after splitting  2(7A3B2C)C  In general, for a schedule that has two clusters aC a bC b (a and b are loop counts) new schedule is defined as  a(C a (b/a)C b )(b%a)C b ), if a<b  (a%b)C a b((b/a)C a C b ), otherwise

16 Seoul National University Decision Formula C C G 21 12 [0,0,21] [12,0,0] [0,12,6] [0,6,0] 12 6 AB 73 [0,0,3] [7,0,5] 21 [6,0,0] [21,0,15] [0,21,15] [6,0,0] New Schedule 2(7A3B2C)C Gain = 12 |Cluster| = |W| value of the cluster

17 Seoul National University Experiment : CD2DAT [0,280,0] G 407 4F4 87 23 F11 X21 fork 1 M1 S21 R21 S1R1 F2 F3X1 [1,0,1] [0,0,1] [1,0,1] [2,0,1] [1,0,2] [1,0,0] [1,0,2] [3,0,4] [7,0,5] [7,0,4] [1,0,0] [4,0,0] [280,4,0] [56,0,40] [6,0,8] [0,0,1] [0,1,1] [0,0,2] [0,1,2] [0,2,1] [0,1,2] [0,1,1] [0,1,6] [0,6,56] [0,56,280] 1 0 1 2 1 1 1 6 56 280 4 7 51 4F4 87 2 F2 F3X1 [3,0,4] [7,0,5] [7,0,4] [1,0,0] [4,0,0] [35,4,0] [56,35,40] [6,0,8] [0,6,56] [0,56,40] [0,35,35] 56 35 4 5 4F4 X1 [7,0,4] [1,0,0] [4,0,0] [35,4,0] 4 G [0,35,0] 35

18 Seoul National University Experimental Result CD2DATFilter Bank SAS1367228512 Code Sharing1276822024 Schedule Adjustment1229622024 Program size after each optimization Memory behavior of CD2DAT in ARM7 FetchesMiss SAS1709817757189 Code Sharing1757392352867 Schedule Adjustment1749938654331

19 Seoul National University Conclusion  Our Environment  PeaCE : Ptolemy extension as Codesign Environment  Optimization Techniques in Software Synthesis  For automatic code generation from dataflow graph  Joint minimization of code and data size  Selective application code sharing and schedule adjustment to SAS  Future works  Clustering : multiple fine grain nodes into a large one increase chance of code sharing  Buffer sharing further reduce the buffer size and increase the cache effect

20 Seoul National University Thank You !


Download ppt "Seoul National University Memory Efficient Software Synthesis from Dataflow Graph Wonyong Sung, Junedong Kim, Soonhoi Ha Codesign and Parallel Processing."

Similar presentations


Ads by Google