Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5 Unfolding.

Similar presentations


Presentation on theme: "Chapter 5 Unfolding."— Presentation transcript:

1 Chapter 5 Unfolding

2 Definitions Unfolding is the process of unfolding a loop so that several iterations are unrolled into the same iteration. Also known as (a.k.a.) Loop unrolling (in compilers for parallel programs) Block processing Applications Reducing sampling period to achieve iteration bound (desired throughput rate) T. Parallel (block processing) to execute several iterations concurrently. Digit-serial or bit-serial processing (C) by Yu Hen Hu

3 An example Block processing formulation J = 3, 9/J = 3 (an integer)
X(k) = [x(3k) x(3k+1) x(3k+2)]T Y(k) = [y(3k) y(3k+1) y(3k+2)]T Y(k) = a*Y(k- 3 ) + X(k) J = 2, 9/J = 5 (not an integer) X(k) = [x(2k) x(2k+1)]T Y(k) = [y(2k) y(2k+1)]T Y(k) = a*Y(k- 5 ) + X(k) Before unfolding: For n = 0 to N-1, y(n)=a*y(n-9)+x(n) end Unfolding once (J = 2) For k = 0 to N/2-1, y(2k)=a*y(2k-9)+x(2k) y(2k+1)=a*y(2k-8)+x(2k+1) Unfolding twice (J = 3) For k = 0 to N/3-1, y(3k)=a*y(3k-9)+x(3k) y(3k+1)=a*y(3k-8)+x(3k+1) y(3k+2)=a*y(3k-7)+x(3k+2) (C) by Yu Hen Hu

4 Implementation with J=3
3Ts Serial-to-parallel conversion parallel-to-Serial conversion Ts y(0) y(1) y(2) y(3) y(4) y(5) . Ts + X D + X D x(0) x(1) x(2) x(3) x(4) x(5) . + X D (C) by Yu Hen Hu

5 Unfolding the DFG Rewrite the algorithm formulation:
y(2k)=a*y(2k-9)+x(2k) y(2k+1)=a*y(2k-8)+x(2k+1) y(2k)=a*y(2(k-5)+1)+x(2k) y(2k+1)=a*y(2(k-4))+x(2k+1) After J-folded unfolding, the clock period T = J Ts, where Ts is the data sampling period. T=Ts T=J Ts (C) by Yu Hen Hu

6 Timing Diagram y(0) y(1) y(2) y(3) y(4) y(5) y(6) y(7) y(8) y(9) y(10) y(11) y(12) y(13) 9 T T=Ts 9 T T=2Ts y(0) y(2) y(4) y(6) y(8) y(10) y(12) 4T 5T y(1) y(3) y(5) y(7) y(9) y(11) y(13) Above timing diagram is obtained assuming that the sampling period Ts remains unchanged. Thus, the clock period T is increased J-fold. Since 9/2 is not an integer, output (y(0), y(1)) will be needed by two different future iterations, 4T and 5T later. (C) by Yu Hen Hu

7 General DFG Unfolding Method
Define Step 1. For each node U in original DFG, draw J nodes {Ui; 0 iJ-1} in the unfolded DFG Step 2. For each edge from U to V with w delays, draw J edges from Ui to V(i+w)%J with (i+w)/J delays (C) by Yu Hen Hu

8 Another DFG Unfolding Example
J=2 S0 i w (i+w)%J 2 1 3 Q0 T0 S R0 Q T 3D 2D S1 R Q1 T1 T=3 R1 Step 1. Duplicate J copies of each node (C) by Yu Hen Hu

9 Another DFG Unfolding Example
J=2 S0 i w (i+w)%J 2 1 3 Q0 T0 S R0 Q T 3D 2D S1 R Q1 T1 T=3 R1 Step 2. Add all edges with 0 delay on them. (C) by Yu Hen Hu

10 Another DFG Unfolding Example
J=2 S0 i w (i+w)%J 2 1 3 Q0 T0 S D R0 Q T 2D D 3D 2D S1 R Q1 T1 T=3 D R1 Step 3. Use table on the left to figure out edges with delays. T=6 (C) by Yu Hen Hu

11 Properties of Unfolding
Unfolding preserves the number of registers (delays) in a DFG For a loop with w delays in a DFG that has been unfolded J times, it leads to g.c.d.(w, J) loops in the unfolded DFG, with each of these loops containing w/(g.c.d.(w,J)) delays and J/(g.c.d.(w,J)) copies of each node that appear in the original loop. Unfolding a DFG with iteration bound T results in a J-folded DFG with iteration bound JT. A path with w (< J) delays in a DFG will lead to J-w paths with no delays, and w paths with 1 delay each in the J-unfolded DFG. Any path in the original DFG containing J or more delays leads to J paths with 1 or more delay in each path. Therefore, it can not create a critical path in the J-unfolded DFG Any clock period that can be achieved by retiming a J-unfolded DFG can be achieved by retiming the original DFG and followed by J-unfolding. (C) by Yu Hen Hu


Download ppt "Chapter 5 Unfolding."

Similar presentations


Ads by Google