Download presentation

Presentation is loading. Please wait.

Published byRosemary Peel Modified over 3 years ago

1
Folding and Pipelining complex combinational circuits Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2012L05-1http://csg.csail.mit.edu/6.S078

2
Combinational IFFT: Suppose we want to reduce the area of the circuit in0 … in1 in2 in63 in3 in4 Bfly4 x16 Bfly4 … … out0 … out1 out2 out63 out3 out4 Permute Reuse the same circuit three times to reduce area Folding February 22, 2012 L05-2http://csg.csail.mit.edu/6.S078

3
fg Reusing a combinational block we expect: Throughput to Area to ff g decrease – less parallelism The clock needs to run faster for the same throughput hyper-linear increase in energy decrease – reusing a block February 21, 2012 L04-3http://csg.csail.mit.edu/6.S078 Introduce state elements to hold intermediate values

4
Circular or folded pipeline: Reusing the Pipeline Stage in0 … in1 in2 in63 in3 in4 out0 … out1 out2 out63 out3 out4 … Bfly4 Permute Stage Counter February 22, 2012 L05-4http://csg.csail.mit.edu/6.S078

5
Folded pipeline rule folded-pipeline (True); if (stage==0) begin sxIn= inQ.first(); inQ.deq(); end else sxIn= sReg; sxOut = f(stage,sxIn); if (stage==n-1) outQ.enq(sxOut); else sReg <= sxOut; stage <= (stage==n-1)? 0 : stage+1; endrule x sReg inQ f outQ stage no for- loop Need type declarations for sxIn and sxOut February 22, 2012 L05-5http://csg.csail.mit.edu/6.S078

6
BSV Code for stage_f function Vector#(64, Complex) stage_f (Bit#(2) stage, Vector#(64, Complex) stage_in); begin for (Integer i = 0; i < 16; i = i + 1) begin Integer idx = i * 4; let twid = getTwiddle(stage, fromInteger(i)); let y = bfly4(twid, stage_in[idx:idx+3]); stage_temp[idx] = y[0]; stage_temp[idx+1] = y[1]; stage_temp[idx+2] = y[2]; stage_temp[idx+3] = y[3]; end //Permutation for (Integer i = 0; i < 64; i = i + 1) stage_out[i] = stage_temp[permute[i]]; end return(stage_out); February 22, 2012 L05-6http://csg.csail.mit.edu/6.S078

7
Folded pipeline-multiple rules another way of expressing the same computation rule foldedEntry if (stage==0); sReg <= f(stage, inQ.first()); stage <= stage+1; inQ.deq(); endrule rule foldedCirculate if (stage!=0)&(stage<(n-1)); sReg <= f(stage, sReg); stage <= stage+1; endrule rule foldedExit if (stage==n-1); outQ.enq(f(stage, sReg)); stage <= 0; endrule x sReg inQ f outQ stage Disjoint firing conditions February 22, 2012 L05-7http://csg.csail.mit.edu/6.S078

8
Superfolded circular pipeline: use just one Bfly-4 node! in0 … in1 in2 in63 in3 in4 out0 … out1 out2 out63 out3 out4 Bfly4 Permute Index == 15? Index: 0 to 15 64, 2-way Muxes 4, 16-way Muxes 4, 16-way DeMuxes Stage 0 to 2 February 21, 2012 L04-8http://csg.csail.mit.edu/6.S078 Lab 3

9
Combinational IFFT Lot of area and long combinational delay Folded or multi-cycle version can save area and reduce the combinational delay but throughput per clock cycle gets worse Pipelining: a method to increase the circuit throughput to evaluate multiple IFFTs in0 … in1 in2 in63 in3 in4 Bfly4 x16 Bfly4 … … out0 … out1 out2 out63 out3 out4 Permute February 22, 2012 L05-9http://csg.csail.mit.edu/6.S078

10
Pipelining a block inQoutQ f2f1 f3 Combinational C inQoutQ f2f1 f3 Pipeline P inQoutQ f Folded Pipeline FP Clock? Area?Throughput? February 22, 2012 L05-10http://csg.csail.mit.edu/6.S078

11
Inelastic pipeline x sReg1inQ f0f1f2 sReg2outQ rule sync-pipeline (True); inQ.deq(); sReg1 <= f0(inQ.first()); sReg2 <= f1(sReg1); outQ.enq(f2(sReg2)); endrule This rule can fire only if February 22, 2012 L05-11http://csg.csail.mit.edu/6.S078

12
Stage functions f1, f2 and f3 function f0(x); return (stage_f(0,x)); endfunction function f1(x); return (stage_f(1,x)); endfunction function f2(x); return (stage_f(2,x)); endfunction The stage_f function was given earlier February 22, 2012 L05-12http://csg.csail.mit.edu/6.S078

13
Problem: What about pipeline bubbles? x sReg1inQ f0f1f2 sReg2outQ rule sync-pipeline (True); inQ.deq(); sReg1 <= f0(inQ.first()); sReg2 <= f1(sReg1); outQ.enq(f2(sReg2)); endrule February 22, 2012 L05-13http://csg.csail.mit.edu/6.S078

14
Explicit encoding of Valid/Invalid data rule sync-pipeline (True); if (inQ.notEmpty()) begin sReg1 <= f0(inQ.first()); inQ.deq(); sReg1f <= Valid end else sReg1f <= Invalid; sReg2 <= f1(sReg1); sReg2f <= sReg1f; if (sReg2f == Valid) outQ.enq(f2(sReg2)); endrule sReg1 sReg2 x inQ f0f1f2 outQ Valid/Invalid February 22, 2012 L05-14http://csg.csail.mit.edu/6.S078

15
When is this rule enabled? inQ sReg1f sReg2foutQ NEVVNF NEVVF NEVINF NEVIF NEIVNF NEIVF NEIINF NEIIF EVVNF EVVF EVINF EVIF EIVNF EIVF EIINF EIIF Yes Yes1 = yes but no change inQ sReg1f sReg2foutQ February 22, 2012 L05-15http://csg.csail.mit.edu/6.S078 rule sync-pipeline (True); if (inQ.notEmpty()) begin sReg1 <= f0(inQ.first()); inQ.deq(); sReg1f <= Valid end else sReg1f <= Invalid; sReg2 <= f1(sReg1); sReg2f <= sReg1f; if (sReg2f == Valid) outQ.enq(f2(sReg2)); endrule

16
Area estimates Tool: Synopsys Design Compiler Comb. FFT Combinational area:16536 Noncombinational area: 9279 Linear FFT Combinational area: 20610 Noncombinational area: 18558 Circular FFT Combinational area: 29330 Noncombinational area: 11603 February 22, 2012 L05-16http://csg.csail.mit.edu/6.S078 Surprising? Explanation?

17
The Maybe type data in the pipeline typedef union tagged { void Invalid; data_T Valid; } Maybe#(type data_T); data valid/invalid Registers contain Maybe type values rule sync-pipeline (True); if (inQ.notEmpty()) begin sReg1 <= tagged Valid f0(inQ.first()); inQ.deq(); end else sReg1 <= tagged Invalid; case (sReg1) matches tagged Valid.sx1: sReg2 <= tagged Valid f1(sx1); tagged Invalid: sReg2 <= tagged Invalid; endcase case (sReg2) matches tagged Valid.sx2: outQ.enq(f2(sx2)); endcase endrule February 22, 2012 L05-17http://csg.csail.mit.edu/6.S078

Similar presentations

OK

March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To ensure the functioning of the site, we use **cookies**. We share information about your activities on the site with our partners and Google partners: social networks and companies engaged in advertising and web analytics. For more information, see the Privacy Policy and Google Privacy & Terms.
Your consent to our cookies if you continue to use this website.

Ads by Google

Ppt on area of a parallelogram Ppt on modern science and technology Ppt on five generation of computers Ppt on decimals for class 7 Ppt on indian politics latest Ppt on solid dielectrics industries Ppt on industrial public address system By appt only movie page Ppt on landforms for 4th grade Ppt on low level language and high level