Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sabyasachi Das Synplicity Inc Sunil P. Khatri Texas A&M University

Similar presentations


Presentation on theme: "Sabyasachi Das Synplicity Inc Sunil P. Khatri Texas A&M University"— Presentation transcript:

1 Sabyasachi Das Synplicity Inc Sunil P. Khatri Texas A&M University
A Timing-Driven Hybrid-Compression Algorithm for Faster Sum-of-Products Sabyasachi Das Synplicity Inc Sunil P. Khatri Texas A&M University

2 What is a Sum-of-Product?
IC block that performs addition of multiple product and sum terms Computationally-intensive Wide usage in DSP, Graphics, Microprocessors p = a * b q = c * d d z = p + q + e + f e f z p q c b a

3 Examples of Sum-of-Product Blocks
Multiplication {assign z = a * b} MAC {assign z = (a * b) + c} 2-operand Addition {assign z = a + b} Squarer {assign z = a * a} Adder-Tree {assign z = a + b + c + d} Generalized SOP {assign z = (a * b) + (c * d) + (e * f) + g + h + k}

4 Structure of Sum-of-Products
Inputs Sum-of-Products block consists of 3 parts (written in the order of data-flow) Partial Product Generator (PPGen) Partial Product Reduction Tree (PPRT) Final Carry-Propagation Adder (CPA) Partial Product Generator (PPGen) Partial Product Reduction Tree (PPRT) Final Carry Propagation Adder (CPA) Output

5 Partial Product Reduction Tree
In Partial Product Reduction Tree, total number of elements in each bit gets reduced to upto two Partial Product Reduction Tree (PPRT) consumes >50% delay of the SOP block Hence the performance of PPRT is crucial to the performance of the SOP block

6 Two Reduction Counters in PPRT
Reduces 2 inputs (ai and bi) to 2 outputs (Si and Ci+1) (3:2) Counter Reduces 3 inputs (ai, bi and ci) to 2 outputs (Si and Ci+1) ai bi Ci+1 Si ai bi ci Ci+1 Si

7 4:3 Reduction Counters (4:3) Counter 4 inputs to 3 outputs
The functionality of the Ci+2 is a 4-input AND gate. Faster reduction at ith column Produces element to (i+2)th column at an earlier time Has larger area than other two counters bi ci di Ci+2 Ci+1 Si ai Key idea is to use (4:3) counter as much as possible in conjunction with the (3:2) and (2:2) counters

8 Explanation of our approach
Perform column-wise reduction (LSB to MSB) For each column (or BitSlice/BitCluster) Sort inputs based on arrival time Is (2:2) reduction fast? If yes, instantiate that Else is (3:2) reduction fast? If yes, instantiate that Else instantiate (4:3) reduction After each reduction, re-sort the signals and continue

9 An example of our approach
P P P P P P P P00 P P P P P P P P10 P P P P P P P P20 P P P P P P P P30 P P P P P P P P40 P P P P P P P P50 C02 C S00 C S10 C S01 C13 C S02

10 Results On an average, our approach produces about 3.5% speed improvement with 4.3% area penalty

11 Summary A 4:3 reduction counter is designed
Reduces elements in the given column at a faster pace Produces an element to the (i+2)th column at an earlier time 4:3 reduction counter is used extensively (in conjunction with the existing 3:2 and 2:2 counters) A timing-driven algorithm selects the correct type of counter that needs to be instantiated On an average, 3.5% improvement in speed with 4.3% area penalty.

12 Thank you


Download ppt "Sabyasachi Das Synplicity Inc Sunil P. Khatri Texas A&M University"

Similar presentations


Ads by Google