Presentation is loading. Please wait.

Presentation is loading. Please wait.

0 Optimizing Stochastic Circuits for Accuracy-Energy Tradeoffs Armin Alaghi 3, Wei-Ting J. Chan 1, John P. Hayes 3, Andrew B. Kahng 1,2 and Jiajia Li 1.

Similar presentations


Presentation on theme: "0 Optimizing Stochastic Circuits for Accuracy-Energy Tradeoffs Armin Alaghi 3, Wei-Ting J. Chan 1, John P. Hayes 3, Andrew B. Kahng 1,2 and Jiajia Li 1."— Presentation transcript:

1 0 Optimizing Stochastic Circuits for Accuracy-Energy Tradeoffs Armin Alaghi 3, Wei-Ting J. Chan 1, John P. Hayes 3, Andrew B. Kahng 1,2 and Jiajia Li 1 UC San Diego, 1 ECE and 2 CSE Depts., 3 University of Michigan, EECS Dept.

2 1 Outline Background and Previous Work Problem Statement in SC Physical Design Modeling Approach Optimization Approach Conclusions

3 2 Motivation: Low Power Challenge Low power design is a grand challenge Mobile devices must operate with extremely low power as the performance requirement of applications grow Voltage scaling has slowed down in the recent years Possible solution: to employ new design paradigms to overcome the challenges and achieve the performance improvements 4W mobile platform power requirement 1W SOC power requirement Slow performance improvement due to power limit + slow voltage scaling [source] ITRS

4 3 New Paradigm: Stochastic Computing (SC) Stochastic computing (SC) is a design paradigm that has gained attention recently due its low power and error tolerance Random bit streams are used to represent operands Complex arithmetic operations implemented by simple logic circuits 4/8 6/8 3/8 Z = X 1 × X 2 3/8 = 4/8  6/8 X1X1 X2X2 Z

5 4 Error Tolerance, Precision, and Accuracy Inaccurate computation may occur Number to represent: 5/16 Stochastic: 0010 0001 0101 0010 Binary: 0.0101 Bit-stream length grows exponentially with precision Redundant representation provides error tolerance Correct = 3/8

6 5 Area, Computation Efficiency, and Delay Stochastic multiplier Conventional binary multiplier SC: smaller area, longer computation latency, and shorter critical path Critical path

7 6 Application Context of SC Stochastic representation is similar to analog “pulse-mode” signals, as well as neural signals Stochastic computing circuit performs cheap pre-processing; saves resources Low cost preprocessing between two domains

8 7 Summary of Advantages/Disadvantages Advantages Low-complexity circuits (allows massive parallelism) Error tolerance Robustness to voltage scaling (explored and improved this work) Disadvantages Long computation time Limited precision Expensive conversion circuits and storage elements

9 8 Outline Background and Previous Work Problem Statement in SC Physical Design Modeling Approach Optimization Approach Conclusions

10 9 Challenges, Problems, and Our Contributions Challenges of stochastic computing (SC) design: Current digital design flow does not comprehend the tradeoff between accuracy and power in SC Physical implementation of SC circuits has not been well explored Problems: What is the efficient way to estimate error while exhaustive simulation is not feasible? Given a synthesized SC circuit, what is the physical implementation recipe? Our contributions: We introduce the delay matching problem in SC We reduce the computation error by balancing delay paths We propose a Markov chain model for error estimation

11 10 Stochastic Computing: Scope of Study Design Metrics Energy Accuracy (new model is proposed in this work) Circuit area Design Parameters Computation latency (N) Frequency Scaling (f) Voltage scaling (V) Netlist Implementation (New optimization is proposed in this work) Metrics covered in this work

12 11 Outline Background and Previous Work Problem Statement in SC Physical Design Modeling Approach Optimization Approach Conclusions

13 12 Three scenarios of signal transitions (A) Ideal: stable states of logic values are captured (B) Balanced delay: all the transitions arrive at the same time (C) Unbalanced delay: causing extra errors due to glitches or delayed transitions Balance of Path Delay Matters x1x1 x0x0 z (A) Ideal Correct (B) Balanced Correct (C) Unbalanced Error Sample clock

14 13 Markov Chain for Error Prediction Markov chain (MC) has been previously used to model sequential SC circuits We augment the states for delay-induced transition errors from the behavior model Errors induced by glitches and delayed transitions Transition probability are trained by a small set of simulation results Stationary probability distribution is obtained by solving the Markov chain C 1, D 1, G 1 decide the output expected values Used for error estimation Only correct states in the previous SC behavior model

15 14 Result: Markov Chain for Error Prediction Model is accurate for larger errors The model is less accurate when error is small Precise prediction for high error magnitude On-going work: to improve the accuracy for small errors

16 15 Before our work: SC behavior model is based on pre-layout simulation SC behavior model did not consider the cell delay and wire delay contributed by physical implementation Our work: Augment the SC behavior model by considering delayed transitions and glitches contributed by physical implementation Optimize the physical implementation by balancing the timing paths Outcome of Accuracy Model Study Correct Error Balanced delays

17 16 Outline Background and Previous Work Problem Statement in SC Physical Design Modeling Approach Optimization Approach Conclusions

18 17 Clock is fast to compensate for long computation latency Launch and capture flip-flops may be far apart in a huge array of SC circuits Unbalanced paths due to circuit structures and variations  Previous analysis shows delay balance matters The timing is more critical when DVFS lowers the supply voltage Challenges of SC Physical Implementation x1x1 x0x0 z SC sub-circuits faster clock to compensate for long latency Path 1 (long) Path 2 (short) Analog front- end circuit or random number generator Converter to binary number system Long physical distance in a huge array

19 18 Problem statement: Given an SC circuit and a range of supply voltages, we seek an implementation that minimizes error across the voltages Observation: Transition errors increase at lower voltages due to path delay mismatch Approach: ILP-based retiming after P&R by commercial tool Optimization constraints: #Buffers / #wires inserted to compensate for shorter paths Bounded delay variation across voltages Buffer power penalty Objective: minimize path delay differences Improves accuracy Side note: Similar to multi-corner multi-mode (MCMM) CTS skew optimization: Skew Path delay differences MCMM Delays are evaluated at multiple supply voltages Power penalty #Buffer insertion Post-P&R Optimization for SC Circuits

20 19 ILP Formulation for Buffer Insertion

21 20 Heuristics for Buffer Choices Heuristic 1: various buffer/wire types to compensate for delay between voltages We provide buffer candidates with different delay sensitivity to voltage scaling We provide wire detour options to provide wider voltage sensitivity range Heuristic 2: pruning buffers in the candidates to speed up MILP Solutions are pruned within sub-regions in the tradeoff space by choosing cells in the regions with lowest leakage Without pruningWith pruningWire detouring

22 21 Result: Improved Accuracy by Balancing Paths Path delays Average Errors Lower error Less inter-path delay skew STRAUSS (UMich) + Conventional P&R (ICC) ReSC (UMN) + Conventional P&R (ICC) ReSC (UMN) + Proposed P&R Opt.

23 22 Result: Improved Input Delay Window Safe timing window: timing margin between clock edge and input delay Before optimization: small input delay variation will cause errors After: Safe timing window = half of the clock cycle Clock period = 150ps Safe window Original delay distribution Opt.

24 23 Improved accuracy = Less voltage scaling needed = Higher energy efficiency Conventional P&R flow (ICC) fails to meet accuracy constraint when VDD is low Our proposed P&R optimization reduce delay mismatch at lower voltages and leads to lower energy cost for the same accuracy Result: Improved Energy Cost by Balancing Paths

25 24 The proposed Markov chain model is verified on four different SC application circuits Green: New MC model Blue: Exhaustive simulation MC Model: Improved Simulation Runtime #Cycle (Ex.)#Cycles (MC) GammaCorr102410 PolySmall25610 Neuron10010 Less simulation cycles

26 25 Testcase: Gamma correction Both SC and conventional circuits are signed off at 1.0V SC still generates recognizable image at 0.6V Energy saving of SC = 66% Result: Gamma Correction

27 26 Outline Background and Previous Work Problem Statement in SC Physical Design Modeling Approach Optimization Approach Conclusions

28 27 Conclusions We identify the impact of delay-induced errors and propose a Markov chain-based model for error estimation We propose a new physical implementation approach that improves the energy-accuracy tradeoff The experiment results show significant energy and benefit over previous work Future work Markov chain model improvement Comprehensive tradeoff recipe for performance, accuracy, and energy

29 28 Thank you !


Download ppt "0 Optimizing Stochastic Circuits for Accuracy-Energy Tradeoffs Armin Alaghi 3, Wei-Ting J. Chan 1, John P. Hayes 3, Andrew B. Kahng 1,2 and Jiajia Li 1."

Similar presentations


Ads by Google