Presentation is loading. Please wait.

Presentation is loading. Please wait.

High-Speed Stochastic Circuits Using Synchronous Analog Pulses M

Similar presentations


Presentation on theme: "High-Speed Stochastic Circuits Using Synchronous Analog Pulses M"— Presentation transcript:

1 High-Speed Stochastic Circuits Using Synchronous Analog Pulses M
High-Speed Stochastic Circuits Using Synchronous Analog Pulses M. Hassan Najafi and David J. Lilja ASP-DAC 2017, Tokyo, Japan

2 Stochastic Number Generation
Overview Introduction Stochastic Computing, advantages, main weakness Representation of stochastic numbers Stochastic Number Generation Conventional approach Proposed approach: PWM Correlation in stochastic circuits Operations with correlated inputs, advantages, disadvantages Low cost sorting unit, stochastic comparator Stochastic Operations with synchronous PWM signals Experimental results Hardware cost, operation time, performance comparison Sources of computational error Conclusions

3 Stochastic Computation
Introduction Stochastic Computation A re-emerging computing paradigm: introduced in 1969 Logical computation on random bit streams Value: probability of obtaining a one versus a zero Unipolar [0, 1] positive Each bit has probability X of being 1 Bipolar [-1, 1] positive, negative Each bit has probability (X+1)/2 of being one 000111, 1010, = 0.5 (unipolar), 0.0 (bipolar) Variable length bit streams Two well-known representation

4 Key Advantages Simple hardware for complex operations
Introduction Key Advantages Simple hardware for complex operations Multiplication: AND (unipolar), XNOR (Bipolar) Scaled Addition: MUX Gracefully tolerate noise Redundant representation provides error tolerance Stochastic: (3/16) -> 4/16=0.25 Binary: = > =0.68 Skew tolerance Polysynchronous stochastic circuits [Najafi et al, ASP-DAC, 2016] A 100 times reduction in area compared to conventional implementations is common.

5 Main Weakness High accuracy ~ Long stochastic streams Introduction
Long computation time -> High energy consumption Much slower More energy consumption than conventional binary design Definitely one of our main motivations for this work is to reduce the long latency, the high energy consumption, and the costs of stochastic circuits.

6 Representation of Stochastic Numbers Digital
Introduction Representation of Stochastic Numbers Digital Probability of obtaining a one versus a zero Analog Encoding the value as the fraction of time the signal is high

7 Stochastic Number Generation
Conventional approach Using random or pseudo-random constructs e.g. LFSR Proposed approach Pulse Width Modulation Analog periodic pulses signals as the stochastic number To move between the deterministic binary domain and the stochastic domain, a real-valued number must be converted into a corresponding SN. In each clock cycle, an unbiased random value r from the pseudo-random source was compared to a target value x. The output is a one if r x and a zero otherwise. Our proposed approach Pulse width modulation as a SNG

8 Stochastic Number Generation
PWM signals as the stochastic number Defined by a frequency and a duty cycle. Duty cycle describe the amount of high time 0.2 0.5 Reminding how we determined the value of a stochastic signal, a PWM generator can work as a SNG. Our approach is motivated by this observation: a stochastic representation is a uniform, fractional representation. All that matters in terms of the value that is computed is the fraction of the time the signal is high. Thus, PWM signals can be treated as time-encoded inputs with values defined by their duty cycle. 0.8 PWM signals with different duty cycles.

9 Correlation in Stochastic Circuits
Stochastic Operations based on their inputs Independent or uncorrelated , AND: Multiplication Correlated , XOR. Absolute value subtraction |X1 – X2| : AND. Minimum: OR. Maximum: Insensitive to correlation MUX. Scaled addition/subtraction Independent (i.e., uncorrelated) bit streams are usually desired in stochastic circuits MUX. At any time only one input is connected to the output and so its performance is insensitive to the correlation

10 Correlation in Stochastic Circuits
Absolute Value Subtraction Basic Sorting Unit (min, max) FSM-based operations ( |X|, tanh ) are expensive Much cheaper when working only on correlated inputs (a) Both correlated and uncorrelated Inputs [Peng et al. TVLSI’16] Correlation is advantageous… Absolute value subtraction: This function was implemented conventionally using a combination of a NOT gate, a MUX, and a relatively high-cost FSM-based stochastic absolute value circuit when correlation is exploited, a much cheaper implementation of this function using just a single XOR gate is achievable, The FSM-based stochastic tanh function along with the three MUXs makes circuit (a) much more expensive than circuit (b) Only correlated Inputs [Alaghi and Hayes, ICCD’13]

11 Correlation in Stochastic Circuits
Still no general method for synthesizing stochastic operations to work on correlated inputs We propose low cost stochastic comparator Comparison of SNs is another common operation in stochastic circuits. Li and Lilja [8] proposed a stochastic comparator using the FSM-based stochastic tanh circuit developed by Brown and Card [4], shown in Fig. 5(a). However, FSM-based circuits are often very expensive to implement. (a) High-cost stochastic comparator [Li and Lilja, ICCD’11] (b) Proposed low cost stochastic comparator

12 Stochastic operations with synchronous PWM signals
We define correlation for analog representation of SN High correlation in PWM signals 1) choosing the same frequency for the input signals 2) having maximum overlap between the high parts Advantage: Still have area saving benefit of correlated stochastic Accurate output after running for only one period Eliminating random fluctuation inaccuracy Disadvantage Difficult to provide synchronization (correlation) in the second (or higher) levels of the circuit.

13 Stochastic operations with synchronous PWM signals
0.3 0.7 0.4 0.3 0.7 Obviously, the expected output is ready right after one cycle of the PWM input signals. Continuing the operations for more cycles does not improve the accuracy of the results. Examples of performing stochastic absolute-valued subtraction, minimum, and maximum operations on two synchronized PWM signals.

14 Stochastic operations with synchronous PWM signals
When IN1 is smaller than IN2, the falling edge of the PWM signal representing N2 causes the flip-flop to sample a low level signal, and so logical-0 is produced at the output. When N1 is greater than N2, the PWM signal representing N1 is still at a high level when the falling edge of IN2 occurs. So, logical-1 will be produced at the output of the flip-flop. Examples of comparing SN, represented by correlated PWM signals, using the proposed stochastic comparator. IN1 < IN2 : Out=0, IN2 > IN1: Out:1

15 Experimental Results Three image processing case studies
Robert’s cross edge detection Circuit (a) is about 20 times more expensive than circuit (b) we use the stochastic implementations of three digital image processing algorithms as case studies to evaluate the idea of using synchronous analog pulses as SNs both functions can be decomposed into two absolute-valued subtraction and one scaled addition operations (a) both correlated and uncorrelated inputs [Peng et al. TVLSI’14]; (b) only correlated inputs [Alaghi et al. DAC’13].

16 Experimental Results Three image processing case studies
Frame difference-based image segmentation (a) both correlated and uncorrelated inputs [Peng et al. TVLSI’14]; (b) Proposed low-cost implementation, only correlated inputs.

17 3x3 Median filter circuit
Experimental Results Three image processing case studies 3x3 Median filter noise reduction based on a sorting network Basic sorting unit (a) Correlated and uncorrelated inputs (b) low-cost circuit, only correlated inputs (c) Simplified model 3x3 Median filter circuit [Peng et al. TVLSI’14]

18 Operation time comparison: prior approach (256-bit)
Experimental Results Hardware cost comparison Operation time comparison: prior approach (256-bit) Synthesized with Synopsys Design Compiler, 45-nm library Circuits that are synthesized to work only with correlated inputs use substantially less hardware resources. Clearly, the simple architecture of the circuits synthesized for the correlated inputs approach leads to a lower critical path and, consequently, a lower total latency required to obtain the final result.

19 Experimental Results Performance evaluation
Average error rate of processing sample images

20 Experimental Results Performance evaluation
Prior approach of stochastic number generation To evaluate performance of the circuits, we simulated their operation with both independent and correlated input streams on sample images. Bit streams of 8 to 1024 bits were generated for each input value. (maximal period LFSRs were used).

21 Experimental Results Performance evaluation PWM-based approach
SPICE-level simulation, 45-nm technology PWM signals with periods: 0.3ns, 0.5ns, 1ns, and 2ns The area cost of the PWM generator (when period=2ns) is roughly as expensive as the cost of the conventional SNG with 8-bit LFSR. Decreasing PWM period = Increases the error rate but Lowers implementation cost To evaluate the performance of the low-cost circuits when they are connected to synchronous analog pulses, we implemented the SPICE netlist of the three stochastic circuits. Simulations were carried out using a 45-nm gate library in HSPICE. Image pixel intensities were converted to PWM signals with periods of 0.3ns, 0.5ns, 1ns, and 2ns using the circuit shown in Figure . The implementation cost of the PWM generator, which consists of an analog comparator, a ramp generator, and a clock generator, is a function of its frequency

22 PWM approach Much faster than prior approach
Experimental Results Performance evaluation PWM-based approach Only one period of the PWM signal is sufficient for determining an accurate output PWM approach Much faster than prior approach Since only one period of the PWM signals is sufficient for determining an accurate output, the circuits‘ operation time when using the PWM signals is much lower than the latency when processing using the conventional bit-stream representation

23 Sources of Computational Errors
Three primary sources of errors in performing stochastic operations on synchronized PWM signals 𝑬 𝑮 = Error in generating PWM signals Average error rate of the PWM generator used 0.3ns -> 0.23% 0.5ns -> 0.12% 1ns -> 0.10% 2ns -> 0.09% 𝑬 𝑺 = Error due to skew between input signals Perfectly synchronized PWM signals are required On-chip variations, other noise sources affecting clock generators result in deviation from expected period, phase shift, slew rate Each PWM generator has some level of inaccuracy in converting the input values into the corresponding PWM signals. The difference between the expected and the generated duty cycle

24 Sources of Computational Errors
Three primary sources of errors in performing stochastic operations on synchronized PWM signals 𝑬 𝑮 = Error in measuring the output signals Simple analog integrator Measuring the fraction of time the output signal is high Longer rise and fall times Imperfect measurement of the high and low voltages Result in inaccuracies in measuring the correct output In our simulations Average error rate of measurements -> 0.10%

25 Reducing the hardware cost
Conclusions Reducing the hardware cost One of the main advantages of exploiting correlation Two new low-cost stochastic circuits Sorting unit: 1 AND + 1 OR, Comparator: a D-type flip-flop Low cost implementation for Median filter noise reduction Frame difference-based image segmentation Introduced synchronous analog pulses as a new representation for correlated SNs Still have area saving advantages of correlated circuits Highly accurate results after only one period A solution to long latency problem of SC

26 PWM signals in correlated stochastic design
Further reading We discussed PWM signals in correlated stochastic design “High-Speed Stochastic Circuits Using Synchronous Analog Pulses”, M. Hassan Najafi and David J. Lilja, ASP-DAC 2017 For PWM signals in independent stochastic design “Time-Encoded Values for Highly Efficient Stochastic Circuits”, M. Hassan Najafi, Shiva Jamali-Zavareh, David J. Lilja, Marc Riedel, Kia Bazargan, and Ramesh Harjani, IEEE Transactions on VLSI 2017

27 Questions? M. Hassan Najafi Najaf011@umn.edu
Thank you Questions? M. Hassan Najafi


Download ppt "High-Speed Stochastic Circuits Using Synchronous Analog Pulses M"

Similar presentations


Ads by Google