Presentation is loading. Please wait.

Presentation is loading. Please wait.

Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004.

Similar presentations


Presentation on theme: "Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004."— Presentation transcript:

1 Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004

2 Outline What is a Perceptron? –Learning? What is a Feed-Forward Network? –Learning? What is a Recurrent Network –Learning? How to do it on hardware??? Results – Adding hidden units Results – Modeling latency of slow networks. Results – Varying the hardware budget

3 The Perceptron Linear (affine) combination of inputs  DECISION

4 Perceptron Learning inputs x j, outputs y i and targets t i are {-1, +1} Cycle through training set if X i = (x1, x2, …, xd) is misclassified, do w j  w j + a * t i * x j end if

5 Feed-Forward Network A network of perceptrons…

6 Feed-forward Network Learning Use A gradient-descent algorithm. Network output is: Error is: Derivatives of error are:

7 Feed-Forward Networks, BACKPROP But no error defined for hidden units??? Solution, assign responsibility for output units error to each hidden unit, then descend gradient This is called “back-propagation”

8 Recurrent Networks Now it has state…

9 Learning weights for a RNN Unroll it and use back-propagation? No! Too Slow, and wrong…

10 Use Real-Time Recurrent Learning Keep list, at each time T: –For each Unit u –For each Weight w –Keep partial derivative du/dw Update with recurrence relation:

11 But on hardware??? Idea, represent real numbers in [-4, +4] with integers in [-4096..4096] Adding, is ok… –1024 i + 1024 j = (i+j)1024 Multiplying requires a divide (shift): –(1024 i) * (1024 j) = (i*j)1024^2 Compute activation function by looking up in a discretized table.

12 Results, different numbers of hidden units

13 Results, Different latencies

14 Results, different HW budget (crafty)

15 Results, Different HW budges (BZIP-PROGRAM)

16 Conclusions DON’T use a RNN! Maybe use a NNet with a few hidden units, but don’t over do it Future work: explore trade-off between –Number, size (hidden units), inputs


Download ppt "Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004."

Similar presentations


Ads by Google