Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wackiness Algorithm A: Algorithm B:

Similar presentations


Presentation on theme: "Wackiness Algorithm A: Algorithm B:"— Presentation transcript:

1 Wackiness Algorithm A: Algorithm B:
Generate 200,000 random values 0-255 Add up all values >= 128 Algorithm B: Sort the values

2 Pipelining Pt2

3 Pipelining Limits In theory: n times speedup for n stage pipeline But
Only if all stages are balanced Only if can be kept full

4 Hazards Hazard : Situation preventing next instruction from continuing in pipeline Structural : Resource (shared hardware) conflict Data : Needed data not ready Control : Correct action depends on earlier instruction

5 Branch Unconditional Branch in perfect world:
Skip inst 3, 4, no bubble

6 Branch Timing Don’t know it is branch until ID

7 Branch Timing Branch address not available until after EX

8 Branch Real Timing Branch destination calculated at T4
Can’t start the instruction until T5 Need to insert NOP bubble

9 Branch Real Timing If we can forward address from EX to IF can start x at T4

10 Branch Real Timing Branch destination calculated at T4
Already started running instruction 3 Need ability to ignore started instruction Still a bubble – ignored instruction instead of No-OP

11 Conditional Branch Conditional branch has two possibilities: Not taken

12 Solving Conditional Branch
Option 1: Stall until we know Not taken Taken

13 Solving Conditional Branch
Option 2: Prediction Predict Not Taken & Is Not Taken Predict Not Taken & Is Taken

14 Predicting Taken Calculating branch destination in time to use in next cycle = more hardware:

15 Solving Conditional Branch
Option 2: Prediction Predict Taken & Is Not Taken Predict Taken & Is Taken

16 Branch Prediction Penalty
In our CPU Predict correct = 0 cycle penalty Predict wrong = 1 cycle penalty Longer pipeline No way to decode before next fetch Bigger penalty for miss Penalty for any taken branch

17 Static Branch Prediction
Static prediction : Hardcoded assumptions If branch backwards, it is a loop, assume we take the branch

18 Dynamic Branch Prediction
Dynamic Prediction : Predict based on runtime behavior More hardware : Branch prediction buffer (aka branch history table) Indexed by recent branch instruction addresses Stores outcome (taken/not taken) To execute a branch Check table, expect the same outcome Start fetching from fall-through or target If wrong, flush pipeline and flip prediction

19 Prediction 1 bit history (Taken / Not taken) may not be optimal
Ex Nested loop: Inner CBZ missed on Last iteration Next first iteration

20 Prediction 2 bit history avoids that issue

21 Real Stuff Is it worth it?

22 Real Stuff Is it worth it?

23 Pipelineing worth it? Yes… to a point

24 ARM Pipelines Early ARM Pipeline: ARM v6 pipeline

25 Modern Pipeline Cortex A53 : ARMv8

26 Modern Pipeline Cortex A53 : Pipeline stalls basically double CPI

27 Why Loads Have +8 in Address
Fun Fact Why Loads Have +8 in Address LDR : Calculates location as: currentLocation immediate (PC) C ( ) (2010) By the time it executes, PC will be 8 greater

28 Intel Pipelines

29 Intel i7 Branch Performance
A few mispredictions can have large impact:

30 Intel vs AMD Part of Intel's IPC advantage: Branch prediction
AMD claiming major advances in new architecture:


Download ppt "Wackiness Algorithm A: Algorithm B:"

Similar presentations


Ads by Google