Chapter One Introduction to Pipelined Processors.

Slides:



Advertisements
Similar presentations
CPU Structure and Function
Advertisements

CH14 Instruction Level Parallelism and Superscalar Processors
The CPU The Central Presentation Unit What is the CPU?
PIPELINE AND VECTOR PROCESSING
Computer Organization and Architecture
CSCI 4717/5717 Computer Architecture
Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
1 ITCS 3181 Logic and Computer Systems B. Wilkinson Slides9.ppt Modification date: March 30, 2015 Processor Design.
CS5365 Pipelining. Divide task into a sequence of subtasks. Each subtask is executed by a stage (segment)of the pipe.
Computer Organization and Architecture
Microprocessors. Von Neumann architecture Data and instructions in single read/write memory Contents of memory addressable by location, independent of.
Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.
Parallell Processing Systems1 Chapter 4 Vector Processors.
Chapter 8. Pipelining. Instruction Hazards Overview Whenever the stream of instructions supplied by the instruction fetch unit is interrupted, the pipeline.
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
Computer Organization and Architecture
Computer Organization and Architecture
Computer Organization and Architecture The CPU Structure.
Chapter 12 Pipelining Strategies Performance Hazards.
Pipelining Fetch instruction Decode instruction Calculate operands (i.e. EAs) Fetch operands Execute instructions Write result Overlap these operations.
Chapter 12 CPU Structure and Function. Example Register Organizations.
7/2/ _23 1 Pipelining ECE-445 Computer Organization Dr. Ron Hayne Electrical and Computer Engineering.
Pipelining By Toan Nguyen.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
CH12 CPU Structure and Function
5-Stage Pipelining Fetch Instruction (FI) Fetch Operand (FO) Decode Instruction (DI) Write Operand (WO) Execution Instruction (EI) S3S3 S4S4 S1S1 S2S2.
Parallelism Processing more than one instruction at a time. Pipelining
Basic Microcomputer Design. Inside the CPU Registers – storage locations Control Unit (CU) – coordinates the sequencing of steps involved in executing.
Edited By Miss Sarwat Iqbal (FUUAST) Last updated:21/1/13
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Presented by: Sergio Ospina Qing Gao. Contents ♦ 12.1 Processor Organization ♦ 12.2 Register Organization ♦ 12.3 Instruction Cycle ♦ 12.4 Instruction.
Chapter One Introduction to Pipelined Processors.
Speeding up of pipeline segments © Fr Dr Jaison Mulerikkal CMI.
Parallel architecture Technique. Pipelining Processor Pipelining is a technique of decomposing a sequential process into sub-processes, with each sub-process.
1 Control Unit Operation and Microprogramming Chap 16 & 17 of CO&A Dr. Farag.
Principles of Linear Pipelining
Pipelining and Parallelism Mark Staveley
Chapter One Introduction to Pipelined Processors
Pentium Architecture Arithmetic/Logic Units (ALUs) : – There are two parallel integer instruction pipelines: u-pipeline and v-pipeline – The u-pipeline.
Introduction  The speed of execution of program is influenced by many factors. i) One way is to build faster circuit technology to build the processor.
System Hardware FPU – Floating Point Unit –Handles floating point and extended integer calculations 8284/82C284 Clock Generator (clock) –Synchronizes the.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
Chapter One Introduction to Pipelined Processors
Chapter One Introduction to Pipelined Processors
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Chapter One Introduction to Pipelined Processors.
Computer Architecture Chapter (14): Processor Structure and Function
William Stallings Computer Organization and Architecture 8th Edition
Chapter One Introduction to Pipelined Processors
Pipelining and Vector Processing
Superscalar Processors & VLIW Processors
Computer Organization and ASSEMBLY LANGUAGE
MARIE: An Introduction to a Simple Computer
Computer Architecture
Chapter 11 Processor Structure and function
Presentation transcript:

Chapter One Introduction to Pipelined Processors

Principle of Designing Pipeline Processors (Design Problems of Pipeline Processors)

Instruction Prefetch and Branch Handling The instructions in computer programs can be classified into 4 types: – Arithmetic/Load Operations (60%) – Store Type Instructions (15%) – Branch Type Instructions (5%) – Conditional Branch Type (Yes – 12% and No – 8%)

Instruction Prefetch and Branch Handling Arithmetic/Load Operations (60%) : – These operations require one or two operand fetches. – The execution of different operations requires a different number of pipeline cycles

Instruction Prefetch and Branch Handling Store Type Instructions (15%) : – It requires a memory access to store the data. Branch Type Instructions (5%) : – It corresponds to an unconditional jump.

Instruction Prefetch and Branch Handling Conditional Branch Type (Yes – 12% and No – 8%) : – Yes path requires the calculation of the new address – No path proceeds to next sequential instruction.

Instruction Prefetch and Branch Handling Arithmetic-load and store instructions do not alter the execution order of the program. Branch instructions and Interrupts cause some damaging effects on the performance of pipeline computers.

Handling Example – Interrupt System of Cray1

Cray-1 System The interrupt system is built around an exchange package. When an interrupt occurs, the Cray-1 saves 8 scalar registers, 8 address registers, program counter and monitor flags. These are packed into 16 words and swapped with a block whose address is specified by a hardware exchange address register

Instruction Prefetch and Branch Handling In general, the higher the percentage of branch type instructions in a program, the slower a program will run on a pipeline processor.

Effect of Branching on Pipeline Performance Consider a linear pipeline of 5 stages Fetch Instruction Decode Fetch Operands Execute Store Results

Overlapped Execution of Instruction without branching I1I1 I2I2 I3I3 I4I4 I5I5 I6I6 I7I7 I8I8

I5 is a branch instruction I1I1 I2I2 I3I3 I4I4 I5I5 I6I6 I7I7 I8I8

Estimation of the effect of branching on an n-segment instruction pipeline

Estimation of the effect of branching Consider an instruction cycle with n pipeline clock periods. Let – p – probability of conditional branch (20%) – q – probability that a branch is successful (60% of 20%) (12/20=0.6)

Estimation of the effect of branching Suppose there are m instructions Then no. of instructions of successful branches = mxpxq (mx0.2x0.6) Delay of (n-1)/n is required for each successful branch to flush pipeline.

Estimation of the effect of branching Thus, the total instruction cycle required for m instructions =

Estimation of the effect of branching As m becomes large, the average no. of instructions per instruction cycle is given as =?

Estimation of the effect of branching As m becomes large, the average no. of instructions per instruction cycle is given as

Estimation of the effect of branching When p =0, the above measure reduces to n, which is ideal. In reality, it is always less than n.

Solution = ?

Multiple Prefetch Buffers Three types of buffers can be used to match the instruction fetch rate to pipeline consumption rate 1.Sequential Buffers: for in-sequence pipelining 2.Target Buffers: instructions from a branch target (for out-of-sequence pipelining)

Multiple Prefetch Buffers A conditional branch cause both sequential and target to fill and based on condition one is selected and other is discarded

Multiple Prefetch Buffers 3.Loop Buffers – Holds sequential instructions within a loop

Data Buffering and Busing Structures

Speeding up of pipeline segments The processing speed of pipeline segments are usually unequal. Consider the example given below: S1S2S3 T1T2T3

Speeding up of pipeline segments If T1 = T3 = T and T2 = 3T, S2 becomes the bottleneck and we need to remove it How? One method is to subdivide the bottleneck – Two divisions possible are:

Speeding up of pipeline segments First Method: S1 TT2T S3 T

Speeding up of pipeline segments First Method: S1 TT2T S3 T

Speeding up of pipeline segments Second Method: S1 TTT S3 T T

Speeding up of pipeline segments If the bottleneck is not sub-divisible, we can duplicate S2 in parallel S1 S2 S3 T 3T T S2 3T S2 3T

Speeding up of pipeline segments Control and Synchronization is more complex in parallel segments

Data Buffering Instruction and data buffering provides a continuous flow to pipeline units Example: 4X TI ASC

In this system it uses a memory buffer unit (MBU) which – Supply arithmetic unit with a continuous stream of operands – Store results in memory The MBU has three double buffers X, Y and Z (one octet per buffer) – X,Y for input and Z for output

Example: 4X TI ASC This provides pipeline processing at high rate and alleviate mismatch bandwidth problem between memory and arithmetic pipeline

Busing Structures PBLM: Ideally subfunctions in pipeline should be independent, else the pipeline must be halted till dependency is removed. SOLN: An efficient internal busing structure. Example : TI ASC

In TI ASC, once instruction dependency is recognized, update capability is incorporated by transferring contents of Z buffer to X or Y buffer.