Principles of Linear Pipelining

Slides:



Advertisements
Similar presentations
RISC Instruction Pipelines and Register Windows
Advertisements

1 A latch is a pair of cross-coupled inverters –They can be NAND or NOR gates as shown –Consider their behavior (each step is one gate delay in time) –From.
Pipelining (Week 8).
CSCI 4717/5717 Computer Architecture
1 ITCS 3181 Logic and Computer Systems B. Wilkinson Slides9.ppt Modification date: March 30, 2015 Processor Design.
CS5365 Pipelining. Divide task into a sequence of subtasks. Each subtask is executed by a stage (segment)of the pipe.
CMPT 334 Computer Organization
Chapter 3 Pipelining. 3.1 Pipeline Model n Terminology –task –subtask –stage –staging register n Total processing time for each task. –T pl =, where t.
Pipelining Principles Prof. Sirer CS 316 Cornell University.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
Chapter Six 1.
Lecture: Pipelining Basics
Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.
1 Lecture 28 Timing Analysis. 2 Overview °Circuits do not respond instantaneously to input changes °Predictable delay in transferring inputs to outputs.
S. Barua – CPSC 440 CHAPTER 6 ENHANCING PERFORMANCE WITH PIPELINING This chapter presents pipelining.
1 Lecture 17: Basic Pipelining Today’s topics:  5-stage pipeline  Hazards and instruction scheduling Mid-term exam stats:  Highest: 90, Mean: 58.
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
Chapter 12 Pipelining Strategies Performance Hazards.
Pipelining Andreas Klappenecker CPSC321 Computer Architecture.
Pipelined Processor II CPSC 321 Andreas Klappenecker.
Pipelining By Toan Nguyen.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
Basic Processing Unit (Week 6)
Parallelism Processing more than one instruction at a time. Pipelining
Basic Microcomputer Design. Inside the CPU Registers – storage locations Control Unit (CU) – coordinates the sequencing of steps involved in executing.
Chapter 8 Problems Prof. Sin-Min Lee Department of Mathematics and Computer Science.
1 Superscalar Pipelines 11/24/08. 2 Scalar Pipelines A single k stage pipeline capable of executing at most one instruction per clock cycle. All instructions,
Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.
July 2005Computer Architecture, Data Path and ControlSlide 1 Part IV Data Path and Control.
Parallel architecture Technique. Pipelining Processor Pipelining is a technique of decomposing a sequential process into sub-processes, with each sub-process.
1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft.
ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013.
Principles of Linear Pipelining
Introduction to Microprocessors
Timing Analysis Section Delay Time Def: Time required for output signal Y to change due to change in input signal X Up to now, we have assumed.
Principles of Linear Pipelining. In pipelining, we divide a task into set of subtasks. The precedence relation of a set of subtasks {T 1, T 2,…, T k }
Processor Architecture
Pipelining and Parallelism Mark Staveley
Chapter One Introduction to Pipelined Processors
How Computers Work Lecture 12 Page 1 How Computers Work Lecture 12 Introduction to Pipelining.
System Unit Working of CPU. The CPU CPU The CPU CPU stands for central processing unit. it is brain of computer It is most important component of the.
Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2010
Introduction  The speed of execution of program is influenced by many factors. i) One way is to build faster circuit technology to build the processor.
Pipelining Example Laundry Example: Three Stages
“Self-Resetting Stage Logic” Presented by : Vishal Jain, ID: Guided By: Prof. Anutosh Maitra.
Chapter One Introduction to Pipelined Processors
Chapter One Introduction to Pipelined Processors
Speedup Speedup is defined as Speedup = Time taken for a given computation by a non-pipelined functional unit Time taken for the same computation by a.
EE3A1 Computer Hardware and Digital Design Lecture 9 Pipelining.
1 Lecture 3: Pipelining Basics Today: chapter 1 wrap-up, basic pipelining implementation (Sections C.1 - C.4) Reminders:  Sign up for the class mailing.
Chapter One Introduction to Pipelined Processors.
Chapter One Introduction to Pipelined Processors.
Full Design. DESIGN CONCEPTS The main idea behind this design was to create an architecture capable of performing run-time load balancing in order to.
CPU (Central Processing Unit). The CPU is the brain of the computer. Sometimes referred to simply as the processor or central processor, the CPU is where.
CSE 340 Computer Architecture Summer 2016 Understanding Performance.
Lecture 16: Basic Pipelining
Lecture: Pipelining Basics
Chapter One Introduction to Pipelined Processors
Lecture 16: Basic Pipelining
Serial versus Pipelined Execution
CSCI 232 © 2005 JW Ryder.
Lecture 17: Pipelining Today’s topics: 5-stage pipeline Hazards.
Single Cycle vs. Multiple Cycle
Pipeline Principle A non-pipelined system of combination circuits (A, B, C) that computation requires total of 300 picoseconds. Comb. logic.
Performance Cycle time of a computer CPU speed speed = 1 / cycle time
Pipelining: Basic Concepts
Part IV Data Path and Control
Lecture: Pipelining Basics
Pipelining.
Pipelining and Superscalar Techniques
Presentation transcript:

Principles of Linear Pipelining

Principles of Linear Pipelining In pipelining, we divide a task into set of subtasks. The precedence relation of a set of subtasks {T1, T2,…, Tk} for a given task T implies that the same task Tj cannot start until some earlier task Ti finishes. The interdependencies of all subtasks form the precedence graph.

Principles of Linear Pipelining With a linear precedence relation, task Tj cannot start until earlier subtasks { Ti} for all (i < j) finish. A linear pipeline can process subtasks with a linear precedence graph.

Principles of Linear Pipelining A pipeline can process successive subtasks if Subtasks have linear precedence order Each subtasks take nearly same time to complete

Basic Linear Pipeline L: latches, interface between different stages of pipeline S1, S2, etc. : pipeline stages

Basic Linear Pipeline It consists of cascade of processing stages. Stages : Pure combinational circuits performing arithmetic or logic operations over the data flowing through the pipe. Stages are separated by high speed interface latches. Latches : Fast Registers holding intermediate results between stages Information Flow are under the control of common clock applied to all latches

Basic Linear Pipeline L: latches, interface between different stages of pipeline S1, S2, etc. : pipeline stages

Basic Linear Pipeline The flow of data in a linear pipeline having four stages for the evaluation of a function on five inputs is as shown below:

Basic Linear Pipeline The vertical axis represents four stages The horizontal axis represents time in units of clock period of the pipeline.

Clock Period (τ) for the pipeline Let τi be the time delay of the circuitry Si and t1 be time delay of latch. Then the clock period of a linear pipeline is defined by The reciprocal of clock period is called clock frequency (f = 1/τ) of a pipeline processor.

Performance of a linear pipeline Consider a linear pipeline with k stages. Let T be the clock period and the pipeline is initially empty. Starting at any time, let us feed n inputs and wait till the results come out of the pipeline. First input takes k periods and the remaining (n-1) inputs come one after the another in successive clock periods. Thus the computation time for the pipeline Tp is Tp = kT+(n-1)T = [k+(n-1)]T

Performance of a linear pipeline For example if the linear pipeline have four stages with five inputs. Tp = [k+(n-1)]T = [4+4]T = 8T

Performance Parameters The various performance parameters of pipeline are : Speed-up Throughput Efficiency

Speedup Speedup is defined as Speedup = Time taken for a given computation by a non-pipelined functional unit Time taken for the same computation by a pipelined version Assume a function of k stages of equal complexity which takes the same amount of time T. Non-pipelined function will take kT time for one input. Then Speedup = nkT/(k+n-1)T = nk/(k+n-1)

Speed-up For e.g., if a pipeline has 4 stages and 5 inputs, its speedup factor is Speedup = ? The maximum value of speedup is Lt [Speedup] = ? n  ∞

Speed-up The maximum value of speedup is Lt [Speedup] = k n  ∞

Efficiency It is an indicator of how efficiently the resources of the pipeline are used. If a stage is available during a clock period, then its availability becomes the unit of resource. Efficiency can be defined as

Efficiency No. of stage time units = nk there are n inputs and each input uses k stages. Total no. of stage-time units available = k[ k + (n-1)] It is the product of no. of stages in the pipeline (k) and no. of clock periods taken for computation(k+(n-1)).

Efficiency Thus efficiency is expressed as follows: The maximum value of efficiency is

Efficiency Efficiency is minimum when n = 1. Efficiency = ? For k = 4 and n = 5, Efficiency = ?

Throughput It is the average number of results computed per unit time. For n inputs, a k-staged pipeline takes [k+(n-1)]T time units Then, Throughput = n / [k+n-1] T = nf / [k+n-1] where f is the clock frequency

Throughput The maximum value of throughput is Lt [Throughput] = ? n  ∞

Throughput The maximum value of throughput is Lt [Throughput] = f n  ∞ Throughput = Efficiency x Frequency

Problem Consider the execution of a program of 15000 instructions by a linear pipeline processor with a clock rate of 25MHz. Assume that the instruction pipeline has 5 stages and that one instruction is issued per clock cycle. The penalties due to branch instructions and out-of-sequence executions are ignored Calculate the speedup factor as compared with non-pipelined processor What are the efficiency and throughput of this pipelined processor

Individual Assignment