Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 10 Computer Hardware Design (Pipeline Datapath and Control Design) Prof. Dr.

Slides:



Advertisements
Similar presentations
PipelineCSCE430/830 Pipeline: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Prof. Yifeng Zhu, U of Maine Fall,
Advertisements

1 IKI20210 Pengantar Organisasi Komputer Kuliah no. 25: Pipeline 10 Januari 2003 Bobby Nazief Johny Moningka
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining I (1) Fall 2005 Lecture 18: Pipelining I.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
EECS 318 CAD Computer Aided Design LECTURE 2: DSP Architectures Instructor: Francis G. Wolff Case Western Reserve University This presentation.
Computer Architecture
CS252/Patterson Lec 1.1 1/17/01 Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer.
Mary Jane Irwin ( ) [Adapted from Computer Organization and Design,
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
Computer ArchitectureFall 2007 © October 24nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
ECE 232 L19.Pipeline2.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 19 Pipelining,
Computer ArchitectureFall 2007 © October 22nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
1 CSE SUNY New Paltz Chapter Six Enhancing Performance with Pipelining.
Pipelining Datapath Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley) and Hank Walker (TAMU)
CS430 – Computer Architecture Introduction to Pipelined Execution
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 9, 2002 Topic: Pipelining Basics.
1 Atanasoff–Berry Computer, built by Professor John Vincent Atanasoff and grad student Clifford Berry in the basement of the physics building at Iowa State.
Pipelining - II Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
CS 61C L30 Introduction to Pipelined Execution (1) Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Computer ArchitectureFall 2008 © October 6th, 2008 Majd F. Sakr CS-447– Computer Architecture.
Ceg3420 L13.1 DAP Fa97,  U.CB CEG3420 Computer Design Introduction to Pipelining.
CS152 / Kubiatowicz Lec /17/01©UCB Fall 2001 CS152 Computer Architecture and Engineering Lecture 13 Introduction to Pipelining: Datapath and Control.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
Pipelining - II Rabi Mahapatra Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
Introduction to Pipelining Rabi Mahapatra Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley)
Lecture 12: Pipeline Datapath Design Professor Mike Schulte Computer Architecture ECE 201.
9.2 Pipelining Suppose we want to perform the combined multiply and add operations with a stream of numbers: A i * B i + C i for i =1,2,3,…,7.
CS1104: Computer Organisation School of Computing National University of Singapore.
COMP381 by M. Hamdi 1 Pipelining Improving Processor Performance with Pipelining.
B 0000 Pipelining ENGR xD52 Eric VanWyk Fall
Pipelining (I). Pipelining Example  Laundry Example  Four students have one load of clothes each to wash, dry, fold, and put away  Washer takes 30.
Analogy: Gotta Do Laundry
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Electrical and Computer Engineering University of Cyprus LAB3: IMPROVING MIPS PERFORMANCE WITH PIPELINING.
ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining.

Cs 152 L1 3.1 DAP Fa97,  U.CB Pipelining Lessons °Pipelining doesn’t help latency of single task, it helps throughput of entire workload °Multiple tasks.
Chap 6.1 Computer Architecture Chapter 6 Enhancing Performance with Pipelining.
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
Pipelining Example Laundry Example: Three Stages
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Lecture 18: Pipelining I.
Pipelines An overview of pipelining
Review: Instruction Set Evolution
CMSC 611: Advanced Computer Architecture
ECE232: Hardware Organization and Design
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
Dave Patterson (http.cs.berkeley.edu/~patterson)
Chapter 3: Pipelining 순천향대학교 컴퓨터학부 이 상 정 Adapted from
Chapter 4 The Processor Part 2
CS 704 Advanced Computer Architecture
Lecturer: Alan Christopher
Serial versus Pipelined Execution
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
An Introduction to pipelining
Pipelining Appendix A and Chapter 3.
CMCS Computer Architecture Lecture 20 Pipelined Datapath and Control April 11, CMSC411.htm Mohamed.
A relevant question Assuming you’ve got: One washer (takes 30 minutes)
Recall: Performance Evaluation
Pipelining.
Presentation transcript:

Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 10 Computer Hardware Design (Pipeline Datapath and Control Design) Prof. Dr. M. Ashraf Chughtai

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 2 Recap: Lecture 9 Single cycle verses multi cycle datapath Key components of multi cycle data path Design and information flow in multi cycle data path Multi cycle control unit design Finite State Machine–based control Unit Microprogram-based controller

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 3 What is pipelining? Pipelining is a fundamental concept It utilizes capabilities of the Datapath by

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 4 Pipelining is Natural! Laundry Example! Four loads: A, B, C, D Four laundry operations: Wash, Dry, fold and place into drawers Washer takes 30 minutes Dryer takes 30 minutes “Folder” takes 30 minutes “Stasher” takes 30 minutes to put clothes into drawers ABCD

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 5 Sequential Laundry 30 TaskOrderTaskOrder B C D A Time 30 6 PM AM Explanation next please ……………..

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 6 Pipelined Laundry: Start work ASAP Pipelined laundry takes 3.5 hours for 4 loads! TaskOrderTaskOrder 12 2 AM 6 PM Time 30 A B C D

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 7 Features of Pipelined Processor All the functional units operate independently All the functional units operate independently Multiple tasks operating simultaneously using different resources Multiple tasks operating simultaneously using different resources Pipelining doesn’t help latency of single task, it helps throughput of entire workload Pipelining doesn’t help latency of single task, it helps throughput of entire workload Potential speedup = Number pipe stages ……… Cont’d Next please!

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 8 Pipelining Lessons Pipeline rate limited by: - Slowest pipeline stage - Time to “fill” pipeline and time to “drain” it reduces speedup - Unbalanced lengths of pipe stages reduces speedup If washer takes longer time than the dryer then dryer has to wait! Stall for Dependences

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 9 Five Steps of Datapath Ins. fetch Dec/RegExecMemWr

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 10 Pipelined Processor Design PC Next PC IR Inst. Mem A B Reg File IRex Dcd Ctrl Exec S IRmem Ex Ctrl Reg. File Equal WB Ctrl Mem Access Data Mem M IRwb Mem Ctrl Instruction Fetch ID/Register Read Execute/ Address Memory Rd/Wrt Write Back (Reg. Wrt)

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 11 Pipeline Control IR <- Mem[PC]; PC <– PC+4; A <- R[rs]; B<– R[rt] S <– A + B; R[rd] <– S; S <– A + SX; M <– Mem[S] R[rd] <– M; S <– A or ZX; R[rt] <– S; S <– A + SX; Mem[S] <- B If Cond PC < PC+SX; Instruction Fetch ID/Reg. Rd Exe/Address Memory Rd/Wrt Reg. Wrt (WB)

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 12 Pipelined Registers Included A B Reg File IRex Dcd Ctrl Exec S IRmem Ex Ctrl Reg. File Equal WB Ctrl Mem Access Data Mem M I Rwb Mem Ctrl PC Next PC IR Inst. Mem Instruction Fetch ID/Register Read Execute/ Address Memory Rd/Wrt Write Back (Reg. Wrt)

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 13 Five Steps as Stages of Pipeline. Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5 IfetchReg/DecExecMemWrLoad

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 14 Multiple Cycle verses Pipeline – Pipeline enhances performance Clk Cycle Multiple Cycle Implementation: IfetchRegExecMemWrIfetchRegExecMem LoadStore Ifetch R-type RegExecMem Load IfetchRegExecMemWr Pipeline Implementation: IfetchRegExecMemWr Store IfetchRegExecMemWr R-type Explanation next slide…….

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 15 3 Instructions program reconsideredLoadStore R-type (ADD)

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 16Example The cycle time of a single cycle machine is 45 ns, and of multi cycle and pipelined machines is 10 ns; and average CPI due to instruction mix on multi cycle machine is 4.6. What is the execution time on each type of machine? Ans: Single Cycle Machine –45 ns/cycle x 1 CPI x 100 inst = 4500 ns Multi Cycle Machine –10 ns/cycle x 4.6 CPI x 100 inst = 4600 ns Pipelined machine –10 ns/cycle x (1 CPI x 100 inst + 4 cycle drain) = 1040 ns

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 17 Another Example Consider a multicycle, unpiplined processor requires 4 cycles for the ALU and Branch operations and 5 cycles for the memory operation. Assume the relative frequency of these operations is 40%, 25% and 35% respectively; and the clock cycle is of 1 n sec. In pipelined implementation, due to clock skew and setup processor adds 0.2 n sec. to the clock Ignoring any latency impact, how much is the speedup from the pipelined processor?

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 18Solution Unpiplined Processor: Average Execution Time/Instruction = Clock Cycle x Average CPI = 1 n sec. x [{( )} x x 5] =1 n sec x (0.65 x x 5) =1 n sec x (0.65 x x 5) = 1 n sec. x ( ) =4.35 n sec Pipelined Processor: Average Execution Time/ Instruction = Clock cycle + overhead = 1 n sec n. sec =1.2 n sec =1.2 n sec Speed up = 4.35 / 1.2 = 3.62 times

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 19 Pipelined Execution Representation Program Flow IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMem WB IFetchDcdExecMemWB Time 1 st Inst. 2 nd Inst. 3 rd Inst 4 th Inst 5 th Inst. Conventional Representation Conventional Representation - Helps showing the program flow viz-a-viz time

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 20 Graphical Representation Explanation…… Next Please Time (clock cycles) I.Mem I n s t r. O r d e r Instr 1 Instr 2 Instr 3 Instr 4 ALU I.Mem Reg D. Mem ALU I.Mem Reg D. Mem Reg ALU I.Mem Reg D.Mem Reg ALU D.Mem Reg ALU I.Mem Reg Mem Reg Instr 5 CC1 CC3 CC2 CC5 CC4 CC6 CC8 CC7 CC9 Reg

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 21 Why Pipeline? Because the resources are there! I n s t r. O r d e r Time (clock cycles) Inst 0 Inst 1 Inst 2 Inst 4 Inst 3 ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 22 Can pipelining get us into trouble? Structural hazards – – Data hazards – – Control hazards

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 23 How Stall degrades the performance? The pipelined CPI with stalls = Ideal CPI + Stall clock cycles per instruction

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 24 How Stall degrades the performance? Speedup w.r.t unpiplined = CPI Unpiplined 1 + stall cycles per instruction Speedup w.r.t. pipeline depth: : Speedup w.r.t pipeline depth = pipeline depth 1 + stall cycles per instruction

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 25 Summary multi cycle datapath verses pipeline datapath Key components of pipeline data path Performance enhancement due to pipeline Hazards in pipelined datapath

MAC/VU-Advanced Computer Architecture Lecture 10 –Computer Hardware Design (4) 26 Asslam-u-aLacum and ALLAH Hafiz