Talianpaip Talianpaip ialah satu teknik utk meningkatkan prestasi mesin dgn melakukan arahan bertindih utk mengurangkan masa perlaksanaan. Ini merupakan.

Slides:



Advertisements
Similar presentations
PipelineCSCE430/830 Pipeline: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Prof. Yifeng Zhu, U of Maine Fall,
Advertisements

COMP25212 Further Pipeline Issues. Cray 1 COMP25212 Designed in 1976 Cost $8,800,000 8MB Main Memory Max performance 160 MFLOPS Weight 5.5 Tons Power.
CMPT 334 Computer Organization
Pipelining I Topics Pipelining principles Pipeline overheads Pipeline registers and stages Systems I.
Pipeline and Vector Processing (Chapter2 and Appendix A)
1 IKI20210 Pengantar Organisasi Komputer Kuliah no. 25: Pipeline 10 Januari 2003 Bobby Nazief Johny Moningka
Chapter 8. Pipelining.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining I (1) Fall 2005 Lecture 18: Pipelining I.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
Computer Architecture
CS252/Patterson Lec 1.1 1/17/01 Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer.
Chapter Six 1.
Isyarat2 Kawalan (Input) Jam (Clock) –Satu arahan-mikro (atau satu set arahan-mikro selari) untuk satu kitaran jam Daftar arahan (Instruction register)
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
FSKTM Kenapa Cache? Penganalisaan bbrp aturcara menunjukkan bahawa memori komputer cenderung merujuk kepada suatu kawasan tertentu. Fenomena ini dinamakan.
Computer ArchitectureFall 2007 © October 22nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.
DLX Instruction Format
Pipelining Datapath Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley) and Hank Walker (TAMU)
FSKTM Sistem I/O Control Data-path Memory Processor Input Output Topik Hari ini: I/O Systems Control Data-path Memory Processor Input Output Network.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 9, 2002 Topic: Pipelining Basics.
Scott Beamer, Instructor
CS 61C L30 Introduction to Pipelined Execution (1) Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Computer ArchitectureFall 2008 © October 6th, 2008 Majd F. Sakr CS-447– Computer Architecture.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
Introduction to Pipelining Rabi Mahapatra Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley)
9.2 Pipelining Suppose we want to perform the combined multiply and add operations with a stream of numbers: A i * B i + C i for i =1,2,3,…,7.
CS1104: Computer Organisation School of Computing National University of Singapore.
Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.
Computer Science Education
B 0000 Pipelining ENGR xD52 Eric VanWyk Fall
EEL5708 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Pipelining.
Pipelining (I). Pipelining Example  Laundry Example  Four students have one load of clothes each to wash, dry, fold, and put away  Washer takes 30.
Analogy: Gotta Do Laundry
ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining.

Cs 152 L1 3.1 DAP Fa97,  U.CB Pipelining Lessons °Pipelining doesn’t help latency of single task, it helps throughput of entire workload °Multiple tasks.
Chap 6.1 Computer Architecture Chapter 6 Enhancing Performance with Pipelining.
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
Pipelining Example Laundry Example: Three Stages
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions.
CS252/Patterson Lec 1.1 1/17/01 معماري کامپيوتر - درس نهم pipeline برگرفته از درس : Prof. David A. Patterson.
LECTURE 7 Pipelining. DATAPATH AND CONTROL We started with the single-cycle implementation, in which a single instruction is executed over a single cycle.
11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.
Introduction to Computer Organization Pipelining.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Chapter One Introduction to Pipelined Processors.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 10 Computer Hardware Design (Pipeline Datapath and Control Design) Prof. Dr.
DICCD Class-08. Parallel processing A parallel processing system is able to perform concurrent data processing to achieve faster execution time The system.
Lecture 18: Pipelining I.
Pipelines An overview of pipelining
Pipelining Chapter 6.
Pipelining concepts, datapath and hazards
CMSC 611: Advanced Computer Architecture
ECE232: Hardware Organization and Design
CDA 3101 Spring 2016 Introduction to Computer Organization
Chapter 3: Pipelining 순천향대학교 컴퓨터학부 이 상 정 Adapted from
Chapter 4 The Processor Part 2
Pipelining Chapter 6.
Lecturer: Alan Christopher
Serial versus Pipelined Execution
An Introduction to pipelining
Pipelining Appendix A and Chapter 3.
Pipelining Chapter 6.
Pipelining.
Presentation transcript:

Talianpaip Talianpaip ialah satu teknik utk meningkatkan prestasi mesin dgn melakukan arahan bertindih utk mengurangkan masa perlaksanaan. Ini merupakan kunci kepantasan CPU pd hari ini. 1

Pipelining Is Natural! Laundry Example Aini, Boon, Chong, David each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 30 minutes “Folder” takes 30 minutes “Stasher” takes 30 minutes to put clothes into drawers ABCD

Sequential Laundry Sequential laundry takes 8 hours for 4 loads If they learned pipelining, how long would laundry take? 30 TaskOrderTaskOrder B C D A Time 30 6 PM AM

Pipelined Laundry: Start Work ASAP Pipelined laundry takes 3.5 hours for 4 loads! TaskOrderTaskOrder 12 2 AM 6 PM Time B C D A 30

Pipelining Lessons Talian tidak mempercepatkan pendaman satu tugas, ia mempercepatkan truput utk keseluruhan beban kerja. Multi tugas dilaksanakan serentak dgn menggunakan sumber yg berlainan. Keupayaan speedup = Bil. segmen talianpaip Kadar talianpaip dihadkan oleh segmen talianpaip yg terlembap Ketidakseimbangan pjg segmen talianpaip mengurangkan speedup Masa utk penuhkan talianpaip & pengosongan talianpaip mengurangkan speedup. 6 PM 789 Time B C D A 30 TaskOrderTaskOrder

Pengenalan Dlm sistem tipikal kepantasan dicapai melalui teknik keselarian dlm semua peringkat: Multi- user, multi-tasking, multi-processing, multi- programming, multi-threading, compiler optimizations. Talianpaip : adl teknik pertindihan operasi semasa perlaksanaan. Different types of pipeline: instruction pipeline, operation pipeline, multi-issue pipelines.

Apakah talianpaip? 2 Ia spt automobile assembly line. Ia m’punyai bbrp langkah atau langkah atau segmen. Setiap segmen melakukan arahan atau operasi yg berbeza. Segmen2 ini disambung menjadikan paip. An inst or operation enters through one end and progresses through the stages and exit through the other end. Talianpaip ialah teknik perlaksanaan yg mengeksplotasikan keselarian sesama arahan dlm aliran arahan berjujukan.

Krateria2 Talianpaip Truput: Bil. item (cars, instructions, operations) yg keluar drpd talianpaip per unit masa. –Ex: 1 inst / clock cycle, 10 cars/ hour, 10 fp operations /cycle. Masa segmen: Matlamat pereka talianpaip ialah keseimbangan masa setiap segmen talianpaip. (Balanced pipeline) –masa segmen = Masa arahan dlm mesin yg bukan talianpaip / bil.segmen. –Dlm kebanyakan kes, masa segmen = mak (masa utk semua segmen). CPI : Pipeline yields a reduction in cycles per instruction. CPI approx = stage time.

A Machine Without Pipelines I1I2I3 Below is a diagram which shows the execution of three instructions in sequence. Computer Instructions Execution Time

A Machine With Pipelines I1 I2 I3 Computer Instructions } Instructions are overlapped Time Below is a diagram to show the execution of three instructions which are overlapped. Execution Time

Instruction Composition A single instruction executing within a machine normally consist of a few segments which performs specific tasks. An example of an instruction which has 6 segments:- 1)Fetch instruction from memory. (IFetch) 2)Decode the instruction. (Dcd) 3)Calculate the effective address. (Eadd) 4)Fetch operand from memory. (Mem) 5)Execute the instruction. (Exec) 6)Store results in the proper place (or location). (WB)

Space Time Diagram A space-time diagram is normally used to represent the execution of instructions in a pipeline. Below is an example of a space time diagram. Whereby there are six instructions executing (I1 to I6) and each instruction has four segments. IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB Program Flow Time

Pipeline Performance - Example2 Assume times for five functional units of a pipeline to be: 10ns, 8ns, 10ns, 10ns and 7ns. Overhead 1ns per stage. Compute the speed of the data path. Pipelined: Stage time = MAX(10,8,10,10,10,7) + overhead = = 11ns. This is the average instruction execution time at steady state. Non-pipelined: = 45ns Speedup = 45/11= 4.1 times

Masalah2 Talianpaip Kebergantungan Data Masalah yg wujud pabila dua arahan bersebelahan dalam talian paip ingin mencapai secebis data yg berada dlm lokasi memori yg sama. Percabangan Arahan Masalah yg berlaku pabila branch perlu dilaksanakan dan arahan2 pre-fetched yg lain dlm talian paip akan/mesti dibuang.

Kebergantungan Data Dlm cth ini, andaikan arahan terdiri drpd 4 segment, dan arahan satu (I1) dan arahan dua (I2) mencapai lokasi memori yg sama. FIDAFOEXFIDAFOEX Instruction Execution I1 I2 Both instructions are trying to access the same variable at the same time.

Selesaian Masalah Kebergantungan Data Hardware Interlocks (membabitkan perkakasan) Operand Forwarding (membabitkan perkakasan) Delayed Load (membabitkan perisian)

Hardware Interlocks FIDAFOEX FIDAFOEX Instruction Execution I1 I2 Ia mengesan masalah kebergantungan data dan lengahkan arahan kedua.

Operand Forwarding Ia mengesan masalah kebergantungan data dan hantar (forwards) keputusan arahan satu ke dlm arahan berikutnya. FIDAFOEXFIDAFOEX Instruction Execution I1 I2 The offending instructions will have the output of one, sent directly into the next instruction.

Delayed Load Ia mengesan masalah kebergantungan data ketika aturcara tersebut dikompilasikan. Pengkompilasi kemudiannya akan memasukkan arahan no-operation di antara arahan2 yg terbabit. FIDAFOEX FIDAFOEX Instruction Execution I1 I2 FIDAFOEX NOP

Percabangan Arahan What if instruction I1 is a branch instruction to I3? what happens to I2? FIDAFOEXFIDAFOEX Instruction Execution I1 I2 FIDAFOEX I3

Selesaian Arahan Bercabang Pre-fetch Target Instruction Branch Target Buffer, A form of cache implementation (but for instructions) Loop Buffer Find out more details for these Branch Prediction solutions.. Delayed Branch

Operasi Talianpaip Apakah yg memudahkan? –Semua arahan adl sama panjang –sedikit format arahan –memori operan hanya wujud ketika muat & simpan Apakah yg menyusahkan? –Katakan hanya ada satu memori (structural hazard) –masalah arahan2 bercabang (control hazard) –Arahan yg bergantung kpd arahan sebelumnya (data hazards)

Bencana Talianpaip Bencana akan mengurangkan prestasi talian drpd ideal speedupnya: Structural hazard: Konflik sumber. Perkakasan tidak boleh menyokong kesemua arahan gabungan yg wujud dlm simultaneous overlapped execution. Data hazard: Pabila arahan memerlukan keputusan drpd arahan sebelumnya. Control hazard: Akibat drpd cabangan2 & arahan lain yg memberi kesan kpd PC.

Structural Hazards Pabila lebih drpd satu arahan dlm talianpaip yg ingin mencapai sumber di mana ini menyebabkan laluan data mengalami structural hazard. Cth sumber spt: register file, memori, ALU. Selesaian: Melengahkan talianpaip utk satu kitaran jam apabila konflik tersebut dikesan. Hasilnya ialah pipeline bubble. Komflik capaian memori dan bgmn ia diatasi dengan melengahkan arahan. Problem: one memory port.

Data Hazard Consider the inst sequence: ADD R1,R2,R3 ; result is in R1 SUB R4,R5,R1 AND R6,R1,R7 OR R8,R1,R9 XOR R10,R1,R11 All instructions use R1 after the first inst.

Selesaian- Data Hazard Selalunya diselesaikan oleh data forwarding atau register forwarding (bypassing or short-circuiting). Bgmn? The data selected is not really used in identification flag (ID) but in the next stage: ALU. Forwarding works as follows: Keputusan ALU drpd buffer EX/MEM selalunya disuapbalik ke selak input ALU. If the forwarding hardware detects that its source operand has a new value, the logic selects the newer result than the value read from the register file.

Perlengahan Talianpaip Stall adl perlengahan dlm kitaran yg disbbkan oleh sebarang bencana yg tlh dinyatakan. Speedup : 1/(1+pipeline stall per instruction)* Number of stages Jadi apakah speedup utk talianpaip unggul tanpa stall? Bil. Kitaran yg diperlukan utk pemenuhan talian paip mungkin boleh dimasukkan dlm penghitungan kepurataan stall per arahan.