Presentation is loading. Please wait.

Presentation is loading. Please wait.

C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Advanced Computers Architecture Lecture 4 By Rohit Khokher Department.

Similar presentations


Presentation on theme: "C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Advanced Computers Architecture Lecture 4 By Rohit Khokher Department."— Presentation transcript:

1 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Advanced Computers Architecture Lecture 4 By Rohit Khokher Department of Computer Science, Sharda University, Greater Noida, India

2 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 High Performance Architectures  Who needs high performance systems?  How do you achieve high performance?  How to analyses or evaluate performance?

3 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Outline of my lecture  Classification  ILP Architectures  Data Parallel Architectures  Process level Parallel Architectures  Issues in parallel architectures  Cache coherence problem  Interconnection networks

4 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Classification of Parallel Computing  Flynn’s Classification  Feng’s Classification  Händler’s Classification  Modern (Sima, Fountain & Kacsuk) Classification

5 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Feng’s Classification  Feng [1972] also proposed a scheme on the basis of degree of parallelism to classify computer architectures.  Maximum number of bits that can be processed every unit of time by the system is called ‘ maximum degree of parallelism’.  Feng’s scheme performed sequential and parallel operations at bit and words level.  The four types of Feng’s classification are as follows:-  WSBS ( Word Serial Bit Serial)  WPBS ( Word Parallel Bit Serial) (Staran)  WSBP ( Word Serial Bit Parallel) (Conventional Computers)  WPBP ( Word Parallel Bit Parallel) (ILLIAC IV)

6 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 1163264 1 16 64 256 16K word length bit slice length MPP STARAN C.mmP PDP11IBM370 IlliacIV CRAY-1

7 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Modern Classification Parallel architectures Data-parallel architectures Function-parallel architectures

8 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Data Parallel Architectures Data-parallel architectures Vector architectures Associative And neural architectures SIMDs Systolic architectures

9 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Function Parallel Architectures Function-parallel architectures Instr level Parallel Arch Thread level Parallel Arch Process level Parallel Arch (ILPs) (MIMDs) Pipelined processors VLIWs Superscalar processors Distributed Memory MIMD Shared Memory MIMD

10 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Motivation  Non-pipelined design  Single-cycle implementation  The cycle time depends on the slowest instruction  Every instruction takes the same amount of time  Multi-cycle implementation  Divide the execution of an instruction into multiple steps  Each instruction may take variable number of steps (clock cycles)  Pipelined design  Divide the execution of an instruction into multiple steps (stages)  Overlap the execution of different instructions in different stages  Each cycle different instruction is executed in different stages  For example, 5-stage pipeline (Fetch-Decode-Read-Execute-Write),  5 instructions are executed concurrently in 5 different pipeline stages  Complete the execution of one instruction every cycle (instead of every 5 cycle)  Can increase the throughput of the machine 5 times

11 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Example of Pipeline LD R1 <- A ADD R5, R3, R4 LD R2 <- B SUB R8, R6, R7 ST C <- R5 FDREW FDREW FDREW FDREW FDREW FDREW FDREW FDREW FDREW F Non-pipelined processor: 25 cycles = number of instrs (5) * number of stages (5) Pipelined processor: 9 cycles = start-up latency (4) + number of instrs (5) Filling the pipeline Draining the pipeline 5 stage pipeline: Fetch – Decode – Read – Execute - Write

12 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1  Data Dependence  Read-After-Write (RAW) dependence  True dependence  Must consume data after the producer produces the data  Write-After-Write (WAW) dependence  Output dependence  The result of a later instruction can be overwritten by an earlier instruction  Write-After-Read (WAR) dependence  Anti dependence  Must not overwrite the value before its consumer  Notes  WAW & WAR are called false dependences, which happen due to storage conflicts  All three types of dependences can happen for both registers and memory locations  Characteristics of programs (not machines)

13 C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Example Example 1 1 LD R1 <- A 2 LD R2 <- B 3 MULT R3, R1, R2 4 ADD R4, R3, R2 5 SUB R3, R3, R4 6 ST A <- R3 FDREW FDREW FDRRR FDDDR DRFDD RAW dependence: 1->3, 2-> 3, 2->4, 3 -> 4, 3 -> 5, 4-> 5, 5-> 6 WAW dependence: 3-> 5 WAR dependence: 4 -> 5, 1 -> 6 (memory location A) EW RRREW RREW Pipeline bubbles due to RAW dependences (Data Hazards) Execution Time: 18 cycles = start-up latency (4) + number of instrs (6) + number of pipeline bubbles (8) FDFFDDRRREW FF


Download ppt "C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Advanced Computers Architecture Lecture 4 By Rohit Khokher Department."

Similar presentations


Ads by Google