AIE Processor Concept. Sequential Processor Stages DecodeFetchExecuteMemWB.

Slides:

Advertisements

Similar presentations

Control Unit Implemntation

Advertisements

ARM Cortex A8 Pipeline EE126 Wei Wang. Cortex A8 is a processor core designed by ARM Holdings. Application: Apple A4, Samsung Exynos What’s the.

Instructor: Yuzhuang Hu Final Exam! The final exam is scheduled on 7 th, August, Friday 7:00 pm – 10:00 pm.

COMP25212 Further Pipeline Issues. Cray 1 COMP25212 Designed in 1976 Cost $8,800,000 8MB Main Memory Max performance 160 MFLOPS Weight 5.5 Tons Power.

Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.

ELEN 468 Advanced Logic Design

INTRODUCTION TO THE ARM PROCESSOR – Microprocessor Asst. Prof. Dr. Choopan Rattanapoka and Asst. Prof. Dr. Suphot Chunwiphat.

CS252/Patterson Lec 1.1 1/17/01 Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer.

1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6.

Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University RISC Pipeline See: P&H Chapter 4.6.

Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang.

ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )

CSCE 212 Quiz 9 – 3/30/11 1.What is the clock cycle time based on for single-cycle and for pipelining? 2.What two actions can be done to resolve data hazards?

Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University.

RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.

Processor Architecture Kieran Mathieson. Outline Memory CPU Structure Design a CPU Programming Design Issues.

Lec 9: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University.

Pipelining By Toan Nguyen.

Princess Sumaya Univ. Computer Engineering Dept. Chapter 4: IT Students.

Processor Organization and Architecture

Instruction Sets and Pipelining Cover basics of instruction set types and fundamental ideas of pipelining Later in the course we will go into more depth.

RISC:Reduced Instruction Set Computing. Overview What is RISC architecture? How did RISC evolve? How does RISC use instruction pipelining? How does RISC.

Chapter 1 An Introduction to Processor Design 부산대학교 컴퓨터공학과.

Computer architecture Lecture 11: Reduced Instruction Set Computers Piotr Bilski.

Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.

Pipelined Datapath and Control

Cis303a_chapt04.ppt Chapter 4 Processor Technology and Architecture Internal Components CPU Operation (internal components) Control Unit Move data and.

What is µP? “An integrated circuit containing … a central processing unit (CPU) and a means to access external memory” -- (Ball 2000)

Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.

CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.

CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.

CMPE 421 Parallel Computer Architecture

1 Computer Architecture Part II-B: CPU Instruction Set.

1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.

POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? Pipelining Ver. Jan 14, 2014 Marco D. Santambrogio:

TEAM FRONT END ECEN 4243 Digital Computer Design.

CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and

Von Neumann Model Computer Organization I 1 September 2009 © McQuain, Feng & Ribbens The Stored Program Computer 1945: John von Neumann –

Computer Organization and Assembly Languages Yung-Yu Chuang 2005/09/29

11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.

CSE431 L06 Basic MIPS Pipelining.1Irwin, PSU, 2005 MIPS Pipeline Datapath Modifications  What do we need to add/modify in our MIPS datapath? l State registers.

RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.

Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.

RISC Pipelining CS 147 Spring 2011 Kui Cheung

Electrical and Computer Engineering University of Cyprus

Computer Organization

15-740/ Computer Architecture Lecture 3: Performance

Exceptions Another form of control hazard Could be caused by…

CDA3101 Recitation Section 8

RISC Pipelining RISC Pipelining CS 147 Spring 2011 Kui Cheung.

CSCI206 - Computer Organization & Programming

ELEN 468 Advanced Logic Design

Single Clock Datapath With Control

Processor Organization and Architecture

Overview Control Memory Comparison of Implementations

Central Processing Unit

Computer Organization and Design

Serial versus Pipelined Execution

Computer Structure S.Abinash 11/29/ _02.

Enemies make you stronger, allies make you weaker. Frank Herbert

Systems Architecture II

CSCI206 - Computer Organization & Programming

The Processor Lecture 3.6: Control Hazards

Guest Lecturer TA: Shreyas Chand

Computer Organization and Assembly Languages Yung-Yu Chuang 2005/09/29

Introduction to Computer Organization and Architecture

Conceptual execution on a processor which exploits ILP

Chapter 4 The Von Neumann Model

Presentation transcript:

AIE Processor Concept

Sequential Processor Stages DecodeFetchExecuteMemWB

Pipelining Processor Stages DecodeFetch ExecuteMem PipElinePipEline PipElinePipEline WB PipElinePipEline PipElinePipEline PipElinePipEline PipElinePipEline PipElinePipEline PipElinePipEline

Transaction Table Five Stages Pipeline

Pipelining Design As Queue – Problems: High Circuit Complexity If Queue is Full in a stage the previous must halt until the queue release item, so there is no great benefit. – Implementation Shift Register Circuit & Registers [Waste Cycles] Counter & Registers [Save Cycles]

Shift Register Circuit & Registers

Counter & Registers Pipeline

Pipeline Optimal Designs Sync Pipeline – All Pipeline Modules Attached with Same Cycle Controller – Cycle Time = Max Stage Clock – Problems There is Waste in Clock but not to much Every stage not aware of the status of previous stage.

Pipeline Optimal Designs A Sync Pipeline – Every Stage aware of the status of the previous stage using internal handshaking signals Ready – Acknowledge Signals – Advantages There is no clock waste thanks to handshaking signals There is no Max Cycle Clock, every instruction take the clocks need to perform it’s operation. – Disadvantages In Control Unit you must specify every instruction timing in every stage of the pipelined processor

Pipeline Optimal Designs Sync Pipeline & A Sync Pipeline

Sync Pipeline Implementation

Key Feature of AIE Processor 32-bit Pipelined Processor Processor Support 48 Instruction Processor Interface with Interleaved Memory Interface with LCD Terminal using Instructions Processor have it’s Assembly Interpreter

Instructions

Ir register IR Register 8 bit INSTRUCIONS ROM Address Bus : 8 bit Data Bus : 32 bit 32 bit Modes Select 24 bit

ROM CONTROLS 11-CMP reg1,reg2 -3e IPNT immediate -4c RPNT reg -2c STOWE address,reg -4c STOWO address,reg -4c STODW address,reg -4c LODWE reg,address -4c LODWO reg,address -4c1d LODDW reg,address -4c a-JG address -8c b-JE address -8c c-JL address -8cc d-JC address -8d e-JNG address -8c f-JNE address -8c JNL address -8cc JNC address -8d JMP address -8d NOP MOV reg,immediate -8c ADD d.reg,s1.reg,s2.reg ADC d.reg,s1.reg,s2.reg SUB d.reg,s1.reg,s2.reg SUW d.reg,s1.reg,s2.reg MUL d.reg,s1.reg,s2.reg DIV d.reg,s1.reg,s2.reg -2a TRSA d.reg,s1.reg -2c TRSB d.reg,s2.reg -2e a-AND d.reg,s1.reg,s2.reg b-OR d.reg,s1.reg,s2.reg c-NAND d.reg,s1.reg,s2.reg d-NOR d.reg,s1.reg,s2.reg e-XOR d.reg,s1.reg,s2.reg f-XNOR d.reg,s1.reg,s2.reg -3a NOT d.reg,s1.reg -3c007000

Main Modes IMMEDIATE MODE REGISTER, REGISTER MODE MEMORY MODE

IMMEDIATE MODE ROM- address 8bit REG-address 5 bit3bit IMMEDIATE 16 bit IR Register : Instructions : *MOV reg,immediate -8c *JG address -8c *JE address -8c *JL address -8cc00000 *JC address -8d *JNG address -8c *JNE address -8c *JNL address -8cc00800 *JNC address -8d *JMP address -8d400000

REGISTER REGISTER MODE IR Register : ROM- address 8bit Source_REG2 5 bit3bit Source_REG1 5 bit3bit Destination_REG 5 bit3bit Instructions : *ADD d.reg,s1.reg,s2.reg *ADC d.reg,s1.reg,s2.reg *SUB d.reg,s1.reg,s2.reg *SUW d.reg,s1.reg,s2.reg *MUL d.reg,s1.reg,s2.reg *DIV d.reg,s1.reg,s2.reg -2a *TRSA d.reg,s1.reg -2c *TRSB d.reg,s2.reg -2e *AND d.reg,s1.reg,s2.reg *OR d.reg,s1.reg,s2.reg *NAND d.reg,s1.reg,s2.reg *NOR d.reg,s1.reg,s2.reg *XOR d.reg,s1.reg,s2.reg *XNOR d.reg,s1.reg,s2.reg -3a *NOT d.reg,s1.reg -3c *CMP reg1,reg2 -3e002000

Indirect addressing MODE IR Register : 8bit5 bit3bit5 bit Instructions : *IDSTOWE address - 2c *IDSTOWO address - 2c *IDSTODW address - 2c *IDLODWE address - 2c *IDLODWO address - 2c1c7000 *IDLODDW address - 2c ROM- address 8bit Source_REG2 5 bit3bit Source_REG1 5 bit3bit Destination_REG 5 bit3bit

MEMORY MODE IR Register : ROM- address 8bit IMMEDIATE 16 bit REG-address 3bit5 bit Instructions : *STOWE address,reg -4c *STOWO address,reg -4c *STODW address,reg -4c *LODWE reg,address -4c *LODWO reg,address -4c1d5000 *INC reg,immediate *DEC reg,immediate *LODDW reg,address -4c *IPNT immediate -4c *PUSHWE reg -4c *PUSHWO reg -4c *PUSHDW reg -4c *POPWE reg -4c *POPWO reg -4c1d5600 *POPDW reg -4c395600

INSTRUCTION set B 31,B 30,B 29 (1) B 28,B 27,B 26, B 25 (2) B 24,B 23,B 22 (3) B 21,B 20,B 19, B 18 (4) B 17 B 16 B 15 B 14,B 13,B 12 B 11 (5)(6)(7)(8)(9) 1) Select Mode : {B 31 : Immediate mode, B 30 : Memory Mode, B 29 : Register-Register Mode} 2) Execution Control 3) Execution Conditional Control 4) Memory Control : {B 21 : BHE, B 20 :Select Memory, B 19 :Memory R/w, B 18 :Memory Even/Odd } 5) Select Write Back Block or TTY Block 6) Select The Input of the Write Back Block From Alu Result or Memory Output 7) No Operation 8) Register File Control { B 14 :Write Register, B 13 :OE Register,B 12 :Enabel Write Select Register } 9) Invert Condition

Tracing Some Instructions

MIPS Architecture based

For Example Executing These Two Instruction Sequentially I1:R1=R2+R3 I2:R4=R2 AND R1

I1: Fetching I2: Still in Memory l1

I1: Decoding & RegFetch R2 R3 I2: Fetching l2 l1

I1: Execute (R2 + R3) I2: Decoding & RegFetch R2 R1 l2 l1

I1: MEM[no Operation] (R2 + R3) I2: Execute (R2 AND R1) l2 l1

I1: Write Back R1=(R2 + R3) I2: MEM[no Operation] (R2 AND R1) Data Stored In R1 l2 l1

Solution I1:R1=R2+R3 NOP I2:R4=R2 AND R1

I1: Fetching I2: Still in Memory l1

I1: Decoding & RegFetch R2 R3 I2: Still in Memory L1 NOP

I1: Execute (R2 + R3) I2: Still in Memory NOP L1 NOP

I1: MEM[no Operation] (R2 + R3) I2: Still in Memory NOP L1 NOP

I1: Write Back R1=(R2 + R3) I2: Fetching Data Stored In R1 L1 NOP L2

I1: Terminated I2: Decoding & RegFetch R2 R1 L2 NOP

I1: Terminated I2:Execute (R2 AND R1) L2 NOP

I1: Terminated I2:MEM[No Operation] (R2 AND R1) L2 NOP

I1: Terminated I2: Terminated

Statistics & Comparisons

Cisc Vs Risc Cisc: -Richer instruction set but very complex circuit. -Instructions generally take more than 1 clock to execute. -Instructions of a variable size. Risc: -Instructions execute in one clock cycle. -Uniformed length instructions and fixed instruction format. -Simple instructions and circuit.

Speed: With Pipelining: Each stage takes 4 clock cycles 5 stages IF,ID,EX,MEM,WB If clock rate 5 MHz then time for performing an instruction per pipeline stage is 0.8 µsec. Without Pipelining: If clock rate 5 MHz then time for performing an instruction is 4 µsec.

MOV r1,05h MOV r2,04h ADD r3,r1,r2 STODW r3,1234h Pipelining If ID NOP If ID EXMEM IfIDEXMEM WB NOP

Average no. of stall cycles per instruction is 0.75 Speed up is 2.85

Thank you