PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, 2000 1 PZ13A - Processor design Programming Language Design.

Slides:



Advertisements
Similar presentations
Instruction Level Parallelism and Superscalar Processors
Advertisements

CH14 Instruction Level Parallelism and Superscalar Processors
Computer Organization and Architecture
CSCI 4717/5717 Computer Architecture
CPU Review and Programming Models CT101 – Computing Systems.
Topics covered: CPU Architecture CSE 243: Introduction to Computer Architecture and Hardware/Software Interface.
Computer Organization and Architecture
10/11: Lecture Topics Slides on starting a program from last time Where we are, where we’re going RISC vs. CISC reprise Execution cycle Pipelining Hazards.
Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.
10-1 Chapter 10 - Trends in Computer Architecture Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Parallel.
PZ13B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ13B - Client server computing Programming Language.
PZ11B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ11B - Parallel execution Programming Language Design.
Chapter 14 Superscalar Processors. What is Superscalar? “Common” instructions (arithmetic, load/store, conditional branch) can be executed independently.
Chapter 12 Pipelining Strategies Performance Hazards.
Midterm Wednesday Chapter 1-3: Number /character representation and conversion Number arithmetic Combinational logic elements and design (DeMorgan’s Law)
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
Chapter 12 CPU Structure and Function. Example Register Organizations.
CHAPTER 8: CPU and Memory Design, Enhancement, and Implementation
Chapter 14 Instruction Level Parallelism and Superscalar Processors
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
RISC CSS 548 Joshua Lo.
Advanced Computer Architectures
COMPUTER ORGANIZATIONS CSNB123 May 2014Systems and Networking1.
Processor Structure & Operations of an Accumulator Machine
Instruction Sets and Pipelining Cover basics of instruction set types and fundamental ideas of pipelining Later in the course we will go into more depth.
Parallelism Processing more than one instruction at a time. Pipelining
Chapter 5 Basic Processing Unit
Previously Fetch execute cycle Pipelining and others forms of parallelism Basic architecture This week we going to consider further some of the principles.
What have mr aldred’s dirty clothes got to do with the cpu
1 Instruction Sets and Beyond Computers, Complexity, and Controversy Brian Blum, Darren Drewry Ben Hocking, Gus Scheidt.
RISC Architecture RISC vs CISC Sherwin Chan.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
M. Mateen Yaqoob The University of Lahore Spring 2014.
CPS 4150 Computer Organization Fall 2006 Ching-Song Don Wei.
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Pipelining and Parallelism Mark Staveley
CISC and RISC 12/25/ What is CISC? acronym for Complex Instruction Set Computer Chips that are easy to program and which make efficient use of memory.
Performance Tuning John Black CS 425 UNR, Fall 2000.
COMPUTER ORGANIZATIONS CSNB123 NSMS2013 Ver.1Systems and Networking1.
1 Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
PZ09A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ09A - Activation records Programming Language Design.
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 1: Overview of High Performance Processors * Jeremy R. Johnson Wed. Sept. 27,
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
CPIT Program Execution. Today, general-purpose computers use a set of instructions called a program to process data. A computer executes the.
Instruction level parallelism And Superscalar processors By Kevin Morfin.
1 Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Topics to be covered Instruction Execution Characteristics
Advanced Architectures
Overview Introduction General Register Organization Stack Organization
CS1251 Computer Architecture
Chapter 14 Instruction Level Parallelism and Superscalar Processors
Morgan Kaufmann Publishers The Processor
Computer Architecture and the Fetch-Execute Cycle
Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
PZ01C - Machine architecture
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
CPU Structure CPU must:
Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Presentation transcript:

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ13A - Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Traditional processor design

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Ways to speed up execution To increase processor speed Increase functionality of processor - add more complex instructions (CISC - Complex Instruction Set Computers) Need more cycles to execute instruction: Chapter 2 assumes 2 cycles: - fetch instruction - execute instruction How many: - fetch instruction - decode operands - fetch operand registers - decode instruction - perform operation - decode resulting address - move data to store register - store result  8 cycles per instruction, not 2

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Alternative - RISC Have simple instructions, each executes in one cycle RISC - Reduced instruction set computer Speed - one cycle for each operation, but more operations. For example: A=B+C CISC: 3 instructions Load Register 1, B Add Register 1, C Store Register 1, A RISC:10 instruction Address of B in read register Read B Move data Register 1 Address of C in read register Read C Move data Register 2 Add Register 1, Register 2, Register 3 Move Register 3 to write register Address of A in write register Write A to memory

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Aspects of RISC design Single cycle instructions Large control memory - often more than 100 registers. Fast procedure invocation - activation record invocation part of hardware. Put activation records totally within registers.

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Implications Cannot compare processor speeds of a RISC and CISC processor: CISC - perhaps 8-10 cycles per instruction RISC - 1 cycle per instruction CISC can do some of these operations in parallel.

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Pipeline architecture CISC design: 1. Retrieve instruction from main memory. 2. Decode operation field, source data, and destination data. 3. Get source data for operation. 4. Perform operation. Pipeline design: while Instruction 1 is executing Instruction 2 is retrieving source data Instruction 3 is being decoded Instruction 4 is being retrieved from memory. Four instructions at once, with an instruction completion each cycle.

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Impact on language design With a standard CISC design, the statement E=A+B+C+D will have the postfix EAB+C+D+= and will execute as follows: 1. Add A to B, giving sum. 2. Add C to sum. 3. Add D to sum. 4. Store sum in E. But, Instruction 2 cannot retrieve sum (the result of adding A to B until the previous instruction stores that result. This causes the processor to wait a cycle on Instruction 2 until Instruction 1 completes its execution. A more intelligent translator would develop the postfix EAB+CD++=, which would allow A+B to be computed in parallel with C+D

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Further optimization E=A+B+C+D J=F+G+H+I has the postfix AB+FG+CD+HI+(1)(3)+(2)(4)+E(5)=F(6)= [where the numbers indicate the operation number within that expression]. In this case, each statement executes with no interference from the other, and the processor executes at the full speed of the pipeline

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Conditionals A = B + C; if D then E = 1 ELSE E = 2 Consider the above program. A pipeline architecture may even start to execute (E=1) before it evaluates to see if D is true. Options could be: If branch is take, wait for pipeline to empty. This slows down machine considerably at each branch Simply execute the pipeline as is. The compiler has to make sure that if the branch is taken, there is nothing in the pipeline that can be affected. This puts a great burden on the compiler writer. Paradoxically the above program can be compiled and run more efficiently as if the following was written: if D (A=B+C) then E = 1 ELSE E = 2 [Explain this strange behavior.]

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Summary New processor designs are putting more emphasis on good language processor designs. Need for effective translation models to allow languages to take advantage of faster hardware. What’s worse - simple translation strategies of the past, may even fail now.

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Multiprocessor system architecture Impact: Multiple processors independently executing Need for more synchronization Cache coherence: Data in local cache may not be up to date.

PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, Tightly coupled systems Architecture on previous slide are examples of tightly coupled systems. All processors have access to all memory Semaphores can be used to coordinate communication among processors Quick (nanosecond) access to all data Later look at networks of machines (Loosely couples machines): Processors have access to only local memory Semaphores cannot be used Relatively slow (millisecond) access to some data Model necessary for networks like the Internet