Computer Design – Introduction 1 MAMAS – Computer Architecture 234267 Dr. Lihu Rappoport Some of the slides were taken from Avi Mendelson, Randi Katz,

Slides:



Advertisements
Similar presentations
Philips Research ICS 252 class, February 3, The Trimedia CPU64 VLIW Media Processor Kees Vissers Philips Research Visiting Industrial Fellow
Advertisements

Chapter 3 Instruction Set Architecture Advanced Computer Architecture COE 501.
1 Lecture 3: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation.
RISC / CISC Architecture By: Ramtin Raji Kermani Ramtin Raji Kermani Rayan Arasteh Rayan Arasteh An Introduction to Professor: Mr. Khayami Mr. Khayami.
Computer Abstractions and Technology
Performance Evaluation of Architectures Vittorio Zaccaria.
Computer Architecture and Data Manipulation Chapter 3.
RISC vs CISC CS 3339 Lecture 3.2 Apan Qasem Texas State University Spring 2015 Some slides adopted from Milo Martin at UPenn.
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
Computer Architecture 2009 – Introduction 1 MAMAS – Computer Architecture Lecturer: Dr. Lihu Rappoport Some of the slides were taken from Avi Mendelson,
Introduction Lihu Rappoport, 10/ MAMAS – Computer Architecture Dr. Lihu Rappoport Some of the slides were taken from: (1) Avi Mendelson (2)
CIS 314 : Computer Organization Lecture 1 – Introduction.
Computer Architecture 2011 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, ) Spring 2011 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub.
MAMAS – Computer Structure
Computer Architecture 2012 – Introduction (lec1) 1 Computer Architecture (“MAMAS”, ) Spring 2012 Lecturer: Dan Tsafrir Reception: Mon 18:30, Taub.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
Processor Organization and Architecture
RISC and CISC. Dec. 2008/Dec. and RISC versus CISC The world of microprocessors and CPUs can be divided into two parts:
Part 1.  Intel x86/Pentium family  32-bit CISC processor  SUN SPARC and UltraSPARC  32- and 64-bit RISC processors  Java  C  C++  Java  Why Java?
1 Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
CH13 Reduced Instruction Set Computers {Make hardware Simpler, but quicker} Key features  Large number of general purpose registers  Use of compiler.
Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( ) 2005.
Computer Architecture 2015 – Introduction 1 Computer Architecture (“MAMAS”, ) Spring 2015 Lecturer: Yoav Etsion Reception: Mon 15:00, Fishbach
RISC:Reduced Instruction Set Computing. Overview What is RISC architecture? How did RISC evolve? How does RISC use instruction pipelining? How does RISC.
Computer Structure 2012 – Introduction 1 MAMAS – Computer Structure Lecturers: Lihu Rappoport Adi Yoaz Some of the slides were taken from Avi Mendelson,
Computer Architecture 2014 – Introduction 1 Computer Architecture (“MAMAS”, ) Spring 2014 Lecturer: Yoav Etsion Reception: Mon 15:00, Fishbach
Chapter 2 The CPU and the Main Board  2.1 Components of the CPU 2.1 Components of the CPU 2.1 Components of the CPU  2.2Performance and Instruction Sets.
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
Part 1.  Intel x86/Pentium family  32-bit CISC processor  SUN SPARC and UltraSPARC  32- and 64-bit RISC processors  Java  C  C++  Java  Why Java?
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
1 Instruction Set Architecture (ISA) Alexander Titov 10/20/2012.
Computer Engineering Rabie A. Ramadan Lecture 1. 2 Welcome Back.
Computer Architecture CPSC 350
Computer Structure 2013 – Introduction 1 MAMAS – Computer Structure Lecturers: Lihu Rappoport Adi Yoaz Some of the slides were taken from Avi Mendelson,
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
Morgan Kaufmann Publishers
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Performance Performance
1 Chapter 2 Central Processing Unit. 2 CPU The "brain" of the computer system is called the central processing unit. Everything that a computer does is.
Lecture on Central Process Unit (CPU)
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO CS 219 Computer Organization.
Computer Organization CS345 David Monismith Based upon notes by Dr. Bill Siever and from the Patterson and Hennessy Text.
Lecturer: Roni Kupershtok Prepared by Lihu Rappoport
Topics to be covered Instruction Execution Characteristics
15-740/ Computer Architecture Lecture 3: Performance
Lecture 3: MIPS Instruction Set
How do we evaluate computer architectures?
Visit for more Learning Resources
A Closer Look at Instruction Set Architectures
Lecturers: Lihu Rappoport Adi Yoaz
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Computer Architecture CSCE 350
Chapter 1 Fundamentals of Computer Design
Computer Architecture
CMSC 611: Advanced Computer Architecture
Performance of computer systems
Lecture 3: MIPS Instruction Set
Arrays versus Pointers
COMS 361 Computer Organization
Chapter 12 Pipelining and RISC
Performance of computer systems
CMSC 611: Advanced Computer Architecture
Lecture 4: Instruction Set Design/Pipelining
Lecturers: Lihu Rappoport Adi Yoaz
Presentation transcript:

Computer Design – Introduction 1 MAMAS – Computer Architecture Dr. Lihu Rappoport Some of the slides were taken from Avi Mendelson, Randi Katz, Patterson, Gabriel Loh

Computer Design – Introduction 2 General Course Information u Grade  20% Exercise (mandatory)  80% Final exam u Textbooks  Computer Architecture a Quantitative Approach: Hennessy & Patterson u Other course information  Course web site:  Foils will be on the web several days before the class

Computer Design – Introduction 3 Class Focus u CPU  Introduction: performance, instruction set (RISC vs. CISC)  Pipeline, hazards  Branch prediction  Out-of-order execution u Memory Hierarchy  Cache  Main memory  Virtual Memory u Advanced Topics u PC Architecture  Motherboard & chipset, DRAM, I/O, Disk, peripherals

Computer Design – Introduction 4 Computer System Structure CPU PCI North Bridge DDRII Channel 1 mouse LAN Lan Adap Graphic Adapt Mem BUS CPU BUS Cache Sound Card speakers South Bridge PCI express ×16 IDE controller IO Controller DVD Drive Hard Disk Parallel Port Serial Port Floppy Drive keybrd DDRII Channel 2 USB controller SATA controller PCI express ×1

Computer Design – Introduction 5 Architecture & Microarchitecture u Architecture The processor features seen by the “user”  Instruction set, addressing modes, data width, … u Micro-architecture The way of implementation of a processor  Caches size and structure, number of execution units, …  Timing is considered uArch (though it is user visible) u Processors with different uArch can support the same Architecture

Computer Design – Introduction 6 Compatibility u Backward compatibility  New hardware can run existing software Core2 Duo  can run SW written for Pentium  4, Pentium  M, Pentium  III, Pentium  II, Pentium , 486, 386, 268 u Forward compatibility  New software can run on existing hardware  Example: new software written with SSE2TM runs on older processor which does not support SSE2TM  Commonly supports one or two generations behind u Architecture independent SW  JIT – just in time compiler: Java and.NET  Binary translation

Computer Design – Introduction 7 Performance

8 Technology Trends and Performance u Computing capacity:4× per 3 years  If we could keep all the transistors busy all the time  Actual: 3.3× per 3 years u Moore’s Law: Performance is doubled every ~18 months  Trend is slowing: process scaling declines, power is up 2× in 3 years 1.1× in 3 years CPU speed and Memory speed grow apart 2× in 3 years 4× in 3 years

Computer Design – Introduction 9 Moore’s Law Graph taken from:

Computer Design – Introduction 10 CPI – Cycles Per Instruction u CPUs work according to a clock signal  Clock cycle is measured in nsec (10 -9 of a second)  Clock frequency (= 1/clock cycle) measured in GHz (10 9 cyc/sec) u Instruction Count (IC)  Total number of instructions executed in the program u CPI – Cycles Per Instruction  Average #cycles per Instruction (in a given program)  IPC (= 1/CPI) : Instructions per cycles CPI = #cycles required to execute the program IC

Computer Design – Introduction 11 CPU Time u CPU Time - time required to execute a program CPU Time = IC  CPI  clock cycle u Our goal: minimize CPU Time  Minimize clock cycle: more GHz (process, circuit, uArch)  Minimize CPI: uArch (e.g.: more execution units)  Minimize IC: architecture (e.g.: SSE TM )

Computer Design – Introduction 12 Speedup overall = ExTime old ExTime new = 1 Speedup enhanced Fraction enhanced (1 - Fraction enhanced ) + Suppose enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected, then: Amdahl’s Law ExTime new = ExTime old × Speedup enhanced Fraction enhanced (1 – Fraction enhanced ) +

Computer Design – Introduction 13 Floating point instructions improved to run at 2×, but only 10% of executed instructions are FP Speedup overall = =1.053 ExTime new = ExTime old × ( / 2) = 0.95 × ExTime old Corollary: Make The Common Case Fast Amdahl’s Law: Example

Computer Design – Introduction 14 Calculating the CPI of a Program u ICi: #times instruction of type i is executed in the program u IC: #instruction executed in the program: u Fi: relative frequency of instruction of type i : Fi = ICi/IC u CPI i – #cycles to execute instruction of type i  e.g.: CPI add = 1, CPI mul = 3 u #cycles required to execute the program: u CPI:

Computer Design – Introduction 15 Comparing Performance u Peak Performance  MIPS, MFLOPS  Often not useful: unachievable / unsustainable in practice u Benchmarks  Real applications, or representative parts of real apps  Targeted at the specific system usages u SPEC INT – integer applications  Data compression, C complier, Perl interpreter, database system, chess-playing, Text-processing, … u SPEC FP – floating point applications  Mostly important scientific applications u TPC Benchmarks  Measure transaction-processing throughput

Computer Design – Introduction 16 The ISA is what the user / compiler see The HW implements the ISA instruction set software hardware Instruction Set Design

Computer Design – Introduction 17 ISA Considerations u Code size  Long instructions take more time to fetch  Longer instructions require a larger memory Important in small devices, e.g., cell phones u Number of instructions (IC)  Reducing IC reduce execution time At a given CPI and frequency u Code “simplicity”  Simple HW implementation Higher frequency and lower power  Code optimization can better be applied to “simple code”

Computer Design – Introduction 18 Architectural Consideration Example u Displacement Address Size  1% of addresses > 16-bits  bits of displacement needed 0% 10% 20% 30% Address Bits Int. Avg. FP Avg.

Computer Design – Introduction 19 CISC Processors u CISC - Complex Instruction Set Computer  The idea: a high level machine language  Example: x86 u Characteristic  Many instruction types, with a many addressing modes  Some of the instructions are complex Execute complex tasks Require many cycles  ALU operations directly on memory Only a few registers, in many cases not orthogonal  Variable length instructions common instructions get short codes  save code length

Computer Design – Introduction 20 Rankinstruction% of total executed 1load22% 2conditional branch20% 3compare16% 4store12% 5add8% 6and6% 7sub5% 8move register-register4% 9call1% 10return1% Total96% Simple instructions dominate instruction frequency Top 10 x86 Instructions

Computer Design – Introduction 21 CISC Drawbacks u Complex instructions and complex addressing modes  complicates the processor  slows down the simple, common instructions  contradicts Make The Common Case Fast u Compilers don’t use complex instructions / indexing methods u Variable length instructions are real pain in the neck  Difficult to decode few instructions in parallel As long as instruction is not decoded, its length is unknown  It is unknown where the instruction ends  It is unknown where the next instruction starts  An instruction may be over more than a single cache line  An instruction may be over more than a single page

Computer Design – Introduction 22 RISC Processors u RISC - Reduced Instruction Set Computer  The idea: simple instructions enable fast hardware u Characteristic  A small instruction set, with only a few instructions formats  Simple instructions execute simple tasks Most of them require a single cycle (with pipeline)  A few indexing methods  ALU operations on registers only Memory is accessed using Load and Store instructions only Many orthogonal registers Three address machine: Add dst, src1, src2  Fixed length instructions u Examples: MIPS TM, Sparc TM, Alpha TM, Power TM

Computer Design – Introduction 23 RISC Processors (Cont.) u Simple architecture  Simple micro-architecture  Simple, small and fast control logic  Simpler to design and validate  Room for large on die caches  Shorten time-to-market u Using a smart compiler  Better pipeline usage  Better register allocation u Existing RISC processor are not “pure” RISC  e.g., support division which takes many cycles

Computer Design – Introduction 24 Compilers and ISA u Ease of compilation  Orthogonality: no special registers few special cases all operand modes available with any data type or instruction type  Regularity: no overloading for the meanings of instruction fields  streamlined resource needs easily determined u Register Assignment is critical too  Easier if lots of registers

Computer Design – Introduction 25 CISC Is Dominant u The x86 architecture, which is a CISC architecture, dominates the processor market  A vast amount of existing software  Intel, AMD, Microsoft and others benefit from this Intel and AMD put a lot of money to make high performance x86 processors, despite the architectural disadvantage Current x86 processor give the best cost/performance  CISC processors use  arch ideas from the RISC world  Starting at Pentium  II and K6 , x86 processors translate CISC instructions into RISC-like operations internally the inside core looks much like that of a RISC processor

Computer Design – Introduction 26 Software Specific Extensions u Extend arch to accelerate exec of specific apps u Example: SSE TM – Streaming SIMD Extensions  128-bit packed (vector) / scalar single precision FP (4×32)  Introduced on Pentium® III on ’99  8 new 128 bit registers (XMM0 – XMM7)  Accelerates graphics, video, scientific calculations, … u Packed:Scalar: x0x1x2x3 y0y1y2y3 x0+y0x1+y1x2+y2 x3+y bits x0x1x2x3 y0y1y2y3 x0+y0y1y2 y bits