Download presentation
Presentation is loading. Please wait.
Published byBetty Weaver Modified over 9 years ago
1
Lec 9Systems Architecture1 Systems Architecture Lecture 10: Alternative Instruction Sets Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some or all figures from Computer Organization and Design: The Hardware/Software Approach, Third Edition, by David Patterson and John Hennessy, are copyrighted material (COPYRIGHT 2004 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED).
2
Lec 9Systems Architecture2 Introduction Objective: To compare MIPS to several alternative instruction set architectures and to better understand the design decisions made in MIPS. MIPS is an example of a RISC (Reduced Instruction Set Computer) architecture as compared to a CISC (Complex Instruction Set Computer) architecture. MIPS trades complexity of instructions and hence greater number of instructions, for a simpler implementation and shorter clock cycle or reduced number of clock cycles per instruction. Alternative instruction set, including recent versions of MIPS –Provide more powerful operations –Aim at reducing the number of instructions executed –The danger is a slower cycle time and/or a higher CPI
3
Lec 9Systems Architecture3 Characteristics of MIPS Load/Store architecture General purpose register machine (32 registers) ALU operations have 3 register operands (2 source + 1 dest) 16 bit constants for immediate mode Simple instruction set –Simple branch operations (beq, bne) –Use register to set condition (e.g. slt) –Operations such as move, li, blt built from existing operations Uniform encoding –All instructions are 32-bits long –Opcode is always in the high-order 6 bits –3 types of instruction formats –Register fields in the same place for all formats
4
Lec 9Systems Architecture4 Design Principles Simplicity favors regularity –uniform instruction length –all ALU operations have 3 register operands –register addresses in the same location for all instruction formats Smaller is faster –register architecture –small number of registers Good design demands good compromises –fixed length instructions and only 16 bit constants –several instruction formats but consistent length Make common cases fast –immediate addressing –16 bit constants –only beq and bne
5
Lec 9Systems Architecture5 MIPS Addressing Modes Immediate Addressing –16 bit constant from low order bits of instruction –addi $t0, $s0, 4 Register Addressing –add $t0, $s0, $s1 Base Addressing (displacement addressing) –16-bit constant from low order bits of instruction plus base register –lw $t0, 16($sp) PC-Relative Addressing –(PC+4) + 16-bit address (word) from instruction –bne $s0, $s1, Target Pseudodirect Addressing –high order 4 bits of PC+4 concatenated with 26 bit word address - low order 26 bits from instruction shifted 2 bits to the right –j Address
6
Lec 9Systems Architecture6 PowerPC Similar to MIPS (RISC) Two additional addressing modes –indexed addressing - base register + index register PowerPC: lw $t1, $a0+$s3 MIPS: add $t0, $a0,$s3 lw $t1, 0($t0) –Update addressing - displacement addressing + increment PowerPC: lwu $t0, 4($s3) MIPS: lw $t0, 4($s3) addi $s3, $s3, 4 Additional instructions –separate counter register used for loops –PowerPC: bc Loop, ctr!=0 –MIPS: Loop: addi $t0, $t0, -1 bne $t0, $zero, Loop
7
Lec 9Systems Architecture Characteristics of 80x86 / IA-32 Evolved from 8086 (and backward compatible!!!) Register-Memory architecture 8 General purpose registers (evolved) Complex instruction set –Instruction lengths vary from 1 to 17 bytes long –A postbyte used to indicate addressing mode when not in opcode –Instructions may have many variants –Special instructions (move, push, pop, string, decimal) –Use condition codes –7 data addressing modes – complex - with 8 or 32 bit displacement –Instructions can operate on 8, 16, or 32 bits (mode) changed with prefix –One operand must act as both a source and destination –One operand can come from memory Saving grace: –the most frequently used instructions are not too difficult to build –compilers avoid the portions of the architecture that are slow
8
3 October 2015Chapter 2 — Instructions: Language of the Computer 8 The Intel x86 ISA Evolution with backward compatibility –8080 (1974): 8-bit microprocessor Accumulator, plus 3 index-register pairs –8086 (1978): 16-bit extension to 8080 Complex instruction set (CISC) –8087 (1980): floating-point coprocessor Adds FP instructions and register stack –80286 (1982): 24-bit addresses, MMU Segmented memory mapping and protection –80386 (1985): 32-bit extension (now IA-32) Additional addressing modes and operations Paged memory mapping as well as segments
9
3 October 2015Chapter 2 — Instructions: Language of the Computer 9 The Intel x86 ISA Further evolution… –i486 (1989): pipelined, on-chip caches and FPU Compatible competitors: AMD, Cyrix, … –Pentium (1993): superscalar, 64-bit datapath Later versions added MMX (Multi-Media eXtension) instructions The infamous FDIV bug –Pentium Pro (1995), Pentium II (1997) New microarchitecture (see Colwell, The Pentium Chronicles) –Pentium III (1999) Added SSE (Streaming SIMD Extensions) and associated registers –Pentium 4 (2001) New microarchitecture Added SSE2 instructions
10
3 October 2015Chapter 2 — Instructions: Language of the Computer 10 The Intel x86 ISA And further… –AMD64 (2003): extended architecture to 64 bits –EM64T – Extended Memory 64 Technology (2004) AMD64 adopted by Intel (with refinements) Added SSE3 instructions –Intel Core (2006) Added SSE4 instructions, virtual machine support –AMD64 (announced 2007): SSE5 instructions Intel declined to follow, instead… –Advanced Vector Extension (announced 2008) Longer SSE registers, more instructions If Intel didn’t extend with compatibility, its competitors would! –Technical elegance ≠ market success
11
Lec 9Systems Architecture11 IA-32 Registers and Data Addressing Registers in the 32-bit subset that originated with 80386
12
Lec 9Systems Architecture12 IA-32 Addressing Modes ModeDescriptionMIPS equivalent Register indirect address in registerlw $s0, 0($s1) Based mode with 8 or 32-bit displacement address is contents of base register plus displacement lw $s0, const($s1) # const <= 16 bits Base plus scaled index (not in MIPS) Base + (2 scale index) mul $t0, $s2, 2 scale add $t0, $t0, $s1 lw $s0, 0($t0) Base plus scaled index 8 or 32-bit plus displacement (not in MIPS) Base + (2 scale index) + displacement mul $t0, $s2, 2 scale add $t0, $t0, $s1 lw $s0, const($t0) # const <= 16 bits There are some restrictions on register use ( not “general purpose”).
13
Lec 9Systems Architecture13 Typical IA-32 Instructions InstructionFunction JE name if equal(condition code) EIP = name, EIP - 128 < name < EIP + 128 JMP nameEIP = name CALL nameSP = SP - 4; M[SP] = EIP + 5; EIP = name MOVW EBX,[EDI+45]EBX = M[EDI+45] PUSH ESISP = SP - 4; M[SP] = ESI POP EDIEDI = M[SP]; SP = SP + 4 ADD EAX,#6765EAX = EAX + 6765 TEST EDX, #42set condition code (flags) with EDX and 42 MOVSLM[EDI] = M[ESI]; EDI = EDI + 4; ESI = ESI + 4
14
Lec 9Systems Architecture14 IA-32 instruction Formats Typical formats: (note the different instruction lengths)
15
3 October 2015Chapter 2 — Instructions: Language of the Computer 15 Implementing IA-32 Complex instruction set makes implementation difficult –Hardware translates instructions to simpler microoperations Simple instructions: 1–1 Complex instructions: 1–many –Microengine similar to RISC –Market share makes this economically viable Comparable performance to RISC –Compilers avoid complex instructions
16
Lec 9Systems Architecture16 Architecture Evolution Accumulator –EDSAC Extended Accumulator (special purpose register) –Intel 8086 General Purpose Register –register-register (CDC 6600, MIPS, SPARC, PowerPC) –register-memory (Intel 80386, IBM 360) –memory-memory (VAX) Alternative –stack –high-level language
17
3 October 2015Chapter 2 — Instructions: Language of the Computer 17 Example: Clearing and Array clear1(int array[], int size) { int i; for (i = 0; i < size; i += 1) array[i] = 0; } clear2(int *array, int size) { int *p; for (p = &array[0]; p < &array[size]; p = p + 1) *p = 0; } move $t0,$zero # i = 0 loop1: sll $t1,$t0,2 # $t1 = i * 4 add $t2,$a0,$t1 # $t2 = # &array[i] sw $zero, 0($t2) # array[i] = 0 addi $t0,$t0,1 # i = i + 1 slt $t3,$t0,$a1 # $t3 = # (i < size) bne $t3,$zero,loop1 # if (…) # goto loop1 move $t0,$a0 # p = & array[0] sll $t1,$a1,2 # $t1 = size * 4 add $t2,$a0,$t1 # $t2 = # &array[size] loop2: sw $zero,0($t0) # Memory[p] = 0 addi $t0,$t0,4 # p = p + 4 slt $t3,$t0,$t2 # $t3 = #(p<&array[size]) bne $t3,$zero,loop2 # if (…) # goto loop2
18
3 October 2015Chapter 2 — Instructions: Language of the Computer 18 Comparison of Array vs. Ptr Multiply “strength reduced” to shift Array version requires shift to be inside loop –Part of index calculation for incremented i –c.f. incrementing pointer Compiler can achieve same effect as manual use of pointers –Induction variable elimination –Better to make program clearer and safer
19
3 October 2015Chapter 2 — Instructions: Language of the Computer 19 ARM & MIPS Similarities ARM: the most popular embedded core Similar basic set of instructions to MIPS ARMMIPS Date announced1985 Instruction size32 bits Address space32-bit flat Data alignmentAligned Data addressing modes93 Registers15 × 32-bit31 × 32-bit Input/output Memory mapped
20
3 October 2015Chapter 2 — Instructions: Language of the Computer 20 Compare and Branch in ARM Uses condition codes for result of an arithmetic/logical instruction –Negative, zero, carry, overflow –Compare instructions to set condition codes without keeping the result Each instruction can be conditional –Top 4 bits of instruction word: condition value –Can avoid branches over single instructions
21
3 October 2015Chapter 2 — Instructions: Language of the Computer 21 Instruction Encoding
22
3 October 2015Chapter 2 — Instructions: Language of the Computer 22 Fallacies Powerful instruction higher performance –Fewer instructions required –But complex instructions are hard to implement May slow down all instructions, including simple ones –Compilers are good at making fast code from simple instructions Use assembly code for high performance –But modern compilers are better at dealing with modern processors –More lines of code more errors and less productivity
23
3 October 2015Chapter 2 — Instructions: Language of the Computer 23 Fallacies Backward compatibility instruction set doesn’t change –But they do accrete more instructions x86 instruction set
24
3 October 2015Chapter 2 — Instructions: Language of the Computer 24 Pitfalls Sequential words are not at sequential addresses –Increment by 4, not by 1! Keeping a pointer to an automatic variable after procedure returns –e.g., passing pointer back via an argument –Pointer becomes invalid when stack popped
25
3 October 2015Chapter 2 — Instructions: Language of the Computer 25 Concluding Remarks Design principles 1.Simplicity favors regularity 2.Smaller is faster 3.Make the common case fast 4.Good design demands good compromises Layers of software/hardware –Compiler, assembler, hardware MIPS: typical of RISC ISAs –c.f. x86
26
3 October 2015Chapter 2 — Instructions: Language of the Computer 26 Concluding Remarks Measure MIPS instruction executions in benchmark programs –Consider making the common case fast –Consider compromises Instruction classMIPS examplesSPEC2006 IntSPEC2006 FP Arithmetic add, sub, addi 16%48% Data transfer lw, sw, lb, lbu, lh, lhu, sb, lui 35%36% Logical and, or, nor, andi, ori, sll, srl 12%4% Cond. Branch beq, bne, slt, slti, sltiu 34%8% Jump j, jr, jal 2%0%
27
Lec 9Systems Architecture Instruction complexity is only one variable –lower instruction count vs. higher CPI / lower clock rate Design Principles: –simplicity favors regularity –smaller is faster –good design demands compromise –make the common case fast Instruction set architecture –a very important abstraction indeed! Summary
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.