Presentation is loading. Please wait.

Presentation is loading. Please wait.

EI209 Chapter 2.1Haojin Zhu, SJTU, 2015 EI 209 Computer Organization Fall 2015 Chapter 2: Instructions: Language of the Computer Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/

Similar presentations


Presentation on theme: "EI209 Chapter 2.1Haojin Zhu, SJTU, 2015 EI 209 Computer Organization Fall 2015 Chapter 2: Instructions: Language of the Computer Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/"— Presentation transcript:

1 EI209 Chapter 2.1Haojin Zhu, SJTU, 2015 EI 209 Computer Organization Fall 2015 Chapter 2: Instructions: Language of the Computer Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/ )http://tdt.sjtu.edu.cn/~hjzhu/ Course Website: http://nsec.sjtu.edu.cn/teaching/ei209.html [Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, © 2012, MK]

2 EI209 Chapter 2.2Haojin Zhu, SJTU, 2015 Assignment I  Assignment I Chapter I: 1.16  Coding Assignment I: Design a program to check if your computer is big endian or little endian? Please submit the source code and the execution results. Submission Due: Oct.8

3 EI209 Chapter 2.3Haojin Zhu, SJTU, 2015 The Language a Computer Understands  Word a computer understands: instruction  Vocabulary of all words a computer understands: instruction set (aka instruction set architecture or ISA)  Different computers may have different vocabularies (i.e., different ISAs) l iPhone (ARM) not same as Macbook (x86)  Or the same vocabulary (i.e., same ISA) l iPhone and iPad computers have same instruction set (ARM)

4 EI209 Chapter 2.4Haojin Zhu, SJTU, 2015 The Language a Computer Understands  Why not all the same? Why not all different? What might be pros and cons? l Single ISA (to rule them all): -Leverage common compilers, operating systems, etc. -BUT fairly easy to retarget these for different ISAs (e.g., Linux, gcc) l Multiple ISAs: -Specialized instructions for specialized applications -Different tradeoffs in resources used (e.g., functionality, memory demands, complexity, power consumption, etc.) -Competition and innovation is good, especially in emerging environments (e.g., mobile devices)

5 EI209 Chapter 2.5Haojin Zhu, SJTU, 2015 MIPS: Instruction Set for EI209  MIPS is a real-world ISA (see www.mips.com ) l Standard instruction set for networking equipment l Was also used in original Nintendo-64!  Elegant example of a Reduced Instruction Set Computer (RISC) instruction set  Invented by John Hennessy @ Stanford l Why not Berkeley/Sun RISC invented by Dave Patterson? Ask him!

6 EI209 Chapter 2.6Haojin Zhu, SJTU, 2015 RISC Design Principles  Basic RISC principle: “A simpler CPU (the hardware that interprets machine language) is a faster CPU” (CPU  Core)  Focus of the RISC design is reduction of the number and complexity of instructions in the ISA  A number of the more common strategies include: l Fixed instruction length, generally a single word; Simplifies process of fetching instructions from memory l Simplified addressing modes; Simplifies process of fetching operands from memory l Fewer and simpler instructions in the instruction set; Simplifies process of executing instructions l Only load and store instructions access memory; E.g., no add memory to register, add memory to memory, etc. l Let the compiler do it. Use a good compiler to break complex high- level language statements into a number of simple assembly language statements

7 EI209 Chapter 2.7Haojin Zhu, SJTU, 2015 Mainstream ISAs  ARM (Advanced RISC Machine) is most popular RISC l In every smart phone-like device (e.g., iPhone, iPad, iPod, …)  Intel 80x86 is another popular ISA and is used in Macbook and PCs (Core i3, Core i5, Core i7, …) l x86 is a Complex Instruction Set Computer (CISC) l 20x ARM sold vs. 80x86 (i.e., 5 billion vs. 0.3 billion)

8 EI209 Chapter 2.8Haojin Zhu, SJTU, 2015 MIPS Green Card Fall 2012 -- Lecture #68

9 EI209 Chapter 2.9Haojin Zhu, SJTU, 2015 MIPS Green Card Fall 2012 -- Lecture #69

10 EI209 Chapter 2.10Haojin Zhu, SJTU, 2015 Two Key Principles of Machine Design 1. Instructions are represented as numbers and, as such, are indistinguishable from data 2. Programs are stored in alterable memory (that can be read or written to) just like data  Stored-program concept l Programs can be shipped as files of binary numbers – binary compatibility l Computers can inherit ready-made software provided they are compatible with an existing ISA – leads industry to align around a small number of ISAs Accounting prg (machine code) C compiler (machine code) Payroll data Source code in C for Acct prg Memory

11 EI209 Chapter 2.11Haojin Zhu, SJTU, 2015 MIPS-32 ISA  Instruction Categories l Computational l Load/Store l Jump and Branch l Floating Point -coprocessor l Memory Management l Special R0 - R31 PC HI LO Registers op rs rt rdsafunct rs rt immediate jump target 3 Instruction Formats: all 32 bits wide R format I format J format

12 EI209 Chapter 2.12Haojin Zhu, SJTU, 2015 MIPS (RISC) Design Principles  Simplicity favors regularity l fixed size instructions l small number of instruction formats l opcode always the first 6 bits  Smaller is faster l limited instruction set l limited number of registers in register file l limited number of addressing modes  Make the common case fast l arithmetic operands from the register file (load-store machine) l allow instructions to contain immediate operands  Good design demands good compromises l three instruction formats

13 EI209 Chapter 2.13Haojin Zhu, SJTU, 2015 MIPS Arithmetic Instructions  MIPS assembly language arithmetic statement add$t0, $s1, $s2 sub$t0, $s1, $s2  Each arithmetic instruction performs one operation  Each specifies exactly three operands that are all contained in the datapath’s register file ( $t0,$s1,$s2 ) destination  source1 op source2  Instruction Format (R format) 0 17 18 8 0 0x22

14 EI209 Chapter 2.14Haojin Zhu, SJTU, 2015 MIPS Arithmetic Instructions  MIPS assembly language arithmetic statement add$t0, $s1, $s2 sub$t0, $s1, $s2  Each arithmetic instruction performs one operation  Each specifies exactly three operands that are all contained in the datapath’s register file ( $t0,$s1,$s2 ) destination  source1 op source2  Instruction Format (R format) 0 17 18 8 0 0x22

15 EI209 Chapter 2.15Haojin Zhu, SJTU, 2015  MIPS fields are given names to make them easier to refer to MIPS Instruction Fields op rs rt rd shamt funct op6-bitsopcode that specifies the operation rs5-bitsregister file address of the first source operand rt5-bitsregister file address of the second source operand rd5-bitsregister file address of the result’s destination shamt5-bitsshift amount (for shift instructions) funct6-bitsfunction code augmenting the opcode

16 EI209 Chapter 2.16Haojin Zhu, SJTU, 2015 Computer Hardware Operands  High-Level Programming languages: could have millions of variables  Instruction sets have fixed, small number  Called registers l “Bricks” of computer hardware l Fastest way to store data in computer hardware l Visible to (the “assembly language”) programmer  MIPS Instruction Set has 32 integer registers

17 EI209 Chapter 2.17Haojin Zhu, SJTU, 2015 Why Just 32 Registers?  RISC Design Principle: Smaller is faster l But you can be too small …  Hardware would likely be slower with 64, 128, or 256 registers  32 is enough for compiler to translate typical C programs, and not run out of registers very often l ARM instruction set has only 16 registers l May be faster, but compiler may run out of registers too often (aka “spilling registers to memory”)

18 EI209 Chapter 2.18Haojin Zhu, SJTU, 2015 Size of Registers  Bit is the atom of Computer Hardware: contains either 0 or 1 l True “alphabet” of computer hardware is 0, 1 l Will eventually express MIPS instructions as combinations of 0s and 1s (in Machine Language)  MIPS registers are 32 bits wide  MIPS calls this quantity a word l Some computers use 16-bit or 64-bit wide words l E.g., Intel 8086 (16-bit), MIPS64 (64-bit)

19 EI209 Chapter 2.19Haojin Zhu, SJTU, 2015 Some Examples of 64-bit processors What is the first 64-bit Apple phone? What is the first 64-bit android phone? Iphone 5s HTC Desire 820

20 EI209 Chapter 2.20Haojin Zhu, SJTU, 2015 MIPS Register File Register File src1 addr src2 addr dst addr write data 32 bits src1 data src2 data 32 locations 32 5 5 5  Holds thirty-two 32-bit registers l Two read ports and l One write port  Registers are l Faster than main memory -But register files with more locations are slower (e.g., a 64 word file could be as much as 50% slower than a 32 word file) -Read/write port increase impacts speed quadratically l Easier for a compiler to use -e.g., (A*B) – (C*D) – (E*F) can do multiplies in any order vs. stack l Can hold variables so that -code density improves (since register are named with fewer bits than a memory location) write control

21 EI209 Chapter 2.21Haojin Zhu, SJTU, 2015 Aside: MIPS Register Convention NameRegister Number UsagePreserve on call? $zero0constant 0 (hardware)n.a. $at1reserved for assemblern.a. $v0 - $v12-3returned valuesno $a0 - $a34-7argumentsyes $t0 - $t78-15temporariesno $s0 - $s716-23saved valuesyes $t8 - $t924-25temporariesno $gp28global pointeryes $sp29stack pointeryes $fp30frame pointeryes $ra31return addr (hardware)yes

22 EI209 Chapter 2.22Haojin Zhu, SJTU, 2015 MIPS Memory Access Instructions  MIPS has two basic data transfer instructions for accessing memory lw$t0, 4($s3) #load word from memory sw$t0, 8($s3) #store word to memory  The data is loaded into (lw) or stored from (sw) a register in the register file – a 5 bit address  The memory address – a 32 bit address – is formed by adding the contents of the base address register to the offset value l A 16-bit field meaning access is limited to memory locations within a region of  2 13 or 8,192 words (  2 15 or 32,768 bytes) of the address in the base register

23 EI209 Chapter 2.23Haojin Zhu, SJTU, 2015  Load/Store Instruction Format (I format): lw $t0, 24($s3) Machine Language - Load Instruction 35 19 8 24 10 Memory dataword address (hex) 0x00000000 0x00000004 0x00000008 0x0000000c 0xf f f f f f f f $s3 0x12004094 24 10 + $s3 =... 0001 1000 +... 1001 0100... 1010 1100 = 0x120040ac $t0

24 EI209 Chapter 2.24Haojin Zhu, SJTU, 2015 Byte Addresses  Since 8-bit bytes are so useful, most architectures address individual bytes in memory l Alignment restriction - the memory address of a word must be on natural word boundaries (a multiple of 4 in MIPS-32)  Big Endian: leftmost byte is word address IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA  Little Endian:rightmost byte is word address Intel 80x86, DEC Vax, DEC Alpha (Windows NT) msblsb 3 2 1 0 little endian byte 0 0 1 2 3 big endian byte 0

25 EI209 Chapter 2.25Haojin Zhu, SJTU, 2015 Aside: Loading and Storing Bytes  MIPS provides special instructions to move bytes lb$t0, 1($s3) #load byte from memory sb$t0, 6($s3) #store byte to memory 0x28 19 8 16 bit offset  What 8 bits get loaded and stored? l load byte places the byte from memory in the rightmost 8 bits of the destination register -what happens to the other bits in the register? l store byte takes the byte from the rightmost 8 bits of a register and writes it to a byte in memory -what happens to the other bits in the memory word?

26 EI209 Chapter 2.26Haojin Zhu, SJTU, 2015 Example of Loading and Storing Bytes  Given following code sequence and memory state what is the state of the memory after executing the code? add$s3, $zero, $zero lb$t0, 1($s3) sb$t0, 6($s3) Memory 0x 0 0 9 0 1 2 A 0 DataWord Address (Decimal) 0 4 8 12 16 20 24 0x F F F F F F F F 0x 0 1 0 0 0 4 0 2 0x 1 0 0 0 0 0 1 0 0x 0 0 0 0 0 0 0 0  What value is left in $t0?  What if the machine was little Endian?  What word is changed in Memory and to what?

27 EI209 Chapter 2.27Haojin Zhu, SJTU, 2015 Example of Loading and Storing Bytes  Given following code sequence and memory state what is the state of the memory after executing the code? add$s3, $zero, $zero lb$t0, 1($s3) sb$t0, 6($s3) Memory 0x 0 0 9 0 1 2 A 0 DataWord Address (Decimal) 0 4 8 12 16 20 24 0x F F F F F F F F 0x 0 1 0 0 0 4 0 2 0x 1 0 0 0 0 0 1 0 0x 0 0 0 0 0 0 0 0 $t0 = 0x00000090 mem(4) = 0xFFFF90FF mem(4) = 0xFF12FFFF  What value is left in $t0?  What if the machine was little Endian?  What word is changed in Memory and to what? $t0 = 0x00000012

28 EI209 Chapter 2.28Haojin Zhu, SJTU, 2015 Speed of Registers vs. Memory  Given that l Registers: 32 words (128 Bytes) l Memory: Billions of bytes (2 GB to 8 GB on laptop)  and the RISC principle is… l Smaller is faster  How much faster are registers than memory??  About 100-500 times faster! l in terms of latency of one access

29 EI209 Chapter 2.29Haojin Zhu, SJTU, 2015 addi$sp, $sp, 4#$sp = $sp + 4 slti $t0, $s2, 15#$t0 = 1 if $s2<15  Machine format (I format): MIPS Immediate Instructions 0x0A 18 8 0x0F  Small constants are used often in typical code  Possible approaches? l put “typical constants” in memory and load them l create hard-wired registers (like $zero) for constants like 1 l have special instructions that contain constants !  The constant is kept inside the instruction itself! l Immediate format limits values to the range +2 15 –1 to -2 15 what about upper 16 bits?

30 EI209 Chapter 2.30Haojin Zhu, SJTU, 2015  We'd also like to be able to load a 32 bit constant into a register, for this we must use two instructions  a new "load upper immediate" instruction lui $t0, 1010101010101010  Then must get the lower order bits right, use ori $t0, $t0, 1010101010101010 Aside: How About Larger Constants? 16 0 8 1010101010101010 2 1010101010101010 00000000000000001010101010101010 0000000000000000 1010101010101010 why can’t addi be used as the second instruction for this 32 bit constant?

31 EI209 Chapter 2.31Haojin Zhu, SJTU, 2015 Review: Unsigned Binary Representation HexBinaryDecimal 0x000000000…00000 0x000000010…00011 0x000000020…00102 0x000000030…00113 0x000000040…01004 0x000000050…01015 0x000000060…01106 0x000000070…01117 0x000000080…10008 0x000000090…10019 … 0xFFFFFFFC1…1100 0xFFFFFFFD1…1101 0xFFFFFFFE1…1110 0xFFFFFFFF1…1111 2 32 - 1 2 32 - 2 2 32 - 3 2 32 - 4 2 32 - 1 1 1 1... 1 1 1 1 bit 31 30 29... 3 2 1 0 bit position 2 31 2 30 2 29... 2 3 2 2 2 1 2 0 bit weight 1 0 0 0... 0 0 0 0 - 1

32 EI209 Chapter 2.32Haojin Zhu, SJTU, 2015 Possible Representations of Computers Sign Mag.Two’s Comp.One’s Comp. 1000 = -8 1111 = -71001= -71000 = -7 1110 = -61010 = -61001 = -6 1101 = -51011 = -51010 = -5 1100 = -4 1011 = -4 1011 = -31101 = -31100 = -3 1010 = -21110 = -21101 = -2 1001 = -11111 = -11110 = -1 1000 = -01111 = -0 0000 = +00000 = 00000 = +0 0001 = +1 0010 = +2 0011 = +3 0100 = +4 0101 = +5 0110 = +6 0111 = +7  Which one is best? Why?

33 EI209 Chapter 2.33Haojin Zhu, SJTU, 2015 Review: Signed Binary Representation 2’sc binarydecimal 1000-8 1001-7 1010-6 1011-5 1100-4 1101-3 1110-2 1111 00000 00011 00102 00113 01004 01015 01106 01117 2 3 - 1 = -(2 3 - 1) = -2 3 = 1010 complement all the bits 1011 and add a 1 complement all the bits 0101 and add a 1 0110 why it holds?

34 EI209 Chapter 2.34Haojin Zhu, SJTU, 2015 MIPS Shift Operations  Need operations to pack and unpack 8-bit characters into 32-bit words  Shifts move all the bits in a word left or right sll $t2, $s0, 8#$t2 = $s0 << 8 bits srl $t2, $s0, 8#$t2 = $s0 >> 8 bits  Instruction Format (R format)  Such shifts are called logical because they fill with zeros l Notice that a 5-bit shamt field is enough to shift a 32-bit value 2 5 – 1 or 31 bit positions 0 16 10 8 0x00

35 EI209 Chapter 2.35Haojin Zhu, SJTU, 2015 MIPS Logical Operations  There are a number of bit-wise logical operations in the MIPS ISA and $t0, $t1, $t2#$t0 = $t1 & $t2 or $t0, $t1, $t2#$t0 = $t1 | $t2 nor $t0, $t1, $t2#$t0 = not($t1 | $t2)  Instruction Format (R format) andi $t0, $t1, 0xFF00#$t0 = $t1 & ff00 ori $t0, $t1, 0xFF00#$t0 = $t1 | ff00  Instruction Format (I format) 0 9 10 8 0 0x24 0x0D 9 8 0xFF00

36 EI209 Chapter 2.36Haojin Zhu, SJTU, 2015  MIPS conditional branch instructions: bne $s0, $s1, Lbl#go to Lbl if $s0  $s1 beq $s0, $s1, Lbl#go to Lbl if $s0=$s1 Ex: if (i==j) h = i + j; bne $s0, $s1, Lbl1 add $s3, $s0, $s1 Lbl1:... MIPS Control Flow Instructions  Instruction Format (I format): 0x05 16 17 16 bit offset  How is the branch destination address specified?

37 EI209 Chapter 2.37Haojin Zhu, SJTU, 2015 Specifying Branch Destinations  Use a register (like in lw and sw) added to the 16-bit offset l which register? Instruction Address Register (the PC) -its use is automatically implied by instruction -PC gets updated (PC+4) during the fetch cycle so that it holds the address of the next instruction l limits the branch distance to -2 15 to +2 15 -1 (word) instructions from the (instruction after the) branch instruction, but most branches are local anyway PC Add 32 offset 16 32 00 sign-extend from the low order 16 bits of the branch instruction branch dst address ? Add 4 32

38 EI209 Chapter 2.38Haojin Zhu, SJTU, 2015  We have beq, bne, but what about other kinds of branches (e.g., branch-if-less-than)? For this, we need yet another instruction, slt  Set on less than instruction: slt $t0, $s0, $s1 # if $s0 < $s1 then # $t0 = 1else # $t0 = 0  Instruction format (R format):  Alternate versions of slt slti $t0, $s0, 25# if $s0 < 25 then $t0=1... sltu $t0, $s0, $s1# if $s0 < $s1 then $t0=1... sltiu $t0, $s0, 25# if $s0 < 25 then $t0=1... 2 In Support of Branch Instructions 0 16 17 8 0x24

39 EI209 Chapter 2.39Haojin Zhu, SJTU, 2015 Aside: More Branch Instructions  Can use slt, beq, bne, and the fixed value of 0 in register $zero to create other conditions less than blt $s1, $s2, Label less than or equal to ble $s1, $s2, Label greater than bgt $s1, $s2, Label great than or equal to bge $s1, $s2, Label slt $at, $s1, $s2#$at set to 1 if bne $at, $zero, Label#$s1 < $s2  Such branches are included in the instruction set as pseudo instructions - recognized (and expanded) by the assembler Its why the assembler needs a reserved register ( $at )

40 EI209 Chapter 2.40Haojin Zhu, SJTU, 2015 MIPS Instructions Consider a comparison instruction: slt $t0, $t1, $zero and $t1 contains the 32-bit number 1111 01…01 What gets stored in $t0?

41 EI209 Chapter 2.41Haojin Zhu, SJTU, 2015 MIPS Instructions Consider a comparison instruction: slt $t0, $t1, $zero and $t1 contains the 32-bit number 1111 01…01 What gets stored in $t0? The result depends on whether $t1 is a signed or unsigned number – the compiler/programmer must track this and accordingly use either slt or sltu slt $t0, $t1, $zero stores 1 in $t0 sltu $t0, $t1, $zero stores 0 in $t0

42 EI209 Chapter 2.42Haojin Zhu, SJTU, 2015 Bounds Check Shortcut  Treating signed numbers as if they were unsigned gives a low cost way of checking if 0 ≤ x < y (index out of bounds for arrays) sltu $t0, $s1, $t2 # $t0 = 0 if # $s1 > $t2 (max) # or $s1 < 0 (min) beq $t0, $zero, IOOB# go to IOOB if # $t0 = 0  The key is that negative integers in two’s complement look like large numbers in unsigned notation. Thus, an unsigned comparison of x < y also checks if x is negative as well as if x is less than y.

43 EI209 Chapter 2.43Haojin Zhu, SJTU, 2015  MIPS also has an unconditional branch instruction or jump instruction: j label#go to label Other Control Flow Instructions  Instruction Format (J Format): 0x02 26-bit address PC 4 32 26 32 00 from the low order 26 bits of the jump instruction Why shift left by two bits?

44 EI209 Chapter 2.44Haojin Zhu, SJTU, 2015 Aside: Branching Far Away  What if the branch destination is further away than can be captured in 16 bits?  The assembler comes to the rescue – it inserts an unconditional jump to the branch target and inverts the condition beq$s0, $s1, L1 becomes bne$s0, $s1, L2 jL1 L2:

45 EI209 Chapter 2.45Haojin Zhu, SJTU, 2015 Compiling Another While Loop  Compile the assembly code for the C while loop where i is in $s3, k is in $s5, and the base address of the array save is in $s6 while (save[i] == k) i += 1;

46 EI209 Chapter 2.46Haojin Zhu, SJTU, 2015 Compiling Another While Loop  Compile the assembly code for the C while loop where i is in $s3, k is in $s5, and the base address of the array save is in $s6 while (save[i] == k) i += 1; Loop:sll$t1, $s3, 2 add$t1, $t1, $s6 lw$t0, 0($t1) bne$t0, $s5, Exit addi$s3, $s3, 1 jLoop Exit:...

47 Can make a “for” loop with only j Can make a “for” loop with only beq Can make an unconditional branch from a conditional branch instruction ☐ ☐ ☐ ☐ Which of the following is FALSE?

48 EI209 Chapter 2.48Haojin Zhu, SJTU, 2015 Review: MIPS Addressing Modes Illustrated 1. Register addressing op rs rt rd funct Register word operand op rs rt offset 2. Base (displacement) addressing base register Memory word or byte operand 3. Immediate addressing op rs rt operand 4. PC-relative addressing op rs rt offset Program Counter (PC) Memory branch destination instruction 5. Pseudo-direct addressing op jump address Program Counter (PC) Memory jump destination instruction||

49 EI209 Chapter 2.49Haojin Zhu, SJTU, 2015 End of Lecture #1  The next topic is procedure

50 EI209 Chapter 2.50Haojin Zhu, SJTU, 2015  MIPS procedure call instruction: jalProcedureAddress#jump and link  Saves PC+4 in register $ra to have a link to the next instruction for the procedure return  Machine format (J format):  Then can do procedure return with a jr$ra#return  Instruction format (R format): Instructions for Accessing Procedures 0x03 26 bit address 0 31 0x08

51 EI209 Chapter 2.51Haojin Zhu, SJTU, 2015 Six Steps in Execution of a Procedure 1. Main routine (caller) places parameters in a place where the procedure (callee) can access them $a0 - $a3 : four argument registers 2. Caller transfers control to the callee 3. Callee acquires the storage resources needed 4. Callee performs the desired task 5. Callee places the result value in a place where the caller can access it $v0 - $v1 : two value registers for result values 6. Callee returns control to the caller $ra : one return address register to return to the point of origin

52 EI209 Chapter 2.52Haojin Zhu, SJTU, 2015 MIPS Function Call Conventions  Registers faster than memory, so use them  $a0–$a3 : four argument registers to pass parameters  $v0–$v1 : two value registers to return values  $ra : one return address register to return to the point of origin  (7 + $zero +$at of 32, 23 left!)  $t0-$t9: 10 x temporaries (intermediates)  $s0-$s7: 8 x “saved” temporaries (program variables)  18 registers  32 – (18 + 9) = 5 left

53 EI209 Chapter 2.53Haojin Zhu, SJTU, 2015 Notes on Functions  Calling program (caller) puts parameters into registers $a0-$a3 and uses jal X to invoke X (callee)  Must have register in computer with address of currently executing instruction l Instead of Instruction Address Register (better name), historically called Program Counter (PC) l It’s a program’s counter; it doesn’t count programs!  jr $ra puts address inside $ra into PC  What value does jal X place into $ra ? ???? Student Roulette?

54 EI209 Chapter 2.54Haojin Zhu, SJTU, 2015 Notes on Functions  Calling program (caller) puts parameters into registers $a0-$a3 and uses jal X to invoke X (callee)  Must have register in computer with address of currently executing instruction l Instead of Instruction Address Register (better name), historically called Program Counter (PC) l It’s a program’s counter, it doesn’t count programs!  jr $ra puts address inside $ra into PC  What value does jal X place into $ra ? Next PC PC + 4

55 EI209 Chapter 2.55Haojin Zhu, SJTU, 2015 Aside: Spilling Registers  What if the callee needs to use more registers than allocated to argument and return values? l callee uses a stack – a last-in-first-out queue low addr high addr $sp  One of the general registers, $sp ( $29 ), is used to address the stack (which “grows” from high address to low address) l add data onto the stack – push $sp = $sp – 4 data on stack at new $sp l remove data from the stack – pop data from stack at $sp $sp = $sp + 4 top of stack (28 out of 32, 4 left!)

56 EI209 Chapter 2.56Haojin Zhu, SJTU, 2015 Compiling a C Leaf Procedure  Leaf procedures are ones that do not call other procedures. Give the MIPS assembler code for int leaf_ex (int g, int h, int i, int j) {int f; f = (g+h) – (i+j); return f;} where g, h, i, and j are in $a0, $a1, $a2, $a3

57 EI209 Chapter 2.57Haojin Zhu, SJTU, 2015 Compiling a C Leaf Procedure  Leaf procedures are ones that do not call other procedures. Give the MIPS assembler code for int leaf_ex (int g, int h, int i, int j) { int f; f = (g+h) – (i+j); return f;} where g, h, i, and j are in $a0, $a1, $a2, $a3

58 EI209 Chapter 2.58Haojin Zhu, SJTU, 2015 Leaf Procedure Example  C code: int leaf_example (int g, h, i, j) { int f; f = (g + h) - (i + j); return f; } l Arguments g, …, j are passed in $a0, …, $a3 l f in $s0 (we need to save $s0 on stack – we will see why later) l Results are returned in $v0, $v1 $a0 $a1 $a2 $a3 $v0 $v1 argument registers result registers procedure

59 EI209 Chapter 2.59Haojin Zhu, SJTU, 2015 Procedure Call Instructions  Procedure call: jump and link jal ProcedureLabel l Address of following instruction put in $ra l Jumps to target address  Procedure return: jump register jr $ra l Copies $ra to program counter l Can also be used for computed jumps -e.g., for case/switch statements Example:

60 EI209 Chapter 2.60Haojin Zhu, SJTU, 2015 Leaf Procedure Example  MIPS code: leaf_example: addi $sp, $sp, -4 sw $s0, 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0, $t0, $t1 add $v0, $s0, $zero lw $s0, 0($sp) addi $sp, $sp, 4 jr $ra Save $s0 on stack Procedure body Restore $s0 Result Return

61 EI209 Chapter 2.61Haojin Zhu, SJTU, 2015 Procedure Call Mechanics Old Stack Frame arg registers return address Saved registers local variables New Stack Frame $fp $sp Low Address High Address compiler ISA HW compiler addressing $gp PC $sp stack dynamic data static data text reserved System Wide Memory Map

62 EI209 Chapter 2.62Haojin Zhu, SJTU, 2015 Example of the Stack Frame Call Sequence 1. place excess arguments 2. save caller save registers ($a0-$a3, $t0-$t9) 3. jal 4. allocate stack frame 5. save callee save registers ($s0-$s9, $fp, $ra) 6 set frame pointer Return 1. place function argument in $v0 2. restore callee save registers 3. restore $fp 4. pop frame 5. jr $31 arg 1 arg 2.. callee saved registers caller saved registers local variables.. $fp $ra $s0-$s9 $a0-$a3 $t0-$t9 $fp $sp

63 EI209 Chapter 2.63Haojin Zhu, SJTU, 2015 Nested Procedures  What happens to return addresses with nested procedures? int rt_1 (int i) { if (i == 0) return 0; else return rt_2(i-1); } caller:jalrt_1 next:... rt_1:bne$a0, $zero, to_2 add$v0, $zero, $zero jr$ra to_2:addi$a0, $a0, -1 jalrt_2 jr$ra rt_2:...

64 EI209 Chapter 2.64Haojin Zhu, SJTU, 2015 Nested Procedures Outcome caller:jalrt_1 next:... rt_1:bne$a0, $zero, to_2 add$v0, $zero, $zero jr$ra to_2:addi$a0, $a0, -1 jalrt_2 jr$ra rt_2:...  On the call to rt_1, the return address ( next in the caller routine) gets stored in $ra. What happens to the value in $ra (when i != 0 ) when rt_1 makes a call to rt_2 ?

65 EI209 Chapter 2.65Haojin Zhu, SJTU, 2015 Saving the Return Address, Part 1  Nested procedures ( i passed in $a0, return value in $v0 ) rt_1:bne$a0, $zero, to_2 add$v0, $zero, $zero jr$ra to_2:addi$sp, $sp, -8 sw$ra, 4($sp) sw$a0, 0($sp) addi$a0, $a0, -1 jalrt_2 bk_2:lw$a0, 0($sp) lw$ra, 4($sp) addi$sp, $sp, 8 jr$ra  Save the return address (and arguments) on the stack low addr high addr  $sp $ra old TOS

66 EI209 Chapter 2.66Haojin Zhu, SJTU, 2015 Saving the Return Address, Part 1  Nested procedures ( i passed in $a0, return value in $v0 ) rt_1:bne$a0, $zero, to_2 add$v0, $zero, $zero jr$ra to_2:addi$sp, $sp, -8 sw$ra, 4($sp) sw$a0, 0($sp) addi$a0, $a0, -1 jalrt_2 bk_2:lw$a0, 0($sp) lw$ra, 4($sp) addi$sp, $sp, 8 jr$ra  Save the return address (and arguments) on the stack caller rt addr high addr  $sp low addr old $a0 $ra caller rt addr bk_2 old TOS

67 EI209 Chapter 2.67Haojin Zhu, SJTU, 2015 Saving the Return Address, Part 2  Nested procedures ( i passed in $a0, return value in $v0 ) rt_1:bne$a0, $zero, to_2 add$v0, $zero, $zero jr$ra to_2:addi$sp, $sp, -8 sw$ra, 4($sp) sw$a0, 0($sp) addi$a0, $a0, -1 jalrt_2 bk_2:lw$a0, 0($sp) lw$ra, 4($sp) addi$sp, $sp, 8 jr$ra  Save the return address (and arguments) on the stack low addr high addr  $sp $ra old TOS

68 EI209 Chapter 2.68Haojin Zhu, SJTU, 2015 Saving the Return Address, Part 2  Nested procedures ( i passed in $a0, return value in $v0 ) rt_1:bne$a0, $zero, to_2 add$v0, $zero, $zero jr$ra to_2:addi$sp, $sp, -8 sw$ra, 4($sp) sw$a0, 0($sp) addi$a0, $a0, -1 jalrt_2 bk_2:lw$a0, 0($sp) lw$ra, 4($sp) addi$sp, $sp, 8 jr$ra  Save the return address (and arguments) on the stack caller rt addr high addr  $sp low addr old $a0 $ra bk_2caller rt addr old TOS

69 EI209 Chapter 2.69Haojin Zhu, SJTU, 2015 MIPS Register Convention NameRegister Number UsagePreserve on call? $zero0the constant 0n.a. $v0 - $v12-3returned valuesno $a0 - $a34-7argumentsyes $t0 - $t78-15temporariesno $s0 - $s716-23saved valuesyes $t8 - $t924-25temporariesno $gp28global pointeryes $sp29stack pointeryes $fp30frame pointeryes $ra31return addressyes

70 EI209 Chapter 2.70Haojin Zhu, SJTU, 2015 Allocating Space on Stack  C has two storage classes: automatic and static l Automatic variables are local to function and discarded when function exits l Static variables exist across exits from and entries to procedures  Use stack for automatic (local) variables that don’t fit in registers  Procedure frame or activation record: segment of stack with saved registers and local variables  Some MIPS compilers use a frame pointer ( $fp ) to point to first word of frame  (29 of 32, 3 left!)

71 EI209 Chapter 2.71Haojin Zhu, SJTU, 2015 Stack Before, During, After Call

72 EI209 Chapter 2.72Haojin Zhu, SJTU, 2015 Compiling a Recursive Procedure  A procedure for calculating factorial int fact (int n) { if (n < 1) return 1; else return (n * fact (n-1)); }  A recursive procedure (one that calls itself!) fact (0) = 1 fact (1) = 1 * 1 = 1 fact (2) = 2 * 1 * 1 = 2 fact (3) = 3 * 2 * 1 * 1 = 6 fact (4) = 4 * 3 * 2 * 1 * 1 = 24...  Assume n is passed in $a0 ; result returned in $v0

73 EI209 Chapter 2.73Haojin Zhu, SJTU, 2015 Compiling a Recursive Procedure fact:addi$sp, $sp, -8#adjust stack pointer sw$ra, 4($sp)#save return address sw$a0, 0($sp)#save argument n slti$t0, $a0, 1#test for n < 1 beq$t0, $zero, L1#if n >=1, go to L1 addi$v0, $zero, 1#else return 1 in $v0 addi$sp, $sp, 8#adjust stack pointer jr$ra#return to caller (1 st ) L1:addi$a0, $a0, -1#n >=1, so decrement n jalfact#call fact with (n-1) #this is where fact returns bk_f:lw$a0, 0($sp)#restore argument n lw$ra, 4($sp)#restore return address addi$sp, $sp, 8#adjust stack pointer mul$v0, $a0, $v0#$v0 = n * fact(n-1) jr$ra#return to caller (2 nd ) Using stacks to save existing data in $ra and $a0 Make the branch Call procedure Fact(n-1) Return Fact(1) Return after procedure Fact(n-1)

74 EI209 Chapter 2.74Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 1  $sp $ra $a0 $v0 old TOS  Stack state after execution of the first encounter with jal (second call to fact routine with $a0 now holding 1) l saved return address to caller routine (i.e., location in the main routine where first call to fact is made) on the stack saved original value of $a0 on the stack

75 EI209 Chapter 2.75Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 1  $sp $ra $a0 $v0  $sp caller rt addr $a0 = 2 21 bk_f old TOS  Stack state after execution of the first encounter with jal (second call to fact routine with $a0 now holding 1) l saved return address to caller routine (i.e., location in the main routine where first call to fact is made) on the stack saved original value of $a0 on the stack

76 EI209 Chapter 2.76Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 2  $sp $ra $a0 $v0 old TOS  Stack state after execution of the second encounter with jal (third call to fact routine with $a0 now holding 0) save return address of instruction in caller routine (instruction after jal ) on the stack save previous value of $a0 on the stack

77 EI209 Chapter 2.77Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 2 $ra $a0 $v0  $sp caller rt addr $a0 = 2 10 bk_f old TOS  $sp $a0 = 1 bk_f  Stack state after execution of the second encounter with jal (third call to fact routine with $a0 now holding 0) saved return address of instruction in caller routine (instruction after jal ) on the stack saved previous value of $a0 on the stack

78 EI209 Chapter 2.78Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 3  $sp $ra $a0 $v0 old TOS  Stack state after execution of the first encounter with the first jr ( $v0 initialized to 1) l stack pointer updated to point to third call to fact

79 EI209 Chapter 2.79Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 3 $ra $a0 $v0  $sp bk_f $a0 = 1 0 old TOS  $sp $a0 = 0 bk_f $a0 = 2 caller rt addr 1  $sp  Stack state after execution of the first encounter with the first jr ( $v0 initialized to 1) l stack pointer updated to point to third call to fact

80 EI209 Chapter 2.80Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 4  $sp $ra $a0 $v0 old TOS  Stack state after execution of the first encounter with the second jr (return from fact routine after updating $v0 to 1 * 1) return address to caller routine ( bk_f in fact routine) restored to $ra from the stack previous value of $a0 restored from the stack l stack pointer updated to point to second call to fact

81 EI209 Chapter 2.81Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 4 $ra $a0 $v0  $sp bk_f $a0 = 1 0 old TOS $a0 = 0 bk_f $a0 = 2 caller rt addr 1  $sp 1 bk_f 1 * 1  Stack state after execution of the first encounter with the second jr (return from fact routine after updating $v0 to 1 * 1) return address to caller routine ( bk_f in fact routine) restored to $ra from the stack previous value of $a0 restored from the stack l stack pointer updated to point to second call to fact

82 EI209 Chapter 2.82Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 5  $sp $ra $a0 $v0 old TOS  Stack state after execution of the second encounter with the second jr (return from fact routine after updating $v0 to 2 * 1 * 1) return address to caller routine (main routine) restored to $ra from the stack original value of $a0 restored from the stack l stack pointer updated to point to first call to fact

83 EI209 Chapter 2.83Haojin Zhu, SJTU, 2015 A Look at the Stack for $a0 = 2, Part 5 $ra $a0 $v0  $sp bk_f $a0 = 1 1 old TOS $a0 = 0 bk_f $a0 = 2 caller rt addr 1 * 1  $sp 2 bk_f 2 * 1 * 1  Stack state after execution of the second encounter with the second jr (return from fact routine after updating $v0 to 2 * 1 * 1) return address to caller routine (main routine) restored to $ra from the stack original value of $a0 restored from the stack l stack pointer updated to point to first call to fact caller rt addr

84 EI209 Chapter 2.84Haojin Zhu, SJTU, 2015 Optimized Function Convention To reduce expensive loads and stores from spilling and restoring registers, MIPS divides registers into two categories: 1. Preserved across function call l Caller can rely on values being unchanged $ra, $sp, $gp, $fp, “saved registers” $s0 - $s7 2. Not preserved across function call l Caller cannot rely on values being unchanged Return value registers $v0, $v1, Argument registers $a0 - $a3, “temporary registers” $t0 - $t9

85 EI209 Chapter 2.85Haojin Zhu, SJTU, 2015 Where is the Stack in Memory?  MIPS convention  Stack starts in high memory and grows down l Hexadecimal (base 16) : 7fff fffc hex  MIPS programs (text segment) in low end l 0040 0000 hex  static data segment (constants and other static variables) above text for static variables MIPS convention global pointer ( $gp ) points to static l (30 of 32, 2 left! – will see when talk about OS)  Heap above static for data structures that grow and shrink ; grows up to high addresses

86 EI209 Chapter 2.86Haojin Zhu, SJTU, 2015 MIPS Memory Allocation 2/1/2016Fall 2012 -- Lecture #786

87 jal saves PC+1 in %ra The callee can use temporary registers (%ti) without saving and restoring them MIPS uses jal to invoke a function and jr to return from a function ☐ ☐ ☐ ☐ Which statement is FALSE?

88 EI209 Chapter 2.88Haojin Zhu, SJTU, 2015 Assignment II Chapter II: 2.4, 2.6, 2.14, 2.20 Submission Deadline: Oct 30


Download ppt "EI209 Chapter 2.1Haojin Zhu, SJTU, 2015 EI 209 Computer Organization Fall 2015 Chapter 2: Instructions: Language of the Computer Haojin Zhu (http://tdt.sjtu.edu.cn/~hjzhu/"

Similar presentations


Ads by Google