1 Lecture 2: MIPS Instruction Set Today’s topic:  MIPS instructions Reminder: sign up for the mailing list cs3810 Reminder: set up your CADE accounts.

Slides:



Advertisements
Similar presentations
Henk Corporaal TUEindhoven 2011
Advertisements

1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 2 Instructions: Language of the Computer.
Lecture 5: MIPS Instruction Set
The University of Adelaide, School of Computer Science
Lecture 9: MIPS Instruction Set
1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.
Lecture 8 Sept 23 Completion of Ch 2 translating procedure into MIPS role of compilers, assemblers, linking, loading etc. pitfalls, conclusions Chapter.
Lecture 6: MIPS Instruction Set Today’s topic –Control instructions –Procedure call/return 1.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 5 MIPS ISA & Assembly Language Programming.
1 Nested Procedures Procedures that don't call others are called leaf procedures, procedures that call others are called nested procedures. Problems may.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
CS3350B Computer Architecture Winter 2015 Lecture 4
Lecture 8: MIPS Instruction Set
Lec 9Systems Architecture1 Systems Architecture Lecture 9: Assemblers, Linkers, and Loaders Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some.
ENEE350 Spring07 1 Ankur Srivastava University of Maryland, College Park Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005.”
1 Lecture 5: MIPS Examples Today’s topics:  the compilation process  full example – sort in C Reminder: 2 nd assignment will be posted later today.
S. Barua – CPSC 440 CHAPTER 2 INSTRUCTIONS: LANGUAGE OF THE COMPUTER Goals – To get familiar with.
1 Lecture 2: MIPS Instruction Set Today’s topic:  MIPS instructions Reminder: sign up for the mailing list cs3810 Reminder: set up your CADE accounts.
RISC Concepts, MIPS ISA and the Mini–MIPS project
Lecture 5 Sept 14 Goals: Chapter 2 continued MIPS assembly language instruction formats translating c into MIPS - examples.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
Lecture 7: MIPS Instruction Set Today’s topic –Procedure call/return –Large constants Reminders –Homework #2 posted, due 9/17/
13/02/2009CA&O Lecture 04 by Engr. Umbreen Sabir Computer Architecture & Organization Instructions: Language of Computer Engr. Umbreen Sabir Computer Engineering.
Computer Architecture Instruction Set Architecture Lynn Choi Korea University.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 7: MIPS Instructions III
Lecture 4: MIPS Instruction Set Reminders: –Homework #1 posted: due next Wed. –Midterm #1 scheduled Friday September 26 th, 2014 Location: TODD 430 –Midterm.
April 23, 2001Systems Architecture I1 Systems Architecture I (CS ) Lecture 9: Assemblers, Linkers, and Loaders * Jeremy R. Johnson Mon. April 23,
Lecture 4: MIPS Instruction Set
Chapter 2 CSF 2009 The MIPS Assembly Language. Stored Program Computers Instructions represented in binary, just like data Instructions and data stored.
MS108 Computer System I Lecture 3 ISA Prof. Xiaoyao Liang 2015/3/13 1.
Computer Organization CS224 Fall 2012 Lessons 7 and 8.
Computer Architecture CSE 3322 Lecture 4 Assignment: 2.4.1, 2.4.4, 2.6.1, , Due 2/10/09
Chapter 2 — Instructions: Language of the Computer — 1 Conditional Operations Branch to a labeled instruction if a condition is true – Otherwise, continue.
CDA 3101 Spring 2016 Introduction to Computer Organization
1 Lecture 6: Assembly Programs Today’s topics:  Large constants  The compilation process  A full example  Intro to the MARS simulator.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 8: MIPS Procedures and Recursion.
Chapter 2 Instructions: Language of the Computer.
MIPS Assembly.
Lecture 5: Procedure Calls
Computer Architecture & Operations I
Lecture 3: MIPS Instruction Set
Lecture 6: Assembly Programs
Computer Architecture Instruction Set Architecture
Lecture 7: MARS, Computer Arithmetic
Computer Architecture Instruction Set Architecture
Morgan Kaufmann Publishers
Lecture 4: MIPS Instruction Set
RISC Concepts, MIPS ISA Logic Design Tutorial 8.
Procedures (Functions)
ECE/CS 552: Instruction Sets – MIPS
Instructions - Type and Format
Lecture 4: MIPS Instruction Set
Henk Corporaal TUEindhoven 2010
The University of Adelaide, School of Computer Science
ECE232: Hardware Organization and Design
The University of Adelaide, School of Computer Science
Lecture 7: Examples, MARS, Arithmetic
Instruction encoding The ISA defines Format = Encoding
Lecture 5: Procedure Calls
Chapter 2 Instructions: Language of the Computer part 2
Computer Organization and Design Assembly & Compilation
Computer Instructions
Computer Architecture
Lecture 3: MIPS Instruction Set
Lecture 6: Assembly Programs
Computer Architecture
Presentation transcript:

1 Lecture 2: MIPS Instruction Set Today’s topic:  MIPS instructions Reminder: sign up for the mailing list cs3810 Reminder: set up your CADE accounts (EMCB 224)

2 Recap Knowledge of hardware improves software quality: compilers, OS, threaded programs, memory management Important trends: growing transistors, move to multi-core, slowing rate of performance improvement, power/thermal constraints, long memory/disk latencies

3 Instruction Set Understanding the language of the hardware is key to understanding the hardware/software interface A program (in say, C) is compiled into an executable that is composed of machine instructions – this executable must also run on future machines – for example, each Intel processor reads in the same x86 instructions, but each processor handles instructions differently Java programs are converted into portable bytecode that is converted into machine instructions during execution (just-in-time compilation) What are important design principles when defining the instruction set architecture (ISA)?

4 Instruction Set Important design principles when defining the instruction set architecture (ISA):  keep the hardware simple – the chip must only implement basic primitives and run fast  keep the instructions regular – simplifies the decoding/scheduling of instructions

5 A Basic MIPS Instruction C code: a = b + c ; Assembly code: (human-friendly machine instructions) add a, b, c # a is the sum of b and c Machine code: (hardware-friendly machine instructions) Translate the following C code into assembly code: a = b + c + d + e;

6 Example C code a = b + c + d + e; translates into the following assembly code: add a, b, c add a, b, c add a, a, d or add f, d, e add a, a, e add a, a, f Instructions are simple: fixed number of operands (unlike C) A single line of C code is converted into multiple lines of assembly code Some sequences are better than others… the second sequence needs one more (temporary) variable f

7 Subtract Example C code f = (g + h) – (i + j); Assembly code translation with only add and sub instructions:

8 Subtract Example C code f = (g + h) – (i + j); translates into the following assembly code: add t0, g, h add f, g, h add t1, i, j or sub f, f, i sub f, t0, t1 sub f, f, j Each version may produce a different result because floating-point operations are not necessarily associative and commutative… more on this later

9 Operands In C, each “variable” is a location in memory In hardware, each memory access is expensive – if variable a is accessed repeatedly, it helps to bring the variable into an on-chip scratchpad and operate on the scratchpad (registers) To simplify the instructions, we require that each instruction (add, sub) only operate on registers Note: the number of operands (variables) in a C program is very large; the number of operands in assembly is fixed… there can be only so many scratchpad registers

10 Registers The MIPS ISA has 32 registers (x86 has 8 registers) – Why not more? Why not less? Each register is 32-bit wide (modern 64-bit architectures have 64-bit wide registers) A 32-bit entity (4 bytes) is referred to as a word To make the code more readable, registers are partitioned as $s0-$s7 (C/Java variables), $t0-$t9 (temporary variables)…

11 Memory Operands Values must be fetched from memory before (add and sub) instructions can operate on them Load word lw $t0, memory-address Store word sw $t0, memory-address How is memory-address determined? Register Memory Register Memory

12 Memory Address The compiler organizes data in memory… it knows the location of every variable (saved in a table)… it can fill in the appropriate mem-address for load-store instructions int a, b, c, d[10] Memory … Base address

13 Immediate Operands An instruction may require a constant as input An immediate instruction uses a constant number as one of the inputs (instead of a register operand) addi $s0, $zero, 1000 # the program has base address # 1000 and this is saved in $s0 # $zero is a register that always # equals zero addi $s1, $s0, 0 # this is the address of variable a addi $s2, $s0, 4 # this is the address of variable b addi $s3, $s0, 8 # this is the address of variable c addi $s4, $s0, 12 # this is the address of variable d[0]

14 Memory Instruction Format The format of a load instruction: destination register source address lw $t0, 8($t3) any register a constant that is added to the register in brackets

15 Example Convert to assembly: C code: d[3] = d[2] + a;

16 Example Convert to assembly: C code: d[3] = d[2] + a; Assembly: # addi instructions as before lw $t0, 8($s4) # d[2] is brought into $t0 lw $t1, 0($s1) # a is brought into $t1 add $t0, $t0, $t1 # the sum is in $t0 sw $t0, 12($s4) # $t0 is stored into d[3] Assembly version of the code continues to expand!

17 Recap – Numeric Representations Decimal Binary Hexadecimal (compact representation) 0x 23 or 23 hex 0-15 (decimal)  0-9, a-f (hex)

18 Instruction Formats Instructions are represented as 32-bit numbers (one word), broken into 6 fields R-type instruction add $t0, $s1, $s bits 5 bits 5 bits 5 bits 5 bits 6 bits op rs rt rd shamt funct opcode source source dest shift amt function I-type instruction lw $t0, 32($s3) 6 bits 5 bits 5 bits 16 bits opcode rs rt constant

19 Logical Operations Logical ops C operators Java operators MIPS instr Shift Left << << sll Shift Right >> >>> srl Bit-by-bit AND & & and, andi Bit-by-bit OR | | or, ori Bit-by-bit NOT ~ ~ nor

20 Control Instructions Conditional branch: Jump to instruction L1 if register1 equals register2: beq register1, register2, L1 Similarly, bne and slt (set-on-less-than) Unconditional branch: j L1 jr $s0 Convert to assembly: if (i == j) f = g+h; else f = g-h;

21 Control Instructions Conditional branch: Jump to instruction L1 if register1 equals register2: beq register1, register2, L1 Similarly, bne and slt (set-on-less-than) Unconditional branch: j L1 jr $s0 Convert to assembly: if (i == j) bne $s3, $s4, Else f = g+h; add $s0, $s1, $s2 else j Exit f = g-h; Else: sub $s0, $s1, $s2 Exit:

22 Example Convert to assembly: while (save[i] == k) i += 1; i and k are in $s3 and $s5 and base of array save[] is in $s6

23 Example Convert to assembly: while (save[i] == k) i += 1; i and k are in $s3 and $s5 and base of array save[] is in $s6 Loop: sll $t1, $s3, 2 add $t1, $t1, $s6 lw $t0, 0($t1) bne $t0, $s5, Exit addi $s3, $s3, 1 j Loop Exit:

24 Title Bullet

25 Lecture 2: MIPS Instruction Set Today’s topic:  MIPS instructions Reminder: sign up for the mailing list cs3810 Reminder: set up your CADE accounts (EMCB 224)

26 Recap Knowledge of hardware improves software quality: compilers, OS, threaded programs, memory management Important trends: growing transistors, move to multi-core, slowing rate of performance improvement, power/thermal constraints, long memory/disk latencies

27 Instruction Set Understanding the language of the hardware is key to understanding the hardware/software interface A program (in say, C) is compiled into an executable that is composed of machine instructions – this executable must also run on future machines – for example, each Intel processor reads in the same x86 instructions, but each processor handles instructions differently Java programs are converted into portable bytecode that is converted into machine instructions during execution (just-in-time compilation) What are important design principles when defining the instruction set architecture (ISA)?

28 Instruction Set Important design principles when defining the instruction set architecture (ISA):  keep the hardware simple – the chip must only implement basic primitives and run fast  keep the instructions regular – simplifies the decoding/scheduling of instructions

29 A Basic MIPS Instruction C code: a = b + c ; Assembly code: (human-friendly machine instructions) add a, b, c # a is the sum of b and c Machine code: (hardware-friendly machine instructions) Translate the following C code into assembly code: a = b + c + d + e;

30 Example C code a = b + c + d + e; translates into the following assembly code: add a, b, c add a, b, c add a, a, d or add f, d, e add a, a, e add a, a, f Instructions are simple: fixed number of operands (unlike C) A single line of C code is converted into multiple lines of assembly code Some sequences are better than others… the second sequence needs one more (temporary) variable f

31 Subtract Example C code f = (g + h) – (i + j); Assembly code translation with only add and sub instructions:

32 Subtract Example C code f = (g + h) – (i + j); translates into the following assembly code: add t0, g, h add f, g, h add t1, i, j or sub f, f, i sub f, t0, t1 sub f, f, j Each version may produce a different result because floating-point operations are not necessarily associative and commutative… more on this later

33 Operands In C, each “variable” is a location in memory In hardware, each memory access is expensive – if variable a is accessed repeatedly, it helps to bring the variable into an on-chip scratchpad and operate on the scratchpad (registers) To simplify the instructions, we require that each instruction (add, sub) only operate on registers Note: the number of operands (variables) in a C program is very large; the number of operands in assembly is fixed… there can be only so many scratchpad registers

34 Registers The MIPS ISA has 32 registers (x86 has 8 registers) – Why not more? Why not less? Each register is 32-bit wide (modern 64-bit architectures have 64-bit wide registers) A 32-bit entity (4 bytes) is referred to as a word To make the code more readable, registers are partitioned as $s0-$s7 (C/Java variables), $t0-$t9 (temporary variables)…

35 Memory Operands Values must be fetched from memory before (add and sub) instructions can operate on them Load word lw $t0, memory-address Store word sw $t0, memory-address How is memory-address determined? Register Memory Register Memory

36 Memory Address The compiler organizes data in memory… it knows the location of every variable (saved in a table)… it can fill in the appropriate mem-address for load-store instructions int a, b, c, d[10] Memory … Base address

37 Immediate Operands An instruction may require a constant as input An immediate instruction uses a constant number as one of the inputs (instead of a register operand) addi $s0, $zero, 1000 # the program has base address # 1000 and this is saved in $s0 # $zero is a register that always # equals zero addi $s1, $s0, 0 # this is the address of variable a addi $s2, $s0, 4 # this is the address of variable b addi $s3, $s0, 8 # this is the address of variable c addi $s4, $s0, 12 # this is the address of variable d[0]

38 Memory Instruction Format The format of a load instruction: destination register source address lw $t0, 8($t3) any register a constant that is added to the register in brackets

39 Example Convert to assembly: C code: d[3] = d[2] + a;

40 Example Convert to assembly: C code: d[3] = d[2] + a; Assembly: # addi instructions as before lw $t0, 8($s4) # d[2] is brought into $t0 lw $t1, 0($s1) # a is brought into $t1 add $t0, $t0, $t1 # the sum is in $t0 sw $t0, 12($s4) # $t0 is stored into d[3] Assembly version of the code continues to expand!

41 Recap – Numeric Representations Decimal Binary Hexadecimal (compact representation) 0x 23 or 23 hex 0-15 (decimal)  0-9, a-f (hex)

42 Instruction Formats Instructions are represented as 32-bit numbers (one word), broken into 6 fields R-type instruction add $t0, $s1, $s bits 5 bits 5 bits 5 bits 5 bits 6 bits op rs rt rd shamt funct opcode source source dest shift amt function I-type instruction lw $t0, 32($s3) 6 bits 5 bits 5 bits 16 bits opcode rs rt constant

43 Logical Operations Logical ops C operators Java operators MIPS instr Shift Left << << sll Shift Right >> >>> srl Bit-by-bit AND & & and, andi Bit-by-bit OR | | or, ori Bit-by-bit NOT ~ ~ nor

44 Control Instructions Conditional branch: Jump to instruction L1 if register1 equals register2: beq register1, register2, L1 Similarly, bne and slt (set-on-less-than) Unconditional branch: j L1 jr $s0 Convert to assembly: if (i == j) f = g+h; else f = g-h;

45 Control Instructions Conditional branch: Jump to instruction L1 if register1 equals register2: beq register1, register2, L1 Similarly, bne and slt (set-on-less-than) Unconditional branch: j L1 jr $s0 Convert to assembly: if (i == j) bne $s3, $s4, Else f = g+h; add $s0, $s1, $s2 else j Exit f = g-h; Else: sub $s0, $s1, $s2 Exit:

46 Example Convert to assembly: while (save[i] == k) i += 1; i and k are in $s3 and $s5 and base of array save[] is in $s6

47 Example Convert to assembly: while (save[i] == k) i += 1; i and k are in $s3 and $s5 and base of array save[] is in $s6 Loop: sll $t1, $s3, 2 add $t1, $t1, $s6 lw $t0, 0($t1) bne $t0, $s5, Exit addi $s3, $s3, 1 j Loop Exit:

48 Title Bullet

49 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday

50 Recap The jal instruction is used to jump to the procedure and save the current PC (+4) into the return address register Arguments are passed in $a0-$a3; return values in $v0-$v1 Since the callee may over-write the caller’s registers, relevant values may have to be copied into memory Each procedure may also require memory space for local variables – a stack is used to organize the memory needs for each procedure

51 The Stack The register scratchpad for a procedure seems volatile – it seems to disappear every time we switch procedures – a procedure’s values are therefore backed up in memory on a stack Proc A’s values Proc B’s values Proc C’s values … High address Low address Stack grows this way Proc A call Proc B … call Proc C … return

52 Example 1 int leaf_example (int g, int h, int i, int j) { int f ; f = (g + h) – (i + j); return f; }

53 Example 1 int leaf_example (int g, int h, int i, int j) { int f ; f = (g + h) – (i + j); return f; } leaf_example: addi $sp, $sp, -12 sw $t1, 8($sp) sw $t0, 4($sp) sw $s0, 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0, $t0, $t1 add $v0, $s0, $zero lw $s0, 0($sp) lw $t0, 4($sp) lw $t1, 8($sp) addi $sp, $sp, 12 jr $ra Notes: In this example, the procedure’s stack space was used for the caller’s variables, not the callee’s – the compiler decided that was better. The caller took care of saving its $ra and $a0-$a3.

54 Example 2 int fact (int n) { if (n < 1) return (1); else return (n * fact(n-1)); }

55 Example 2 int fact (int n) { if (n < 1) return (1); else return (n * fact(n-1)); } fact: addi $sp, $sp, -8 sw $ra, 4($sp) sw $a0, 0($sp) slti $t0, $a0, 1 beq $t0, $zero, L1 addi $v0, $zero, 1 addi $sp, $sp, 8 jr $ra L1: addi $a0, $a0, -1 jal fact lw $a0, 0($sp) lw $ra, 4($sp) addi $sp, $sp, 8 mul $v0, $a0, $v0 jr $ra Notes: The caller saves $a0 and $ra in its stack space. Temps are never saved.

56 Memory Organization The space allocated on stack by a procedure is termed the activation record (includes saved values and data local to the procedure) – frame pointer points to the start of the record and stack pointer points to the end – variable addresses are specified relative to $fp as $sp may change during the execution of the procedure $gp points to area in memory that saves global variables Dynamically allocated storage (with malloc()) is placed on the heap Stack Dynamic data (heap) Static data (globals) Text (instructions)

57 Dealing with Characters Instructions are also provided to deal with byte-sized and half-word quantities: lb (load-byte), sb, lh, sh These data types are most useful when dealing with characters, pixel values, etc. C employs ASCII formats to represent characters – each character is represented with 8 bits and a string ends in the null character (corresponding to the 8-bit number 0)

58 Example Convert to assembly: void strcpy (char x[], char y[]) { int i; i=0; while ((x[i] = y[i]) != `\0’) i += 1; }

59 Example Convert to assembly: void strcpy (char x[], char y[]) { int i; i=0; while ((x[i] = y[i]) != `\0’) i += 1; } strcpy: addi $sp, $sp, -4 sw $s0, 0($sp) add $s0, $zero, $zero L1: add $t1, $s0, $a1 lb $t2, 0($t1) add $t3, $s0, $a0 sb $t2, 0($t3) beq $t2, $zero, L2 addi $s0, $s0, 1 j L1 L2: lw $s0, 0($sp) addi $sp, $sp, 4 jr $ra

60 Large Constants Immediate instructions can only specify 16-bit constants The lui instruction is used to store a 16-bit constant into the upper 16 bits of a register… thus, two immediate instructions are used to specify a 32-bit constant The destination PC-address in a conditional branch is specified as a 16-bit constant, relative to the current PC A jump (j) instruction can specify a 26-bit constant; if more bits are required, the jump-register (jr) instruction is used

61 Starting a Program C Program Assembly language program Object: machine language moduleObject: library routine (machine language) Executable: machine language program Memory Compiler Assembler Linker Loader x.c x.s x.o x.a, x.so a.out

62 Role of Assembler Convert pseudo-instructions into actual hardware instructions – pseudo-instrs make it easier to program in assembly – examples: “move”, “blt”, 32-bit immediate operands, etc. Convert assembly instrs into machine instrs – a separate object file (x.o) is created for each C file (x.c) – compute the actual values for instruction labels – maintain info on external references and debugging information

63 Role of Linker Stitches different object files into a single executable  patch internal and external references  determine addresses of data and instruction labels  organize code and data modules in memory Some libraries (DLLs) are dynamically linked – the executable points to dummy routines – these dummy routines call the dynamic linker-loader so they can update the executable to jump to the correct routine

64 Full Example – Sort in C Allocate registers to program variables Produce code for the program body Preserve registers across procedure invocations void sort (int v[], int n) { int i, j; for (i=0; i<n; i+=1) { for (j=i-1; j>=0 && v[j] > v[j+1]; j-=1) { swap (v,j); } void swap (int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; }

65 The swap Procedure Register allocation: $a0 and $a1 for the two arguments, $t0 for the temp variable – no need for saves and restores as we’re not using $s0-$s7 and this is a leaf procedure (won’t need to re-use $a0 and $a1) swap: sll $t1, $a1, 2 add $t1, $a0, $t1 lw $t0, 0($t1) lw $t2, 4($t1) sw $t2, 0($t1) sw $t0, 4($t1) jr $ra

66 The sort Procedure Register allocation: arguments v and n use $a0 and $a1, i and j use $s0 and $s1; must save $a0 and $a1 before calling the leaf procedure The outer for loop looks like this: (note the use of pseudo-instrs) move $s0, $zero # initialize the loop loopbody1: bge $s0, $a1, exit1 # will eventually use slt and beq … body of inner loop … addi $s0, $s0, 1 j loopbody1 exit1: for (i=0; i<n; i+=1) { for (j=i-1; j>=0 && v[j] > v[j+1]; j-=1) { swap (v,j); }

67 The sort Procedure The inner for loop looks like this: addi $s1, $s0, -1 # initialize the loop loopbody2: blt $s1, $zero, exit2 # will eventually use slt and beq sll $t1, $s1, 2 add $t2, $a0, $t1 lw $t3, 0($t2) lw $t4, 4($t2) bgt $t3, $t4, exit2 … body of inner loop … addi $s1, $s1, -1 j loopbody2 exit2: for (i=0; i<n; i+=1) { for (j=i-1; j>=0 && v[j] > v[j+1]; j-=1) { swap (v,j); }

68 Saves and Restores Since we repeatedly call “swap” with $a0 and $a1, we begin “sort” by copying its arguments into $s2 and $s3 – must update the rest of the code in “sort” to use $s2 and $s3 instead of $a0 and $a1 Must save $ra at the start of “sort” because it will get over-written when we call “swap” Must also save $s0-$s3 so we don’t overwrite something that belongs to the procedure that called “sort”

69 Saves and Restores sort: addi $sp, $sp, -20 sw $ra, 16($sp) sw $s3, 12($sp) sw $s2, 8($sp) sw $s1, 4($sp) sw $s0, 0($sp) move $s2, $a0 move $s3, $a1 … move $a0, $s2 # the inner loop body starts here move $a1, $s1 jal swap … exit1: lw $s0, 0($sp) … addi $sp, $sp, 20 jr $ra 9 lines of C code  35 lines of assembly

70 Relative Performance Gcc optimization Relative Cycles Instruction CPI performance count none B 115B 1.38 O B 37B 1.79 O B 40B 1.66 O B 45B 1.46 A Java interpreter has relative performance of 0.12, while the Jave just-in-time compiler has relative performance of 2.13 Note that the quicksort algorithm is about three orders of magnitude faster than the bubble sort algorithm (for 100K elements)

71 Title Bullet