Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC 3210 Computer Organization and Programming

Similar presentations


Presentation on theme: "CSC 3210 Computer Organization and Programming"— Presentation transcript:

1 CSC 3210 Computer Organization and Programming
Chapter 2 SPARC Architecture Dr. Anu Bourgeois

2 Introduction SPARC is a load/store architecture
Registers used for all arithmetic and logical operations 32 registers available at a time Uses only load and store instructions to access memory

3 Registers Registers are accessed directly for rapid computation
32 registers – divided into 4 sets -- Global: %g0-%g7 -- Out: %o0 - %o7 -- In: %i0 - %i7 -- Local: %l0 - %l7 %g0 – always returns 0 %o6, %o7, %i6, %i7 – do not use Register size = 32 bits each

4 Table of Registers

5 SPARC Assembler SPARC assembler as: 2-pass assembler First pass:
Updates location counter without paying attention to undefined labels for operands Defines label symbol to location counter Second pass: Values substituted in for labels Ignores labels followed by colons

6 Assembly Language Programs
Programs are line based Use mnemonics which generate machine code upon assembling Statements may be labeled Comments: ! or /* … */ /* instructions to add and to subtract the contents of %o0 and %o1 */ start: add %o0, %o1, %l0 !l0=o0+o1 sub %o0, %o1, %l1 !l1=o0-o1

7 Psuedo-ops Statements that do not generate machine code
e.g. Data defininitions, statements to provide the assembler information Generally start with a period a: .word 3 Can be labeled .global main main:

8 Compiling Code – 2 step process
C compiler will call as and produce the object files Object files are the machine code Next calls the linker to combine .o files with library routines to produce the executable program – a.out

9 Compiling a C program %gcc -S program.c : produces the .s assembly language file %gcc expr.s –o expr : assembles the program and produces the executable file NOTE: You will only do this for the 1st assignment

10 Start of Execution C compiler expects to start execution at an address main The label must be at the first statement to execute and declared to be global .global main main: save %sp, -96, %sp save instruction provides space to save registers for the debugger

11 Macros If we have macros defined, then the program should be a .m file
We can expand the macros to produce a .s file by running m4 first % m4 expr.m > expr.s % gcc expr.s –o expr

12 SPARC Instructions 3 operands: 2 source operands and 1 destination operand Source registers are unchanged Result stored in destination register Constants : ≤ c < 4096 op regrs1, regrs2, regrd op regrs1, imm, regrd

13 Sample Instructions clr regrd mov reg_or_imm, regrd
Clears a register to zero mov reg_or_imm, regrd Copies content of source to destination add regrs1, reg_or_imm, regrd Adds oper1 + oper2  destination sub regrs1, reg_or_imm, regrd Subtracts oper1 - oper2  destination

14 Multiply and Divide No instruction available in SPARC
Use function call instead Must use %o0 and %o1 for sources and %o0 holds result mov b, %o0 mov b, %o0 mov c, %o1 mov c, %o1 call .mul call .div a = b * c a = b ÷ c

15 Instruction Cycle Instruction cycle broken into 4 stages:
Instruction fetch Fetch & decode instruction, obtain any operands, update PC Execute Execute arithmetic instruction, compute branch target address, compute memory address Memory access Access memory for load or store instruction; fetch instruction at target of branch instruction Store results Write instruction results back to register file

16 Pipelining SPARC is a RISC machine – want to complete one instruction per cycle Overlap stages of different instructions to achieve parallel execution Can obtain a speedup by a factor of 4 Hardware does not have to run 4 times faster – break h/w into 4 parts to run concurrently

17 Pipelining Sequential: each h/w stage idle 75% of the time. timeex = 4 * i Parallel: each h/w stage working after filling the pipeline. timeex = 3 + i

18 Data Dependencies – Load Delay Problem
load [%o0], %o1 add %o1, %o2, %o2

19 Branch Delay Problem Branch target address not available until after execution of branch instruction Insert branch delay slot instruction

20 Branch delays Try to place an instruction after the branch that is useful – can also use nop The instruction following a branch instruction will always be fetched Updating the PC determines which instruction to fetch next

21 Determine if branch taken
cmp bg mov ??? execute fetch cmp %l0, %l1 bg next mov %l2, %l3 sub %l3, 20, %l4 Condition true: branch to next Condition false: continue to sub F E M W F E M W F E M W F E M W Determine if branch taken Update if true Target  PC Fetch instruction from memory[PC] Obtain operands Update PC PC++

22 Actual SPARC Code: expr.m

23 Expanding Macros After running through m4: %m4 expr.m > expr.s
Produce executable: %gcc expr.s – expr Execute file: %./expr

24 The Debugger – gdb Used to verify correctness, and find bugs
Can also execute a program, stop execution at any point and single-step execution After assembling the program and placing the output into expr, launch gdb: %gdb expr To run code in gdb, type “r”: (gdb) r

25 gdb Commands Best way to learn is by practice
Can be set at any address to stop execution in order to check status of program and registers To set a breakpoint at a label: (gdb) b main Breakpoint 1 at 0x106a8 (gdb) Typing “c” continues execution until it reaches the next breakpoint or end of code Can print contents of a register (gdb) p $l1 $2 = -8 Best way to learn is by practice

26 Filling Delay Slots The call instruction is called a delayed control transfer instruction : changes address from where future instructions will be fetched The following instruction is called a delayed instruction, and is located in the delay slot The delayed instruction is executed before the branch/call happens By using a nop for the delay slot – still wasting a cycle Instead, we may be able to move the instruction prior to the branch instruction into the delay slot.

27 Filling Delay Slots Move sub instructions to the delay slots to eliminate nop instructions .global main main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call mul sub %l0, 7, %o1 !(x - 7) into %o1 call div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y mov , %g1 !trap dispatch ta !trap to system

28 Filling Delay Slots Executing the mov instruction, while fetching the sub instruction .global main main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call mul sub %l0, 7, %o1 !(x - 7) into %o1 call div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y mov , %g1 !trap dispatch ta !trap to system EXECUTE  FETCH 

29 Filling Delay Slots Now executing the sub instruction, while fetching the call instruction .global main main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call mul sub %l0, 7, %o1 !(x - 7) into %o1 call div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y mov , %g1 !trap dispatch ta !trap to system EXECUTE  FETCH 

30 Filling Delay Slots Now executing the call instruction, while fetching the sub instruction .global main main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call mul sub %l0, 7, %o1 !(x - 7) into %o1 call div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y mov , %g1 !trap dispatch ta !trap to system Execution of call will update the PC to fetch from mul routine, but since sub was already fetched, it will be executed before any instruction from the mul routine EXECUTE  FETCH 

31 Filling Delay Slots Now executing the sub instruction, while fetching from the mul routine .global main main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call mul sub %l0, 7, %o1 !(x - 7) into %o1 call div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y mov , %g1 !trap dispatch ta !trap to system …… .mul: save ….. EXECUTE  FETCH 

32 Filling Delay Slots Now executing the save instruction, while fetching the next instruction from the mul routine .global main main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call mul sub %l0, 7, %o1 !(x - 7) into %o1 call div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y mov , %g1 !trap dispatch ta !trap to system …… .mul: save ….. EXECUTE  FETCH 

33 Filling Delay Slots While executing the last instruction of the mul routine, will come back to main and fetch the call .div instruction .global main main: save %sp, -96, %sp mov 9, %l0 !initialize x sub %l0, 1, %o0 !(x - 1) into %o0 call mul sub %l0, 7, %o1 !(x - 7) into %o1 call div sub %l0, 11, %o1 !(x - 11) into %o1, the divisor mov %o0, %l1 !store it in y mov , %g1 !trap dispatch ta !trap to system …… .mul: save ….. At this point %o0 has the result from the multiply routine – this is the first operand for the divide routine FETCH  The subtract instruction will compute the 2nd operand before starting execution of the divide routine EXECUTE 

34 2.9 Branching 2.9.1 Testing Z zero whether the result was zero
Instructions for testing and branching: 2.9.1 Testing The information about the state of execution of an instruction is saved in the following flags: Z zero whether the result was zero N negative whether the result was negative V overflow whether the result was too large for the register C carry whether the result generated a carry out Special add and sub instructions: ‘cc’ is appended to the mnemonic, and the instruction sets condition codes Z, N, V, and C to save the state of execution. E.g. addcc regrs1, reg_or_imm, regrd subcc regrs1, reg_or_imm, regrd

35 2.9.2 Branches Branch instructions are similar to call instructions. They will specify the label of the destination instruction. These too are delayed control transfer instructions. Branch instructions test the condition codes in order t determine if the branching condition exists: b_{icc} label where bicc stands for one of the branches testing the integer condition codes.

36 Table of signed number branches
Assembler Mnemonic Unconditional Branches ba Branch always, goto bn Branch never Assembler Mnemonic Signed Arithmetic Branches bl Branch on less than zero ble Branch on less or equal to zero be Branch on equal to zero bne Branch on not equal to zero bge Branch on greater or equal to zero bg Branch on greater than zero

37

38

39

40

41

42

43

44 2.10 Control statements 2.10.1 While :
The condition of a while loop is to be evaluated before the loop is executed, and if the condition is not met, the loop, including the first instruction of the loop, is not to be executed. Consider the C equivalent of the while loop: While ( a <= 17) { a = a += b; c++; }

45

46

47

48

49

50 Annulled conditional branches
All conditional branches may be annulled, and if annulled, the delay instruction is executed when the branch is taken, but not if the branch is not taken. Note that the delay slot instruction is still fetched; it is just that its execution is ‘annulled, wasting a cycle. The flow chart of the annulled conditional branches is given in the next slide.

51

52 Do Consider a Do loop:

53

54 For For structure in C: For ( ex1; ex2;, ex3 ) st Express the above definition as: ex1; While ( ex2 ) { st ex3 }

55 Thus the translation of for (a=1; a<= b; a++) c *= a; would be:

56 If Then The statement following the relational expression is to be branched over if the condition is not true. To accomplish this, we need to logically complement the sense of the branch, following the relational expression evaluation, before the code for the statement. Table of complements of the branches Condition Complement bl bge ble bg be bne

57 For example, to translate

58

59 If Else An if-else statement allows us to do a letter with regard to filling the delay slot. Consider: If ((a+b) >= c) { a += b; c++; } else { a -= b; C--; } C += 10;

60 We will complement initial test to branch over and then code to the else code if the condition is false.

61

62

63 Annulled unconditional branch

64 Summary of gdb commands


Download ppt "CSC 3210 Computer Organization and Programming"

Similar presentations


Ads by Google