Download presentation
Presentation is loading. Please wait.
1
Chapter 3 x86 Assembly Language 2
ECE 485/585 Microprocessors Chapter 3 x86 Assembly Language 2 Herbert G. Mayer, PSU Status 10/27/2016
2
Syllabus Motivation Integer Multiply Integer Divide Conditional Branch
Loop Constructs Memory Access Call and Return Procedure PutDec Summary Appendix Bibliography
3
Motivation In another handout about x86 assembly language we cover modules, character- and string-output, and writing assembler procedures Here we cover integer arithmetic including divide, loops, and develop a more complex program to output signed integer numbers Since integer multiplication can generate results that are twice as long in bits as any of the source operands, the machine instructions for integer multiply –conversely for integer divide– must make special provisions for the length of operands Note: multiplication generates double precision result! Sometimes divide follows, cutting double- to single precision
4
X86 Integer Multiply and Divide
5
Integer Multiply Our first project is 16-bit signed integer multiplication To track all minute detail of the result, including overflow, sign of the result, etc. we use the small x86 machine model, using 16- bit operands, in Intel-parlance known as words In that model, the smallest negative integer is , the largest is Same principles apply to new model with 64- bit precision integers
6
Integer Multiply Under Microsoft’s assembler: opcodes are mul for unsigned, and imul for signed integer multiplication One operand is the ax register; is always implied Second operand may be a memory location, or register Literal operand not permitted in the small mode; but ok on 32-bit μP The result/product is in the register pair ax and dx There exists also a byte-version of the multiply, in which case the implied operand is in al, the other operand is a byte memory location or a byte register, and the result/product is in ax In the code sample below we multiply literal 10, moved into register bx, with the contents of the other, implied operand as always register ax
7
Integer Multiply ; integer multiplication on x86:
; multiply literal 10 with contents of ax ; ax holds a copy of memory location MAX mov bx, 10 ; a literal is in bx mov ax, MAX ; signed word at location MAX imul bx ; note that 1 operand is implied ; 32-bit product is in ax + dx ; hi order bits in dx . . .
8
Integer Divide: cwd Just as integer multiply creates a signed integer double-word result in register pair ax and dx, the integer divide instruction assumes the numerator to be in pair ax and dx If that numerator is a single precision operand, it must be explicitly extended The denominator may be in register or memory To create a sign-extended double-register operand in the ax-dx pair from single precision ax, x86 provides a special convert-word-to-double instruction cwd The cwd has no explicit operand; both are implied! Assumed source operand is the value in ax, ax is unchanged assumed destination is register pair ax-dx I.e. the sign of ax is extended into dx
9
Integer Divide: cwd ; memory location B_word holds one operand: numerator ; numerator is copied i.e. moved into register ax ; but first convert single to double precision mov ax, B_word ; signed word at B_word in ax cwd ; convert word to double-word ; sign of ax extended into dx ; ditto with byte-sized operands mov al, a_byte ; signed byte a_byte into al cbw ; convert byte to word ; now the numerator can be used as operand in divide . . .
10
Integer Divide Integer divide needs 2 operands
Numerator is in ax, is extended to dx to create double word Other operand may be memory location or register Opcode div is for unsigned divide operation, and idiv for signed integer division In example assume numerator to be in memory location A_wd Denominator is at memory location B_wd Quotient ends up in ax, and remainder in dx
11
Integer Divide ; signed integer divide on x86:
; assume operands to be in locations A_wd and B_wd ; mov ax, A_wd ; signed word at A_wd in ax cwd ; sign of A_wd 16 times in dx idiv B_wd ; quotient A_wd/B_wd in ax ; remainder A_wd/B_wd in dx ; flags set to see: negative?
12
On the x86 Microprocessor
Memory Access On the x86 Microprocessor
13
Memory Access Key components of any computer are processor and memory
Memory is referenced implicitly (on CISC architecture) and explicitly by instructions that read, write data to and from memory Explicit accesses are called loads (for reading) and stores (for writing) Assemblers provide explicit instructions for these operations; mov on x86 Implicit memory accesses occur in machine instructions whose operands may be memory cells On RISC systems these implicit references generally do not exist; instead all memory traffic is exclusively funneled through loads and stores on RISC
14
Memory Access In an assembler program, memory locations (both for data and code) are generally referred to symbolically This improves readability and allows for relocation; i.e. the linker and loader have a certain degree of freedom of placement in physical memory However, explicit memory addressing via a hard coded numbers is possible: On hypothetical machine ld r1, 1000 could mean: load the word in memory location 1000 into register r1 Some assembly languages provide syntax to render indirection explicit, for example the load operation: ld r1, (1000) uses parentheses to allude to this indirection
15
Memory Access A common paradigm of referencing symbolic memory names (labels) is achieved indirectly A label’s memory address is not what is wanted, but the contents of memory at that label location For example, if offset of datum foo is 10,000, then operation ld r2, foo does not mean: load literal into register r2 Instead, foo is used indirectly, the word at that address is referenced, loaded into register r2 When the address is really wanted, IBM 370 for example uses a special load address, while the masm assembler for the x86 architecture uses the seg -or offset- operator to express: indirection is not wanted Instead, the segment register portion of the address -or the offset portion of the address- is wanted
16
Memory Access During indirect memory references it is sometimes desirable to index Indexing modifies an otherwise fixed memory address. Typically, modifier resides in register And if the value in that register is modified from iteration to iteration, the indexing operation can access memory in some sequential order, say in increasing (or decreasing) fashion Such access to sequential memory addresses in equal steps is known as stride For example, if register r2 holds value 4, then the instruction ld r1, foo[r2] means: fetch the word located in memory 4 bytes further than offset foo Load that word into register r1
17
Memory Access In addition to indexing through a register, many architectures (and thus assemblers) allow the offset to be modified by an additional literal index The literal value is encoded into the instruction, referred to as an immediate operand Immediate values are usually small, since architecture often provides just a few bits to hold it On some architectures this immediate operand may be signed, on others only unsigned literal modifiers are possible
18
Memory Access Memory holds the data to be manipulated
Also intermediate results must be stored in memory Registers usually are in short supply, contrasted with amount of addressable memory During computation, data must be brought from peripherals to memory; then from memory to registers After computation, data are moved from register to memory, then to peripherals, e.g. printers, discs, etc. Access to memory is 1 to 2 orders of magnitude slower than access to register! Often a cache helps overcome the speed bottleneck of memory accesses
19
Memory Access Indexing on x86 with masm
Indirect memory references are the default semantics on assemblers for the x86 architecture On nasm and masm this can be expressed explicitly via the [ ] operator pair The move instruction mov below is really a load: data_seg1 segment foo dw , 0, . . . data_seg1 ends . . . mov ax, foo ; indirection implied in masm mov ax, [foo] ; explicit indirection in masm
20
Memory Access Indexing on x86 with nasm
The above mov code loads the word at data segment location foo into register ax, regardless of whether [] is used In the nasm assembler the instruction mov ax, foo ; load offset of address in nasm!!! loads the address of the memory location into register ax, while the nasm mnemonic: mov ax, [foo]; loads contents at address in nasm loads the contents of memory location foo into ax; assembler differences can be very subtle, very dangerous!!
21
Memory Access Indexing on x86
Handy programming tool that makes indexing convenient is modification of address labels by registers, literals, or a combination Clearly, underlying computer architecture must support this, i.e. there must be instructions in place that allow index or indexed load and store operations Some architectures (including IBM 360 and x86) allow multiple registers to modify (to index) address label These registers are referred to as base- and as index- registers Base-register means that some base address sits in that register
22
Memory Access Indexing on x86
However, in the x86 architecture, as long as an address expression includes a data memory label, that label is the base address With the following provisos: If l1 and l2 are address labels, and c1 and c2 numeric literal constants, then: l1 + c1 ; is address of location l1 plus c1 l1 – c2 ; is address of location l1 minus c2 l1 – l2 ; is a pure, raw numeric value: l1 – l2 [l2 + c1] ; is the memory contents at l2 + c1 [l1 – c2] ; is the memory contents at l1 – c2 l1 + l2 ; is illegal on x86 ; difference of labels [l1 – l2] ok: base address ; is canceled out. Sum [l1 + l2]not meaningful: base ; address would be included twice!
23
Memory Access Indexing on x86
On orthogonal architectures, user visible registers should be usable for indexing and as regular operands Practical limitations often force compromise: E.g. on x86 architecture, only certain registers can be used for indexing, listed below: address expression + one of ( bx, bp, si, and di ) on x86 An address expression, being indexed via suitable index registers, may also contain a literal modifier, even both, making indexing practical and easy for array accesses Note: possible to use up to 2 index- and base- registers in a single address expression, but only with the following restriction: address expression + two of ( bx or bp, and ( si or di ) ) on x86
24
Memory Access Indexing on x86
Address expression such as [min_data+bx+si+2] is allowed, while [min_data+bx+bx] is not permitted due to multiple uses of bx register Samples here assume min_data to be a code label in date segment Complex arithmetic expressions with all typical arithmetic operators are allowed on x86 assemblers, as long as the resulting value is reducible to single numeric value assembly time Thus, an expression like [chars+bx+si+2*3+4] is legal, provided that chars is a defined label
25
Memory Access Implicit Segment Register
Data declared in the data (.data) segment (p. 27 below) are hexadecimal digits ‘0’ .. ‘f’ User-defined macro Put_Char (ibid.) uses system service call 021h for single-character output See sample use of bx as index register (p. 28) Only base and index registers can be used for this purpose, e.g. not cx Memory operands (data labels) are used indirectly Indirection is explicitly expressed via [ and ] operator But not necessary for memory operands in Microsoft SW: indirection assumed! Since it is needed in nasm and Unix systems, we recommend always use of the [ ] operator for portability!
26
Memory Access Implicit Segment Register
Benefit is also improved readability to use explicit brackets to allude to indirection, such as [chars] Indirect offset and index register are both allowed Either or both or none may be modified by an immediate operand Immediate operands are limited to 16 bits in size Order of offset and index arbitrary The output of program below is: You figure it out, students, at least first few characters
27
Memory Access ; Purpose: memory references, indexing
start macro ; no parameters mov predefined macro mov ds, ax ; now data segment reg set endm ; end macro: start termin macro ret_code ; no parameters, assume 0 mov ah, Term_Code ; we wanna terminate, ah + al mov al, ret_code ; any errors? If /= 0 int 21h ; call DOS for help endm ; end macro: termin Put_Char macro char ; output char passed as parameter mov ah, Cout_Code ; tell DOS: Char out mov dl, char ; char into required byte reg dl int 21h ; and call DOS 21 to print char endm ; end macro Put_Char Cout_Code = 2h Term_Code = 4ch .model small .data chars db " abcdef"
28
Memory Access .code main: start mov bx, 2 ; index char '2' in chars
mov cl, 'h' Put_Char cl ; o.k. since cl holds char Put_Char 'm' Put_Char chars ; not good programming Put_Char chars[bx] ; shows partial indirection Put_Char [chars] ; explicit Put_Char [chars+1] ; explicit Put_Char [chars+bx] Put_Char chars[bx+2] Put_Char [chars+bx+3] Put_Char [bx][chars] Put_Char [chars]+[bx] Put_Char [bx+4][chars] Put_Char [bx+3][chars+2] done: termin ; no errors if we reach done
29
Memory Access hm02012452267 Explicit Segment Register, Answer
Data segment character string: “ abcdef” And the answer, i.e. output is below: hm
30
Memory Access Units of Words
31
Word Access Goal: to reference memory as words
Output these integers as decimal numbers Use to be designed PutDec() assembler procedure to print decimal numbers Macros start and termin as before Use register bx again as index register Data segment defines some decimal and some hex literals Data label nums defines an array of integer words
32
Word Access Observe that modifications to index register is done in steps of 2 Stride of word is 2 on x86! Note that words initialized via hex literals are still printed as signed integers Intended output shown below: Output:
33
Word Access ; Purpose: word memory references, indexing
start macro ; no parameters mov predefined macro mov ds, ax ; now data segment reg set endm ; end macro: start termin macro ret_code ; no parameters, assume 0 mov ah, Term_Code ; terminate, ah + al mov al, ret_code ; any errors? If /= 0 int 21h ; call DOS for help endm ; end macro: termin Term_Code = 4ch .model small .data nums dw 511, 512, 513, 1023, 1024, 1025 w1 dw 0deadh w2 dw 0beefh w3 dw 0c0edh w4 dw 0babeh
34
Word Access, Cont’d .code extrn PutDec : near main: start
mov bx, 2 ; use bx as index register mov ax, nums call PutDec ; output is: 511 mov ax, [nums + 2] call PutDec ; output is: 512 mov ax, [nums + bx] mov ax, [nums][bx + 2] call PutDec ; output is: 513 mov ax, [nums+2][bx+6] call PutDec ; output is: 1025
35
Word Access, Cont’d mov nums, 0deadh mov ax, nums
call PutDec ; output is: -8531 mov ax, w1 mov ax, [w2+bx+2] call PutDec ; output is: mov ax, [w1+6] done: termin 0 ; no errors if we reach end main ; start here!
36
Loop Constructs
37
Comparison By default, a machine executes one instruction after another, in sequence That sequence can be changed via branches Branches are also known as jumps, depending on manufacturer Unconditional branches transfer control to their defined destination Conditional branches make this change in control flow only if their associated condition holds! Needed to construct loops
38
Comparison How does the microprocessor “know” when or whether a condition is true? The CPU has flags that specify this condition, and instructions that test for the condition Typical conditions are zero, negative, positive, overflow, carry, etc. Symbolic flags are CF, ZF, OF These can be used as operands in conditional branches, conditional calls etc.
39
Conditional Branch -- a sample C source program snippet
if ( a > b ) { max = a; }else{ max = b; } //end if; ; corresponding x86 assembler snippet: ; larger of a, b, into memory loc “max”, better: [max] mov ax, [a] ; contents mem. location a cmp ax, [b] ; mem. location b jle b_is_max ; jump to b_is_max if mov [max], ax jmp end_if ; jump around else b_is_max: ; this is else mov ax, [b] end_if: . . .
40
Loops Operations are performed repeatedly via loops
Higher level language loops use defined structured loops Loops include Repeat, While, and For Statements We introduce the x86 loop instruction Generally a loop body is repeated until a particular value (sentinel) is found A loop body entered unconditionally is akin to a Repeat Statement or do while in C
41
x86 Loop Typical pattern of x86 loop instruction follows:
next: . . . loop next ; is executed: if --cx then goto next; This loop: characteristic for do{ } while() of C
42
Skip Loop Constructs For ECE
43
x86 Loop Loops allow the repeated operation of their bodies
Based on a condition, or based on a defined number of steps, which in effect defines that condition On the x86 architecture, the cx register functions as the counter for counted loops, with the loop opcode On x86 the counted loop is executed by the loop instruction, assuming the loop count in cx As long as cx is not 0, execution continues at the place of the loop label Else execution continues at the next instruction after the loop opcode During each execution of the loop opcode, the value in cx is decremented by 1
44
x86 Loop ; demonstrate the x86 “loop” instruction
; assumes count to be in cx ; when loop is executed: decrement cx ; once cx is 0, continue at instruction after loop ; else branch to label ; place 10 into cx to define loop steps mov cx, 10 again: ; a label! Note the colon : mov ax, cx ; print value in ax call PutDec ; via PutDec procedure loop again ; check, if need to loop more ; prints the numbers 10 down to 1, but NOT 0
45
First Loop We define a string in data segment, all ‘0’..’f’ digits
The data area is named ‘chars’ and being used as address (data offset) The sentinel for loop termination is ‘#’ Register bx used as index register Note that only bx, si, di, and bp can be used for indexing on x86 Practice the cmp instruction, which compares by subtracting, and then sets flags Learn to know conditional (jcc) and unconditional jump (jmp) See use of labels as destinations of jumps Output of program is: abcdef
46
First Loop ; Source file: loop1.asm
; Purpose: use, syntax of indexing array w. sentinel Start macro ; no parameters mov predefined macro mov ds, ax ; now data segment reg set endm ; end macro: start Termin macro ret_code ; 1 parameter: return code mov ah, 4ch ; terminate: set ah + al mov al, ret_code; any errors? If /= 0 int 21h ; call sys sw for help endm ; end macro: termin Char_Out = 2h Sentin = '#' .model small .data chars db " abcdef", Sentin
47
First Loop .code main: start mov ah, Char_Out ; set up ah for sys
mov bx, 0 ; to index string, init 0 next: mov dl, chars[bx] ; find next char inc bx ; increment index reg bx cmp dl, Sentin ; found sentinel? je done ; yep, so stop int 21h ; nop, so print it jmp next ; try next; could be sent done: termin ; no errors if we reach end main ; start here!
48
Second Loop Again we define character string in data segment, all ‘0’..’f’ hex digits This time we use no sentinel Assume that the loop is executed exactly 16 times, and is known a-priori, i.e. a countable loop Again we use register bx as index register Learn loop instruction, which tracks loop count and conditional branch Loop instruction on x86 subtracts 1 from cx each time it is executed If cx = 0, fall through; else branch to target, which is part of instruction Output of program is: abcdef
49
Second Loop ; Source file: loop2.asm
; Purpose: use, syntax of indexing char array ; loop is "countable" we know # of elements ; b 4 start of loop; we know at assembly time . . . same macros start, termin Char_Out = 2h Num_El = 10h ; 16 elements in chars array[] .model small .data chars db " abcdef"
50
Second Loop .code ; abbreviation main: start
mov ah, Char_Out ; set up ah for system call mov bx, 0 ; initial index off 'chars' mov cx, Num_El ; know # iterations a priori next: mov dl, chars[bx]; find next char inc bx ; increment index register int 21h ; print it loop next ; try next one; could be 0: end
51
Third Loop Again we define a character string in data segment, all ‘0’..’f’ hex digits, no sentinel Assume iteration count is not known a-priori Again use register bx as index register Must check whether cx is less than or equal to zero Caution: If cx were negative, this would be bad news, as looping will be excessive! Goood that x86 provides a special opcode jcxz Loop instruction on x86 subtracts 1 from cx; should start with a positive value New instruction jcxz: if cx is already zero at start, branch and don’t enter loop body Output of program is: abcdef
52
Third Loop ; Source file: loop3.asm
; Purpose: use, syntax of indexing char array .model small .data chars db " abcdef" .code main: start mov ah, Char_Out; set up ah for DOS call mov bx, 0 ; initial index off 'chars‘ ; assume that # read at run time ; fake this reading by brute-force setting ; but the point is: The # could be non-positive!
53
Third Loop mov cx, 16 ; pretend we read value of cx
cmp cx, 0 ; then test if cx < 0 jl done_neg ; if it is, jump jcxz done_zero ; if it is zero, jump also ; if we reach this: cx is positive next: mov dl, [chars][bx] ; find next char inc bx ; increment index register int 21h ; output next character loop next ; try next one; could be end done: termin 0 ; no errors if we reach done_neg: termin 1 ; another error code. Not 0 done_zero: termin 2 ; an yet another error end main ; start here!
54
X86 Call and Return
55
Call and Return High level programming requires logical (and physical) modularization to render the overall programming task manageable Key tool for logical modularization is creation of procedures (in some languages called subroutines, functions, etc.) with their associated calls and returns This section introduces calling and returning, also known as context switching We’ll use the term procedure generically to mean procedure, function, or subroutine, unless the particular meaning is needed
56
Call and Return It is not feasible to express a complete program as single procedure, when the program is large Logical modules reduce complexity of programming task This allows re-use and reincarnation of the same procedure through parameterization A High Level language should hide the detail of call/return mechanism; not so in assembler For example, the manipulation of the stack through push and pop operations should be hidden However, some aspects of context switch should be reflected in High Level language, in particular the call and return
57
Call and Return Like in High-Level language programs, procedures are a key syntax tool to modularize Physical modules (procedures) encapsulate data and actions that belong together Physical modules –delineated by the proc and endp keywords) are the language tool to define modules Procedures can be called, via the call opcode, parameterized by the procedure name, e.g.: call PutDec Procedures return, via the ret instruction If they return a result to the calling environment, we refer to them as functions A return ends up at the instruction after the call
58
Call and Return Stack Frame
Stack Pointer identifies top of current stack, and also top of current Stack Frame Stack pointer may vary often during invocation Stack pointer changes upon call, return, push, pop, explicit assignments Base pointer does not vary during call Base pointer only set up once at start of call Base pointer changed again at return, to value of previous base pointer, dynamic link Parameters can be addressed relative to base pointer in one direction
59
Call and Return Stack Frame
Locals (and temps) can be addressed relative to base pointer in opposite direction of formal parameters Possible to save use of base pointer, useful when registers are scarce, as on x86; not discussed here However, this scheme is difficult, since compiler (or human programmer) must keep dynamic size of stack in mind at any moment of time of code generation
60
Call and Return Stack Frame –stack grows upwards
61
Call and Return Before Call Push actual parameters: Changes the stack
Track size of actual parameters pushed In most languages the actual size is fixed not so in C: It is allowed in C to pass a smaller number of actuals than formally declared Has interesting side-effect on order: inverted in C. Belongs to CS, not ECE Base pointer still points to Stack Marker of caller before call instruction is executed After last actual parameter pushed: one flexible part of Stack Frame is complete
62
Call and Return Call Push the instruction pointer (ip); AKA pc
The address of the instruction after the call must be saved as return address; i.e. pc + 1 This uses a pre-defined location in the Stack Marker Other places in the Stack Marker are: function return value and Dynamic Link = copy of old base pointer Set instruction pointer (ip, AKA pc) to the address of the destination (callée) x86 architecture has 24 flavors of call instructions
63
Call and Return Procedure Entry
Push Base Pointer, this is the dynamic link, on x86 via the bp register Set Base Pointer bp to value of Stack Pointer sp Now the new Stack Frame is being addressed by bp The fixed part of stack, the Stack Marker is built Allocate space for local variables, if any This increases (logically) the sp pointer On x86 actually decreases sp Establishes 3rd part of Stack Frame, variable in size: the space for locals, saved registers, temps
64
Call and Return Return Pop locals, temps, saved registers off stack
This frees the second variable size area from the Stack Frame Pop registers to be restored Pop the top of stack value into the Base Pointer (bp) This uses the Dynamic Link to reactivate the previous, i.e. the caller’s Stack Frame Pop top of stack value into instruction pointer; that’s where the return address was stored
65
Call and Return Return This sets the ip register back to the instruction after the call; had been saved in the Stack Marker The return instruction –on x86 ret– does this! Either caller (or a suitable argument of the return instruction) frees the space allocated for actual parameters The x86 architecture allows an argument to ret instruction, freeing that amount of bytes off of the stack
66
Call and Return Code 1a. Procedure Entry, No Locals, Save Regs
push bp ; save dyn link in Stack Marker mov bp, sp; establish new Frame: point to S.M. push ax ; save ax if needed by callee, opt. push bx ; ditto for bx
67
Call and Return Code 1b. Procedure Exit, No Locals, Restore Regs
pop bx ; restore bx if was used by callee pop ax ; ditto for ax pop bp ; must find back old Stack Frame ret args ; ip to instruction after call ; free args
68
Call and Return Code 2a. Procedure Entry, With Locals, No Regs
push bp ; save dyn link in Stack Marker mov bp, sp ; establish new Frame: point to S.M. sub sp, 24 ; allocate 24 bytes uninitialized ; space for locals
69
Call and Return Code 2b. Procedure Exit, With Locals, No Regs
mov sp, bp ; free all locals and temps pop bp ; must find old S.F., RA on top ret args ; ip to instruction after call ; free args
70
Call and Return Code 3a. Procedure Entry, With Locals, Save Regs
push bp ; save dyn link in Stack Marker mov bp, sp; establish new Frame: point to S.M. sub sp, 24; allocate 24 bytes uninitialized ; space for locals push ax ; save ax if needed by callee, opt. push bx ; ditto for bx, i.e. another temp
71
Call and Return Code 3b. Procedure Exit, With Locals, Restore Regs
pop bx ; restore bx if was used by callee pop ax ; ditto for ax mov sp, bp; free all locals and other temps pop bp ; must find back old S.F., RA on top ret args ; ip to instruction after call ; free args
72
Call and Return Recursive Factorial in C // source: fact.c . . .
unsigned fact( unsigned arg ) { // fact if ( arg <= 1 ) { return 1; }else{ return fact( arg - 1 ) * arg; } //end if } //end fact
73
Call and Return Recursive Factorial in x86 ; Source file: fact.asm
penter macro push bp mov bp, sp push bx push cx push dx endm pexit macro args pop dx pop cx pop bx pop bp ret args Errcode= 4ch MAX = 9d .model small .stack 100h .data arg dw 0
74
Call and Return Recursive Factorial in x86 .code extrn uPutDec : near
; assume arg on tos ; return fact( int arg ) in ax rfact proc penter mov ax, [bp+4] ; arg 4 bytes b4 dyn link cmp ax, 1 ; argument > 1? jg recurse ; if so: recursive call base: mov ax, 1 ; No: then 0!=1!=1 pexit 2 ; done, free 2 bytes = arg recurse: mov ax, [bp+4] ; recurse; get next arg dec ax ; but decrement first push ax ; and pass on stack call rfact ; recurse! mov cx, [bp+4] ; product in ax, * arg mul cx ; product in ax pexit 2 ; and done rfact endp
75
Call and Return Recursive Factorial in x86 drive_r proc
mov arg, 0 ; initial memory mov ax, 0 ; initial value again mov bp, sp ; no space for locals needed again_r: cmp arg, MAX jge done_r ; ax holds argument to be factorialized :-) push ax ; argument on stack call rfact ; now ax holds factorial value call uPutDec ; print next result inc arg ; compute next fact(arg) mov ax, arg ; pass in ax jmp again_r done_r: ret drive_r endp
76
Design Asm Procedure PutDec
77
Design PutDec Goal Definition
Design an assembly language procedure, which prints a passed integer value in decimal notation Values are passed in a machine register Values may be positive or negative Use x86 small arithmetic, i.e. 16-bit integer precision, to easily track overflow, minimum and maximum integer values We’ll proceed stepwise: Printing a character Printing a decimal digit, given an integer value 0..9 Finally printing the complete integer
78
Design PutDec Define Macro Put_Ch to print one character
; character is passed in dl ; fiddle with ax, dx; restore before finishing Put_Ch macro char ; 'char' is char 2 b printed push ax ; save ax push dx ; ditto for dx; use only dl mov dl, char ; move into formal parameter mov ah, 02h ; tell system SW whom to call int h ; call system SW, e.g. DOS pop dx ; restore pop ax ; ditto endm
79
Design PutDec Print integer value 0..9 in dl as a character
; assume integer 0..9 to be in dl ; convert to ASCII character ; simple: just add ‘0’ ; p_char: add dl, '0’ ; convert int to char Put_Ch dl ; previously defined macro
80
Design PutDec Print rightmost digit of number in ax in decimal
; ax holds non-negative integer value ; but is a binary number, i.e. binary 0..9; need ASCII mov bx, 10 ; base 10 is in bx sub dx, dx ; make ax-dx a double word div bx ; unsigned divide ax by 10 ; remainder is in dx; known to be < 10, so dl holds it add dl, '0' ; make int a printable char Put_Ch dl ; print that char
81
Asm Source For Procedure PutDec
82
PutDec Asm Code: Macros
; Purpose: print various signed 16-bit numbers start macro mov ; typical for MS system SW mov ds, ax endm finish macro ; also MS system SW mov ax, 4c00h int h Put_Ch macro char ; 'char' char is printed push ax ; save cos ax is overwritten push dx ; ditto for dx mov dl, char ; move character into parameter mov ah, 02h ; tell DOS who int h ; call DOS pop dx ; restore pop ax ; ditto
83
PutDec Asm Code: Macros
Put_Str macro str_addr ; print string at 'str_addr' push ax ; save push dx ; save mov dx, offset str_addr mov ah, 09h ; DOS proc id int h ; call DOS pop dx ; restore pop ax ; ditto endm base_10 = .model small .stack 500 .data min_num db '-32768$' ; end strings with ‘$’ num_is db 'the number is: $' cr_lf db , 13, '$' ; magic numbers for lf, cr
84
PutDec Asm Code: Body .code
; ax value printed as a decimal integer number PutDec proc ; special case cannot be negated cmp ax, ; is it special case? jne do_work ; nop, so do your real job Put_Str min_num ; yep: so print it and be done ret ; done. Printed do_work: ; ax NOT ; is negative? push ax push bx push cx push dx cmp ax, ; negative number? jge positive ; if not, invert sign, print - neg ax ; here the inversion Put_Ch '-' ; but first print - sign positive: sub cx, cx ; cx counts steps = # digits mov bx, base_10 ; divisor is 10 ; now we know number in ax is non-negative
85
PutDec Asm Code: Body ; continue with non-negative number
push_m: sub dx, dx ; make a double word div bx ; unsigned divide o.k. add dl, '0' ; make number a char push dx ; save; order reversed inc cx ; count steps cmp ax, ; finally done? jne push_m ; if not, do next step ; now all chars are stored on stack in l-2-r order pop_m: pop dx ; pop to dx; dl of interest Put_Ch dl ; print it as char loop pop_m ; more work? If so, do again done: pop dx ; restore what you messed up pop cx ; ditto pop bx pop ax ret ; return to caller PutDec endp
86
PutDec Asm Code: Driver
; output readable string. Print #, carriage-return ; next_n proc put_str num_is ; print message call putdec ; print # put_str cr_lf ; cr lf ret next_n endp ; repeat label before endp num macro val ; just to practice macros mov ax, val ; PutDec expects # in ax call next_n ; message, print #, cr lf endm
87
PutDec Asm Code: Main main proc ; entry point under Windows
start ; set up for OS ; exercise all kinds of cases, including corner cases num ; all macro expansions num ; ditto num ; put this # into ax num num 1 num num 0 num 0ffh finish main endp end main ; this IDs the entry point ; can be different name
88
Appendix: Some Definitions
89
Definitions Activation Record Synonym for Stack Marker
90
Definitions Base Address Memory address of an aggregate area
Usually a segment- AKA base-register is used to hold a base address Addressing can then proceed relative to such a base address
91
Definitions Base Pointer
An address pointer (often implemented via a dedicated register), that identifies an agreed-upon area in the Stack Frame of an executing procedure On the x86 architecture this is implemented via the dedicated bp register
92
Definitions Binding Procedures may have parameters
Formal parameters express attributes such as type, name, and similar attributes At the place of call, these formal parameters receive initial, actual values through so-called actual parameters Sometimes, an actual parameter is solely the address of the true object referenced during the call The association of actual t formal parameter is referred to as parameter binding
93
Definitions Branch Transfer of control to a destination that is generally not the instruction following the branch Synonym: Jump. The destination is an explicit or implicit operand of the branch instruction
94
Definitions Call Transfer of control (a.k.a. context switch) to the operand of the call instruction A call expects that after completion, the program resumes execution at the place after the call instruction
95
Definitions Countable Loop
Loop, in which the number of iterations can be computed (is known) before the loop body starts Thus the loop body must include code to change the remaining loop count And includes a check to test, whether the final count has been reached
96
Definitions Dynamic Link
Location in the Stack Marker pointing to the Stack Frame of the calling procedure This caller is temporarily dormant; i.e. it is the callee’s stack frame that is active Since the caller also has a Dynamic Link object, all currently yet incomplete Stack Frames are linked together via this data structure
97
Definitions For Loop High-level construct implementing a countable loop The x86 instruction is a key component to write countable loops
98
Definitions Frame Pointer Synonym for Base Pointer
99
Definitions Hand-Manufactures Loop
Most general type of loop: the number of iterations cannot be computed before, not even during the execution of the loop Generally, the number of iterations depends on data that are input via read operations Also, the number of steps may depend on the precision of a computer (floating-point) result and thus is not known until the end
100
Definitions Immediate Operand
Operand encoded as part of the instruction No load is needed to get the immediate value; instead, it is immediately available in the instruction proper Since opcodes have a limited number of bits the size of immediate operands usually is limited to a fraction of a natural machine instruction—or word
101
Definitions Load Operation to move (read) data from memory to the processor Usually the destination is a register The source address is communicated in an immediate operand, or in another register, or indirectly through a register
102
Definitions Loop Body Program portion executed repeatedly
This is the actual work to be accomplished. The rest is loop overhead. Goal to minimize that overhead
103
Definitions Offset A distance in memory units away from the base address On a byte addressable microprocessor an offset is a distance in units of bytes Offset is frequently defined as a distance from a base registers, on x86 from a segment register
104
Definitions Pop Stack operation that frees data from the stack
Often, the data popped off are assigned to some other object Other times, data are just popped because they are no longer needed, in which case only the stack space is freed This can also be accomplished by changing the value of the stack pointer. Often the memory location is not overwritten by a pop, i.e. the data just stay. But the memory areas is not considered to be part of the active stack anymore
105
Definitions Push Stack operation that reserves temporary space on the stack Generally, the space reserved on the stack through a push is initialized with the argument of the push operation Other times, a push just reserves space on the stack for data to be initialized at a later time Note that on the x86 architecture a push decreases the top of stack pointer (sp value)
106
Definitions Repeat Loop
Loop in which the body is entered unconditionally, and thus executed at least once The number of iterations is generally not known until the loop terminates The termination condition is computed logically at the physical end of the loop
107
Definitions Return Transfer of control after completion of a call
Usually, this is accomplished through a return instruction The return instruction assumes the return address to be saved in a fixed part of the stack frame, called the Stack Marker
108
Definitions Return Value The value returned by a function call
If the return value is a composite data structure, then the location set aside for the function return value is generally a pointer to the actual data When no value is returned, we refer to the callée as a procedure
109
Definitions Stack AKA runtime stack
Run time data structure that grows and shrinks during program execution It generally holds data (parameters, locals, temps) and control information (return addresses, links) Operations that change the stack include push, pop, call, return, and the like
110
Definitions Stack Frame
Run time data structure associated with an active procedure or function A Stack Frame is composed of the procedure parameters, the Stack Marker, local data, and space for temporary data, including saved registers
111
Definitions Stack Marker
Run time data structure on the stack associated with a Stack Frame The Stack Marker holds fixed information, whose structure is known a priori This includes the return address, the static link, and the dynamic link In some implementations, the Stack Marker also holds an entry for the function return value and the saved registers
112
Definitions Stack Pointer AKA top of stack pointer
A pointer (typically implemented via a register) that addresses the last element allocated (pushed) on top of the stack On the x86 architecture this is implemented via the sp register It is also possible to have the Stack Pointer refer to the next free location (if any) on the stack in case another push operation needs stack space
113
Definitions Static Link
An address in the Stack Marker that points to the Frame Pointer of the last invocation of the procedure which lexicographically surrounds the one executing currently This is necessary only for high level languages that allow statically nested scopes, such as Ada, Algol, and Pascal This is not needed in more restricted languages such as C or C++
114
Definitions Store Operation to move data to memory
Such moves are named: writes or stores Usually the source is a register, holding the source address The target is a memory location, whose address is held in some register Some architectures allow the target address to be an immediate operand; not so on RISC architectures
115
Definitions Stride Distance in number of bytes from one element to next of same type For example, the stride of an integer array on the x86 architecture is 2 for signed and unsigned words – note that x86 calls a unit of 2 bytes a word; most architectures have 4-byte words It is 4 for double words on x86
116
Definitions Top of Stack
Stack location of the last allocated (pushed) object
117
Definitions While Loop
Loop in which the body is entered after first checking whether the condition for execution is true If false, the body is not executed. This is also used as the termination criterion The number of iterations is generally not known until the loop terminates
118
Bibliography Jan’s Linux and Assembler: Webster Assembly Language: Nasm assembler under Unix:
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.