Presentation is loading. Please wait.

Presentation is loading. Please wait.

IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs.

Similar presentations


Presentation on theme: "IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs."— Presentation transcript:

1 IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs

2 A source program’s format Source-file: a pure ASCII-character textfile Is created using a text-editor (such as ‘vi’) You cannot use a ‘word processor’ (why?) Program consists of series of ‘statements’ Each program-statement fits on one line Program-statements all have same layout Design in 1950s was for IBM punch-cards

3 Statement Layout (1950s) Each ‘statement’ was comprised of four ‘fields’ Fields appear in a prescribed left-to-right order These four fields were named (in order): -- the ‘label’ field -- the ‘opcode’ field -- the ‘operand’ field -- the ‘comment’ field In many cases some fields could be left blank Extreme case (very useful): whole line is blank!

4 The ‘as’ program The ‘assembler’ is a computer program It accepts a specified text-file as its input It must be able to ‘parse’ each statement It can produce onscreen ‘error messages’ It can generate an ELF-format output file (That file is known as an ‘object module’) It can also generate a ‘listing file’ (optional)

5 The ‘label’ field A label is a ‘symbol’ followed by a colon (‘:’) The programmer invents his own ‘symbols’ Symbols can use letters and digits, plus a very small number of ‘special’ characters ( ‘.’, ‘_’, ‘$’ ) A ‘symbol’ is allowed to be of arbitrarily length The Linux assembler (‘as’) was designed for translating source-text produced by a high-level language compiler (such as ‘cc’) But humans can also write such files directly

6 The ‘opcode’ field Opcodes are predefined symbols that are recognized by the GNU assembler There are two categories of ‘opcodes’ (called ‘instructions’ and ‘directives’) ‘Instructions’ represent operations that the CPU is able to perform (e.g., ‘add’, ‘inc’) ‘Directives’ are commands that guide the work of the assembler (e.g., ‘.globl’, ‘.int’)

7 Instructions vs Directives Each ‘instruction’ gets translated by ‘as’ into a machine-language statement that will be fetched and executed by the CPU when the program runs (i.e., at ‘runtime’) Each ‘directive’ modifies the behavior of the assembler (i.e., at ‘assembly time’) With GNU assembly language, they are easy to distinguish: directives begin with ‘.’

8 A list of the Pentium opcodes An ‘official’ list of the instruction codes can be found in Intel’s programmer manuals: http://developer.intel.com http://developer.intel.com But it’s three volumes, nearly 1000 pages (it describes ‘everything’ about Pentiums) An ‘unofficial’ list of (most) Intel instruction codes can fit on one sheet, front and back: http://www.jegerlehner/intel/

9 The AT&T syntax The GNU assembler uses AT&T syntax (instead of official Intel/Microsoft syntax) so the opcode names differ slightly from names that you will see on those lists: Intel-syntax AT&T-syntax ------------------------------------- ADD  addb/addw/addl INC  incb/incw/incl CMP  cmpb/cmpw/cmpl

10 The UNIX culture Linux is intended to be a version of UNIX (so that UNIX-trained users already know Linux) UNIX was developed at AT&T (in early 1970s) and AT&T’s computers were built by DEC, thus UNIX users learned DEC’s assembley language Intel was early ally of DEC’s competitor, IBM, which deliberately used ‘incompatible’ designs Also: an ‘East Coast’ versus ‘West Coast’ thing (California, versus New York and New Jersey)

11 Bytes, Words, Longwords CPU Instructions usually operate on data-items Only certain sizes of data are supported: BYTE: one byte consists of 8 bits WORD: consists of two bytes (16 bits) LONGWORD: uses four bytes (32 bits) With AT&T’s syntax, an instruction’s name also incorporates its effective data-size (as a suffix) With Intel syntax, data-size usually isn’t explicit, but is inferred by context (i.e., from operands)

12 The ‘operand’ field Operands can be of several types: -- a CPU register may hold the datum -- a memory location may hold the datum -- an instruction can have ‘built-in’ data -- frequently there are multiple data-items -- and sometimes there are no data-items An instruction’s operands usually are ‘explicit’, but in a few cases they also could be ‘implicit’

13 Examples of operands Some instruction that have two operands: movl%ebx, %ecx addl$4, %esp Some instructions that have one operand: incl%eax pushl$fmt An instruction that lacks explicit operands: ret

14 The ‘comment’ field An assembly language program often can be hard for a human being to understand Even a program’s author may not be able to recall his programming idea after awhile So programmer ‘comments’ can be vital A comments begin with the ‘#’ character The assembler disregards all comments (but they will appear in program listings)

15 ‘Directives’ Sometimes called ‘pseudo-instructions’ They tell the assembler what to do The assembler will recognize them Their names begin with a dot (‘.’) Examples:‘.section’, ‘.global’, ‘.int,’ … The names of valid directives appears in the table-of-contents of the GNU manual

16 New program example Let’s look at a demo program (‘squares.s’) It prints out a mathematical table showing some numbers and their squares But it doesn’t use any multiplications! It uses an algorithm based on algebra: (n+1) 2 - n 2 = n + n + 1 If you already know the square of a given number n, you can get the square of the next number n+1 by just doing additions

17 A program with a ‘loop’ Here’s our program idea (expressed in C) intnum = 1, val = 1; do{ printf( “ %d %d \n”, num, val ); val += num + num + 1; num += 1; } while ( num <= 20 );

18 Some new ‘directives’ ‘.equ’ – equates a symbol to a value:.equmax, 20 ‘.globl’ – just an alternative to ‘.global’:.globlmain

19 Some new ‘instructions’ ‘inc’ – adds one to the specified operand: inclarg ‘cmp’ – compares two specified operands: cmpl$max, arg ‘jle’ – jump (to a specified instruction) if condition ‘less than or equal to’ is true: jleagain

20 In-class Exercise Can you write a program that prints out a table showing powers of 2 (very useful for computer science students to have handy) Can you see how to do it without using any ‘multiply’ operations – just additions? Hint: study the ‘squares.s’ source-code Then write your own ‘powers.s’ solution Turn in printouts (source and its output)


Download ppt "IA32 programming for Linux Concepts and requirements for writing Linux assembly language programs for Pentium CPUs."

Similar presentations


Ads by Google