Presentation is loading. Please wait.

Presentation is loading. Please wait.

Assembly Language for x86 Processors 6th Edition

Similar presentations


Presentation on theme: "Assembly Language for x86 Processors 6th Edition"— Presentation transcript:

1 Assembly Language for x86 Processors 6th Edition
Kip Irvine Chapter 1: Introduction to ASM Slides prepared by the author Revision date: 2/15/2010 (c) Pearson Education, All rights reserved. You may modify and copy this slide show for your personal use, or for use in the classroom, as long as this copyright statement, the author's name, and the title are not changed.

2 The Bottom-Up Approach
4/1/2017 The Bottom-Up Approach We can study computer architectures by starting with the basic building blocks Transistors and logic gates To build more complex circuits Flip-flops, registers, multiplexors, decoders, adders, ... From which we can build computer components Memory, processor, I/O controllers… Which are used to build a computer system This was the approach taken in your first course : Computer Architecture I: Digital Design

3 4/1/2017 The Top-Down Approach In this course we will study computer architectures from the programmer’s view We study the actions that the processor needs to do to execute tasks written in high level languages (HLL) like C/C++, Pascal, … But to accomplish this we need to: Learn the set of basic actions that the processor can perform: its instruction set Learn how a HLL compiler decomposes HLL command into processor instructions

4 The Top-Down Approach (Ctn.)
4/1/2017 The Top-Down Approach (Ctn.) We can learn the basic instruction set of a processor either At the machine language level But reading individual bits is tedious for humans At the assembly language level This is the symbolic equivalent of machine language (understandable by humans) Hence we will learn how to program a processor in assembly language to perform tasks that are normally written in a HLL We will learn what is going on beneath the HLL interface

5 Welcome to Assembly Language
How does assembly language (AL) relate to machine language? How do C++ and Java relate to AL? Is AL portable? Why learn AL? Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

6 Levels and Languages Compiler
4/1/2017 Levels and Languages High-level language program Assembly language program Machine language program Compiler Assembler The compiler translates each HLL statement into one or more assembly language instructions The assembler translate each assembly language instruction into one machine language instruction Each processor instruction can be written either in machine language form or assembly language form Example, for the Intel Pentium: MOV AL, 5 ;Assembly language ;Machine language Hence we will use assembly language

7 Translating Languages
English: Display the sum of A times B plus C. C++: cout << (A * B + C); Assembly Language: Mov eax,A Mul B Add eax,C Call WriteInt Intel Machine Language: A F E Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

8 Assembly Language Today
4/1/2017 Assembly Language Today A program written directly in assembly language has the potential to have a smaller executable and to run faster than a HLL program But it takes too long to write a large program in assembly language Only time-critical procedures are written in assembly language (optimization for speed) Assembly language are often used in embedded system programs stored in PROM chips Computer cartridge games, micro controllers, … Remember: you will learn assembly language to learn how high-level language code gets translated into machine language i.e. to learn the details hidden in HLL code

9 Comparing ASM to High-Level Languages
Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

10 Specific Machine Levels
(descriptions of individual levels follow ) Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

11 High-Level Language Level 4 C++, Java, Pascal, Visual Basic . . .
Application-oriented languages C++, Java, Pascal, Visual Basic . . . Programs compile into assembly language (Level 3) Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

12 Assembly Language Level 3 To be learned in 03-60-266
Instruction mnemonics that have a one-to-one correspondence to machine language Programs are translated into Instruction Set Architecture Level - machine language (Level 2) To be learned in Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

13 Instruction Set Architecture (ISA)
Level 2 Also known as conventional machine language Executed by Level 1 (Digital Logic) The hardware (taught in ) Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

14 Digital Logic Level 1: the digital system seen in 03-60-265
CPU, constructed from digital logic gates System bus Memory Implemented using bipolar transistors Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

15 Basic Microcomputer Design
Central Processor Unit: clock synchronizes CPU operations control unit (CU) coordinates sequence of execution steps ALU performs arithmetic and logic operations Bus: transfer data between different parts of the computer Data bus, Control bus, and Address bus Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

16 Instruction Execution Cycle [Fetch-and-Execute Cycle]
Loop: Fetch next instruction then increment IP (the Instruction Pointer) Decode the instruction If memory operand needed then Fetch operand’s value from memory Execute the instruction If result is memory operand then Store output to memory Continue loop Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

17 The Platform We Will Use
4/1/2017 The Platform We Will Use Assembly language and machine language are processor specific We will write code for Intel’s 80x86 (x>=3) IA-32 family: Intel 80386, 486, … Pentium, … The assembler places its machine code into an object file which is OS specific Our code will run (only) on Windows And it will crash on DOS Our programs will be Win32 console applications These are programs for which all I/O operations are character-based They run into an MS-DOS box but they are not DOS programs (they do not use DOS calls)

18 4/1/2017 The Intel X86 Family . . . Pentium 80486 80386 80286 8086 The instruction set of the x86 is backward compatible with any one of its predecessors New additional instructions are introduced with each new processor

19 Basic Microcomputer Design
Central Processor Unit: clock synchronizes CPU operations control unit (CU) coordinates sequence of execution steps ALU performs arithmetic and logic operations Bus: transfer data between different parts of the computer Data bus, Control bus, and Address bus Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

20 Basic Program Execution Registers
Registers: high-speed memories located in the CPU Registers for 8086 and are 16 bits wide Registers for IA-32 family are 32 bits wide Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

21 General-Purpose Registers
8 registers used for arithmetic and data movement Use 8-bit name, 16-bit name, or 32-bit name Applies to EAX, EBX, ECX, and EDX only Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

22 Index and Base Registers
Some registers have only a 16-bit name for their lower half: EBP/ESP registers are used as pointers to stack ESI/EDI registers used for fast memory indexing. Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

23 Some Specialized Register Uses (1 of 2)
General-Purpose EAX – accumulator ECX – loop counter ESP – stack pointer ESI, EDI – index registers EBP – extended frame pointer (stack) Segment: stores the address of a memory segment CS – code segment DS – data segment SS – stack segment ES, FS, GS - additional segments Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

24 4/1/2017 Segment Registers Each program is subdivided into logical parts called SEGMENTS Code segment (CS) Stack segment (SS) Data segments (DS, ES, FS, and GS) Real-address mode: segment registers hold the “base address” of these program segments Protected mode: segment registers hold pointers to segment descriptor table Segment registers are 16-bit wide CS SS DS ES FS GS

25 Some Specialized Register Uses (2 of 2)
EIP – instruction pointer Stores the address of the next instruction to be executed IP for 8086 EFLAGS control flags: Controling the operation of the CPU status flags: Reflecting outcome of CPU operations each flag is a single binary bit Set flag = 1 and Clear flag = 0 Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

26 EFLAGS’s Status Flags Carry Overflow Sign Zero Auxiliary Carry Parity
CF: unsigned arithmetic out of range Overflow OF: signed arithmetic out of range Sign SF: result is negative Zero ZF: result is zero Auxiliary Carry AF: carry from bit 3 to bit 4 Parity PF: sum of 1 bits is an even number Direction DF: (CPU control flag) Process arrays up or down ? Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

27 Floating-Point UNIT, MMX, XMM Registers
Eight 80-bit floating-point data registers ST(0), ST(1), , ST(7) arranged in a stack used for all floating-point arithmetic Eight 64-bit MMX registers Eight 128-bit XMM registers for single-instruction multiple-data (SIMD) operations Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

28 Basic Microcomputer Design
Central Processor Unit: clock synchronizes CPU operations control unit (CU) coordinates sequence of execution steps ALU performs arithmetic and logic operations Bus: transfer data between different parts of the computer Data bus, Control bus, and Address bus Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

29 Logical and Physical Addresses
4/1/2017 Logical and Physical Addresses Addresses specify the location of instructions and data Addresses that specify an absolute location in main memory are physical addresses (called linear addresses) They appear on the address bus Addresses that specify a location relative to a point in the program are logical (or virtual) addresses They are addresses used in the code and are independent of the structure of main memory Each logical address for the x86 consist of 2 parts: A segment number used to specify a (logical) part of the program [The physical address of the segment] A offset number used to specify a location relative to the beginning of the segment

30 Segmented Memory Segmented memory addressing: absolute (linear) address is a combination of a 16-bit segment value added to a 16-bit offset one segment linear addresses Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

31 Address Translation and Running Modes
4/1/2017 Address Translation and Running Modes The translation from logical to physical addresses is done at run time The way in which this address translation is done depends on the running mode of the x86 Two different running modes exist for the x86: Real mode (supported by every x86) Protected mode (all x86 except the 8086) You will use this mode

32 Address Translation in Real Mode
4/1/2017 Address Translation in Real Mode The 16-bit segment number (contained in a segment register) is first multiplied by 16 to give the 20-bit physical address of the first byte of the referenced segment: Seg_adr + Off_adr Then we add the 16-bit offset address to obtain the 20-bit physical address of the referenced data (or instruction) Ex: if CS contains 15A6h (in hexadecimal), and IP contains 0012h, then The physical address of the instruction to be executed next is just 15A60h h = 15A72h

33 Calculating Linear Addresses
Given a segment address, multiply it by 16 (add a hexadecimal zero), and add it to the offset Example: convert 08F1:0100 to a linear address Adjusted Segment value: 0 8 F 1 0 Add the offset: Linear address: Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

34 Characteristics of (Archaic) Real Mode
4/1/2017 Characteristics of (Archaic) Real Mode Can address only up to 1MB of physical memory Uses 20-bit address for referenced segment Does not support multitasking Only 1 process at a time is active No protection is provided: a program can write anywhere (and corrupt the operating system) The 8086 runs only in this mode DOS is a real-mode operating system Our programs will not run in this archaic mode They will run in protected mode which does not suffer from any of these limitations

35 Address Translation in Protected Mode
4/1/2017 Address Translation in Protected Mode The logical/virtual address of a referenced word is given by a pair of numbers (segment, offset) The segment number is contained in a segment register and is used to select (or index) an entry in a segment table (called a descriptor table) Hence, a segment resister is also called a selector The selected entry (the descriptor) contains the base address and length of the referenced segment The 32-bit base address is added to the 32-bit offset to form a 32-bit linear address (P1,P2,D) P1 indexes a directory page table (in memory) to obtain the base address of a second page table which is indexed by P2 to give the physical address of the referenced word

36 Intel 386 Address Translation
4/1/2017 Intel 386 Address Translation P1 P2 D

37 Characteristics of Protected Mode
4 GB addressable RAM ( to FFFFFFFFh) Each program assigned a memory partition which is protected from other programs Designed for multitasking Supported by Linux & MS-Windows Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

38 Characteristics of Protected mode
Segment descriptor tables Program structure code, data, and stack areas CS, DS, SS segment descriptors global descriptor table (GDT) MASM Programs use the Microsoft flat memory model Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

39 4/1/2017 The FLAT Memory Model The segmentation part is hidden to the programmer when the base address of each segment descriptor is the same Each selector then points to the same segment so that code, data, and stack share the same segment Protection bits (read-only, read-write) in each descriptor can still be used Done by Windows, Linux, FreeBSD… The offset part of the logical address is then equivalent to the linear address (P1,P2,D). Only the offset part of the logical address is used to specify the location of a referenced word The address space is then said to be FLAT All our programs will use the FLAT memory model

40 Flat Segment Model Single global descriptor table (GDT).
All segments mapped to entire 32-bit address space Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

41 Multi-Segment Model Each program has a local descriptor table (LDT)
holds descriptor for each segment used by the program Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

42 Paging Supported directly by the CPU
Divides each segment into 4096-byte blocks called pages Sum of all programs can be larger than physical memory Part of running program is in memory, part is on disk Virtual memory manager (VMM) – OS utility that manages the loading and unloading of pages Page fault – issued by CPU when a page must be loaded from disk Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

43 What do these numbers represent?
Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

44 Basic Microcomputer Design
Central Processor Unit: clock synchronizes CPU operations control unit (CU) coordinates sequence of execution steps ALU performs arithmetic and logic operations Bus: transfer data between different parts of the computer Data bus, Control bus, and Address bus Irvine, Kip R. Assembly Language for x86 Processors 6/e, 2010.

45 Review: Data Representation
Binary Numbers Translating between binary and decimal Binary Addition Integer Storage Sizes Hexadecimal Integers Translating between decimal and hexadecimal Hexadecimal subtraction Signed Integers Binary subtraction Character Storage Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

46 Memory Units for the Intel x86
4/1/2017 Memory Units for the Intel x86 The smallest addressable unit is the BYTE 1 byte = 8 bits For the x86, the following units are used 1 word = 2 bytes 1 double word = 2 words (= 32 bits) 1 quad word = 2 double words

47 4/1/2017 Data Representation To obtain the value contained in a block of memory we need to choose an interpretation Ex: memory content can either represent: The number Or the ASCII code of character “A” Only the programmer can provide the interpretation

48 Number Systems Hexadecimal 25 is written as 25h
4/1/2017 Number Systems A written number is meaningful only with respect to a base To tell the assembler which base we use: Hexadecimal 25 is written as 25h Octal 25 is written as 25o or 25q Binary 1010 is written as 1010b Decimal 1010 is written as 1010 or 1010d You already know how to convert from one base to another (if not, review your class notes)

49 Binary Numbers Digits are 1 and 0 MSB – most significant bit
1 = true 0 = false MSB – most significant bit LSB – least significant bit Bit numbering: Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

50 Binary Numbers Every binary number is a sum of powers of 2
Each digit (bit) is either 1 or 0 Each bit represents a power of 2: Every binary number is a sum of powers of 2 Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

51 Translating Binary to Decimal
Weighted positional notation shows how to calculate the decimal value of each binary bit: dec = (Dn-1  2n-1) + (Dn-2  2n-2) (D1  21) + (D0  20) D = binary digit binary = decimal 9: (1  23) + (1  20) = 9 Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

52 Translating Unsigned Decimal to Binary
Repeatedly divide the decimal integer by 2. Each remainder is a binary digit in the translated value: 37 = Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

53 Binary Addition Starting with the LSB, add each pair of digits, include the carry if present. Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

54 Integer Storage Sizes Standard sizes:
What is the largest unsigned integer that may be stored in 20 bits? Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

55 Hexadecimal Integers Binary values are represented in hexadecimal.
Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

56 Translating Binary to Hexadecimal
Each hexadecimal digit corresponds to 4 binary bits. Example: Translate the binary integer to hexadecimal: Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

57 Converting Hexadecimal to Decimal
Multiply each digit by its corresponding power of 16: dec = (D3  163) + (D2  162) + (D1  161) + (D0  160) Hex 1234 equals (1  163) + (2  162) + (3  161) + (4  160), or decimal 4,660. Hex 3BA4 equals (3  163) + (11 * 162) + (10  161) + (4  160), or decimal 15,268. Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

58 Powers of 16 Used when calculating hexadecimal values up to 8 digits long: Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

59 Converting Decimal to Hexadecimal
decimal 422 = 1A6 hexadecimal Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

60 Hexadecimal Addition 36 28 28 6A 42 45 58 4B 78 6D 80 B5
Divide the sum of two digits by the number base (16). The quotient becomes the carry value, and the remainder is the sum digit. 1 1 A B 78 6D 80 B5 21 / 16 = 1, rem 5 Important skill: Programmers frequently add and subtract the addresses of variables and instructions. Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

61 Hexadecimal Subtraction
When a borrow is required from the digit to the left, add 16 (decimal) to the current digit's value: = 21 -1 C6 75 A2 47 24 2E Practice: The address of var1 is The address of the next variable after var1 is A. How many bytes are used by var1? Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

62 Integer Representations
4/1/2017 Integer Representations Two different representations exists for integers The signed representation: in that case the most significant bit (MSB) represents the sign Positive number (or zero) if MSB = 0 Negative number if MSB = 1 The unsigned representation: in that case all the bits are used to represent a magnitude It is thus always a positive number or zero

63 Signed Integers The highest bit indicates the sign. 1 = negative, 0 = positive If the highest digit of a hexadecimal integer is > 7, the value is negative. Examples: 8A, C5, A2, 9D Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

64 Forming the Two's Complement
Negative numbers are stored in two's complement notation Represents the additive Inverse Note that = Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

65 Binary Subtraction When subtracting A – B, convert B to its two's complement Add A to (–B) Practice: Subtract 0101 from 1001. Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

66 Learn How To Do the Following:
Form the two's complement of a hexadecimal integer Convert signed binary to decimal Convert signed decimal to binary Convert signed decimal to hexadecimal Convert signed hexadecimal to decimal Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

67 Ranges of Signed Integers
The highest bit is reserved for the sign. This limits the range: Practice: What is the largest positive value that may be stored in 20 bits? Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

68 Signed and Unsigned Interpretation
4/1/2017 Signed and Unsigned Interpretation To obtain the value of a integer in memory we need to chose an interpretation Ex: a byte of memory containing can represent either one of these numbers: -1 if a signed interpretation is used 255 if an unsigned interpretation is used Only the programmer can provide an interpretation of the content of memory

69 Maximum and Minimum Values
4/1/2017 Maximum and Minimum Values The MSB of a signed integer is used for its sign fewer bits are left for its magnitude Ex: for a signed byte smallest positive = b largest positive = b = 127 largest negative = -1 = b smallest negative = b = -128 Exercise 2: give the smallest and largest positive and negative values for A) a signed word B) a signed double word

70 Character Representation
4/1/2017 Character Representation Each character is represented by a 7-bit code called the ASCII code ASCII codes run from 00h to 7Fh (h = hexadecimal) Only codes from 20h to 7Eh represent printable characters. The rest are control codes (used for printing, transmission…). An extended character set is obtained by setting the most significant bit (MSB) to 1 (codes 80h to FFh) so that each character is stored in 1 byte This part of the code depends on the OS used For Windows: we find accentuated characters, Greek symbols and some graphic characters

71 The ASCII Character Set
4/1/2017 The ASCII Character Set CR = “carriage return” (Windows: move to beginning of line) LF = “line feed” (Windows: move directly one line below) SPC = “blank space”

72 4/1/2017 Text Files These are files containing only printable ASCII characters (for the text) and non-printable ASCII characters to mark each end of line. But different conventions are used for indicating an “end-of line” Windows: <CR>+<LF> UNIX: <LF> MAC: <CR> This is at the origin of many problems encountered during transfers of text files from one system to another

73 Character Storage Character sets Null-terminated String
Standard ASCII (0 – 127) Extended ASCII (0 – 255) ANSI (0 – 255) Unicode (0 – 65,535) Null-terminated String Array of characters followed by a null byte Using the ASCII table back inside cover of book Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

74 Numeric Data Representation
pure binary can be calculated directly ASCII binary string of digits: " " ASCII decimal string of digits: "65" ASCII hexadecimal string of digits: "9C" next: Boolean Operations Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

75 Boolean Operations NOT AND OR Operator Precedence Truth Tables
Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

76 Boolean Algebra Based on symbolic logic, designed by George Boole
Boolean expressions created from: NOT, AND, OR Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

77 Digital gate diagram for NOT:
Inverts (reverses) a boolean value Truth table for Boolean NOT operator: Digital gate diagram for NOT: Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

78 Digital gate diagram for AND:
Truth table for Boolean AND operator: Digital gate diagram for AND: Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

79 Digital gate diagram for OR:
Truth table for Boolean OR operator: Digital gate diagram for OR: Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

80 Operator Precedence Examples showing the order of operations:
Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

81 Truth Tables (1 of 3) Example: X  Y
A Boolean function has one or more Boolean inputs, and returns a single Boolean output. A truth table shows all the inputs and outputs of a Boolean function Example: X  Y Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

82 Truth Tables (2 of 3) Example: X  Y
Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

83 Two-input multiplexer
Truth Tables (3 of 3) Example: (Y  S)  (X  S) Two-input multiplexer Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

84 Summary Assembly language helps you learn how software is constructed at the lowest levels Assembly language has a one-to-one relationship with machine language Each layer in a computer's architecture is an abstraction of a machine layers can be hardware or software Boolean expressions are essential to the design of computer hardware and software Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.

85 What do these numbers represent?
Irvine, Kip R. Assembly Language for Intel-Based Computers 6/e, 2010.


Download ppt "Assembly Language for x86 Processors 6th Edition"

Similar presentations


Ads by Google