© 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture.

Slides:



Advertisements
Similar presentations
Memory Management Unit
Advertisements

MICROPROCESSORS TWO TYPES OF MODELS ARE USED :  PROGRAMMER’S MODEL :- THIS MODEL SHOWS FEATURES, SUCH AS INTERNAL REGISTERS, ADDRESS,DATA & CONTROL BUSES.
Chapter 2: The Microprocessor and its Architecture.
Copyright ©2009 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. The Intel Microprocessors: 8086/8088, 80186/80188,
Computer Organization and Assembly Languages Yung-Yu Chuang
Introduction to 8086 Microprocessor
The Microprocessor and its Architecture
CEN 226: Computer Organization & Assembly Language :CSC 225 (Lec#3) By Dr. Syed Noman.
Intel MP.
CSC 221 Computer Organization and Assembly Language
Operating Systems: Segments 1 Segmentation Hardware Support single user program system: – wish somehow to relocate address 0 to after operating system.
IA-32 Processor Architecture
Vacuum tubes Transistor 1948 ICs 1960s Microprocessors 1970s.
1 Hardware and Software Architecture Chapter 2 n The Intel Processor Architecture n History of PC Memory Usage (Real Mode)
Memory Management (II)
© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture.
Chapter 3.2 : Virtual Memory
Assembly Language for Intel-Based Computers Chapter 2: IA-32 Processor Architecture Kip Irvine.
Vacuum tubes Transistor 1948 –Smaller, Cheaper, Less heat dissipation, Made from Silicon (Sand) –Invented at Bell Labs –Shockley, Brittain, Bardeen ICs.
ICS312 Set 3 Pentium Registers. Intel 8086 Family of Microprocessors All of the Intel chips from the 8086 to the latest pentium, have similar architectures.
Microprocessor Systems Design I Instructor: Dr. Michael Geiger Spring 2012 Lecture 2: 80386DX Internal Architecture & Data Organization.
Microprocessor Systems Design I Instructor: Dr. Michael Geiger Fall 2012 Lecture 15: Protected mode intro.
80x86 Processor Architecture
© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture.
Gursharan Singh Tatla Block Diagram of Intel 8086 Gursharan Singh Tatla 19-Apr-17.
Unit-1 PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE Advance Processor.
The 8086 Microprocessor The 8086, announced in 1978, was the first 16-bit microprocessor introduced by Intel Corporation 8086 is 16-bit MPU. Externally.
UNIT 2 Memory Management Unit and Segment Description and Paging
An Introduction to 8086 Microprocessor.
The Pentium Processor.
The Pentium Processor Chapter 3 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer,
The Pentium Processor Chapter 3 S. Dandamudi.
Multitasking Mr. Mahendra B. Salunke Asst. Prof. Dept. of Computer Engg., STES SITS, Narhe, Pune-41 STES Sinhgad Institute of Tech. & Science Dept. of.
1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.
Fall 2012 Chapter 2: x86 Processor Architecture. Irvine, Kip R. Assembly Language for x86 Processors 6/e, Chapter Overview General Concepts IA-32.
80386DX.
(-133)*33+44* *33+44*14 Input device memory calculator Output device controller Control bus data bus memory.
EFLAG Register of The The only new flag bit is the AC alignment check, used to indicate that the microprocessor has accessed a word at an odd.
Computers organization & Assembly Language Chapter 1 THE 80x86 MICROPROCESSOR.
1 Microprocessors CSE Protected Mode Memory Addressing Remember using real mode addressing we were previously able to address 1M Byte of memory.
1 Microprocessors CSE – 341 EEE – 365 \\server2\tsr\Spring\CSE\CSE341
INTRODUCTION TO INTEL X-86 FAMILY
© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture.
Chapter 2 The Microprocessor Architecture Microprocessors prepared by Dr. Mohamed A. Shohla.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
Information Security - 2. Other Registers EFLAGS – 32 Bit Register CFPFAFZFSFTFIFDFOFIO PL IO PL NTRFVM Bits 1,3,5,15,22-31 are RESERVED. 18: AC, 19:VIF,
MODULE 5 INTEL TODAY WE ARE GOING TO DISCUSS ABOUT, FEATURES OF 8086 LOGICAL PIN DIAGRAM INTERNAL ARCHITECTURE REGISTERS AND FLAGS OPERATING MODES.
Internal Programming Architecture or Model
1 x86 Programming Model Microprocessor Computer Architectures Lab Components of any Computer System Control – logic that controls fetching/execution of.
Intel 8086 MICROPROCESSOR ARCHITECTURE
BITS Pilani Pilani Campus Pawan Sharma Lecture / ES C263 INSTR/CS/EEE F241 Microprocessor Programming and Interfacing.
The Microprocessor & Its Architecture A Course in Microprocessor Electrical Engineering Department Universitas 17 Agustus 1945 Jakarta.
I NTEL 8086 M icroprocessor بسم الله الرحمن الرحيم 1.
Chapter Overview General Concepts IA-32 Processor Architecture
Microprocessor Architecture
Difference between Microprocessor and Microcontroller
UNIT Architecture M.Brindha AP/EIE
Introduction to 8086 Microprocessor
8086 Microprocessor.
Basic Microprocessor Architecture
Intel 8088 (8086) Microprocessor Structure
Basic of Computer Organization
Chapter 2: The Microprocessor and its Architecture
Intel 8088 (8086) Microprocessor Structure
CS 301 Fall 2002 Computer Organization
The Microprocessor & Its Architecture
Assembly Language (CSW 353)
Computer Architecture CST 250
Chapter 2: The Microprocessor and its Architecture
Unit-I 80386DX Architecture
Presentation transcript:

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture The Intel 8086, 80X86, and Pentium Family

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Contents Internal architecture of the Microprocessor: –The programmer’s model, i.e. the registers model –The processor (organization) model Memory addressing with segmentation - In the real mode - In the protected mode Memory addressing with paging

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Objectives for this Chapter Describe the function and purpose of program- visible registers Describe the Flags register and the purpose of flag bits Describe how memory is accessed using segmentation in both the real mode and the protected mode Describe the program-invisible registers Describe the structures and operation of the memory paging mechanism Describe the organizational processor model Briefly review the evolution of the 80X86 architecture

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e The Intel Family  (1978)  (2000) = 2 A (A) Microcontrollers) I n c r e a s e Addressable Memory, bytes  I n c r e a s e

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Programming Model General Purpose Registers Special Purpose Registers Segment Registers and above: -32-bit registers (except seg. regs.) -Two additional segment registers:F,G

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e General-Purpose Registers The top portion of the programming model contains the general purpose registers: EAX, EBX, ECX, EDX, EBP, ESI, and EDI Can carry both Data & Address offsets Although general in nature, each has a special purpose and name: EAX – Accumulator Used also as AX (16 bit), AH (8 bit), and AL (8 bit) EBX – Base Index often used to address memory (BX, BH, and BL)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e General-Purpose Registers (continued) ECX – count, for shifts, rotates, and loops (CX, CH, and CL) EDX – data, used with multiply and divide (DX, DH, and DL) EBP – base pointer used to address stack data (BP) ESI – source index (SI) for memory locations, e.g. with string instructions EDI – destination index (DI) for memory locations

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Special-Purpose Registers ESP, EIP, and EFLAGS Each has a specific task –ESP – Stack pointer: Offset to the top of the stack in the stack segment. Used with procedure calls (SP) –EIP – Instruction Pointer: Offset to the next instruction in a program in the code segment (IP) –EFLAGS – indicates latest conditions (state) of the microprocessor (FLAGS) SS Used With CS

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e EFLAGS 80386DX

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e The Flags C – Carry/borrow from last operation P – the parity flag (little used today) A – auxiliary flag Half-carry between bits 3 and 4, used with BCD arithmetic Z – zero S – sign O – Overflow D – direction - Determines auto increment/decrement direction for SI and DI registers with string instructions I – interrupt - Enables (using STI) or disables (using CLI) the processing of hardware interrupts arriving at the INTR input pin of the processor T – Trap - Turns trapping interrupt (for program debugging) on/off Basic Flag Bits (8086 etc.): Output, Input bits Determined by last operation Set/Reset explicitly by the programmer Some flag bits can be both, e.g. the C flag

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Newer Flag Bits IOPL – 2-bit I/O privilege level in protected mode NT – nested task RF – resume flag (used with debugging) VM – virtual mode: multiple DOS programs each with a 1 MB memory partition in Windows AC – alignment check: detects addressing memory on wrong boundary for words/double words VIF – virtual interrupt flag VIP – virtual interrupt pending ID = CPUID instruction is supported The instruction gives info on CPU version and manufacturer

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Segment Registers The segment registers are: –CS (code), –DS (data), –ES (extra data. used as destination for some string instructions), –SS (stack), –FS, and GS: Additional segment registers on and above Segment registers define the start of a section (segment) of memory for a program. A segment is either: - 64K (2 16 ) bytes of fixed length (real mode), or - Up to 4G (2 32 ) bytes of variable length (protected mode). All code (programs) reside in a code segment. Each register points to the start of a segment in memory

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Real Mode Memory Addressing Used by the DOS operating system The only mode available on the : 20 bit address bus  1 MB, 16 bit data bus, 16 bit registers Real mode memory is the first 1M (2 20 ) bytes of the memory system (real, conventional, DOS memory) in later processors Real mode 20-bit addresses are obtained by combining a segment number (in a segment register) and an offset address (in another processor register) The segment register address (16-bits) is appended with a 0H or (or multiplied by 10H or 16d) to form a 20-bit start of segment address Then the effective memory address (EA) = this 20-bit segment start address + the 16-bit offset address in another processor register For the 8086, segment length is 2 16 = 64K bytes (determined by the size of the offset registers)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e 1 MB 20-bit (5-byte) Physical Memory address 64 KB Segment 16-bit each Appended 4 bits (0H) + EA (Effective Address) of byte accessed (1 MB) Segment number In Segment Register

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Effective Address Calculations EA = segment register (SR) x 10H + offset (a) SR: 1000H = (b) SR: AAF0H AAF = AB034 (c) SR: 1200H FFF0 = 21FF0 Q: Is 3FC81 a valid start address of a segment?

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Overlapping segments How to detect overlap? Top of CS: 090F0 FFFF+ 190EF Code should be limited to only this portion of the code segment, to avoid effects of segment overlap

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Defaults Default segment numbers in: –CS for program (code) –SS for stack –DS for data –ES for string (destination) data Default offset addresses that go with them: SegmentOffset (16-bit) 8080, 8086, Offset (32-bit) and above Purpose CS IPEIPProgram SS SP, BPESP, EBPStack DS BX, DI, SI, 8-bit or 16-bit #EBX, EDI, ESI, EAX ECX, EDX, 8-bit or 32-bit # Data ES DI, with string instructionsEDI, with string instructionsString destination Convention Example: EA = CS:[IP] Segment number in Segment register Offset: Literal or in a CPU register

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Addressing Modes Summary

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Segmentation: Pros and Cons Advantages: Allows easy and efficient relocation of code and data To relocate code or data, only the number in the relevant segment register needs to be changed Consequences:  A program can be located anywhere in memory without making any changes to it (addresses are not absolute, but offsets relative to start of segments)  Program writer needs not worry about actual memory structure (map) of the computer used to execute it Disadvantages: Complex hardware and for address generation Address computation delay for every memory access Software limitation: Program size limited by segment size (64KB with the 8086)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Limitations of the above real mode segmentation scheme Segment size is fixed at and limited to 64 KB Segment can not begin at an arbitrary memory address… With 20-bit memory addressing, can only begin at addresses starting with 0H, i.e. at 16 byte intervals  Principle is difficult to apply with and above, with segment registers remaining at 16-bits! and above use 24, 32 bit addresses but still 16-bit segment registers No protection mechanisms: Programs can overwrite operating system code segments and corrupt them!  Use memory segmentation in the protected mode Append: 00H 0000H

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Protected Mode Segmentation: Flexible definition of segment starting address Flexible definition of segment size Protection mechanisms that prevent programs from corrupting the code and data of each other and of the operating system: Primarily, what is needed:

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Basic Segmentation in the Protected Mode Scheme also checks for privileges and access rights to prevent programs from corrupting other programs or the operating system Address Translation (Processor) (Memory) Segment Descriptor Table Table Segment Descriptor (Segment Register) (Offset Register) Offset  Seg number Access Segment Start Address Maximum Allowed Offset

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Protected Mode: and above Domain of the Windows operating system 32-bit addressing: 4G of memory with 2G for the system and 2 G for the application Protected mode still uses segment and offset addresses, but: - Segment definition is through a more complex selector/descriptor mechanism (greater flexibility) - Offset address: 16-bit (286) or 32-bits (386 and above: e.g. EIP register) Descriptors are placed in descriptor tables in main memory Protection is provided by restricting access to memory segments through: - Privilege levels, - and Access rights

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Descriptors specify memory segments Segment number (still in a 16-bit segment register) defines the segment through a selector/descriptor (not directly as in real mode  but more flexibility) 16 bits segment register = 13 bit descriptor selector + 1 bit descriptor table selector + 2-bit requested privilege 2 13 = 8192 Segment Register, e.g. DS How many segments can be defined in total? (1 Table, Segments available to all tasks) (1 table for each task, segments local to each task)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Base: 3-byte  24 bit addressing Limit: 2-byte (16 bit)  Seg. Size: 1B-64 KB Note provision for upward compatibility (286 software run on higher processors) Each 8-byte segment descriptor entry in the table contains: Base address (start address of segment) (size =  P address bus) Limit (maximum offset, i.e. offset for the end address of segment) (segment size = 1 + Limit) Privilege level and access rights to this segment So a segment can start at any location & have a specified length. 8-byte Segment Descriptors Base: 4-byte  32 bit addressing Limit: 2 1/2-byte (20 bit)  Size: 1B-1MB With G (4 K multiplier) bit = 1: 4KB-4GB Segment Availability Instruction Mode: 16/32 bits Contains 2-bit Descriptor Privilege Level Max Limit = Max offset Max Limit < Max offset LSB

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e The base is a 32-bit address at which the memory segment starts The limit is a 20-bit number. When added to the base, it addresses the last location in the segment The limit has a modifier bit called Granularity (G). If G=0: no change If G=1, append limit with FFFH, i.e. segment size is multiplied by 4K With limit specifying 1 MB segments and G=1 (i.e. 4K multiplier): Max Segment size = 4K x 1 MB = 4 GB With 16K segments like this, the system can address 16K x 4 GB = 64 TB (not necessarily all will be in physical memory) Protected Mode: and above (Pentium class)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e and above Example: Descriptor has: base = H limit = 012FFH With G = 0 Segment start = H Segment end = H + 012FFH=230012FFH Segment size = 12FFH+1H = 1300H (= 19 x 256 bytes) With G = 1 (  so actual limit = 012FFFFFH) (append limit in descriptor by FFFH) Segment start = H Segment end = H + 012FFFFFH = 242FFFFFH Segment size = 12FFFFF+1H = H = 2 12 x 1300H = 4K x 1300H

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Base Limit Processor: Segment size = Limit+1 = FF+1 = 100H bytes Access Rights byte What is the RPL value? H What is the selector value? Are we using the global or the local descriptor? table? 24-bit Address Always 0’s for upward compatibility 8-byte Segment Descriptor # 1 MSB 16-bit Segment Register (in main memory) Protected Mode Segmentation Example GDT Base Address  000b Descriptor # 0 Descriptor # 2 Because each descriptor in the table is 8 bytes wide, Selector:  000b is used as an offset from GDT (or LDT) base address to point to the start of the required segment descriptor (= segment #) Offset

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e The Access Rights Byte: & higher* DPL will be compared with the request privilege level (RPL) in the segment register specifying this segment. Allow access to the segment only if RPL has higher or equal privilege to the DPL, subject to the state of C bit if applicable This is the access rights byte in the 8-byte segment descriptor yet 00: Highest Privilege or stack * and higher have 4 more access rights bits ED = Expand Direction for the segment W /R E: Not Code (0) or Code (1) Not Code Segment Code Segment

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Privilege Levels Highest Lowest Hardware Privilege Comparator DPL (in descriptor) RPL (In Seg Reg) RPL  DPL Allow Access to segment 00: Highest Privilege : Lowest Privilege

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Types of Descriptor Tables in memory One Global Descriptor Table (GDT): (64 KB Max) (Start and Limit are cached in GDTR) 1. Descriptors for all global segments (common to all tasks) (cached on the processor for the currently used 6 segments- CS,…GS) 2. For each task:  Descriptor for the task’s task state segment (TSS) in memory The TSS holds all information about the task e.g. processor registers, LDT selector, etc. (descriptor is cached on the processor for the currently running task- selected by TR register)  Descriptor for the task’s Local Descriptor Table (LDT) in memory: The LDT holds descriptors for all local segments for that task. (descriptor is cached on the processor for the currently running task- selected by LDTR register) One Interrupt Descriptor Table (IDT): (64 KB Max) (Start, Limit cached in IDTR) 8-byte descriptors called “interrupt gates” that define the attributes and starting addresses of the interrupt service routines for up to 256 hardware and software interrupts Several local segment descriptor tables, one table for each task. The GDT contains descriptors for these tables as mentioned in 2 above.

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Types of Descriptor Tables in memory ….. number Interrupt Descriptor One LDT table for each task Segment Register 16-bit Selectors for task 0 for task j 8-byte Descriptors Base, Limit, Access for Seg i For each Task (table base address, limit) Memory System: Code, Data, Stack, Extra Segments Calculate Physical Address Segment # i Offset Base, Limit, Access for LDT j Addressed Byte i LDTR

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Program Invisible Registers (caches) Segment Selector (for the GDT and IDT tables) (16-bit) Descriptor Table for interrupts Global Descriptor Table (GDT) for: segments, tasks, and LDT Task Register LDT Register LDT for a task is accessed through a descriptor in the GDT Segment Descriptor Invisible  P Registers (not seen by programmer) In main memory Task Selector Cache for the task state segment (TSS) and the descriptor of the local segment descriptor table - for the currently executing task LDT Selector Visible  p registers Loaded from GDT or LDT tables In memory every time the segment number changes GDT Descriptor cache for the currently used 6 segments Loaded from the GDT In memory every time the task changes Task’s TSS Task’s LDT (24 or 32 bits)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Memory Management Paging: and above The paging mechanism translates a logical (virtual, linear) address generated by the program to a physical (real) address that accesses a storage location in memory Address space consists of pages of bytes: Virtual pages & physical pages (frames) of the same size (e.g. 4K Bytes) Translation is done from virtual to physical pages Physical pages may or may not reside in physical memory: Linear pages If page is not in memory (page fault occurs), it is brought into memory for use Paging applies to both real and protected modes in & above Paging can be enabled or disabled (using bit 31 of control register CR0) If disabled, the address computed with segmentation is the physical address If enabled, paging operates on the virtual address obtained with segmentation to provide the physical address

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Example of 1-level Paging: Address Translation (in the Memory Management Unit- MMU) (From Processor) (Memory) Logical Pages, Each = 4KB Physical frames (pages), Each = 4KB Same offset “Frames” are physical pages # - Page table maps logical page #s to corresponding physical frame #’s - Offset part is the same for both Physical Byte Address (To Memory)/ or Get from HD 1-Level Paging Logical Page # Logical, Linear, Virtual, Programmer Address 32 bit byte address

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e 2-Level Paging 2 32 = 4 GB address space  1 M x 4K pages The 1-level approach requires a single contiguous page table (but maybe we do not need to do all translations!) 2-level paging uses several smaller page tables (up to 1K tables), each providing translation for 1K pages. The smaller page tables can easily fit in memory and we can use as many of them as needed Start address for each page table used is kept in one directory table For 2-level Paging: 32-bit linear byte address space (as generated by the processor) is divided into three parts: –Directory: 10 bits, determines which page table in the page table directory –Page table: 10 bits, determines which page in that page table –Memory offset: 12 bits, determines which byte in that page (same for both logical and physical pages)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e 2-Level Memory Paging: and above 32-bit linear byte address (2 32 = 4 G virtual bytes) bits 1K x 4 bytes (In memory) Page table directory Base address Which page table? (Page table # in Directory) 1024 page table Start addresses 1024 page entries per table 1024 x 1024 = 1 M 4 K bytes 1024 x 1024 x 4 K = 4 G Physical bytes Which page in that page table? (Page # in page table) Which byte in that page? (offset in page) offset Addressed Physical byte 1K x 4 bytes = 4KB (In memory) Page mapping Is done here (Linear  Physical) Page In Page Out Start address of Physical Page Logical  table bytes Start address of a page table Page # (1 of 1 M pages)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e and above: Paging is controlled by four control registers CR0-CR3 on the  p 1: Paging 0: No Paging (address generated by segmentation is considered physical address) Most significant 20 bits of the (Pentium only)  000H 32-bit start address of the page table directory table Linear address corresponding to the most recent page fault Remaining 12 Bits are set to 0’s

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Format for the linear address Format for an entry in:  The page table directory or  a page table 20 bits  000H 32-bit Start Address of:  A page table or  A physical page 10 bits 12 bits Attribute bits for the corresponding page table or page in memory If page is not in memory  A page fault interrupt occurs to bring it from mass storage Each entry is 32 bits i.e. 4 bytes (Page number in Table) (Page Table number)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e 000H Memory Paging: and above 32 bit linear address (from segmentation) (4 G of virtual bytes) bits (In memory) Base Address of page directory Which page table in Directory? 1024 page table entries 1024 page entries 1024 x 1024 = 1 M 4 K byte page 1024 x 1024 x 4 K = 4 G Physical bytes Which page in that page table? Which byte in that page? Addressed Physical byte + CR3 000H 20 bits Append 12 bits 10 bits + + Offset 12 bits  12-bit offset 000H 20 bits 000H Append  00b Append  00b 12 bits For 1024 pages 4 bytes WK 2 Page Table  12-bit offset 1 byte Linear Page # Physical Page #

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Memory Paging: Example + 000H 12 bits 20 bits Physical Memory Pages Linear (Logical) Byte Address Base Address of Page Table Directory = 320H + 000H + Corresponding Physical Byte = 000H  000H + Start of Page Table 0  00b Physical Page 00110H Page (Physical) Physical Address  000H Page (Physical) 4K Bytes 4K Bytes Physical Byte C8FFF Base address for page table 0 Table Directory Base for table 1 Base for table 2,,,,, Linear  Physical Linear  Physical ? Physical Page Linear Page

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Memory space required to accommodate the page directory and the page tables  To page the full linear address space of 4 GB: - Each page is 4KB, so we need to map (translate address for) 1M pages - Each page table holds translations for 1 K pages  # of page tables required = 1M pages / 1 K pages Page Tables: 1K tables x (1K x 4) = 4 MB = 4096 KB Page Table Directory: 1K x 4 = 4 KB Total: 4100) d KB This is a considerable amount of memory So, some operating systems do not support paging for the total memory space, e.g. Windows 3.1 pages only 16 MB (i.e. only 16M/4K = 4K pages) This requires only 4 page tables, occupying 4 x (1K x 4B) = 16 KB of memory. The page directory table is 4 x 4B = 16 B

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Speeding Up the Paging Mechanism Paging requires accessing the page table directory and a page table (in main memory) to generate the physical memory address of the required memory location This slows down memory access To speed this up, a fast associative mapping cache memory is used to store the most recent page address translations, which are also likely to be accessed in the near future The uses a 32-entry TLB (translation look-aside buffer) for holding the physical addresses of the most recently used 32 memory pages (Translation = mapping from logical to physical). Scenarios: –1. Hit in the TLB cache?: Very well!, use the address translation in TLB –Miss in the TLB?: 2. Hit in the page tables: Get translation from page tables in memory and place it in TLB (e.g. by replacing the least-recently-used existing entry there) 3. Miss in the page tables (page fault), Bring page from mass storage into physical memory and update both the page tables and the TLB

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e TLB Operation: Speeding up paging Add entry to TLB Add entry to Page Table Note: Frame = Physical Page 1 2 We assume 1-Level page tables for simplicity Miss in page table also i.e. into memory Associative Search Add entry to TLB Corresponding Frame # Physical Note: TLB provides 1-level mapping 1 2,32,3 3 TLB miss & Page table miss  Page fault P bit = 0 But page table hit

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Organizational Model of the Processor - Functional aspects- how the processor actually functions - Internal organization is determined by functionality required External Buses Control bus Memory I/O Devices Microprocessor-based System; e.g. a microcomputer Two main tasks for the microprocessor in a  p-based system: 1.Interface with external peripherals 2.Execute instructions

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e The 8086 processor model (Organization) Early pipelining attempts Two main functional units: - The Bus Interface Unit (BIU) - The Execution Unit (EU) The BIU generates memory and I/O addresses for reading code and transferring data to/from the processor The EU receives code and data from the BIU, executes the instructions, and stores results in the general purpose registers Pipelined architecture: Two, hopefully independent operations, are executed at the same time by two separate units: - Fetch by the BIU - Execute by the EU

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e The 8086 processor model BIU EU FIFO Instruction/Operand Queue BIU fills it by fetches from memory EU empties it by executing instructions ALU Interfacing: (BIU) Generate all timing & control signals for reads, writes, etc. Synchronize data transfers With all system modules Execution: (EU) Recognize, decode, and execute fetched program instructions Has no direct interaction With external  p busses External  p busses & Control

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e The 8086 processor model Common Scenarios that cause pipeline inefficiency: Operand is not in queue Jump or branch instructions Long-executing instructions: e.g. 83 clock cycles for execution vs. 4 cycles for a fetch. BIU fills the buffer and waits idly! Non-pipelined  Pipelined (8086) Fetch-Execute Overlap Wasted fetches after a Jump inst. 1. Operand not in Queue 2. Fetch operand 3. Execute a. Oops! Turned out to be a Jump instruction! b. Start fetching at the correct target location c. Execute at last! * = Wasted Fetches and Executes (inefficiency) RISC & Modern architectures: Reduce fetches from memory (operate mostly on registers) Speed up memory fetches (cache) Use small instructions (both in length and in execution time)  Finer pipeline stages (super pipelined- 486)  Multiple pipelines ( superscalar- P5)  Predict how the jump will go (branch Prediction)

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Super pipelining Super scalar P5 Pentium Pro, Pentium II, III 0.25:1 0.5:1 I n c r e a s e Evolution of the 80X86 Intel Processors Multi-core Architecture; e.g. Itanium® 2: Multiple processors on chip 2-stage pipelining 5-stage pipelining 2 execution units 3 execution units 1 execution unit 5-stage pipelining 12-stage pipelining Greater Throughput, e.g. in MIPs, MFLOPs

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Example: 486 Processor Model Colored units are new additions Super pipelining EU On Chip Main Memory

© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 Summary Described the  P programming model and purpose and function of program-visible registers Described the Flags register and the purpose of each flag bit Described how memory is accessed using segmentation, both in the real mode and the protected mode Described the program-invisible registers Described the structures and operation of the memory paging mechanism Described the organizational model of the 8086  P Reviewed the evolution of the 80X86 architecture: Pipelining  Super pipelining  Super scalar  Multi core