Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture.

Similar presentations


Presentation on theme: "© 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture."— Presentation transcript:

1 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture The Intel 8086, 80X86, and Pentium Family

2 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Contents Internal architecture of the Microprocessor: –The programmer’s model, i.e. the registers model –The processor (organization) model Memory addressing with segmentation - In the real mode - In the protected mode Memory addressing with paging

3 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Objectives for this Chapter Describe the function and purpose of program- visible registers Describe the Flags register and the purpose of flag bits Describe how memory is accessed using segmentation in both the real mode and the protected mode Describe the program-invisible registers Describe the structures and operation of the memory paging mechanism Describe the organizational processor model Briefly review the evolution of the 80X86 architecture

4 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e The Intel Family  (1978)  (2000) = 2 A (A) Microcontrollers) I n c r e a s e Addressable Memory, bytes  I n c r e a s e

5 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Programming Model General Purpose Registers Special Purpose Registers Segment Registers 80386 and above: -32-bit registers (except seg. regs.) -Two additional segment registers:F,G

6 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e General-Purpose Registers The top portion of the programming model contains the general purpose registers: EAX, EBX, ECX, EDX, EBP, ESI, and EDI Can carry both Data & Address offsets Although general in nature, each has a special purpose and name: EAX – Accumulator Used also as AX (16 bit), AH (8 bit), and AL (8 bit) EBX – Base Index often used to address memory (BX, BH, and BL)

7 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e General-Purpose Registers (continued) ECX – count, for shifts, rotates, and loops (CX, CH, and CL) EDX – data, used with multiply and divide (DX, DH, and DL) EBP – base pointer used to address stack data (BP) ESI – source index (SI) for memory locations, e.g. with string instructions EDI – destination index (DI) for memory locations

8 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Special-Purpose Registers ESP, EIP, and EFLAGS Each has a specific task –ESP – Stack pointer: Offset to the top of the stack in the stack segment. Used with procedure calls (SP) –EIP – Instruction Pointer: Offset to the next instruction in a program in the code segment (IP) –EFLAGS – indicates latest conditions (state) of the microprocessor (FLAGS) SS Used With CS

9 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e EFLAGS 80386DX

10 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e The Flags C – Carry/borrow from last operation P – the parity flag (little used today) A – auxiliary flag Half-carry between bits 3 and 4, used with BCD arithmetic Z – zero S – sign O – Overflow D – direction - Determines auto increment/decrement direction for SI and DI registers with string instructions I – interrupt - Enables (using STI) or disables (using CLI) the processing of hardware interrupts arriving at the INTR input pin of the processor T – Trap - Turns trapping interrupt (for program debugging) on/off Basic Flag Bits (8086 etc.): Output, Input bits Determined by last operation Set/Reset explicitly by the programmer Some flag bits can be both, e.g. the C flag

11 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Newer Flag Bits IOPL – 2-bit I/O privilege level in protected mode NT – nested task RF – resume flag (used with debugging) VM – virtual mode: multiple DOS programs each with a 1 MB memory partition in Windows AC – alignment check: detects addressing memory on wrong boundary for words/double words VIF – virtual interrupt flag VIP – virtual interrupt pending ID = CPUID instruction is supported The instruction gives info on CPU version and manufacturer

12 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Segment Registers The segment registers are: –CS (code), –DS (data), –ES (extra data. used as destination for some string instructions), –SS (stack), –FS, and GS: Additional segment registers on 80386 and above Segment registers define the start of a section (segment) of memory for a program. A segment is either: - 64K (2 16 ) bytes of fixed length (real mode), or - Up to 4G (2 32 ) bytes of variable length (protected mode). All code (programs) reside in a code segment. Each register points to the start of a segment in memory

13 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Real Mode Memory Addressing Used by the DOS operating system The only mode available on the 8086-8088: 20 bit address bus  1 MB, 16 bit data bus, 16 bit registers Real mode memory is the first 1M (2 20 ) bytes of the memory system (real, conventional, DOS memory) in later processors Real mode 20-bit addresses are obtained by combining a segment number (in a segment register) and an offset address (in another processor register) The segment register address (16-bits) is appended with a 0H or 0000 2 (or multiplied by 10H or 16d) to form a 20-bit start of segment address Then the effective memory address (EA) = this 20-bit segment start address + the 16-bit offset address in another processor register For the 8086, segment length is fixed @ 2 16 = 64K bytes (determined by the size of the offset registers)

14 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e 1 MB 20-bit (5-byte) Physical Memory address 64 KB Segment 16-bit each Appended 4 bits (0H) + EA (Effective Address) of byte accessed (1 MB) Segment number In Segment Register

15 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Effective Address Calculations EA = segment register (SR) x 10H + offset (a) SR: 1000H 10000 + 0023 = 10023 (b) SR: AAF0H AAF00 + 0134 = AB034 (c) SR: 1200H 12000 + FFF0 = 21FF0 Q: Is 3FC81 a valid start address of a segment?

16 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Overlapping segments How to detect overlap? Top of CS: 090F0 FFFF+ 190EF Code should be limited to only this portion of the code segment, to avoid effects of segment overlap

17 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Defaults Default segment numbers in: –CS for program (code) –SS for stack –DS for data –ES for string (destination) data Default offset addresses that go with them: SegmentOffset (16-bit) 8080, 8086, 80286 Offset (32-bit) 80386 and above Purpose CS IPEIPProgram SS SP, BPESP, EBPStack DS BX, DI, SI, 8-bit or 16-bit #EBX, EDI, ESI, EAX ECX, EDX, 8-bit or 32-bit # Data ES DI, with string instructionsEDI, with string instructionsString destination Convention Example: EA = CS:[IP] Segment number in Segment register Offset: Literal or in a CPU register

18 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Addressing Modes Summary

19 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Segmentation: Pros and Cons Advantages: Allows easy and efficient relocation of code and data To relocate code or data, only the number in the relevant segment register needs to be changed Consequences:  A program can be located anywhere in memory without making any changes to it (addresses are not absolute, but offsets relative to start of segments)  Program writer needs not worry about actual memory structure (map) of the computer used to execute it Disadvantages: Complex hardware and for address generation Address computation delay for every memory access Software limitation: Program size limited by segment size (64KB with the 8086)

20 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Limitations of the above real mode segmentation scheme Segment size is fixed at and limited to 64 KB Segment can not begin at an arbitrary memory address… With 20-bit memory addressing, can only begin at addresses starting with 0H, i.e. at 16 byte intervals  Principle is difficult to apply with 80286 and above, with segment registers remaining at 16-bits! 80286 and above use 24, 32 bit addresses but still 16-bit segment registers No protection mechanisms: Programs can overwrite operating system code segments and corrupt them!  Use memory segmentation in the protected mode Append: 00H 0000H

21 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Protected Mode Segmentation: Flexible definition of segment starting address Flexible definition of segment size Protection mechanisms that prevent programs from corrupting the code and data of each other and of the operating system: Primarily, what is needed:

22 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Basic Segmentation in the Protected Mode Scheme also checks for privileges and access rights to prevent programs from corrupting other programs or the operating system Address Translation (Processor) (Memory) Segment Descriptor Table Table Segment Descriptor (Segment Register) (Offset Register) Offset  Seg number Access Segment Start Address Maximum Allowed Offset

23 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Protected Mode: 80286 and above Domain of the Windows operating system 32-bit addressing: 4G of memory with 2G for the system and 2 G for the application Protected mode still uses segment and offset addresses, but: - Segment definition is through a more complex selector/descriptor mechanism (greater flexibility) - Offset address: 16-bit (286) or 32-bits (386 and above: e.g. EIP register) Descriptors are placed in descriptor tables in main memory Protection is provided by restricting access to memory segments through: - Privilege levels, - and Access rights

24 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Descriptors specify memory segments Segment number (still in a 16-bit segment register) defines the segment through a selector/descriptor (not directly as in real mode  but more flexibility) 16 bits segment register = 13 bit descriptor selector + 1 bit descriptor table selector + 2-bit requested privilege 2 13 = 8192 Segment Register, e.g. DS How many segments can be defined in total? (1 Table, Segments available to all tasks) (1 table for each task, segments local to each task)

25 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Base: 3-byte  24 bit addressing Limit: 2-byte (16 bit)  Seg. Size: 1B-64 KB Note provision for upward compatibility (286 software run on higher processors) Each 8-byte segment descriptor entry in the table contains: Base address (start address of segment) (size =  P address bus) Limit (maximum offset, i.e. offset for the end address of segment) (segment size = 1 + Limit) Privilege level and access rights to this segment So a segment can start at any location & have a specified length. 8-byte Segment Descriptors Base: 4-byte  32 bit addressing Limit: 2 1/2-byte (20 bit)  Size: 1B-1MB With G (4 K multiplier) bit = 1: 4KB-4GB Segment Availability Instruction Mode: 16/32 bits Contains 2-bit Descriptor Privilege Level Max Limit = Max offset Max Limit < Max offset LSB

26 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e The base is a 32-bit address at which the memory segment starts The limit is a 20-bit number. When added to the base, it addresses the last location in the segment The limit has a modifier bit called Granularity (G). If G=0: no change If G=1, append limit with FFFH, i.e. segment size is multiplied by 4K With limit specifying 1 MB segments and G=1 (i.e. 4K multiplier): Max Segment size = 4K x 1 MB = 4 GB With 16K segments like this, the system can address 16K x 4 GB = 64 TB (not necessarily all will be in physical memory) Protected Mode: 80386 and above (Pentium class)

27 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e 80386 and above Example: Descriptor has: base = 23000000H limit = 012FFH With G = 0 Segment start = 23000000H Segment end = 23000000H + 012FFH=230012FFH Segment size = 12FFH+1H = 1300H (= 19 x 256 bytes) With G = 1 (  so actual limit = 012FFFFFH) (append limit in descriptor by FFFH) Segment start = 23000000H Segment end = 23000000H + 012FFFFFH = 242FFFFFH Segment size = 12FFFFF+1H = 1300000H = 2 12 x 1300H = 4K x 1300H

28 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Base Limit Processor: 80286 Segment size = Limit+1 = FF+1 = 100H bytes Access Rights byte What is the RPL value? H What is the selector value? Are we using the global or the local descriptor? table? 24-bit Address Always 0’s for upward compatibility 8-byte Segment Descriptor # 1 MSB 16-bit Segment Register (in main memory) Protected Mode Segmentation Example GDT Base Address  000b Descriptor # 0 Descriptor # 2 Because each descriptor in the table is 8 bytes wide, Selector:  000b is used as an offset from GDT (or LDT) base address to point to the start of the required segment descriptor (= segment #) Offset

29 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e The Access Rights Byte: 80286 & higher* DPL will be compared with the request privilege level (RPL) in the segment register specifying this segment. Allow access to the segment only if RPL has higher or equal privilege to the DPL, subject to the state of C bit if applicable This is the access rights byte in the 8-byte segment descriptor yet 00: Highest Privilege or stack * 80386 and higher have 4 more access rights bits ED = Expand Direction for the segment W /R E: Not Code (0) or Code (1) Not Code Segment Code Segment

30 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Privilege Levels Highest Lowest Hardware Privilege Comparator DPL (in descriptor) RPL (In Seg Reg) RPL  DPL Allow Access to segment 00: Highest Privilege 01 10 11: Lowest Privilege

31 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Types of Descriptor Tables in memory One Global Descriptor Table (GDT): (64 KB Max) (Start and Limit are cached in GDTR) 1. Descriptors for all global segments (common to all tasks) (cached on the processor for the currently used 6 segments- CS,…GS) 2. For each task:  Descriptor for the task’s task state segment (TSS) in memory The TSS holds all information about the task e.g. processor registers, LDT selector, etc. (descriptor is cached on the processor for the currently running task- selected by TR register)  Descriptor for the task’s Local Descriptor Table (LDT) in memory: The LDT holds descriptors for all local segments for that task. (descriptor is cached on the processor for the currently running task- selected by LDTR register) One Interrupt Descriptor Table (IDT): (64 KB Max) (Start, Limit cached in IDTR) 8-byte descriptors called “interrupt gates” that define the attributes and starting addresses of the interrupt service routines for up to 256 hardware and software interrupts Several local segment descriptor tables, one table for each task. The GDT contains descriptors for these tables as mentioned in 2 above.

32 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Types of Descriptor Tables in memory ….. number Interrupt Descriptor One LDT table for each task Segment Register 16-bit Selectors for task 0 for task j 8-byte Descriptors Base, Limit, Access for Seg i For each Task (table base address, limit) Memory System: Code, Data, Stack, Extra Segments Calculate Physical Address Segment # i Offset Base, Limit, Access for LDT j Addressed Byte i LDTR

33 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Program Invisible Registers (caches) Segment Selector (for the GDT and IDT tables) (16-bit) Descriptor Table for interrupts Global Descriptor Table (GDT) for: segments, tasks, and LDT Task Register LDT Register LDT for a task is accessed through a descriptor in the GDT Segment Descriptor Invisible  P Registers (not seen by programmer) In main memory Task Selector Cache for the task state segment (TSS) and the descriptor of the local segment descriptor table - for the currently executing task LDT Selector Visible  p registers Loaded from GDT or LDT tables In memory every time the segment number changes GDT Descriptor cache for the currently used 6 segments Loaded from the GDT In memory every time the task changes Task’s TSS Task’s LDT (24 or 32 bits)

34 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Memory Management Paging: 80386 and above The paging mechanism translates a logical (virtual, linear) address generated by the program to a physical (real) address that accesses a storage location in memory Address space consists of pages of bytes: Virtual pages & physical pages (frames) of the same size (e.g. 4K Bytes) Translation is done from virtual to physical pages Physical pages may or may not reside in physical memory: Linear pages If page is not in memory (page fault occurs), it is brought into memory for use Paging applies to both real and protected modes in 80836 & above Paging can be enabled or disabled (using bit 31 of control register CR0) If disabled, the address computed with segmentation is the physical address If enabled, paging operates on the virtual address obtained with segmentation to provide the physical address

35 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Example of 1-level Paging: Address Translation (in the Memory Management Unit- MMU) (From Processor) (Memory) Logical Pages, Each = 4KB Physical frames (pages), Each = 4KB Same offset “Frames” are physical pages # - Page table maps logical page #s to corresponding physical frame #’s - Offset part is the same for both Physical Byte Address (To Memory)/ or Get from HD 1-Level Paging Logical Page # Logical, Linear, Virtual, Programmer Address 32 bit byte address

36 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e 2-Level Paging 2 32 = 4 GB address space  1 M x 4K pages The 1-level approach requires a single contiguous page table (but maybe we do not need to do all translations!) 2-level paging uses several smaller page tables (up to 1K tables), each providing translation for 1K pages. The smaller page tables can easily fit in memory and we can use as many of them as needed Start address for each page table used is kept in one directory table For 2-level Paging: 32-bit linear byte address space (as generated by the processor) is divided into three parts: –Directory: 10 bits, determines which page table in the page table directory –Page table: 10 bits, determines which page in that page table –Memory offset: 12 bits, determines which byte in that page (same for both logical and physical pages)

37 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e 2-Level Memory Paging: 80386 and above 32-bit linear byte address (2 32 = 4 G virtual bytes) 10 12 bits 1K x 4 bytes (In memory) Page table directory Base address Which page table? (Page table # in Directory) 1024 page table Start addresses 1024 page entries per table 1024 x 1024 = 1 M 4 K bytes 1024 x 1024 x 4 K = 4 G Physical bytes Which page in that page table? (Page # in page table) Which byte in that page? (offset in page) offset Addressed Physical byte 1K x 4 bytes = 4KB (In memory) Page mapping Is done here (Linear  Physical) Page In Page Out Start address of Physical Page Logical  table 1 2 3 4 bytes Start address of a page table Page # (1 of 1 M pages)

38 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e 80386 and above: Paging is controlled by four control registers CR0-CR3 on the  p 1: Paging 0: No Paging (address generated by segmentation is considered physical address) Most significant 20 bits of the (Pentium only)  000H 32-bit start address of the page table directory table Linear address corresponding to the most recent page fault Remaining 12 Bits are set to 0’s

39 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Format for the linear address Format for an entry in:  The page table directory or  a page table 20 bits  000H 32-bit Start Address of:  A page table or  A physical page 10 bits 12 bits Attribute bits for the corresponding page table or page in memory If page is not in memory  A page fault interrupt occurs to bring it from mass storage Each entry is 32 bits i.e. 4 bytes (Page number in Table) (Page Table number)

40 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e 000H Memory Paging: 80386 and above 32 bit linear address (from segmentation) (4 G of virtual bytes) 10 12 bits (In memory) Base Address of page directory Which page table in Directory? 1024 page table entries 1024 page entries 1024 x 1024 = 1 M 4 K byte page 1024 x 1024 x 4 K = 4 G Physical bytes Which page in that page table? Which byte in that page? Addressed Physical byte + CR3 000H 20 bits Append 12 bits 10 bits + + Offset 12 bits  12-bit offset 000H 20 bits 000H Append  00b Append  00b 12 bits For 1024 pages 4 bytes WK 2 Page Table  12-bit offset 1 byte Linear Page # Physical Page #

41 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Memory Paging: Example + 000H 12 bits 20 bits Physical Memory Pages Linear (Logical) Byte Address Base Address of Page Table Directory = 320H + 000H + Corresponding Physical Byte = 000H  000H + Start of Page Table 0  00b Physical Page 00110H Page 00110 (Physical) Physical Address  000H Page 00000 (Physical) 4K Bytes 4K Bytes Physical Byte C8FFF Base address for page table 0 Table Directory Base for table 1 Base for table 2,,,,, Linear  Physical Linear  Physical ? Physical Page Linear Page

42 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Memory space required to accommodate the page directory and the page tables  To page the full linear address space of 4 GB: - Each page is 4KB, so we need to map (translate address for) 1M pages - Each page table holds translations for 1 K pages  # of page tables required = 1M pages / 1 K pages Page Tables: 1K tables x (1K x 4) = 4 MB = 4096 KB Page Table Directory: 1K x 4 = 4 KB Total: 4100) d KB This is a considerable amount of memory So, some operating systems do not support paging for the total memory space, e.g. Windows 3.1 pages only 16 MB (i.e. only 16M/4K = 4K pages) This requires only 4 page tables, occupying 4 x (1K x 4B) = 16 KB of memory. The page directory table is 4 x 4B = 16 B

43 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Speeding Up the Paging Mechanism Paging requires accessing the page table directory and a page table (in main memory) to generate the physical memory address of the required memory location This slows down memory access To speed this up, a fast associative mapping cache memory is used to store the most recent page address translations, which are also likely to be accessed in the near future The 80386 uses a 32-entry TLB (translation look-aside buffer) for holding the physical addresses of the most recently used 32 memory pages (Translation = mapping from logical to physical). Scenarios: –1. Hit in the TLB cache?: Very well!, use the address translation in TLB –Miss in the TLB?: 2. Hit in the page tables: Get translation from page tables in memory and place it in TLB (e.g. by replacing the least-recently-used existing entry there) 3. Miss in the page tables (page fault), Bring page from mass storage into physical memory and update both the page tables and the TLB

44 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e TLB Operation: Speeding up paging Add entry to TLB Add entry to Page Table Note: Frame = Physical Page 1 2 We assume 1-Level page tables for simplicity Miss in page table also i.e. into memory Associative Search 2 3 3 3 Add entry to TLB Corresponding Frame # Physical Note: TLB provides 1-level mapping 1 2,32,3 3 TLB miss & Page table miss  Page fault P bit = 0 But page table hit

45 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Organizational Model of the Processor - Functional aspects- how the processor actually functions - Internal organization is determined by functionality required External Buses Control bus Memory I/O Devices Microprocessor-based System; e.g. a microcomputer Two main tasks for the microprocessor in a  p-based system: 1.Interface with external peripherals 2.Execute instructions

46 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e The 8086 processor model (Organization) Early pipelining attempts Two main functional units: - The Bus Interface Unit (BIU) - The Execution Unit (EU) The BIU generates memory and I/O addresses for reading code and transferring data to/from the processor The EU receives code and data from the BIU, executes the instructions, and stores results in the general purpose registers Pipelined architecture: Two, hopefully independent operations, are executed at the same time by two separate units: - Fetch by the BIU - Execute by the EU

47 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e The 8086 processor model BIU EU FIFO Instruction/Operand Queue BIU fills it by fetches from memory EU empties it by executing instructions ALU Interfacing: (BIU) Generate all timing & control signals for reads, writes, etc. Synchronize data transfers With all system modules Execution: (EU) Recognize, decode, and execute fetched program instructions Has no direct interaction With external  p busses External  p busses & Control

48 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e The 8086 processor model Common Scenarios that cause pipeline inefficiency: Operand is not in queue Jump or branch instructions Long-executing instructions: e.g. 83 clock cycles for execution vs. 4 cycles for a fetch. BIU fills the buffer and waits idly! Non-pipelined  Pipelined (8086) Fetch-Execute Overlap Wasted fetches after a Jump inst. 1. Operand not in Queue 2. Fetch operand 3. Execute a. Oops! Turned out to be a Jump instruction! b. Start fetching at the correct target location c. Execute at last! * = Wasted Fetches and Executes (inefficiency) RISC & Modern architectures: Reduce fetches from memory (operate mostly on registers) Speed up memory fetches (cache) Use small instructions (both in length and in execution time)  Finer pipeline stages (super pipelined- 486)  Multiple pipelines ( superscalar- P5)  Predict how the jump will go (branch Prediction)

49 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Super pipelining Super scalar P5 Pentium Pro, Pentium II, III 0.25:1 0.5:1 I n c r e a s e Evolution of the 80X86 Intel Processors Multi-core Architecture; e.g. Itanium® 2: Multiple processors on chip 2-stage pipelining 5-stage pipelining 2 execution units 3 execution units 1 execution unit 5-stage pipelining 12-stage pipelining Greater Throughput, e.g. in MIPs, MFLOPs

50 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Example: 486 Processor Model Colored units are new additions Super pipelining EU On Chip Main Memory

51 © 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 Summary Described the  P programming model and purpose and function of program-visible registers Described the Flags register and the purpose of each flag bit Described how memory is accessed using segmentation, both in the real mode and the protected mode Described the program-invisible registers Described the structures and operation of the memory paging mechanism Described the organizational model of the 8086  P Reviewed the evolution of the 80X86 architecture: Pipelining  Super pipelining  Super scalar  Multi core


Download ppt "© 2006 Pearson Education, Upper Saddle River, NJ 07458. All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture."

Similar presentations


Ads by Google