Download presentation
1
Memory Addressing 09 October 2015
2
Topics x86 Memory Addressing Segmentation Paging Kernel Modules
09 October 2015
3
The x86 operating modes Virtual 8086 mode IA-32e mode 64-bit mode
Power on Real mode Protected mode Compatibility mode System Management mode
4
Why ‘mode’ matters Key differences among the x86 modes:
How memory is addressed and mapped What instruction-set is available Which registers are accessible Which ‘exceptions’ may be generated What data-structures are required How task-switching can be accomplished How interrupts will be processed
5
Mode transitions The processor starts up in ‘real mode’
Mode-transitions normally happen under program control (except for transitions to System Management Mode) Details of programming a mode-change depend on which modes are involved Some mode-transfers aren’t possible (and some mode-changes aren’t documented)
6
Enabling ‘protected-mode’
Protected-mode was first introduced in the processor (used in the IBM-PC/AT) Intel added some special system registers to support protected-mode, and to control the transition from the power-on ‘real-mode’ Global Descriptor Table register (GDTR) Interrupt Descriptor Table register (IDTR) Local Descriptor Table register (LDTR) Task Register (TR) Machine Status Word (MSW)
7
80x86 Processor Architecture
8085 (review) – typical, single segment 8086/88 – pipeline + segments 80286/386 – real(8086)/protected mode 80386 – MMU (+paging) 80486 – cache memory Pentium P6 (Pentium Pro, II, Celeron, III, Xeon, …) Pentium 4, Core 2 – 64 bit extension 09 October 2015
8
Address binding The instructions that make up a program are produced by system software (compiler and/or an assembler). When compiler or assembler is translating a module, it doesn’t know where the module will be loaded in the physical memory. Address translation can be dynamic or static. 8
9
x86 Memory Address Types Logical Address Linear (Virtual) Address
The addresses found in machine code are logical. Consist of a segment + offset w/i that segment. Linear (Virtual) Address Single 32-bit integer. Translated by paging unit into a physical address. Physical Address Used to address memory cells in memory chips. 09 October 2015
10
Paging vs segmentation
One major difference between paging and segmentation is that: paging splits RAM into equal sized chunks called pages segmentation splits memory into chunks of arbitrary sizes This is the main advantage of paging, since equal sized chunks make things more manageable. 09 October 2015
11
Dynamics of Segmentation
Typically, each process has its own segment table. Similarly to paging, each segment table entry contains a present (valid-invalid) bit and a modified bit. If the segment is in main memory, the entry contains the starting address and the length of that segment. Other control bits may be present if protection and sharing is managed at the segment level. Logical to physical address translation is similar to paging except that the offset is added to the starting address (instead of appended). 09 October 2015
12
Logical address consists of a two tuple:
<segment-number, offset>, Segment table – maps two-dimensional physical addresses; each table entry has: base – contains the starting physical address where the segments reside in memory limit – specifies the length of the segment Segment-table base register (STBR) points to the segment table’s location in memory Segment-table length register (STLR) indicates number of segments used by a program; segment number s is legal if s < STLR 09 October 2015
13
Address Translation in a Segmentation System
09 October 2015
14
Simple Combined Segmentation and Paging
The Segment Base is the physical address of the page table of that segment. Present/modified bits are present only in page table entry. Protection and sharing info most naturally resides in segment table entry. Ex: a read-only/read-write bit, a kernel/user bit... 09 October 2015
15
Address Translation in combined Segmentation/Paging
09 October 2015
16
Segment Registers Six registers: cs, ss, ds, es, fs, gs
contain 16-bit segment selector fields 3 segment registers are special purpose (cs, ss, ds) cs: points to code segment Also includes 2-bit Current Privilege Level (CPL) CPL is the ring value (ring 0 is kernel, 3 is user) ss: points to stack segment ds: points to data segment 09 October 2015
17
Segmentation in Hardware
Logical address Segment id: 16-bit (Segment Selector) Offset: 32-bit Segment selector Index: of the segment descriptor TI (Table Indicator): GDT or LDT RPL (Requestor Privilege Level) Segmentation registers To hold segment selectors Special purpose cs: code segment, ss: stack segment, ds: data segment General purpose: es, fs, gs
18
Segment Descriptors Segment descriptors stored in
- GDT: Global Descriptor Table - LDT: Local Descriptor Table (one per process) - Address/size of segment descriptor contained in processor control registers: gdtr, ldtr Fields in segment descriptor - Base: 32-bit linear address - G granularity flag: 1-bit (segment size in bytes or 4KB) - Limit: 20-bit offset - S system flag: 1-bit (system segment or not) - Type: 4-bit Code, Data, Task Sate (TSSD), Local Descriptor Table (LDTD) Data Segment Descriptors describe stack segments as well as other types of data segments. 09 October 2015
19
DPL (descriptor privilege level): 2-bit
To restrict access to the segment Segment-present flag: 1-bit (in memory or not) D or B flag: 1-bit (depending on code or data) Reserved bit (bit 53): 0 AVL flag: 1-bit (ignored by Linux)
20
Segment Descriptor Format
21
Fast Access to Segment Descriptors
For each of the six programmable segmentation registers, 80x86 provides an additional nonprogrammable register, which is loaded every time a segment selector is loaded in a segment register Without accessing the GDT or LDT in memory
22
Segment Selector and Segment Descriptor
23
Logical to Linear Address Translation
Logical Address = 16 bit segment selector + 32 bit offset Segment Selector: 13-bit index into GDT or LDT Table Indicator (TI) flag (0=GDT) Requestor Privilege Level, which is the CPL when selector used. Translation: 1. Select GDT or LTD based on TI 2. Multiply index by 8-byte desc len. 3. Lookup base in segment desc. 4. Linear = segment base + offset 09 October 2015
24
Segmentation in Linux Used in a limited way
Linux prefers paging to segmentation Memory management is simpler Portability to wide range of architectures such as RISC -> Linux 2.6 uses segmentation only when required
25
4 main segments used by Linux
Kernel code segment: __KERNEL_CS macro Kernel data segment: __KERNEL_DS macro User code segment: __USER_CS macro User data segment: __USER_DS macro
26
Linux GDTs Linux GDTs 18 segment descriptors One GDT per CPU
Stored in cpu_gdt_table array Addresses/sizes stored in cpu_gdt_descr array 18 segment descriptors 4 user/kernel code/data segments Task state segment (TSS): init_tss array A default LDT: default_ldt 3 Thread-Local-Storage (TLS) segments 3 segments related to APM (Advanced Power Management) 5 segments related to PnP (Plug and Play) A special TSS segment to handle “double fault” exceptions
27
Linux GDTs
28
Linux LDTs Default LDT: stored in default_ldt array
5 entries included, but only two are effectively used by the kernel A call gate for iBCS executables A call gate for Solaris/x86 executables Call gates: mechanism provided by 80x86 microprocessors to change the privilege level of the CPU
29
Linear Address The base address got from the segment descriptor table is concatenated with the offset This new address is often referred to as a linear address This is the address that is translated by the paging hardware
30
Linear to Physical Translation
Handled by paging unit. Divides memory into 4KB pages. Linear address is divided into 3 fields Directory: most significant 10 bits Table: middle 10 bits Offset: least significant 12 bits Page Directory Every active process must have a Page Directory. cr3 register points to address of in-use PD. Page tables are allocated when needed. 09 October 2015
31
Paging in Hardware Early architectures used 1-level page tables
VAX, x86 used 2-level page tables SPARC uses 3-level page tables Alpha uses 4-level page tables
32
Paging in Hardware Pages: linear addresses grouped in fixed-length intervals Page frames (physical pages): RAM partitioned into fixed-length blocks In 80x86 processors, paging is enabled by setting the PG flag of control register cr0 4KB pages
33
Control Registers | | | | PAGE DIRECTORY BASE REGISTER (PDBR) | RESERVED |CR3 | | | | | PAGE FAULT LINEAR ADDRESS |CR2 | | | RESERVED |CR1 | | |P| |E|E|T|E|M|P| |G| RESERVED |N|T|S|M|P|E|CR0
34
Regular Paging 4KB pages Linear address: 32-bit Two-step translation
Directory: 10 bits Table: 10 bits Offset: 12 bits Two-step translation Page directory: physical address stored in cr3 register Page table
35
x86 Paging Figure 2-7 from Understanding the Linux Kernel, 3/e 09 October 2015
36
Same structure for the entries in page directories and page tables
Present flag: in memory or not 20-MSB of a page physical address Accessed flag: used to select pages to be swapped out Dirty flag: applies only to page table entries Read/write flag: access right of the page User/supervisor flag: privilege level PCD and PWT flag: hardware cache Page size flag: applies only to page directory entries (2MB or 4MB) Global flag: applies only to page table entries (to prevent frequently used pages from being flushed from the TLB cache)
37
Q1.Suppose you had a computer that supported virtual memory and had 32-bit virtual addresses and 4KB (2^12 byte) pages. If a process actually uses 1024 (2^10 ) pages of its virtual address space, how much space would be occupied by the page table for that process if a single-level page table was used? Assume each page table entry occupies 4 bytes.
38
Q2.Assuming a page size of 1 KB and that each page table entry (PTE) takes 4bytes, how many levels of page tables would be required to map a 34-bit address if every page table fits into a single page. Be explicit in your explanation.
39
Q3.In a 32-bit machine we subdivide the virtual address into 4 segments as follows:
10-bit 8-bit 6-bit 8 bit We use a 3-level page table, such that the first 10-bit are for the first level and so on. a)What is the page size in such a system? b)What is the size of a page table for a process that has 256K of memory starting at address 0? c)What is the size of a page table for a process that has a code segment of 48K starting at address 0x , a data segment of 600K starting at address 0x and a stack segment of 64K starting at address 0xf and growing upward
40
Q4. A computer system has a 36-bit virtual address space with a page size of 8K, and 4 bytes per page table entry. a)How many pages are in the virtual address space? b)If the average process size is 8GB, would you use a one-level, or two-level page table.
41
Extended Paging
42
Extended Paging Used to translate large contiguous linear address ranges Page size: 4MB Linear address: 32 bits Directory: 10 bits Offset: 22 bits Page directory entries for extended paging are the same as for regular paging, except that: The Page Size flag must be set Only the 10 most significant bits of the 20-bit physical address are significant
43
Hardware Protection scheme for Paging
Only two privilege levels associated with pages by user/supervisor flag Only two types of access rights associated with pages by read/write flag
44
Physical Address Extension (PAE) Paging Mechanism
Starting with Pentium Pro, the number of address pins are increased to 36 Up to 64GB RAM PAE is activated by setting the PAE flag in cr4 control register
45
When mapping 4KB pages When mapping 2MB pages
Bits 31-30: points to entries in PDPT Bits 29-21: points to entries in page directory Bits 20-12: points to entries in page table Bits 11-0: offset When mapping 2MB pages Bits 20-0: offset
46
Paging for 64-bit Architectures
Two-level paging is not suitable The number of levels depends on the type of processor 3-level: alpha, ia64, ppc64, sh64 4-level: x86_64
47
Hardware Cache Subset of lines- direct mapped, fully associative, N-way set associative
48
Cache hit Cache miss Write through Write back Cache snooping TLB: Translation Lookaside Buffers
49
Paging in Linux
50
Paging in Linux A common paging model for both 32-bit and 64-bit architectures Up to Linux version , 3-level paging Starting with , 4-level paging With no PAE, 2-level paging is enough Linux essentially eliminates the Page Upper Directory and Page Middle Directory fields With PAE, 2-level paging is used Page Global Directory: x86’s PDPT Page Upper Directory: (eliminated) Page Middle Directory: x86’s Page Directory Page Table: x86’s Page Table
51
Page Table Handling Functions/Macros (see pp. 57-61)
Macros for simplifying page table handling: PAGE_SHIFT, PMD_SHIFT, PUD_SHIFT, PGDIR_SHIFT PTRS_PER_PTE, PTRS_PER_PMD, PTRS_PER_PUD, PTRS_PER_PGD Data structures for page table handling: pte_t, pmd_t, pud_t, pgd_t pgprot_t Macros for page table type conversions: Macros: __pte, __pmd, __pud, __pgd, __pgprot Macros: pte_val, pmd_val, pud_val, pgd_val, pgprot_val
52
Macros and functions to read or modify page table entries:
pte_none, pmd_none, pud_none, pgd_none pte_clear, pmd_clear, pud_clear, pgd_clear set_pte, set_pmd, set_pud, set_pgd pte_same(a,b), pmd_large(e) pte_present, pmd_present, pud_present, pgd_present Macros to check page table entries pmd_bad, pud_bad, pgd_bad Functions to query the current value of any flag in a page table entry: pte_user(), pte_read(), pte_write(), pte_exec(), pte_dirty(), pte_young(), pte_file() Functions to set the value of flags in a page table entry: mk_pte_huge(), pte_wrprotect(), pte_rdprotect(), pte_exprotect(), pte_mkwrite(), pte_mkread(), pte_mkexec(), pte_mkdirty(), pte_mkclean(), pte_mkyound(), pte_mkold(), pte_modify(p,v), ptep_set_wrprotect(), ptep_set_access_flags(), ptep_mkdirty(), ptep_test_and_clear_dirty(), ptep_test_and_clear_young()
53
Macros to combine/extract page address into/from a page entry
mk_pte, mk_pte_phys, pte_page(), pmd_page(), pgd_offset(p,a), pmd_offset(p,a), pte_offset(p,a) Functions to create and delete page table entries: pgd_alloc(m), pud_alloc(m, p, a), pmd_alloc(m,p,a), pte_alloc(m,p,a) pte_free(p), pmd_free(x), pud_free(x), pgd_free(p), free_one_pmd(), free_one_pgd(), clear_page_tables()
54
Physical Address Layout
In general, Linux kernel is installed in RAM starting from the second megabyte (0x ) Page frame 0: used by BIOS (to store system configuration detected during POST 0x000a0000 to 0x000fffff: reserved to BIOS (to map ISA cards) Some page frames within the first megabytes may be reserved by specific computer models
55
The First 768 Page Frames (3MB) in Linux 2.6
56
Page Table Entries Present Flag Offset Accessed Flag Dirty Flag
If 0, page not in memory, so paging unit stores linear addr in cr2 and generates exception 14 (Page Fault) on access. Offset Least significant 12 bits of address. Accessed Flag Set when paging unit accesses page. Dirty Flag Set when a write operation performed on page. Read/Write Flag Protection flag: is page read-only or read/write? User/Supervisor Flag Privilege level required to access page or page table. Page Size Flag If 1, page directory entries refer to large (4MB) pages. 09 October 2015
57
Physical Address Extension (PAE)
Allows access to 236 = 64GB physical RAM. Splits memory into 224 pages. Page table entries expanded to handle 24-bit addressing. Page Directory Pointer Table (PDPT) New level of Page Table with 4 entries. Linear Addresses are still 32-bits long cr3 register points to PDPT. PDPT: most significant 2 bits Directory: next 9 bits Table: next 9 bits Offset: least significant 12 bits 09 October 2015
58
Paging Considerations
Locality, VM and Thrashing Prepaging (Anticipatory Paging) Page size issue TLB reach Program structure I/O interlock Copy-on-Write Memory-Mapped Files 09 October 2015
59
Working Sets The working set size is num pages in the working set
the number of pages touched in the interval [t-Δ+1..t]. The working set size changes with program locality. during periods of poor locality, you reference more pages. Within that period of time, you will have a larger working set size. Goal: keep WS for each process in memory. E.g. If Σ WSi for all i runnable processes > physical memory, then suspend a process
60
Prepaging Can help to reduce the large number of page faults that occurs at process startup or resumption. Prepage all or some of the pages a process will need, before they are referenced. But if prepaged pages are unused, I/O and memory was wasted. Assume s pages are prepaged and a fraction α of the pages are used: Is cost of s * α saved pages faults greater or less than the cost of prepaging s * (1- α) unnecessary pages? α near zero ⇒ prepaging loses. 60
61
TLB Reach The amount of memory accessible from the TLB.
Ideally, working set of each process is stored in TLB: Otherwise there is a high degree of page faults. TLB Reach = (TLB Size) x (Page Size) Increase the size of the TLB: might be expensive. Increase the Page Size: This may lead to an increase in internal fragmentation as not all applications require a large page size. Provide Multiple Page Sizes: This allows applications that require larger page sizes the opportunity to use them without an increase in fragmentation. A. Frank - P. Weisberg 61
62
Program Structure Program structure int A[][] = new int[1024][1024];
Each row is stored in one page. Program 1: for (j = 0; j < A.length; j++) for (i = 0; i < A.length; i++) A[i,j] = 0; we have 1024 x 1024 page faults Program 2: for (i = 0; i < A.length; i++) for (j = 0; j < A.length; j++) A[i,j] = 0; we have 1024 page faults A. Frank - P. Weisberg 62
63
The address sequence generated by tracing a particular program executing in a pure demand paging system with 100 bytes per page is 0100, 0200, 0430, 0499, 0510, 0530, 0560, 0120, 0220, 0240, 0260, 0320, Suppose that the memory can store only one page and if x is the address which causes a page fault then the bytes from addresses x to x + 99 are loaded on to the memory. How many page faults will occur ?
64
A CPU generates 32-bit virtual addresses. The page size is 4 KB
A CPU generates 32-bit virtual addresses. The page size is 4 KB. The processor has a translation look-aside buffer (TLB) which can hold a total of 128 page table entries and is 4-way set associative. The minimum size of the TLB tag is:
65
Consider a system with a single level paging scheme in which a regular memory access takes 150 nanoseconds, and servicing a page fault takes 8 milliseconds. An average instruction takes 100 nanoseconds of CPU time and two memory accesses. The TLB hit ratio is 90% and the page fault rate is one in every 10,000 instructions. What is the effective average instruction execution time?
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.