MAMAS – Computer Architecture Address spaces and Memory management Dr

Name: MAMAS – Computer Architecture Address spaces and Memory management Dr
Uploaded: 2017-12-15T06:38:50+00:00
Duration: PTM24S40
Channel: Lindsey Plough
Description: MAMAS – Computer Architecture Address spaces and Memory management Dr

MAMAS – Computer Architecture Address spaces and Memory management Dr
MAMAS – Computer Architecture Address spaces and Memory management Dr. Avi Mendelson Some of the slides were taken from: (1) Jim Smith (2) Patterson presentations.

Memory System Problems
Different Programs have different memory requirements How to manage program placement? Different machines have different amount of memory How to run the same program on many different machines? At any given time each machine runs a different set of programs How to fit the program mix into memory? Reclaiming unused memory? Moving code around? The amount of memory consumed by each program is dynamic (changes over time) How to effect changes in memory location: add or subtract space? Program bugs can cause a program to generate reads and writes outside the program address space How to protect one program from another?

Address Spaces

Address Space of a single process (Linux)
Each process has a fixed size address space that can be divided into: Symbol tables and other management areas Code Data “static data” allocated by compiler “Dynamic data allocated by “malloc” Stack Managed automatically memory invisible to user code kernel virtual memory %esp stack the “brk” ptr runtime heap (via malloc) uninitialized data (.bss) initialized data (.data) program text (.text) forbidden

Virtual Memory Main memory can act as a cache for the secondary storage (disk) Divide memory (virtual and physical) into fixed size blocks (Pages, Frames) Pages in Virtual space, Frames in Physical space Page size = Frame size Page size is a power of 2: page size = 2k All pages in the virtual address space are contiguous Pages can be mapped into physical Frames in any order Some of the pages are in main memory (DRAM), some of the pages are on disk. Physical Addresses Virtual Addresses Address Translation Disk Addresses

Managing Multiple Virtual Address spaces
Each process “sees” a private address space of 4G. 2G private address space the process can use 2G system address space, shared by all processes, and can be accessed only if the system at “supervisor” mode (kernel mode). All processes are sharing the same (small) physical memory Small part of the address space is kept in main memory (DRAM), most of the address space is either not mapped or is kept on disk. All programs are written using Virtual Memory Address Space The hardware does on-the-fly translation between virtual and physical address spaces. 2GB 2GB U Area 2GB U Area U Area 2GB Virtual address spaces

How a process is created (exec)
The executable file contains the code, the initial values for the data area and initial tables (symbol tables etc). A page table is created to map the virtual addresses to the physical addresses At the start time, no page is in the main memory, so Code pages are pointed to the disk where the code section within the file is The initial data pages (.data) are mapped to the .data section within the executable file. The system may allocate pages for initial stack space, data space (.BSS) and for the heap area (this is a performance optimization, the system can avoid it). As the execution continues, the system allocates pages in the main memory, copies their content from the disk and change the pointer to point to the main memory. When a page is replaced from the main memory, the system keeps it in a special place on the disk, called swap area.

Virtual Memory can help to share address spaces
If several processes map different pages to the same physical page, the data/code of that page is shared (we will discuss how the OS takes advantage of this feature later on) Physical Address Space (DRAM) Virtual Address Space for Process 1: Address Translation VP 1 PP 2 VP 2 ... N-1 (e.g., read only or library code) PP 7 Virtual Address Space for Process 2: VP 1 VP 2 PP 10 ... M-1 N-1

VM can help process protection
Page table entry contains access rights information hardware enforces this protection (trap into OS if violation occurs) Page Tables Memory Physical Addr Read? Write? PP 9 Yes No PP 4 XXXXXXX VP 0: VP 1: VP 2: • 0: 1: N-1: Process i: Physical Addr Read? Write? PP 6 Yes PP 9 No XXXXXXX • VP 0: VP 1: VP 2: Process j:

Virtual Memory Implementation

Page Tables Page Table Physical Memory Disk Virtual page number
Physical Page Or Disk Address Virtual page number Physical Memory Valid 1 1 1 1 1 1 1 Disk 1 1

Virtual to Physical Address translation
Virtual Address 3 1 3 2 9 2 8 2 7 1 5 1 4 1 3 1 2 1 1 1 9 8 3 2 1 Virtual page number Page offset Page Table 2 9 2 8 2 7 1 5 1 4 1 3 1 2 1 1 1 9 8 3 2 1 Physical page number Page offset Physical Addresses Page size: 212 byte =4K byte

The Page Table Virtual Address Physical Address Virtual Page Number
31 12 11 Virtual Page Number Page offset V D AC Frame number Page table base reg Access Control Dirty bit 1 Valid bit 29 12 11 Physical Frame Number Page offset Physical Address

Address Mapping Algorithm
If V = 1 then page is in main memory at frame address stored in table  Access data else (page fault) need to fetch page from disk  causes a trap, usually accompanied by a context switch: current process is suspended while page is fetched from disk Access Control (R = Read-only, R/W = read/write, X = execute only) If kind of access not compatible with specified access rights then protection_violation_fault  causes trap to hardware, or software fault handler Missing item fetched from secondary memory only on the occurrence of a fault  demand load policy

Managing virtual addresses
Virtual memory management is mainly done by the OS. Deciding which pages should be in memory and which – on disk. Deciding when to write pages to disk. Deciding what access rights are assigned to pages. Et cetera, et cetera. The exact algorithms are out of the scope of this course. Here, we study two hardware mechanisms Pseudo LRU mechanism – to decide which pages to keep TLB – to perform fast lookup

Page Replacement Algorithm (pseudo LRU)
Not Recently Used (NRU) Associated with each page is a reference flag such that ref flag = 1 if the page has been referenced in recent past If replacement is needed, choose any page frame such that its reference bit is 0. This is a page that has not been referenced in the recent past Clock implementation of NRU: 1 0 page table entry Ref bit last replaced pointer (lrp) if replacement is to take place, advance lrp to next entry (mod table size) until one with a 0 bit is found; this is the target for replacement; As a side effect, all examined PTE's have their reference bits set to zero. Possible optimization: search for a page that is both not recently referenced AND not dirty

Page Faults Page faults: the data is not in memory  retrieve it from disk The CPU must detect situation The CPU cannot remedy the situation (has no knowledge of the disk) CPU must trap to the operating system so that it can remedy the situation Pick a page to discard (possibly writing it to disk) Load the page in from disk Update the page table Resume to program so HW will retry and succeed! Page fault incurs a huge miss penalty Pages should be fairly large (e.g., 4KB) Can handle the faults in software instead of hardware Page fault causes a context switch Using write-through is too expensive so we use write-back

Virtual Memory Unix (VAX)
Swappable area VAX was the first system to run UNIX, and so many of the UNIX mechanisms are based on the VAX implementation. Unix distinguishes between 3 different address spaces, each of them is controlled by a separate page table The system address space – one per system User code + data address space – one per process User Stack address space – one per process. Only the system page table (one table) resists in the main memory all the other tables and address spaces are pageable. code stack code stack User space System space User PT Run Kernel Sys page table PP table

VM in VAX: Address Format
Virtual Address 31 30 29 9 8 Virtual Page Number Page offset P0 process space code and data P1 process space stack S0 system space S1 not used Physical Address 29 9 8 Physical Frame Number Page offset Page size: 29 = 512 bytes

Page Table Entry (PTE) This is the structure of all the entries in each of the page tables. V PROT M Z OWN S S Physical Frame Number 31 20 Valid bit =1 if page mapped to main memory, otherwise page on the swap area and the address indicates where I can find the page on the disk 4 Protection bits Modified bit 3 ownership bits Indicate if the line was cleaned (zero)

Address Translation The System address space is divided into “resident” part that always exists in the main memory and the rest of the 2G address space of the system is swappable. User space is always swappable. The system’s page table is part of the resident area and the SBR register points on its physical base address. For each process, there are two dedicated registers: P0BR: points to the virtual address (in the system’s address space) of the code segment page table of the user P1BR: points to the virtual address (in the system’s address space) of the stack segment page table of the user

System Space - Address Translation
From the virtual address of the system’s address space we can extract the virtual page number (VPN). Assuming that each entry in the system address space is 4 bytes, the PTE that contains the physical address of the line exists in SBR+VPN*4. If V=1, we can extract the physical address of the page. If V=0, it causes a page fault and we need to bring the page from the disk. SBR VPN*4 PFN 10 offset 8 29 9 VPN 31

P0/P1 Address Translation
Address translation need to be done in few phases: Find the address of the proper PTE in the user level page table. Since the PTE address is within the system’s address space, check if the page that contains the PTE is in the main memory If page exists, check if the user page is in the main memory If the page does not exist, causes a page fault to bring the value of the PTE and only than, we can check if the address exists in the memory or not. We may have up to 2 page-faults in the translation process of the singe access to a user level address.

P0/P1 Space Address Translation (cont)
00 offset 8 29 9 VPN 31 10 Offset’ 8 29 9 VPN’ 31 P0BR+VPN*4 SBR VPN’*4 PFN’ Offset’ 8 29 9 PFN’ Physical addr of PTE PFN Offset 8 29 9 PFN

Handling large address space
For a large virtual address space, we may get a large page table, for example: For a 2 GB virtual address space, page size of 4KB, we get 231/212=219 entries Assuming each entry is 4 bytes wide, page table alone will occupy 2MB in main memory Two proposed techniques to handle it: Hierarchical page tables “segmentation”

Large Address Spaces Solution: use two-level Page Tables
Master page table resides in physical memory Secondary page tables reside in virtual address space Virtual address format At the lower levels of the tree we will keep only the segments of the tables we use 4 bytes 4KB 1K PTEs P1 index P2 index page offest 10 12

Virtual Memory – Memory-management
We do not need to map the addresses between the TOS (Top Of Stack) and the Break point The system needs to know how to map new regions when needed Virtual Addresses Physical Addresses Address Translation Disk Addresses

The VAX Solution Segmentation
p0 Map only the sections the program uses Need to extend at run time Stack manipulation SBRK command p1

How a process is forked When a process is forked, the new process and the old processes starts at the same point with the same state. The only difference between the two processes is that the Fork command returns the PID to the original process and 0 for the “new born” process. When a process is forked, we try to avoid to copy physical pages At the initial point, the translation tables are copied All pages in the memory that belong to the original process are marked as “read-only” Only when one of the processes tries to change the page, we copy the page to private copies and allow private writable copies. (this mechanism is called COW – Copy On Write)

TLB – hardware support for address translation
TLB is a cache that keeps the last translations the system made, using the virtual address as an index If the translation was found in the TLB, no further translation is needed. If it misses, we need to continue the process as described before.

Making Address Translation Fast
TLB (Translation Lookaside Buffer) is a cache for recent address translations: TLB Valid Tag Physical Page Virtual page number 1 Physical Memory 1 1 1 1 Page Table Valid 1 1 Disk 1 1 1 1 Physical Page Or Disk Address 1 1 1

P0 space Address translation Using TLB
VPN offset Memory Access Process TLB Access Process TLB hit? No Calculate PTE virtual addr (in S0): P0BR+4*VPN Yes System TLB Access Get PTE of req page from the proc. TLB System TLB hit? Access Sys Page Table in SBR+4*VPN(PTE) No Yes PFN Get PTE from system TLB Calculate physical address PFN Get PTE of req page from the process Page table Access Memory

Virtual Memory And Cache
Yes No TLB Access TLB Hit ? Access Page Table Cache Virtual Address Hit ? Memory Physical Addresses Data TLB access is serial with cache access

Overlapped TLB & Cache Access
Virtual Memory view of a Physical Address 3 2 1 9 8 5 4 7 Physical page number Page offset Cache view of a Physical Address 3 2 1 9 8 5 4 7 Tag # Set Disp In the above example #Set is not contained within the Page Offset The #Set is not known Until the Physical page number is known Cache can be accessed only after address translation done.

Overlapped TLB & Cache Access (cont)
Virtual Memory view of a Physical Address 3 2 1 9 8 5 4 7 Physical page number Page offset Cache view of a Physical Address 3 2 1 9 8 5 4 7 Tag # Set Disp In the above example #Set is contained within the Page Offset The #Set is known immediately Cache can be accessed in parallel with address translation First the tags from the appropriate set are brought Tag comparison takes place only after the Physical page number is known (after address translation done).

Overlapped TLB & Cache Access (cont)
First Stage Virtual Page Number goes to TLB for translation Page offset goes to cache to get all tags from appropriate set Second Stage The Physical Page Number, obtained from the TLB, goes to the cache for tag comparison. Limitation: Cache < (page size * associativity) How can we overcome this limitation ? Fetch 2 sets and mux after translation Increase associativity

Virtual memory and process switch
Whenever we switch the execution between processes we need to: Save the state of the process Load the state of the new process Make sure that the P0BPT and P1BPT registers are pointing to the new translation tables Clean the TLB We do not need to clean the cache (Why????)

MAMAS – Computer Architecture Address spaces and Memory management Dr

Similar presentations

Presentation on theme: "MAMAS – Computer Architecture Address spaces and Memory management Dr"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

MAMAS – Computer Architecture Address spaces and Memory management Dr

Similar presentations

Presentation on theme: "MAMAS – Computer Architecture Address spaces and Memory management Dr"— Presentation transcript:

Similar presentations

About project

Feedback