Presentation on theme: "MAMAS – Computer Architecture Address spaces and Memory management Dr"— Presentation transcript:
1 MAMAS – Computer Architecture Address spaces and Memory management Dr MAMAS – Computer Architecture Address spaces and Memory management Dr. Avi MendelsonSome of the slides were taken from:(1) Jim Smith (2) Patterson presentations.
2 Memory System Problems Different Programs have different memory requirementsHow to manage program placement?Different machines have different amount of memoryHow to run the same program on many different machines?At any given time each machine runs a different set of programsHow to fit the program mix into memory? Reclaiming unused memory? Moving code around?The amount of memory consumed by each program is dynamic (changes over time)How to effect changes in memory location: add or subtract space?Program bugs can cause a program to generate reads and writes outside the program address spaceHow to protect one program from another?
4 Address Space of a single process (Linux) Each process has a fixed size address space that can be divided into:Symbol tables and other management areasCodeData“static data” allocated by compiler“Dynamic data allocated by “malloc”StackManaged automaticallymemory invisible to user codekernel virtual memory%espstackthe “brk” ptrruntime heap (via malloc)uninitialized data (.bss)initialized data (.data)program text (.text)forbidden
5 Virtual MemoryMain memory can act as a cache for the secondary storage (disk)Divide memory (virtual and physical) into fixed size blocks (Pages, Frames)Pages in Virtual space, Frames in Physical spacePage size = Frame sizePage size is a power of 2: page size = 2kAll pages in the virtual address space are contiguousPages can be mapped into physical Frames in any orderSome of the pages are in main memory (DRAM), some of the pages are on disk.Physical AddressesVirtual AddressesAddressTranslationDisk Addresses
6 Managing Multiple Virtual Address spaces Each process “sees” a private address space of 4G.2G private address space the process can use2G system address space, shared by all processes, and can be accessed only if the system at “supervisor” mode (kernel mode).All processes are sharing the same (small) physical memorySmall part of the address space is kept in main memory (DRAM), most of the address space is either not mapped or is kept on disk.All programs are written using Virtual Memory Address SpaceThe hardware does on-the-fly translation between virtual and physical address spaces.2GB2GBU Area2GBU AreaU Area2GBVirtual address spaces
7 How a process is created (exec) The executable file contains the code, the initial values for the data area and initial tables (symbol tables etc).A page table is created to map the virtual addresses to the physical addressesAt the start time, no page is in the main memory, soCode pages are pointed to the disk where the code section within the file isThe initial data pages (.data) are mapped to the .data section within the executable file.The system may allocate pages for initial stack space, data space (.BSS) and for the heap area (this is a performance optimization, the system can avoid it).As the execution continues, the system allocates pages in the main memory, copies their content from the disk and change the pointer to point to the main memory.When a page is replaced from the main memory, the system keeps it in a special place on the disk, called swap area.
8 Virtual Memory can help to share address spaces If several processes map different pages to the same physical page, the data/code of that page is shared (we will discuss how the OS takes advantage of this feature later on)PhysicalAddressSpace (DRAM)Virtual Address Space for Process 1:Address TranslationVP 1PP 2VP 2...N-1(e.g., read only or library code)PP 7Virtual Address Space for Process 2:VP 1VP 2PP 10...M-1N-1
9 VM can help process protection Page table entry contains access rights informationhardware enforces this protection (trap into OS if violation occurs)Page TablesMemoryPhysical AddrRead?Write?PP 9YesNoPP 4XXXXXXXVP 0:VP 1:VP 2:•0:1:N-1:Process i:Physical AddrRead?Write?PP 6YesPP 9NoXXXXXXX•VP 0:VP 1:VP 2:Process j:
14 Address Mapping Algorithm If V = 1 thenpage is in main memory at frame address stored in table Access dataelse (page fault)need to fetch page from disk causes a trap, usually accompanied by a context switch:current process is suspended while page is fetched from diskAccess Control (R = Read-only, R/W = read/write, X = execute only)If kind of access not compatible with specified access rights then protection_violation_fault causes trap to hardware, or software fault handlerMissing item fetched from secondary memory only on the occurrence of a fault demand load policy
15 Managing virtual addresses Virtual memory management is mainly done by the OS.Deciding which pages should be in memory and which – on disk.Deciding when to write pages to disk.Deciding what access rights are assigned to pages.Et cetera, et cetera.The exact algorithms are out of the scope of this course.Here, we study two hardware mechanismsPseudo LRU mechanism – to decide which pages to keepTLB – to perform fast lookup
16 Page Replacement Algorithm (pseudo LRU) Not Recently Used (NRU)Associated with each page is a reference flag such that ref flag = 1 if the page has been referenced in recent pastIf replacement is needed, choose any page frame such that its reference bit is 0.This is a page that has not been referenced in the recent pastClock implementation of NRU:1 0page table entryRef bitlast replaced pointer (lrp)if replacement is to take place,advance lrp to next entry (modtable size) until one with a 0 bitis found; this is the target forreplacement; As a side effect,all examined PTE's have theirreference bits set to zero.Possible optimization: search for a page that is both not recently referenced AND not dirty
17 Page FaultsPage faults: the data is not in memory retrieve it from diskThe CPU must detect situationThe CPU cannot remedy the situation (has no knowledge of the disk)CPU must trap to the operating system so that it can remedy the situationPick a page to discard (possibly writing it to disk)Load the page in from diskUpdate the page tableResume to program so HW will retry and succeed!Page fault incurs a huge miss penaltyPages should be fairly large (e.g., 4KB)Can handle the faults in software instead of hardwarePage fault causes a context switchUsing write-through is too expensive so we use write-back
18 Virtual Memory Unix (VAX) Swappable areaVAX was the first system to run UNIX, and so many of the UNIX mechanisms are based on the VAX implementation.Unix distinguishes between 3 different address spaces, each of them is controlled by a separate page tableThe system address space – one per systemUser code + data address space – one per processUser Stack address space – one per process.Only the system page table (one table) resists in the main memory all the other tables and address spaces are pageable.codestackcodestackUser spaceSystem spaceUser PTRunKernelSys page tablePP table
19 VM in VAX: Address Format Virtual Address31302998Virtual Page NumberPage offsetP0 process space code and dataP1 process space stackS0 system spaceS1 not usedPhysical Address2998Physical Frame NumberPage offsetPage size: 29 = 512 bytes
20 Page Table Entry (PTE)This is the structure of all the entries in each of the page tables.V PROT M Z OWN S SPhysical Frame Number3120Valid bit =1 if page mapped to main memory, otherwise page on the swap area and the address indicates where I can find the page on the disk4 Protection bitsModified bit3 ownership bitsIndicate if the line was cleaned (zero)
21 Address TranslationThe System address space is divided into “resident” part that always exists in the main memory and the rest of the 2G address space of the system is swappable.User space is always swappable.The system’s page table is part of the resident area and the SBR register points on its physical base address.For each process, there are two dedicated registers:P0BR: points to the virtual address (in the system’s address space) of the code segment page table of the userP1BR: points to the virtual address (in the system’s address space) of the stack segment page table of the user
22 System Space - Address Translation From the virtual address of the system’s address space we can extract the virtual page number (VPN).Assuming that each entry in the system address space is 4 bytes, the PTE that contains the physical address of the line exists in SBR+VPN*4.If V=1, we can extract the physical address of the page.If V=0, it causes a page fault and we need to bring the page from the disk.SBRVPN*4PFN10offset8299VPN31
23 P0/P1 Address Translation Address translation need to be done in few phases:Find the address of the proper PTE in the user level page table.Since the PTE address is within the system’s address space, check if the page that contains the PTE is in the main memoryIf page exists, check if the user page is in the main memoryIf the page does not exist, causes a page fault to bring the value of the PTE and only than, we can check if the address exists in the memory or not.We may have up to 2 page-faults in the translation process of the singe access to a user level address.
24 P0/P1 Space Address Translation (cont) 00offset8299VPN3110Offset’8299VPN’31P0BR+VPN*4SBRVPN’*4PFN’Offset’8299PFN’Physical addrof PTEPFNOffset8299PFN
25 Handling large address space For a large virtual address space, we may get a large page table, for example:For a 2 GB virtual address space, page size of 4KB, we get 231/212=219 entriesAssuming each entry is 4 bytes wide, page table alone will occupy 2MB in main memoryTwo proposed techniques to handle it:Hierarchical page tables“segmentation”
26 Large Address Spaces Solution: use two-level Page Tables Master page table resides in physical memorySecondary page tables reside in virtual address spaceVirtual address formatAt the lower levels of the tree we will keep only the segments of the tables we use4 bytes4KB1KPTEsP1 indexP2 indexpage offest1012
27 Virtual Memory – Memory-management We do not need to map the addresses between the TOS (Top Of Stack) and the Break pointThe system needs to know how to map new regions when neededVirtual AddressesPhysical AddressesAddressTranslationDisk Addresses
28 The VAX Solution Segmentation p0Map only the sections the program usesNeed to extend at run timeStack manipulationSBRK commandp1
29 How a process is forkedWhen a process is forked, the new process and the old processes starts at the same point with the same state. The only difference between the two processes is that the Fork command returns the PID to the original process and 0 for the “new born” process.When a process is forked, we try to avoid to copy physical pagesAt the initial point, the translation tables are copiedAll pages in the memory that belong to the original process are marked as “read-only”Only when one of the processes tries to change the page, we copy the page to private copies and allow private writable copies. (this mechanism is called COW – Copy On Write)
30 TLB – hardware support for address translation TLB is a cache that keeps the last translations the system made, using the virtual address as an indexIf the translation was found in the TLB, no further translation is needed.If it misses, we need to continue the process as described before.
31 Making Address Translation Fast TLB (Translation Lookaside Buffer) is a cache for recent address translations:TLBValid Tag Physical PageVirtual page number1Physical Memory1111Page TableValid11Disk1111Physical PageOrDisk Address111
32 P0 space Address translation Using TLB VPN offsetMemory AccessProcessTLB AccessProcessTLB hit?NoCalculate PTE virtual addr(in S0): P0BR+4*VPNYesSystem TLB AccessGet PTE of reqpage from theproc. TLBSystemTLB hit?Access Sys PageTable inSBR+4*VPN(PTE)NoYesPFNGet PTE from system TLBCalculate physicaladdressPFNGet PTE of req page fromthe process Page tableAccess Memory
33 Virtual Memory And Cache YesNoTLB AccessTLB Hit ?AccessPage TableCacheVirtual AddressHit ?MemoryPhysicalAddressesDataTLB access is serial with cache access
34 Overlapped TLB & Cache Access Virtual Memory view of a Physical Address32198547Physical page numberPage offsetCache view of a Physical Address32198547Tag# SetDispIn the above example #Set is not contained within the Page OffsetThe #Set is not known Until the Physical page number is knownCache can be accessed only after address translation done.
35 Overlapped TLB & Cache Access (cont) Virtual Memory view of a Physical Address32198547Physical page numberPage offsetCache view of a Physical Address32198547Tag# SetDispIn the above example #Set is contained within the Page OffsetThe #Set is known immediatelyCache can be accessed in parallel with address translationFirst the tags from the appropriate set are broughtTag comparison takes place only after the Physical page number is known (after address translation done).
36 Overlapped TLB & Cache Access (cont) First StageVirtual Page Number goes to TLB for translationPage offset goes to cache to get all tags from appropriate setSecond StageThe Physical Page Number, obtained from the TLB,goes to the cache for tag comparison.Limitation: Cache < (page size * associativity)How can we overcome this limitation ?Fetch 2 sets and mux after translationIncrease associativity
37 Virtual memory and process switch Whenever we switch the execution between processes we need to:Save the state of the processLoad the state of the new processMake sure that the P0BPT and P1BPT registers are pointing to the new translation tablesClean the TLBWe do not need to clean the cache (Why????)