Presentation on theme: "Main Memory Operating System Concepts chapter 8 CS 355"— Presentation transcript:
1Main Memory Operating System Concepts chapter 8 CS 355 Operating SystemsDr. Matthew Wright
2Background A process generates a stream of memory requests. Memory access takes many CPU clock cycles, during which the processor may have to stall as it waits for data.The operating system and hardware work together to:Provide the data that the process requestsProtect data from improper access by another processWe will ignore details about caches, because they are managed by the hardware.
3Address BindingAddress binding involves converting a variable in the source program to a physical address in memory. This happens in multiple stages:Compile time: If the memory location of the program is known when it is compiled, then absolute code can be generated. If the location changes, the code must be recompiled.Load time: If the memory location is not known at compile time, the compiler must generate relocatable code. Binding may happen when the program is loaded into memory.Execution time: If the process can be moved during its execution from one memory segment to another, then binding may be delayed until run time. This is the most common method, though it requires hardware support.
4Base and Limit Registers Each process has a separate memory space.Protection can be achieved by using a base register and a limit register.Only the operating system (in kernel mode) can set the base and limit registers.If the CPU requests an absolute memory location M, the hardware verifies thatbase ≤ M < base + limit.If this is not the case, the hardware generates a trap to the operating system.
5Logical and Physical Addresses Logical address: generated by a process running on the CPU, also called a virtual addressPhysical address: seen by the memory unitMemory-Management Unit (MMU): hardware device that maps logical addresses to physical addressesThis simple MMU adds the value stored in a relocation register to the logical addresses that arrive from the CPU.
6Fundamental Problem Problems Main memory is not large enough to hold all the programs on a system.Main memory is volatile.Memory requirements of processes may change over time.SolutionsSwapping: moving processes between memory and diskPaging: allocating fixed-size blocks of memory to processesSegmentation: allocating blocks of memory to processes according to the needs of constructs within each process
7SwappingA process can be swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution.Backing store: fast disk large enough to accommodate copies of all memory images for all usersRoll out, roll in: swapping variant used for priority-based scheduling algorithms; lower-priority process is swapped out so higher-priority process can be loaded and executedThe major part of swap time is transfer time. Total transfer time is directly proportional to the amount of memory swapped.Swapping is slow, but it can be necessary if the main memory cannot hold all running processes.Modified versions of swapping are found on many systems (i.e., UNIX, Linux, and Windows).
8Memory Allocation How do we allocate memory to processes? Continuous memory allocation: each process is contained in a single contiguous section of memoryOS reserves some memory for itself (often with low addresses).When the scheduler selects a process for execution, the dispatcher sets the relocation (base) and limit registers.For protection, every requested address is checked against these registers.
9Memory AllocationWe could divide memory into various partitions, each of which holds a process.Variable-partition scheme: OS keeps a table indicating which parts of memory are occupied and which are available.A block of available memory is called a hole.If a new process arrives, the OS finds a hole for it.If a hole is too big for the process, the OS splits it into two parts: one for the process, and one that is returned to the set of holes.Two adjacent holes can be combined to make one larger hole.
10Memory AllocationDynamic storage-allocation problem: How do we choose where to place a process from a list of holes?First fit: allocate the first hole that is big enoughBest fit: allocate the smallest hole that is big enough(requires searching the entire list of holes, unless they are ordered by size)Worst fit: allocate the largest hole(also requires searching the entire list; produces the largest leftover hole)First fit and best fit are generally better than worst fit in terms of speed and storage utilization. First fit is generally faster.
11Fragmentation Over time, memory becomes fragmented. External fragmentation: total memory space exists to satisfy a request, but it is not contiguous (many small holes)50-percent rule: Statistically, if N partitions are allocated via first-fit, another 0.5N partitions are lost to fragmentation.Internal Fragmentation: allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being usedExample: A process requests 8424 bytes, and we have a hole of 8428 bytes. It’s not worth keeping track of a 4-byte hole, so we allocate the entire 8428 bytes, and we lose 4 bytes to internal fragmentation.We could reduce external fragmentation by compaction, shuffling memory contents to place all free memory together in one large block.Compaction is slow, and only possible only if address binding is done at execution time.
12PagingPaging allows the physical address space of a process to be noncontiguous and avoids external fragmentation.Paging is handled by the hardware, together with the operating system.Idea:Divide physical memory into fixed-size blocks called frames.Divide logical memory into blocks called pages, of the same size as the frames.To run a process, place each of its pages into an available frame in memory.Set up a table to translate logical to physical addresses.
13Paging HardwareEvery address from the CPU is divided into a page number and a page offset.Address spaces whose size is a power of 2 make this particularly easy.frame at location f
14Paging Example In this example: Logical memory contains 16 bytes: 4 pages of 4 bytes each.Page table contains 4 entries.Physical memory consists of 32 bytes: 8 frames of 4 bytes each.In reality, pages are often 4 KB or 8 KB, though they can be much larger.
15Paging: User ViewUser program views its memory as one contiguous space.All aspects of paging are hidden from the user program.A user program cannot access memory that it does not own, for it can only access the pages provided by its page table.The operating system uses the page table to translate addresses when it interacts with a process’s memory (e.g. when a process makes a system call that requests I/O).The operating system also keeps track of which frames are available in a frame table.
16Paging: Hardware Support Memory address translation through the page table must be fast.A small page table can be implemented as a set of registers.Modern systems allow large page tables (e.g. 1 million entries), which are kept in main memory. A page-table base register (PTBR) points to the page table.Problem: If the page table is stored in memory, then accessing a memory location requires two memory accesses (one for the page-table entry, another for the actual memory location), which would be intolerably slow.Solution: Use a small, fast hardware cache called a translation look-aside buffer (TLB) to store commonly-used page-table entries.When the CPU generates a logical address, the system checks to see if its page number is in the TLB. If so, the frame number is quickly available.If the page number is not in the TLB (a TLB miss), then the system consults the page table in memoryAfter a TLB miss, the system adds the page/frame combination to the TLB, replacing an existing entry if the TLB is full.
18Paging: Multiple Processes Address-space identifiers (ASIDs) are basically process identifiers that allow the TLB to contain entries for multiple processes simultaneously.In this case, the TLB stores an ASID with each entryThe ASID of the currently running process is compared with that of the page to provide memory protection.If the ASIDs don’t match, the memory access attempt is treated as a TLB miss.If the TLB does not support ASIDs, it must be flushed (erased) with each context switch.
19Paging: TLB Performance Hit ratio: the percentage of times that a requested page number is found in the TLBEffective memory-access time: average time required for a memory accessExample:suppose it takes 15 nanoseconds to search the TLB and 100 ns to access memoryMemory access takes 115 ns if a TLB hit occursMemory access takes 215 ns if a TLB miss occursIf the hit ratio is 80%, then:effective access time = 0.80 × (115) (215) = 135 nsIf the hit ratio is 95%, then:effective access time = 0.95 × (115) (215) = 120 ns
20Paging: ProtectionThe page table stores various protection bits along with each entry.One special bit indicates whether a page is read-only.The valid-invalid bit indicates whether a page is in the logical address space of a given process.Since few processes use their entire available address space, most of the page table could be unused. The page-table length register (PTLR) indicates the size of the page table, and is checked against every logical address.
21Paging: Shared PagesPaging makes it relatively easy for processes to share common code.For example, multiple copies of a text editor might be running at the same time, sharing the same program code.Shared code must be read-only.Only one copy of the shared code must be kept in memory, and each user’s page table maps to the common code.See the example on the following slide…
23Paging: Hierarchical Page Tables With the large address space of modern computers, the page table itself can become too large to store in one contiguous block of memory.One solution is to use two-level paging: paging the page table.Example: 32-bit logical address space, 4 KB page sizeThis is called a forward-mapped page table.page numberpage offsetp1p2d10 bits12 bits
24Paging: Hierarchical Page Tables For a system with a 64-bit logical address space, two-level paging is not enough.We could divide the outer page table into smaller pages, resulting in three-level paging.Often, even three-level paging is not enough.To ensure that all page sizes are small, a 64-bit address system could require seven levels of paging, which would be much too slow.
25Paging: Hashed Page Tables For address spaces larger than 32 bits, a hashed page table may be used.Each entry in the hash table contains a linked list of elements (to handle collisions).Each element contains a virtual page number, the value of the mapped page frame, and a pointer to the next element in the list.
26Paging: Inverted Page Tables Usual design:Each process has its own page table.Page table entries contain physical addresses, indexed by virtual addresses.Drawback: a page table may contain many unused entries, and may consume lots of memoryInverted Page Table:A single, system-wide page table, with one entry for each physical frame of memoryPage table contains virtual addresses, indexed by physical addressesDecreases the amount of memory required to store the page tableIncreases the amount of time required to search the page tableUses a hash function to make searching fasterDifficult to implement shared memory with this solution
28SegmentationUsers view memory as a collection of variable-sized segments, with no ordering among segments.Examples of segments: objects, methods, variables, etc.With segmentation, a logical address space is a collection of segmentsEach segment has a name and a lengthA user specifies each address by two quantities: a segment name and an offsetThe compiler creates segments, and the loader assigns segment numbers to segment names.
29Segmentation: Hardware A segment table maps segment numbers and offsets to physical addresses.Each entry contains a segment base (the address of the segment in memory) and a segment limit (the length of the segment).
31Paging vs. Segmentation Invisible from the user programDivides physical memory into blocks (pages)Pages are of uniform, predetermined sizeLogical address is a positive integerVisible to the user programDivides logical memory into blocks (segments)Segments are of arbitrary sizesLogical address is an ordered pair: (segment, offset)In both cases, the operating system works together with the hardware to translate logical addresses into physical addresses.
32Example: Intel Pentium The Intel Pentium supports both pure segmentation and segmentation with paging.Overview:CPU generates logical address, which goes to the segmentation unitSegmentation unit produces linear address, which goes to the paging unitPaging unit generates physical address in main memory
33Example: Pentium Segmentation The logical address space of a process is divided into two partitionsOne partition holds segments that are private to that process, indexed by the local descriptor table (LDT).The other partition holds segments that are shared, indexed by the global descriptor table (GDT).Logical addresses consist of a selector and an offset.The CPU has six segment registers to hold segment addresses.
34Example: Pentium Paging Pentium allows page sizes of either 4 KB or 4 MB.Page directory includes a page size bit that indicates the size of a given page.Page tables can be swapped to disk, and another bit in the directory indicates whether a page is on the disk.
35Example: Linux on the Pentium Linux is designed to run on various processors, so it uses minimal segmentation.On the Pentium, Linux uses six segmentsA segment for kernel code 4. A segment for user dataA segment for kernel data 5. A task-state segmentA segment for user code 6. A default LDT segmentLinux uses a three-level paging model that works well for 32-bit and 64- bit architectures, with linear addresses divided into four parts:On the Pentium, the size of the middle directory is zero bits, so the paging is effectively two-level.Each task in Linux has its own set of page tables.
36Example: Three-level Paging in Linux On the Pentium, the size of the middle directory is zero.