Chapter 9: Main Memory.

Chapter 9: Main Memory

Chapter 9: Memory Management
Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel 32 and 64-bit Architectures Example: ARM Architecture

Objectives To provide a detailed description of various ways of organizing memory hardware To discuss various memory-management techniques, including paging and segmentation To provide a detailed description of the Intel Pentium, which supports both pure segmentation and segmentation with paging

Background Program must be brought (from disk) into memory and placed within a process for it to be run Main memory and registers are only storage CPU can access directly Memory unit only sees a stream of addresses + read requests, or address + data and write requests Register access in one CPU clock (or less) Main memory can take many cycles, causing a stall Cache sits between main memory and CPU registers Protection of memory required to ensure correct operation

Background Kernel vs. user mode What is an address space?
How is it implemented? Physical memory Abstraction: virtual memory No protection Limited size Sharing visible to programs Easy to share data between programs Each program isolated from all others and from the OS Illusion of infinite memory Transparent – can’t tell if memory is shared Ability to share code, data

Virtualizing Resources
Physical Reality: Different Processes/Threads share the same hardware Need to multiplex CPU (Just finished: scheduling) Need to multiplex use of Memory (Today) Need to multiplex disk and devices (later in term) Why worry about memory sharing? The complete working state of a process and/or kernel is defined by its data in memory (and registers) Consequently, cannot just let different threads of control use the same memory Physics: two different pieces of data cannot occupy the same locations in memory Probably don’t want different threads to even have access to each other’s memory (protection)

Recall: Single and Multithreaded Processes
Threads encapsulate concurrency “Active” component of a process Address spaces encapsulate protection Keeps buggy program from trashing the system “Passive” component of a process

Important Aspects of Memory Multiplexing
Controlled overlap: Separate state of threads should not collide in physical memory. Obviously, unexpected overlap causes chaos! Conversely, would like the ability to overlap when desired (for communication) Translation: Ability to translate accesses from one address space (virtual) to a different one (physical) When translation exists, processor uses virtual addresses, physical memory uses physical addresses Side effects: Can be used to avoid overlap Can be used to give uniform view of memory to programs Protection: Prevent access to private memory of other processes Different pages of memory can be given special behavior (Read Only, Invisible to user programs, etc). Kernel data protected from User programs Programs protected from themselves

Base and Limit Registers
A pair of base and limit registers define the logical address space CPU must check every memory access generated in user mode to be sure it is between base and limit for that user

Hardware Address Protection

Address Binding Programs on disk, ready to be brought into memory to execute from an input queue Without support, must be loaded into address 0000 Inconvenient to have first user process physical address always at 0000 How can it not be? Further, addresses represented in different ways at different stages of a program’s life Source code addresses usually symbolic Compiled code addresses bind to relocatable addresses i.e. “14 bytes from beginning of this module” Linker or loader will bind relocatable addresses to absolute addresses i.e Each binding maps one address space to another

Binding of Instructions and Data to Memory
Binding of instructions and data to addresses: Choose addresses for instructions and data from the standpoint of the processor Could we place data1, start, and/or checkit at different addresses? Yes When? Compile time/Load time/Execution time Related: which physical memory locations hold particular instructions or data? 0x … … 0x900 8C2000C0 0x904 0C000340 0x FFFF 0x90C 1420FFFF … 0xD00 … data1: dw 32 … start: lw r1,0(data1) jal checkit loop: addi r1, r1, -1 bnz r1, r0, loop … checkit: …

Binding of Instructions and Data to Memory
Address binding of instructions and data to memory addresses can happen at three different stages Compile time: If memory location known a priori, absolute code can be generated; must recompile code if starting location changes Load time: Must generate relocatable code if memory location is not known at compile time Execution time: Binding delayed until run time if the process can be moved during its execution from one memory segment to another Need hardware support for address maps (e.g., base and limit registers)

Multistep Processing of a User Program
Preparation of a program for execution involves components at: Compile time (i.e. “gcc”) Link/Load time (unix “ld” does link) Execution time (e.g. dynamic libs) Addresses can be bound to final values anywhere in this path Depends on hardware support Also depends on operating system Dynamic Libraries Linking postponed until execution Small piece of code, stub, used to locate the appropriate memory-resident library routine Stub replaces itself with the address of the routine, and executes routine

Operating system organizations
Uniprogramming without protection Early personal computer operating systems: application always runs at the same place in physical memory, because each application runs one at a time (application given illusion of dedicated machine, by giving it reality of a dedicated machine). For example, load application into low memory, operating system into high memory. Application can address any physical memory location. 0x000000 Application OS Physical memory 0xFFFFFF

Operating system organizations
Multiprogramming without protection: Linker-loader Can multiple programs share physical memory, without hardware translation? Yes: when a program is copied into memory, its addresses are changed (loads, stores, jumps) to use the addresses of where the program lands in memory. This is done by a linker-loader. Used to be very common. 0x000000 Application1 Application2 OS 0x20000 Physical memory 0xFFFFFF

Multiprogramming w/out protection (cont.)
UNIX ld does the linking portion of this (despite its name deriving from loading!): compiler generates each .o file with code that starts at location 0. How do you create an executable from this? Scan through each .o, changing addresses to point to where each module goes in larger program (requires help from compiler to say where all the relocatable addresses are stored).

Multiprogrammed OS with protection
Goal of protection: Keep user programs from crashing/corrupting OS Keep user programs from crashing/corrupting each other How is protection implemented? Hardware support: Address translation Dual mode operation: kernel vs. user mode Application User mode Application library Portable OS layer Kernel mode Machine-dependent OS layer Hardware

Address translation Address space: literally, all the addresses a program can touch. All the state that a program can affect or be affected by. Restrict what a program can do by restricting what it can touch! Hardware translates every memory reference from virtual addresses to physical addresses; software sets up and manages the mapping in the translation box. Data read or write(untranslated) Virtual address Physical address Translation Box (MMU) Physical memory CPU

Address translation Two views of memory:
View from the CPU – what program sees, virtual memory View from memory – physical memory Translation box converts between the two views. Translation helps implement protection because there’s no way for programs to even talk about other programs’ addresses; no way for them to touch operating system code or data. Translation can be implemented in any number of ways – typically, by some form of table lookup (we’ll discuss various options for implementing the translation box later). Separate table for each user address space.

Logical vs. Physical Address Space
The concept of a logical address space that is bound to a separate physical address space is central to proper memory management Logical address – generated by the CPU; also referred to as virtual address Physical address – address seen by the memory unit Logical and physical addresses are the same in compile-time and load-time address-binding schemes; logical (virtual) and physical addresses differ in execution-time address-binding scheme Logical address space is the set of all logical addresses generated by a program Physical address space is the set of all physical addresses generated by a program

Memory-Management Unit (MMU)
Hardware device that at run time maps virtual to physical address Many methods possible, covered in the rest of this chapter To start, consider simple scheme where the value in the relocation register is added to every address generated by a user process at the time it is sent to memory Base register now called relocation register MS-DOS on Intel 80x86 used 4 relocation registers The user program deals with logical addresses; it never sees the real physical addresses Execution-time binding occurs when reference is made to location in memory Logical address bound to physical addresses

Dynamic relocation using a relocation register
Routine is not loaded until it is called Better memory-space utilization; unused routine is never loaded All routines kept on disk in relocatable load format Useful when large amounts of code are needed to handle infrequently occurring cases No special support from the operating system is required Implemented through program design OS can help by providing libraries to implement dynamic loading

Dynamic Linking Static linking – system libraries and program code combined by the loader into the binary program image Dynamic linking –linking postponed until execution time Small piece of code, stub, used to locate the appropriate memory-resident library routine Stub replaces itself with the address of the routine, and executes the routine Operating system checks if routine is in processes’ memory address If not in address space, add to address space Dynamic linking is particularly useful for libraries System also known as shared libraries Consider applicability to patching system libraries Versioning may be needed

Swapping A process can be swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution Total physical memory space of processes can exceed physical memory Backing store – fast disk large enough to accommodate copies of all memory images for all users; must provide direct access to these memory images Roll out, roll in – swapping variant used for priority-based scheduling algorithms; lower-priority process is swapped out so higher-priority process can be loaded and executed Major part of swap time is transfer time; total transfer time is directly proportional to the amount of memory swapped System maintains a ready queue of ready-to-run processes which have memory images on disk

Swapping (Cont.) Does the swapped out process need to swap back in to same physical addresses? Depends on address binding method Plus consider pending I/O to / from process memory space Modified versions of swapping are found on many systems (i.e., UNIX, Linux, and Windows) Swapping normally disabled Started if more than threshold amount of memory allocated Disabled again once memory demand reduced below threshold

Schematic View of Swapping

Context Switch Time including Swapping
If next processes to be put on CPU is not in memory, need to swap out a process and swap in target process Context switch time can then be very high 100MB process swapping to hard disk with transfer rate of 50MB/sec Swap out time of 2000 ms Plus swap in of same sized process Total context switch swapping component time of 4000ms (4 seconds) Can reduce if reduce size of memory swapped – by knowing how much memory really being used System calls to inform OS of memory use via request_memory() and release_memory()

Context Switch Time and Swapping (Cont.)
Other constraints as well on swapping Pending I/O – can’t swap out as I/O would occur to wrong process Or always transfer I/O to kernel space, then to I/O device Known as double buffering, adds overhead Standard swapping not used in modern operating systems But modified version common Swap only when free memory extremely low

Swapping on Mobile Systems
Not typically supported Flash memory based Small amount of space Limited number of write cycles Poor throughput between flash memory and CPU on mobile platform Instead use other methods to free memory if low iOS asks apps to voluntarily relinquish allocated memory Read-only data thrown out and reloaded from flash if needed Failure to free can result in termination Android terminates apps if low free memory, but first writes application state to flash for fast restart Both OSes support paging as discussed below

Dual mode operation Can application modify its own translation tables? If it could, could get access to all of physical memory. Has to be restricted somehow. Dual-mode operation When in the OS, can do anything (kernel-mode) When in a user program, restricted to only touching that program’s memory (user-mode) Hardware requires CPU to be in kernel-mode to modify address translation tables.

Dual mode operation In Nachos, as well as most OS’s:
OS runs in kernel mode (untranslated addresses) User programs run in user mode (translated addresses) Want to isolate each address space so its behavior can’t do any harm, except to itself. A couple of issues: How to share CPU between kernel and user programs How do programs interact? How does one switch between kernel and user modes when the CPU gets shared between the OS and a user program? OS -> user (kernel –> user mode) User -> OS (user mode –> kernel mode)

Dual mode operation Kernel -> user:
To run a user program, create a thread to: Allocate and initialize address space control block Read program off disk and store in memory Allocate and initialize translation table (point to program memory) Run program (or to return to user level after calling the OS with a system call): Set machine registers Set hardware pointer to translation table Set processor status word (from kernel mode to user mode) Jump to start of program

Dual mode operation User-> kernel:
How does the user program get back into the kernel? Voluntarily user->kernel: System call – special instruction to jump to a specific operating system handler. Just like doing a procedure call into the operating system kernel – program asks OS kernel, please do something on procedure’s behalf. Can the user program call any routine in the OS? No. Just specific ones the OS says are OK (registered trap handlers) Always start running handler at same place, otherwise, problems! How does OS know that system call arguments are as expected? It can’t – OS kernel has to check all arguments – otherwise, bug in user program can crash kernel.

Dual mode operation User-> kernel:
Involuntarily user->kernel: Hardware interrupt, also program exception Examples of program exceptions: Bus error (bad address – e.g., unaligned access) Segmentation fault (out of range address) Page fault (important for providing illusion of infinite memory) On system call, interrupt, or exception: hardware atomically Sets processor status to kernel mode Changes execution stack to an OS kernel stack Saves current program counter Jumps to handler routine in OS kernel Handler saves previous state of any registers it uses

Dual mode operation Context switching between programs: same as with threads, except now also save and restore pointer to translation table. To resume a program, re-load registers, change PSL (hardware pointer to translation table), and jump to old PC. How does the system call pass arguments? Two choices: Use registers. Can’t pass very much that way. Write into user memory, kernel copies into its memory. Except: User addresses – translated Kernel addresses – untranslated Addresses the kernel sees are not the same addresses as what the user sees!

Communication between address spaces
How do two address spaces communicate? Can’t do it directly if address spaces don’t share memory. Instead, all inter-address space (in UNIX, inter-process) communication has to go through kernel, via system calls. Models of inter-address space communication: Byte stream producer/consumer. For example, communicate through pipes connecting stdin/stdout. Message passing (send/receive). Will explain later how you can use this to build remote procedure call (RPC) abstraction, so that you can have one program call a procedure in another.

Models of inter-address space communication (cont’d)
File system (read and write files). File system is shared state! (Even though it exists outside of any address space.) “Shared Memory” -- Alternately, on most UNIXes, can ask kernel to set up address spaces to share a region of memory, but that violates the whole notion of why we have address spaces – to protect each program from bugs in the other programs.

Communication between address spaces
In any of these, once you allow communication, bugs from one program can propagate to those it communicates with, unless each program verifies that its input is as expected. So why do UNIXes support shared memory? One reason is that it provides a cheap way to simulate threads on systems that don’t support them: Each UNIX process = Heavyweight thread.

An Example of Application – Kernel Interaction: Shells and UNIX fork
Shell – user program (not part of the kernel!) Prompts users to type command Does system call to run command In Nachos, system call to run command is simply “exec”. But UNIX works a bit differently than Nachos. UNIX idea: separate notion of fork vs. exec Fork – create a new process, exact copy of current one Exec – change current process to run different program

Shells and UNIX fork To run a program in UNIX: Fork a process
In child, exec program In parent, wait for child to finish UNIX fork: Stop current process Create exact copy + mark one register in child Put on ready list Resume original Original has code/data/stack. Copy has exactly the same thing!

Shells and UNIX fork Only difference between child and parent is: UNIX changes one register in child before resume. Child process: Exec program: Stop process Copy new program over current one Resume at location 0 Justification was to allow I/O (pipes, redirection, etc.), to be set up between fork and exec. Child can access shell’s data structures to see whether there is any I/O redirection, and then sets it up before exec. Nachos simply combines UNIX fork and exec into one operation.

Protection without hardware support
Does protection require hardware support? In other words, do we really need hardware address translation and an unprivileged user mode? No! Can put two different programs in the same hardware address space, and be guaranteed that they can’t trash each other’s code or data. Two approaches: strong typing and software fault isolation.

Protection via strong typing
Restrict programming language to make it impossible to misuse data structures, so can’t express program that would trash another program, even in same address space. Examples of strongly typed languages include LISP, Cedar, Ada, Modula-3, and most recently, Java. Note: nothing prevents shared data from being trashed; which includes the data that exists in the file system. Even in UNIX, there is nothing to keep programs you run from deleting all your files (but at least can’t crash the OS!)

Protection via strong typing
Java’s solution: programs written in Java can be downloaded and run safely, because language/compiler /runtime prevents the program (also called an applet) from doing anything bad (for example, can’t make system calls, so can’t touch files). Java also defines portable virtual machine layer, so any Java programs can run anywhere, dynamically compiled onto native machine. Application written in Java Java runtime library Native operating system (kernel mode or unprotected Java operating system structure

Protection via software fault isolation
Language independent approach: Have compiler generate object code that provably can’t step out of bounds – programming language independent. Easy for compiler to statically check that program doesn’t do any native system calls. How does the compiler prevent a pointer from being misused, or a jump to an arbitrary place in the (unprotected) OS?

Chapter 9: Main Memory.

Similar presentations

Presentation on theme: "Chapter 9: Main Memory."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 9: Main Memory.

Similar presentations

Presentation on theme: "Chapter 9: Main Memory."— Presentation transcript:

Similar presentations

About project

Feedback