Presentation is loading. Please wait.

Presentation is loading. Please wait.

11 Chapter 4: Memory Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity.

Similar presentations


Presentation on theme: "11 Chapter 4: Memory Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity."— Presentation transcript:

1 11 Chapter 4: Memory Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity

2 22 Content Basic memory management Swapping Virtual memory Page replacement algorithms Implementation issues Segmentation

3 33 Memory Management Ideally programmers want memory that is –large –fast –non volatile

4 44 Memory Hierarchy Small amount of fast, expensive memory – –cache Some medium-speed, medium price memory –main memory Gigabytes of slow, cheap memory –disk storage

5 55 Virtual Memory An illusion provided to the applications An address space can be larger than –the amount of physical memory on the machine –This is called virtual memory Memory manager handles the memory hierarchy

6 66 Basic Memory Management Mono-programming –without Swapping or Paging Multi-programming with Fixed Partitions –Swapping may be used

7 77 Uni-programming One process runs at a time –One process occupies memory at a time Always load process into the same memory spot And reserve some space for the OS

8 88 Uni-programming Three simple ways of organizing memory for –an operating system with one user process Operating system In ROM User program 0xFFF… 0 Operating system In RAM User program Operating system In ROM User Program Operating system In RAM

9 99 Uni-programming Achieves address independence by –Loading process into same physical memory location Problems with uni-programming? –Load processes in its entirety (no enough space?) –Waste resource (both CPU and Memory)

10 10 Multi-programming More than one process is in memory at a time Need to support address translation –Address from instruction may not be the final address Need to support protection –Each process cannot access other processes space

11 11 Multiprogramming with Fixed Partitions Two options exist for fixed memory partitions Separate input queues for each partition –Incoming processes are allocated into fixed partition Single input queue –Incoming processes can go to any partition –Assume partition is big enough to hold the processes

12 12 Multiprogramming with Fixed Partitions

13 13 Benefit of Multiprogramming Improve resource utilization –All resource can be kept busy with proper care Improve response time –Dont need to wait for previous process to finish to receive a feedback from computer

14 14 CPU Utilization of Multiprogramming Assume processes spend 80% time wait for I/O CPU utilization for uni-programming: 20% CPU utilization for two processes: 36% –Rough estimation only (i.e. ignore overhead) –CPU utilization = ×0.8=0.36 CPU utilization for three processes: 48.8%

15 15 Degree of multiprogramming CPU Utilization of Multiprogramming

16 16 Multiprogramming System Performance Given: –Arrival and work requirements of 4 jobs –CPU utilization for 1 – 4 jobs with 80% I/O wait Plot: –Sequence of events as jobs arrive and finish –Show amount of CPU time jobs get in each interval

17 17 Multiprogramming System Performance

18 18 Multiprogramming Issues Cannot be sure where program will be loaded –Address locations of variables –Code routines cannot be absolute Solution: –Use base and limit values –Address added to base value to map to physical addr –Address locations larger than limit value is an error

19 19 Multiprogramming Issues Protection processes from each other –Must keep a program out of other processes partitions Solution: –Address translation –Must translate addresses issued by a process so they dont conflict with addresses issued by other processes

20 20 Address Translation Static address translation –Translate addresses before execution –Translation remains constant during execution Dynamic address translation –Translate addresses during execution –Translation may change during execution

21 21 Address Translation Is it possible to: –Run two processes at the same time (both in memory), and provide address independence with only static address translation? Does this achieve the other address space abstractions? –No (i.e. does not offer address protection)

22 22 Address Translation Achieving all the address space abstractions (protection and independence) requires doing some work on every memory reference Solution: –Dynamic address translation

23 23 Dynamic Address Translation Translate every memory reference from virtual address to physical address Virtual address: –An address viewed by the user process –The abstraction provided by the OS Physical address –An address viewed by the physical memory

24 24 Dynamic Address Translation User Process Translator (MMU) Physical memory Virtual address Physical address

25 25 Benefit of Dynamic Translation Enforces protection –One process cant even refer to another processs address space Enables virtual memory –A virtual address only needs to be in physical memory when its being accessed –Change translations on the fly as different virtual addresses occupy physical memory

26 26 Dynamic Address Translation Does dynamic address translation require hardware support? –Its better to have but not absolutely necessary

27 27 Implement Translator Lots of ways to implement the translator Tradeoffs among: –Flexibility (e.g. sharing, growth, virtual memory) –Size of translation data –Speed of translation

28 28 Base and Bounds The simplest solution Load each process into contiguous regions of physical memory Prevent each process from accessing data outside its region

29 29 Base and Bounds if (virtual address > bound) { trap to kernel; kill process (core dump) } else { physical address = virtual address + base }

30 30 Base and Bounds Process has illusion of running on its own dedicated machine with memory [0, bound) Physical memory 0 base base + bound physical memory size virtual memory 0 bound

31 31 Base and Bounds This is similar to linker-loader –But also protect processes from each other Only kernel can change base and bounds During context switch, must change all translation data (base and bounds registers)

32 32 Base and Bounds What to do when address space grows?

33 33 Pros of Base and Bounds Low hardware cost – 2 registers, adder, comparator Low overhead –Add and compare on each memory reference

34 34 Cons of Base and Bounds Hard for a single address space to be larger than physical memory But sum of all address spaces can be larger than physical memory –Swap an entire address space out to disk –Swap address space for new process in

35 35 Cons of Base and Bounds Cant share part of an address space between processes data (P2) data (P1) code Physical memory data (P1) code data (P2) code virtual memory virtual address (process 1) virtual address (process 2) Does this work under base and bound?

36 36 Cons of Base and Bounds Solution: use 2 sets of base and bounds one for code section one for data section

37 37 Swapping Memory allocation changes as: Processes come into memory Processes leave memory Take processes out from memory and bring process into memory are called Swapping

38 38 Swapping

39 39 Swapping Problem with previous situation? Difficult to grow process space –i.e. stack, data, etc. Solution: –Allocating space for growing data segment –Allocating space for growing stack & data segment

40 40 Swapping

41 41 Memory Mgmt Issues Need keep track of memory used and available Bit map approach Linked list approach

42 42 Memory Mgmt with Bit Divide memory into allocation units Use one bit to mark if the unit is allocated or not

43 43 Memory Mgmt with Bit Divide memory into allocation units Use a linked list to indicate allocated and available units in chunks

44 44 Memory Mgmt with Bit Maps Memory allocation Bit-map representation c) Linked list representation

45 45 Memory Mgmt with Linked Lists What happens when units are released? May need to collapse linked list appropriately

46 46 Memory Mgmt with Linked Lists

47 47 External Fragmentation Processes come and go Can leave a mishmash of available mem regions Some regions may be too small to be of any use

48 48 External Fragmentation P1 start:100 KB (phys. mem KB) P2 start:200 KB (phys. mem KB) P3 start:300 KB (phys. mem KB) P4 start:400 KB (phys. mem KB) P3 exits (frees phys. mem KB) P5 start:100 KB (phys. mem KB) P1 exits (frees phys. mem KB) P6 start:300 KB

49 49 External Fragmentation 300 KB are free ( KB; 0-99 KB) –but not contiguous This is called external fragmentation –wasted memory between allocated regions Can waste lots of memory

50 50 Strategies to Minimize Fragmentation Best fit: –Allocate the smallest memory region that can satisfy the request (least amount of wasted space) First fit: –Allocate the memory region that you find first that can satisfy the request

51 51 Strategies to Minimize Fragmentation In worst case, must re-allocate existing memory regions –by copying them to another area

52 52 Problems of Fragmentation Hard to grow address space –Might have to move to different region of physical memory (which is slow) How to extend more than one contiguous data structure in virtual memory?

53 53 Paging Allocate physical memory in terms of fixed-size chunks of memory (called pages) –fixed unit makes it easier to allocate –any free physical page can store any virtual page Virtual address –virtual page # (high bits of address, e.g. bits 31-12) –offset (low bits of addr, e.g. bits 11-0 for 4 KB page)

54 54 Paging Processes access memory by virtual addresses Each virtual memory reference is translated into physical memory reference by the MMU

55 55 Paging

56 56 Paging Page translation process: if (virtual page is invalid or non-resident or protected) { trap to OS fault handler } else { physical page # = pageTable[virtual page #].physPageNum }

57 57 Paging What must be changed on a context switch? –Page table, registers, cache image –Possibly memory images Each virtual page can be in physical memory or paged out to disk

58 58 Paging How does processor know that virtual page is not in physical memory? –Through a valid/invalid bit in page table Pages can have different protections –e.g. read, write, execute –These information is also kept in page table

59 59 Valid vs. Resident Resident means a virtual page is in memory –NOT an error for a program to access non-resident pages Valid means a virtual page is currently legal for the program to access

60 60 Valid vs. Resident Who makes a virtual page resident/non-resident? Who makes a virtual page valid/invalid? Why would a process want one of its virtual pages to be invalid?

61 61 Page Table Used to keep track of virtual-physical page map One entry for each virtual page Also keep information concerning other relevant information such read, write, execute, valid, etc. MMU use it to perform addresses translation

62 62 Page Table Typical page table entry

63 63 Paging The relation between virtual addresses and physical memory addresses given by page table

64 64 Paging The internal operation of MMU with 16 4 KB pages

65 65 Paging Pros and Cons + simple memory allocation + can share lots of small pieces of an addr space + easy to grow the address space –Simply add a virtual page to the page table –and find a free physical page to hold the virtual page before accessing it

66 66 Paging Pros and Cons Problems with paging? The size of page table could be enormous Take 32 bits virtual address for example Assume the size of page is 4KB Then there are virtual pages For a 64 bit virtual address?

67 67 Paging Pros and Cons The solution? Use multi-level translation! Break page tables into 2 or more levels Top-level page table always reside in memory Second-level page tables in memory as needed

68 68 Multi-level Translation Standard page table is a simple array –one degree of indirection Multi-level translation changes this into a tree –multiple degrees of indirection

69 69 Multi-level Translation Example: two-level page table Index into the level 1 page table using virtual address bits Index into the level 2 page table using virtual address bits Page offset: bits 11-0 (4 KB page)

70 70 Multi-level Translation

71 71 Multi-level Translation What info is stored in the level 1 page table? –Information concerning secondary-level page tables What info is stored in the level 2 page table? –Virtual-to-physical page mappings

72 72 Multi-level Translation This is a two-level tree Virtual address bits Physical page # Level 1 Page table Level 2 Page table Virtual address bits Physical page # NULL

73 73 Multi-level Translation How does this allow the translation data to take less space? How to use share memory when using multi-level page tables? What must be changed on a context switch?

74 74 Multi-level Translation Another alternative: use segments in place of the level-1 page table This uses pages on level 2 (i.e. break each segment

75 75 Multi-level Translation Pros and cons + space-efficient for sparse address spaces + easy memory allocation + lots of ways to share memory - two extra lookups per memory reference

76 76 Inverted Page Table An alternate solution to big table size problem Rather than storing virtual-physical mapping We store physical-virtual mapping This significantly reduce the page table size

77 77 Inverted Page Tables

78 78 Comparing Basic Translation Schemes Base and bound: –unit (and swapping) is an entire address space Segments: unit (and swapping) is a segment –a few large, variable-sized segments per addr space Page: unit (and swapping/paging) is a page –lots of small, fixed-sized pages per address space –How to modify paging to take less space?

79 79 Translation Speed Translation when using paging involves 1 or more additional memory references –This can be a big issue if not taking care of How to speed up the translation process? Solution: –Translation look-aside buffer

80 80 Translation Look-aside Buffer Facility to speed up memory access Abbreviated as TLB TLB caches translation from virtual page # to physical page # TLB conceptually caches the entire page table entry, e.g. dirty bit, reference bit, protection

81 81 Translation Look-aside Buffer If TLB contains the entry youre looking for –can skip all the translation steps above On TLB miss, figure out the translation by –getting the users page table entry, –store in the TLB, then restart the instruction

82 82 A TLB to Speed Up Paging

83 83 Translation Look-aside Buffer Does this change what happens on a context switch?

84 84 Replacement One design dimension in virtual memory is –which page to replace when you need a free page? Goal is to reduce the number of page faults –i.e. a page to be accessed is not in memory

85 85 Replacement Modified page must first be saved –unmodified just overwritten Better not to choose an often used page –will probably need to be brought back in soon

86 86 Replacement Algorithms Random replacement Optimal replacement NRU (not recently used) replacement FIFO (first in first out) replacement Second chance replacement

87 87 Replacement Algorithms LRU (least recently used) replacement Clock replacement Work set replacement Work set clock replacement

88 88 Random Replacement Randomly pick a page to replace Easy to implement, but poor results

89 89 Optimal Replacement Replace page needed at farthest point in future –i.e. page that wont be used for the longest time –this yields the minimum number of misses –but requires knowledge of the future Forecast future is difficult if at all possible

90 90 NRU Replacement Replace page not recently used Each page has Reference bit, Modified bit –bits are set when page is referenced, modified

91 91 NRU Replacement Pages are classified into four classes: –not referenced, not modified –not referenced, modified –referenced, not modified –referenced, modified NRU removes page at random –from lowest numbered non empty class

92 92 FIFO Replacement Replace the page that was brought into memory the longest time ago Maintain a linked list of all pages –in order they came into memory Page at beginning of list replaced

93 93 FIFO Replacement Unfortunately, this can replace popular pages that are brought into memory a long time ago (and used frequently since then)

94 94 Second Chance Algorithm A modification to FIFO Just as FIFO but page evicted only if R bit is 0 If R bit is 1, the page is put behind the list –And the R bit is cleared i.e. page brought in the longest time ago but with R bit set is given a second chance

95 95 Second Chance Algorithm Page list if fault occurs at time 20, A has R bit set (numbers above pages are loading times)

96 96 LRU Replacement LRU stands for Least Recently Used Use past references to predict the future –temporal locality If a page hasnt been used for a long time –it probably wont be used again for a long time

97 97 LRU Replacement LRU is an approximation to OPT Can we approximate LRU to make it easier to implement without increasing miss rate too much? Basic idea is to replace an old page –not necessarily the oldest page

98 98 LRU Replacement Must keep a linked list of pages –most recently used at front, least at rear –update this list every memory reference !! Alternatively use counter in each page table entry –choose page with lowest value counter –periodically zero the counter

99 99 Implementing LRU with Matrix Another option is to use n×n matrix –Here n is the number of pages in virtual space The matrix is set to zero initially Whenever a page k is referenced: –Row k is to all one, then column k is set to all zero Whenever need to pick a page to evict –Pick the one with the smallest number (row value)

100 100 Implementing LRU with Matrix Pages referenced in order 0,1,2,3,2,1,0,3,2,3

101 101 Implementing LRU with Aging Each page corresponding to a shift register –Shift register is initially set to zero At every clock tick, the value of the shift is shifted one bit left, and the R bit is added to the left most bit of the corresponding shifter Whenever need to pick a page to evict –Pick the one with the smallest number

102 102 Implementing LRU with Aging

103 103 The Clock Algorithm Maintain referenced bit for each resident page –set automatically when the page is referenced Reference bit can be cleared by OS The resident page organized into a clock cycle A clock hand points to one of the pages

104 104 The Clock Algorithm To find a page to evict: –look at page being pointed to by clock hand reference=0 means page hasnt been accessed in a long time (since last sweep) reference=1 means page has been accessed since your last sweep. What to do?

105 105 The Clock Algorithm

106 106 The Clock Algorithm Can this infinite loop? What if it finds all pages referenced since the last sweep? New pages are put behind the clock hand, with reference=1

107 107 The Clock Algorithm Why is hardware support needed to maintain the reference bit? How can you identify an old page?

108 108 The Working Set Algorithm The working set is the set of pages used by the k most recent memory references w(k,t) is the size of the working set at time, t

109 109 The Working Set Algorithm

110 110 The Working Set Algorithm Work set changes as time passes but stabilizes after k (most recent references)

111 111 The Work Set Clock Algorithm Combine work set algorithm with clock algorithm Pages organized into a clock cycle Each page has a time of last use and R bit Whenever needs to evict a page: –Inspect from the page pointed by the clock hand –The first page that with 0 R bit and is outside the work set is evicted

112 112

113 113 Page Replacement Algorithm Review

114 114 Design Issues for Paging Systems Thrashing Local versus Global Allocation Policies Page size OS Involvement with Paging Page fault handling

115 115 Thrashing What happens when a work set is big than the available memory frames? Thrashing! i.e. constant page fault to bring pages in and out Should avoid thrashing at all cost

116 116 Local versus Global Allocation When evict a page, do we only look at pages of the same process for possible eviction –Local allocation policy Or do we look at the whole memory for victim? –Global allocation policy

117 117 Local versus Global Allocation Local policy Global policy

118 118 Local versus Global Allocation In global allocation policy, can use PFF to manage the allocation –PFF page fault frequency If PFF is large, allocate more memory frames Otherwise, decrease the number of frames Goal is to maintain an acceptable PFF

119 119 Page fault rate as a function of # of page frames assigned Local versus Global Allocation

120 120 Page Size What happens if page size is small? What happens if page size is really big? Could we use a large page size but let other processes use the leftover space in the page? Page size is typically a compromise –e.g. 4 KB or 8 KB

121 121 Page Size What happens to paging if the virtual address space is sparse? –most of the address space is invalid, –with scattered valid regions

122 122 Small Page Size Advantages –less internal fragmentation –better fit for various data structures, code sections –less unused program in memory Disadvantages –programs need many pages, larger page tables

123 123 Page Size Therefore, to decide a good page size, one needs to balance page table size and internal fragmentation

124 124 Page Size Overhead due to page table and internal fragmentation can be calculate as: s = average process size in bytes p = page size in bytes e = page entry Page table space Internal fragmentation

125 125 Page Size Overhead is minimized when:

126 126 Fixed vs. Variable Size Partitions Fixed size (pages) must be compromise –too small a size leads to a large translation table –too large a size leads to internal fragmentation

127 127 Fixed vs. Variable Size Partitions Variable size (segments) can adapt to the need –but its hard to pack these variable size partitions into physical memory –leading to external fragmentation

128 128 Load Control Despite good designs, system may still thrash When PFF algorithm indicates: –some processes need more memory –but no processes need less

129 129 Load Control Solution : Reduce # of processes competing for memory swap 1 or more to disk, divide up pages they held reconsider degree of multiprogramming

130 130 Separate Instruction and Data Spaces With combined instruction and data space, programmers have to fit everything into 1 space By separating instruction and data space, we: –Allows programmers more freedom –Facility sharing of program text (code)

131 131 Separate Instruction and Data Spaces

132 132 OS Involvement with Paging Four times when OS involved with paging

133 133 OS Involvement with Paging Process creation –determine program size –create page table Process execution –MMU reset for new process –TLB flushed

134 134 OS Involvement with Paging Page fault time –determine the virtual address that causes the fault –swap target page out, needed page in Process termination time –release page table, pages

135 135 Page Fault Handling Hardware traps to kernel General registers saved OS determines which virtual page needed OS checks validity of address, seeks page frame If selected frame is dirty, write it to disk

136 136 Page Fault Handling OS brings schedules new page in from disk Page tables updated Faulting instruction backed up to when it began Faulting process scheduled Registers restored Program continues

137 137 Instruction Backup An instruction causing a page fault

138 138 Locking Pages in Memory Sometimes may need to lock a page in memory –i.e. prohibit its eviction from memory

139 139 Locking Pages in Memory Proc issues call for read from device into buffer –while waiting for I/O, another processes starts up –has a page fault –buffer for the first proc may be chosen to be paged out Need to specify some pages locked –exempted from being target pages

140 140 Backing Store When paged out, where does it go on disk? Two options: A special designated area: swap area Or the normal place of the program

141 141 Backing Store When use swap area, there are two options: Allocate swap area for entire process –Do this at loading time before execution Allocate swap area for part of the process that is currently paged out to disk –Load process into memory first, then as pages get paged out, allocate swap area then

142 142 Backing Store (a) Paging to a static area (b) Back up pages dynamically

143 143 Separation of Policy and Mechanism An important technique in managing complexity When applied to memory management: Have the most functions of memory manager run in the user space Leave only the fault handler and page table management inside the kernel

144 144 Separation of Policy and Mechanism Page fault handling with an external pager

145 145 Separation of Policy and Mechanism Where does the paging algorithm reside? In kernel or in user space? If in user space, need a way to pass the R and M bit from kernel to user space If in kernel, the external pager wouldnt be clean

146 146 Belady's Anomaly Do not show this slide

147 147 Belady's Anomaly Do not show this slide FIFO with 3 page frames FIFO with 4 page frames P's show which page references show page faults

148 148 Page Out What to do with page when its evicted? –i.e. do we write it back to disk or simply discard? Why not write pages to disk on every store? –Cost CPU time to do this

149 149 Page Out While evicted page is being written to disk, the page being brought into memory must wait May be able to reduce total work by giving preference to dirty pages –e.g. could evict clean pages before dirty pages If system is idle, might spend time profitably by writing back dirty pages

150 150 Page Table Contents Data stored in the hardware page table: Resident bit: –true if the virtual page is in physical memory Physical page # (if in physical memory) Dirty bit: –set by MMU when page is written

151 151 Page Table Contents Reference bit: –set by MMU when page is read or written Protection bits (readable, writable) –set by operating system to control access to page –Checked by hardware on each access

152 152 Page Table Contents Does the hardware page table need to store the disk block # for non-resident virtual pages? Really need hardware to maintain a dirty bit? How to reduce # of faults required to do this? Do we really need hardware to maintain a reference bit?

153 153 MMU Memory management unit MMU is responsible for checking: –if the page is resident –if the page protections allow this access –and setting the dirty/reference bits

154 154 MMU If page is resident and access is allowed, –MMU translates virtual address into physical address –using info from the TLB and page table Then MMU issues the physical memory address to the memory controller

155 155 MMU If page is not resident, or protection bits disallow the access –the MMU generates an exception (page fault)

156 156 Segmentation In a paging system, each process occupies one virtual address space This may be inconvenient since different sections of the process can grow or shrink independently

157 157 Segmentation One-dimensional address space with growing tables One table may bump into another

158 158 Segmentation The solution is segmentation!

159 159 Segmentation Segment: a region of contiguous memory space Segmentation divides both physical and virtual memory into segments Each segment is dedicated to one or more sections of a process The pure segmentation use entire process

160 160 Segmentation Allows each table to grow or shrink, independently

161 161 Segmentation Segment #BaseBoundDescription Code segment 10500Data segment 2Unused Stack segment Lets generalize this to allow multiple segments –described by a table of base & bound pairs

162 162 Segmentation data stack code data stack Physical memory Virtual memory segment 1 Virtual memory segment 3 Virtual memory segment 0 6ff 0 4ff 0 fff 0 46ff fff ff 0

163 163 Segmentation Note that not all virtual addresses are valid –e.g. no valid data in segment 2; –no valid data in segment 1 above 4ff Valid means the region is part of the processs virtual address space

164 164 Segmentation Invalid means this virtual address is illegal for the process to access Accesses to invalid address will cause the OS to take corrective measures –usually a core dump

165 165 Segmentation Protection: –different segments can have different protection –e.g. code can be read-only (allows inst. fetch, load) –e.g. data is read/write (allows fetch, load, store) In contrast, base & bounds gives same protection to entire address space

166 166 Segmentation In segmentation, a virtual address takes the form: –(virtual segment #, offset) Could specify virtual segment # via –The high bits of the address, –Or a special register, –Or implicit to the instruction opcode

167 167 Implementation of Pure Segmentation

168 168 Segmentation What must be changed on a context switch?

169 169 Pros and Cons of Segmentation + works well for sparse address spaces –with big gaps of invalid areas + easy to share whole segments without sharing entire address space - complex memory allocation

170 170 Compare Paging and Segmentation ConsiderationPagingSegmentation Need programmer aware that this technique is being used NoYes How many linear address spaces?1Many Can total address space exceed the size of physical memory Yes Can procedures and data be distinguished & separately protected NoYes Can tables size fluctuate easily?NoYes Sharing of procedures between users?NoYes Why was this technique inventedTo get a large linear address space without buying more memory Allow programs & data to be broken up into logically independent spaces and to aid sharing & protection

171 171 Segmentation Can a single address space be larger than physical memory? How to make memory allocation easy and –allow an address space be larger than physical memory?

172 172 Segmentation with Paging

173 173 Segmentation with Paging Descriptor segment points to page tables Page tables points to physical frames MULTICS use this method

174 174 SP Example: MULTICS A 34-bit MULTICS virtual address Segment number Address within the segment 18 Page number Offset within the page 610

175 175 SP Example: MULTICS Segment number MULTICS virtual space 18 Page number Offset within the page 610 Descriptor Page frame Word Segment number Page table Page number Descriptor segment offset

176 176 SP Example: MULTICS TLB

177 177 SP Example: MULTICS Do not show this slide Simplified version of the MULTICS TLB Existence of 2 page sizes makes actual TLB more complicated

178 178 SP Example: Pentium Pentium virtual memory contains two tables: Global Descriptor Table: –Describes system segments, including OS Local Descriptor Table: –Describes segments local to each program

179 179 SP Example: Pentium Pentium selector contains a bit to indicate if the segment is local or global LDT or GDT entry numbers 1213Bits 0 = GDT/1 = LDTPrivilege level (0-3) A Pentium selector

180 180 SP Example: Pentium Pentium code segment descriptor (Data segments differ slightly)

181 181 SP Example: Pentium Conversion of a (selector, offset) pair to a linear address

182 182 Pentium Address Mapping

183 183 Protection on the Pentium Level

184 184 Translation data Where is translation data kept? How can kernel refer to translation data? –Translation data is not in any processs address space –its in physical (i.e. un-translated) memory

185 185 Translation data Kernel can issue un-translated addresses –i.e. bypass the translator Kernel can map physical memory into a portion of its address space How does kernel access users address space?

186 186 Kernel vs. User Mode Who sets up the data used by translator? Kernel is allowed to modify any memory –including translation tables

187 187 Kernel vs. User Mode How does machine know kernel is running? Machine must know to allow kernel to –bypass translator, and –execute privileged instructions (e.g. halt, I/O) Need hardware support: –two processor modes (kernel and user)

188 188 Kernel vs. User Mode How have we handled the problem of protection so far?

189 189 Kernel vs. User Mode Implement protection by translating all addresses –But who can modify data used by translator? Only kernel can modify translators data –but how does processor know if kernel is running? Mode bit distinguishes between kernel and user –But who is allowed to modify mode bit?

190 190 Switch from User Process into Kernel What causes a switch from a user process into the kernel? Lets look at an example: –Sequence of events when calling cin

191 191 Event Sequence When Calling CIN C++ code calls cin cin is a standard library function that calls read() read() is a standard library function that –executes the assembly-language instruction syscall –with parameters (SYS_read, file number, size) –parameters are in registers or on the stack

192 192 Event Sequence When Calling CIN When processor executes syscall instruction –it traps to the kernel at a pre-specified location Kernel syscall handler receives the trap –and calls the kernels read() function

193 193 What happens when trapping to kernel Set processor mode bit to kernel Save current registers –SP, PC, general purpose registers Set SP to the kernels stack Change address spaces to kernels space –by changing some data used by the translator Jump to kernel exception handler

194 194 Questions Does this look familiar? How does processor know exception handlers address?

195 195 Passing Arguments to System Call Can store arguments in registers or memory –according to agreed-upon convention If pass arguments via memory –which address space holds the arguments? How does kernel access users address space?

196 196 Passing Arguments to System Call Kernel cannot assume arguments are valid It must be paranoid and check them all –or process could crash kernel with bogus arguments

197 197 Process Creation Process are created and destroyed all the time Many steps involved in Process Creation

198 198 Steps in Process Creation Allocate process control block Read code from disk and store into memory Initialize machine registers Initialize translator data (page table and PTBR) Set processor mode bit to user Jump to start of program

199 199 Steps in Process Creation Need hardware support for last few steps Otherwise processor executing in user mode cant access the kernels jump instruction Switching from kernel to user process –e.g. after a system call completes is the same as last 4 steps above

200 200 Multi-process Issues How to allocate physical memory between processes? Resource allocation is an issue whenever sharing a single resource among multiple users –e.g. CPU scheduling Often a tradeoff between globally optimal (best overall performance) and fairness

201 201 Replacement Policy Global replacement: –Consider all pages equally for eviction need Local replacement: –Only consider pages belonging to the process needing a new page when looking for a page to evict –But how to set the # of pages assigned to a process?

202 202 Replacement Policy Generally, global has lower overall miss rate –but local is more fair

203 203 Thrashing What would happen with lots of big processes, all actively using lots of virtual memory? Usually, performance degrades rapidly as you go from having all programs fit in memory to not quite fitting in memory This is called thrashing

204 204 Thrashing Average access time = –hit rate * hit time + miss rate * miss time –e.g. hit time =.0001 ms, miss time = 10 ms 100% hit rate: –average access time is.0001 ms 99% hit rate: 90% hit rate:

205 205 Solutions to Thrashing If a single process is actively using more pages than can fit –theres no solution –that process (at least) will thrash

206 206 Solutions to Thrashing If cause is the combination of several processes Can alleviate thrashing by swapping all pages of a process out to disk –That process wont run at all –but other processes will run much faster –Overall performance improves

207 207 Working Set Whats meant by a process actively using a lot of virtual pages? Working set: all pages used in last T seconds –or T instructions larger working set ==> –process needs more physical memory to run well –i.e. avoid thrashing

208 208 Working Set Sum of all working sets should fit in memory –otherwise system will thrash Only run a set of processes whose working sets all fit in memory –this is called a balance set How to measure size of working set for a process?

209 209 Examples of Process Creation Unix separates process creation into two steps Unix fork: create a new process (with 1 thread) –Address space of new process is a copy of parents

210 210 Examples of Process Creation Unix exec: overlay the new processs address space with –the specified program and jump to its starting PC –this loads the new program

211 211 Examples of Process Creation Example: Parent process wants to fork a child to do a task. Any problem with having the new process be an exact copy of the parent?

212 212 Examples of Process Creation Why does Unix fork copy the parents entire address space, just to throw it out and start with the new address space? Unix provides the semantic of copying the parents entire address space –but does not physically copy the data until needed

213 213 Examples of Process Creation Separating fork and exec gives maximum flexibility for the parent process to pass information to the child Common special case: fork a new process that runs the same code as parent

214 214 Alternative Process Creation Windows creates processes with a single call –CreateProcess Unixs approach gives the flexibility of sharing arbitrary data with child process Windows approach allows the program to share the most common data via parameters

215 215 Implementing a Shell Shell provides the user interface –sh, csh, tcsh, bash, zsh, etc. Windows Explorer is similar –looks like part of the operating system –but we now know enough to write a shell as a standard user program How to write a shell?

216 Computer Changes Life


Download ppt "11 Chapter 4: Memory Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity."

Similar presentations


Ads by Google