Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS161 Spring 201411 VM Internals Topics Hardware support for virtual memory Learning Objectives: Specify what the hardware must provide to enable virtual.

Similar presentations


Presentation on theme: "CS161 Spring 201411 VM Internals Topics Hardware support for virtual memory Learning Objectives: Specify what the hardware must provide to enable virtual."— Presentation transcript:

1 CS161 Spring 201411 VM Internals Topics Hardware support for virtual memory Learning Objectives: Specify what the hardware must provide to enable virtual memory. Discuss the hardware mechanisms that make virtual memory perform well. 2/27/14

2 OS Designer Humor CS161 Spring 20142 Hey, what if we translated every user address? Wow, that sounds really slow! What do we usually do to improve performance? Well, we could try a cache… 2/27/14

3 Everything's a Cache CPU Registers L1 Cache L2 Cache Memory Disk 3

4 A Virtual Address Cache A translation lookaside buffer (TLB) is a cache of virtual to physical address translation. Implemented in hardware It contains a few tens to a few hundreds of mappings between virtual and physical addresses. But a process may have many, many more mappings, how can such a small number help us? CS161 Spring 201442/27/14

5 A Virtual Address Cache A translation lookaside buffer (TLB) is a cache of virtual to physical address translation. Implemented in hardware It contains a few tens to a few hundreds of mappings between virtual and physical addresses. But a process may have many, many more mappings, how can such a small number help us? Programs exhibit locality! At any one time, you are only executing out of a few pages, and you are only accessing data from a few pages. Reuse a mapping many, many times and then replace it when you no longer need it. CS161 Spring 201452/27/14

6 The MMU CS161 Spring 20146 CPU Memory MMU TLB Regular Translation 2/27/14

7 The TLB CS161 Spring 20147 VAPA V junk 0 0 0 0 1234 pageoffset Compare page number to those in TLB 10480003 Miss!Trap! 2/27/14

8 A TLB Miss (in software) CS161 Spring 20148 VAPAV junk 0 0 0 0 1234 pageoffset Parallel comparison 10480003 Trap Exception PC Faulting Address 10480003 1115 2129 003A 001D 7201 6800 Page Table 001D TLB refill 2/27/14

9 TLB Refill CS161 Spring 20149 VAPA V junk 0 0 0 0 1234 pageoffset 10480003 001D From mapping 1 2/27/14

10 1234 pageoffset Physical Address TLB Hit (1) CS161 Spring 201410 VAPA V junk 0 0 031D1 junk 0 1234 pageoffset10480003 Virtual Address 10480003 ? 001D1 Is valid? 2/27/14 001D 1D

11 pageoffset 1234 Virtual Address 104C0003 1234 pageoffset Physical Address 1D TLB Hit (2) CS161 Spring 201411 VAPA V junk 0 0 031D1 junk 0 ? 0003001D1 Is valid? 001D 104C 2/27/14

12 pageoffset 1234 3246 Virtual Address 0000 1234 pageoffset Physical Address When your TLB Fills … CS161 Spring 201412 0000 VAPA V FD0423401 0191140E1 0003001D1 422314031 ? 3246 We know how to refill, but there are no empty spots! 2/27/14

13 TLB Eviction How do we decide which entry to evict? Goal is to evict something you are unlikely to need soon. Thoughts? We may also be constrained by where we are allowed to place an entry … 13CS161 Spring 20142/27/14

14 TLB Eviction How do we decide which entry to evict? Goal is to evict something you are unlikely to need soon. Thoughts? If we knew which entry would be used farthest in the future, we would like to evict that one, because it’s doing us the least good. If our fortune telling isn’t as good as we would like, how might we guess? How about the entry that has been unused the longest? That would lead us to want to do LRU replacement. We may also be constrained by where we are allowed to place an entry … Where we place entries is determined by how we find entries in the TLB. 14CS161 Spring 20142/27/14

15 TLB Searching How do we quickly look up a virtual page number? Linear search? 15CS161 Spring 20142/27/14

16 TLB Searching How do we quickly look up a virtual page number? Linear search? Very simple! Likely to be really slow! Direct mapped Let each page number map to a particular slot in the TLB. Typically select the slot be extracting a few bits from the page number: 16 VAPA V FD0423401 0191140E1 0003001D1 124414031 101010101010101 0 High bits?Low bits?Middle bits? CS161 Spring 20142/27/14

17 Direct Mapping: Which bits? High bits: Low bits: Middle bits: Mixture of high and low bits: 17CS161 Spring 20142/27/14

18 Direct Mapping: Which bits? High bits: All the pages in a particular segment map to the same entry. Low bits: Great for sequential access “Obvious” places in different segments will map to the same place (e.g., beginning of code and beginning of data). Middle bits: Sensitive to segment sizes and actual mapping. Mixture of high and low bits: Works well using mostly low and adding in a high bit to differentiate segments. 18CS161 Spring 20142/27/14

19 Direct Mapping: Thrashing (1) One of the problems with direct mapped caches is that they are subject to thrashing: that is repeated missing and eviction of the same pages. Consider a sequence of instructions whose instruction addresses and data addresses map to the same entry in the TLB. For example: for (i = 0; i < 1024; i++) sum += array[i] Let’s assume that this loop starts at PC 12340000 and the array resides at 32040000. Our page numbers are: 1234 and 3204 19 0011010000010010 3412 Instruction page 0000010000110010 0432 Data page CS161 Spring 2014 Map to the same TLB entry! 2/27/14

20 Direct Mapping: Thrashing (2) 20 12340000:lw r1, 0(r0) 12340004:add r2, r2, r1 12340008:br loop 1234000C:add r0, r0, 4 VAPA V FD04234010191140E10002001D1114314031 32040000 R0 PC VPNPPN Hit/Mi ss 12342244HIT 123422441432188641323256781422356791

21 Direct Mapping: Thrashing (2) 21 12340000:lw r1, 0(r0) 12340004:add r2, r2, r1 12340008:br loop 1234000C:add r0, r0, 4 VAPA V FD04234010191140E10002001D1114314031 32040000 R0 PC VPNPPN Hit/Mi ss 12342244HIT 32043579 12342244 12342244 12342244 12342244 32043579 12342244 12342244 123422441432188641323256781422356791 32043579 MISS HIT 0004

22 TLB Hit Ratios TLB performance or effectiveness is expressed in terms of a hit rate: the percentage of times that an access hits in the TLB. 22 TLB Hit Rate # TLB Accesses # TLB Hits CS161 Spring 2014 What is the hit rate on the previous slide? Even small TLBs typically achieve about a 98% hit rate; anything less is intolerable! How do they do it? 2/27/14

23 TLB: From Direct Mapped to Set Associative The fundamental problem we have is that any address can go in only one place. When more than one address that you need frequently maps to that one place, you are out of luck. Solution: Provide more flexibility: let an address map to multiple locations Search those locations in parallel. 23CS161 Spring 20142/27/14

24 2-Way Set Associate TLB 24 VPNPPNValidVPNPPNValid FF0012341AB0043211 000187651550124681 000213571420222221 884312351420322231 884400141320422201 2345010511355A0001 3636FF001235601061 0007FE001AB0745671 Comparator VPN index Physical Page Number CS161 Spring 20142/27/14

25 Higher Associativity? If two-way is good, wouldn’t four way be better? How about 8-way? How about fully associative? Any entry can go in any location. Older, tiny TLBs were, in fact, fully associative. Implemented with a content-addressable memory (CAM). CAMs are very expensive and become slower as you increase their size. In practice, most processors use a direct mapped or two-way set associative TLB. Once you have a few hundred entries, it doesn’t seem to matter. 252/27/14CS161 Spring 2014

26 Back to TLB Eviction Recall (slide 13) that we started to look at how TLBs were arranged to answer the question, “How do we select an entry to evict when the TLB is full?” Answer: You do not have many choices. In a direct mapped TLB? In a 2-way TLB? What do you suppose you do? 262/27/14CS161 Spring 2014

27 The MMU and Context Switching What happens to the state in the TLB on a process switch? How could you design a smarter TLB? We’ve seen that traps must do a lot of work in saving and restoring state. This makes context switching expensive. What are some of the other (hidden) costs of context switching? 272/27/14CS161 Spring 2014

28 The MMU and Context Switching What happens to the state in the TLB on a process switch? Must be invalidated! How could you design a smarter TLB? Add an address space ID in the TLB. We’ve seen that traps must do a lot of work in saving and restoring state. This makes context switching expensive. What are some of the other (hidden) costs of context switching? If the last process used all of the TLB, you will take a lot of TLB faults as you start running. 282/27/14CS161 Spring 2014

29 Address Translation Summary (1) 29 TLB HW Page Table Virtual Address 1) Look up in TLB (fast) Operating System Physical Address TLB Hit!!! 2/27/14CS161 Spring 2014

30 Address Translation Summary (2) 30 TLB HW Page Table Virtual Address 1) Look up in TLB (fast) Operating System Physical Address TLB Miss 2) Look up in page table (slower) In table Load TLB TLB Hit!!! 2/27/14CS161 Spring 2014

31 Address Translation Summary (3) 31 TLB HW Page Table Virtual Address 1) Look up in TLB (fast) Operating System Physical Address TLB Miss 2) Look up in page table (slower) Not in table! Load TLB TLB Hit!!! 3) Ask the OS for help (slowest) Load table 2/27/14CS161 Spring 2014

32 Hardware/Software Boundary 32 TLB HW Page Table Virtual Address Operating System Hardware Software Not found on all processors 2/27/14CS161 Spring 2014

33 HW versus SW TLB Fault Handling Software (MIPS R2000): Advantages Simplicity (of hardware) Flexibility More able to monitor page accesses Good for homework assignments in operating systems Disadvantages: Slow? Hardware (x86): Advantages: Speed Disadvantages: Less control Hardware dictates page table structure 332/27/14CS161 Spring 2014


Download ppt "CS161 Spring 201411 VM Internals Topics Hardware support for virtual memory Learning Objectives: Specify what the hardware must provide to enable virtual."

Similar presentations


Ads by Google