Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 490/590 Computer Architecture Virtual Memory II

Similar presentations


Presentation on theme: "CSE 490/590 Computer Architecture Virtual Memory II"— Presentation transcript:

1 CSE 490/590 Computer Architecture Virtual Memory II
Steve Ko Computer Sciences and Engineering University at Buffalo

2 Last time… Virtual memory organization Page-table walk TLB
Linear page table Hierarchical page table Page-table walk Software or hardware TLB Caches address translations CS252 S05

3 Virtual Address Caches
CPU Physical Cache TLB Primary Memory VA PA Alternative: place the cache before the TLB CPU VA (StrongARM) Virtual Cache PA TLB Primary Memory one-step process in case of a hit (+) cache needs to be flushed on a context switch unless address space identifiers (ASIDs) included in tags (-) aliasing problems due to the sharing of pages (-) maintaining cache coherence (-) (see later in course) CS252 S05

4 Aliasing in Virtual-Address Caches
Page Table Tag Data Page Table VA1 Data Pages VA1 1st Copy of Data at PA PA VA2 2nd Copy of Data at PA VA2 Virtual cache can have two copies of same physical data. Writes to one copy not visible to reads of other! Two virtual pages share one physical page Two processes sharing the same file, Map the same memory segment to different Parts of their address space. General Solution: Disallow aliases to coexist in cache Software (i.e., OS) solution for direct-mapped cache VAs of shared pages must agree in cache index bits; this ensures all VAs accessing same PA will conflict in direct-mapped cache (early SPARCs) CS252 S05

5 Concurrent Access to TLB & Cache
VPN L b TLB Direct-map Cache 2L blocks 2b-byte block PPN Page Offset = hit? Data Physical Tag Tag VA PA Virtual Index k Index L is available without consulting the TLB cache and TLB accesses can begin simultaneously Tag comparison is made after both accesses are completed Cases: L + b = k, L + b < k, L + b > k CS252 S05

6 Virtual-Index Physical-Tag Caches: Associative Organization
VPN a L = k-b b TLB Direct-map 2L blocks PPN Page Offset = hit? Data Phy. Tag VA PA Virtual Index k 2a Consider 4-Kbyte pages and caches with 32-byte blocks 32-Kbyte cache  2a = 8 4-Mbyte cache  2a = No ! After the PPN is known, 2a physical tags are compared How does this scheme scale to larger caches? CS252 S05

7 Concurrent Access to TLB & Large L1 The problem with L1 > Page size
Virtual Index L1 PA cache Direct-map VA VPN a Page Offset b TLB VA1 PPNa Data VA2 PPNa Data PA PPN Page Offset b = hit? Tag Can VA1 and VA2 both map to PA ? CS252 S05

8 A solution via Second Level Cache
CPU L1 Instruction Cache Unified L2 Cache Memory Memory Memory L1 Data Cache RF Memory Usually a common L2 cache backs up both Instruction and Data L1 caches L2 is “inclusive” of both Instruction and Data caches CS252 S05

9 Anti-Aliasing Using L2: MIPS R10000
Virtual Index L1 PA cache Direct-map VA VPN a Page Offset b into L2 tag VA1 PPNa Data TLB VA2 PPNa Data PA PPN Page Offset b PPN = hit? Tag Suppose VA1 and VA2 both map to PA and VA1 is already in L1, L2 (VA1  VA2) After VA2 is resolved to PA, a collision will be detected in L2. VA1 will be purged from L1 and L2, and VA2 will be loaded  no aliasing ! PA a Data Direct-Mapped L2 CS252 S05

10 Virtually-Addressed L1: Anti-Aliasing using L2
Index & Tag VA VPN Page Offset b VA1 Data TLB VA2 Data PA PPN Page Offset b L1 VA Cache “Virtual Tag” Tag Physical Index & Tag PA VA1 Data Physically-addressed L2 can also be used to avoid aliases in virtually-addressed L1 L2 PA Cache L2 “contains” L1 CS252 S05

11 Page Fault Handler When the referenced page is not in DRAM:
The missing page is located (or created) It is brought in from disk, and page table is updated Another job may be run on the CPU while the first job waits for the requested page to be read from disk If no free pages are left, a page is swapped out Pseudo-LRU replacement policy Since it takes a long time to transfer a page (msecs), page faults are handled completely in software by the OS Untranslated addressing mode is essential to allow kernel to access page tables CS252 S05

12 Swapping a Page of a Page Table
A PTE in primary memory contains primary or secondary memory addresses A PTE in secondary memory contains only secondary memory addresses  a page of a PT can be swapped out only if none its PTE’s point to pages in the primary memory Why?__________________________________ What is the worst thing you can do with respect to storing page tables? Storing page table on disk for whose entries point to phys. Mem. CS252 S05

13 Virtual Memory Use Today - 1
Desktops/servers have full demand-paged virtual memory Portability between machines with different memory sizes Protection between multiple users or multiple tasks Share small physical memory among active tasks Simplifies implementation of some OS features Vector supercomputers have translation and protection but not demand-paging (Older Crays: base&bound, Japanese & Cray X1/X2: pages) Don’t waste expensive CPU time thrashing to disk (make jobs fit in memory) Mostly run in batch mode (run set of jobs that fits in memory) Difficult to implement restartable vector instructions CS252 S05

14 Virtual Memory Use Today - 2
Most embedded processors and DSPs provide physical addressing only Can’t afford area/speed/power budget for virtual memory support Often there is no secondary storage to swap to! Programs custom written for particular memory configuration in product Difficult to implement restartable instructions for exposed architectures CS252 S05

15 CSE 490/590 Administrivia Midterm on Friday, 3/4
Project 1 deadline: Friday, 3/11 Quiz 1 regrading  Jangyoung CSE machines are available for projects Thin clients & SSH only for simulation Linux & Windows 216 Bell for board

16 Address Translation in CPU Pipeline
PC D E M W Inst TLB Inst. Cache Decode Data TLB Data Cache + TLB miss? Page Fault? Protection violation? TLB miss? Page Fault? Protection violation? Software handlers need restartable exception on page fault or protection violation Handling a TLB miss needs a hardware or software mechanism to refill TLB Need mechanisms to cope with the additional latency of a TLB: slow down the clock pipeline the TLB and cache access virtual address caches parallel TLB/cache access CS252 S05

17 Address Translation: putting it all together
Virtual Address hardware hardware or software software Restart instruction TLB Lookup miss hit Page Table Walk Protection Check the page is Ï memory Î memory denied Need to restart instruction. Soft and hard page faults. permitted Page Fault (OS loads page) Protection Fault Update TLB Physical Address (to cache) SEGFAULT CS252 S05

18 Translation Lookaside Buffers
Address translation is very expensive! In a two-level page table, each reference becomes several memory accesses Solution: Cache translations in TLB TLB hit  Single Cycle Translation TLB miss  Page-Table Walk to refill virtual address VPN offset V R W D tag PPN (VPN = virtual page number) 3 memory references 2 page faults (disk accesses) + .. Actually used in IBM before paged memory. (PPN = physical page number) hit? physical address PPN offset CS252 S05

19 Linear Page Table Page Table Entry (PTE) contains:
Data word Data Pages Page Table Entry (PTE) contains: A bit to indicate if a page exists PPN (physical page number) for a memory-resident page DPN (disk page number) for a page on the disk Status bits for protection and usage OS sets the Page Table Base Register whenever active user process changes Page Table PPN PPN DPN PPN Offset DPN PPN PPN DPN DPN VPN DPN PPN PPN PT Base Register VPN Offset Virtual address CS252 S05

20 Hierarchical Page Table
Virtual Address 31 22 21 12 11 p p offset 10-bit L1 index 10-bit L2 index offset Root of the Current Page Table p2 p1 Physical Memory (Processor Register) Level 1 Page Table Level 2 Page Tables page in primary memory page in secondary memory PTE of a nonexistent page Data Pages CS252 S05

21 Hierarchical Page Table
Virtual Address 31 22 21 12 11 A program that traverses the page table needs a “no translation” addressing mode. p p offset 10-bit L1 index 10-bit L2 index offset Root of the Current Page Table p2 p1 (Processor Register) Level 1 Page Table Level 2 Page Tables page in primary memory page in secondary memory PTE of a nonexistent page Data Pages CS252 S05

22 Acknowledgements These slides heavily contain material developed and copyright by Krste Asanovic (MIT/UCB) David Patterson (UCB) And also by: Arvind (MIT) Joel Emer (Intel/MIT) James Hoe (CMU) John Kubiatowicz (UCB) MIT material derived from course 6.823 UCB material derived from course CS252


Download ppt "CSE 490/590 Computer Architecture Virtual Memory II"

Similar presentations


Ads by Google