COMPSYS 304 Computer Architecture Memory Management Units Reefed down - heading for Great Barrier Island.

COMPSYS 304 Computer Architecture Memory Management Units Reefed down - heading for Great Barrier Island

Memory Management Unit Physical and Virtual Memory Physical memory presents a flat address space Addresses 0 to 2 p -1 p = number of bits in an address word User programs accessing this space  Conflicts in  multi-user ( eg Unix)  multi-process ( eg Real-Time systems) systems  Virtual Address Space  Each user has a “private” address space  Addresses 0 to 2 q -1  q and p not necessarily the same!

Memory Management Unit èVirtual Address Space Each user has a “private” address space User D’s Address Space

Virtual Addresses Memory Management Unit CPU (running user programs) emits Virtual Address MMU translates it to Physical Address Operating System allocates pages of physical memory to users

Virtual Addresses Mappings between user space and physical memory created by OS

Memory Management Unit (MMU) Responsible for VIRTUAL  PHYSICAL address mapping Sits between CPU and cache Cache operates on Physical Addresses (mostly - some research on VA cache) CPU MMU Cache Main Mem D or I VA PA D or I

MMU - operation OS constructs page tables - one for each user Page address from memory address selects a page table entry Page table entry contains physical page address

MMU - operation q-k

MMU - Virtual memory space Page Table Entries can also point to disc blocks Valid bit Set: page in memory address is physical page address Cleared: page “swapped out” address is disc block address MMU hardware generates page fault when swapped out page is requested Allows virtual memory space to be larger than physical memory Only “working set” is in physical memory Remainder on paging disc

MMU - Page Faults Page Fault Handler Part of OS kernel Finds usable physical page LRU algorithm Writes it back to disc if modified Reads requested page from paging disc Adjusts page table entries Memory access re-tried

Page Fault q-k

MMU - Page Faults Page Fault Handler Part of OS kernel Finds usable physical page LRU algorithm Writes it back to disc if modified Reads requested page from paging disc Adjusts page table entries Memory access re-tried Can be an expensive process! Usual to allow page tables to be swapped out too!  Page fault can be generated on the page tables!

MMU - practicalities Page size 8 kbyte pages  k = 13 p = 32  p - k = 19 So page table size 2 19 = 0.5 x 10 6 entries Each entry 4 bytes  0.5 x 10 6 × 4 = 2 Mbytes! Page tables can take a lot of memory! Larger page sizes reduce page table size but can waste space Access one byte  whole page needed!

MMU - practicalities Page tables are stored in main memory They’re too large to be in smaller memories! MMU needs to read PT for address translation  Address translation can require additional memory accesses! >32-bit address space?

MMU - Virtual Address Space Size Segments, etc Virtual address spaces >2 32 bytes are common Intel x86 - 46 bit address space PowerPC - 52 bit address space PowerPC address formation

MMU - Protection Page table entries Access control bits Read Write Read/Write Execute only Program - “text” in Unix Read only Data Read write

MMU - Alternative Page Table Styles Inverted Page tables One page table entry (PTE) / page of physical memory MMU has to search for correct VA entry  PowerPC hashes VA  PTE address PTE address = h( VA ) h – hash function Hashing  collisions Hash functions in hardware “hash” of n bits to produce m bits Usually m < n Fewer bits reduces information content

MMU - Alternative Page Table Styles Hash functions in hardware “hash” of n bits to produce m bits Usually m < n Fewer bits reduces information content There are only 2 m distinct values now! Some n -bit patterns will reduce to the same m -bit patterns Trivial example 2-bits  1-bit with xor h(x 1 x 0 ) = x 1 xor x 0 y h(y) 00 0 01 1 10 1 11 0 Collisions

MMU - Alternative Page Table Styles Inverted Page tables One page table entry (PTE) / page of physical memory MMU has to search for correct VA entry  PowerPC hashes VA  PTE address PTE address = h( VA ) h – hash function Hashing  collisions PTEs are linked together PTE contains tags (like cache) and link bits MMU searches linked list to find correct entry Smaller Page Tables / Longer searches

MMU - Sharing Can map virtual addresses from different user spaces to same physical pages  Sharing of data Commonly done for frequently used programs Unix “sticky bit” Text editors, eg vi Compilers Any other program used by multiple users  More efficient memory use  Less paging  Better use of cache

Address Translation - Speeding it up Two+ memory accesses for each datum? Page table 1 - 3 (single - 3 level tables) Actual data 1  System running slower than an 8088? Solution Translation Look-Aside Buffer (TLB or TLAB) Small cache of recently-used page table entries Usually fully-associative Can be quite small!

Address Translation - Speeding it up TLB sizes MIPS R4000 199248 entries MIPS R10000 199664 entries HP PA7100 1993120 entries Pentium 4 (Prescott) 200664 entries One page table entry / page of data Locality of reference Programs spend a lot of time in same memory region  TLB hit rates tend to be very high 98%  Compensate for cost of a miss (many memory accesses – but for only 2% of references to memory!)

TLB - Coverage How much of the address space is ‘covered’ by the TLB? ‪ ie how large is the address space for which the TLB will ensure fast translations? All others will need to fetch PTEs from cache, main memory or ( hopefully not too often! ) disc Coverage = number of TLB entries × page size eg 128 entry TLB, 8 kbyte pages Coverage = 2 7 × 2 13 = 2 20 ~ 1 Mbyte Critical factor in program performance Working data set size > TLB coverage could be bad news!

TLB - Coverage Luckily, sequential access is fine! Example: large (several MByte) array of doubles (8 byte floating point values) 8kbyte pages  1024 doubles/page Sequential access, eg sum all values: for(j=0;j<n;j++) sum = sum + x[j] First access to each page  TLB miss but now TLB is loaded with data for this page, so Next 1023 accesses  TLB hits  TLB Hit Rate = 1023/1024 ~ 99.9%

TLB – Avoid thrashing! n  n float matrix accessed by columns For n  2048, what is the TLB hit rate? Each element is on a new page! TLB hit rate = 0% General rule: Access matrices by rows if possible

Memory Hierarchy - Operation Search = Look up if standard PT or Search if inverted PT Hit = Page found in RAM

COMPSYS 304 Computer Architecture Memory Management Units Reefed down - heading for Great Barrier Island.

Similar presentations

Presentation on theme: "COMPSYS 304 Computer Architecture Memory Management Units Reefed down - heading for Great Barrier Island."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

COMPSYS 304 Computer Architecture Memory Management Units Reefed down - heading for Great Barrier Island.

Similar presentations

Presentation on theme: "COMPSYS 304 Computer Architecture Memory Management Units Reefed down - heading for Great Barrier Island."— Presentation transcript:

Similar presentations

About project

Feedback