Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient Virtual Memory for Big Memory Servers U Wisc and HP Labs ISCA’13 Architecture Reading Club Summer'131.

Similar presentations

Presentation on theme: "Efficient Virtual Memory for Big Memory Servers U Wisc and HP Labs ISCA’13 Architecture Reading Club Summer'131."— Presentation transcript:

1 Efficient Virtual Memory for Big Memory Servers U Wisc and HP Labs ISCA’13 Architecture Reading Club Summer'131

2 Key points Big memory workloads  Memcached, databases, graph analysis Analysis shows  TLB misses can account for upto 51% of execution time  Rich features of Paged VM is not needed by most applications Proposal : Direct Segments  Paged VM as usual where needed  Segmentation where possible For big memory workloads – this eliminates 99% of data TLB misses ! Architecture Reading Club Summer'132

3 Main Memory Mgmt Trends The amount of physical memory has gone from a few MBs to a few GBs and then to several TBs now But at the same time the size of the DTLB has remained fairly unchanged  Pent III – 72 Pent IV – 64 Nehalem – 96 IvyBridge – 100 Also workloads were nicer in the days-gone-by (higher locality) So higher memory cap + const TLB + misbehaving apps = more TLB misses Architecture Reading Club Summer'133

4 So how bad is it really ? Architecture Reading Club Summer'134

5 Main Features of Paged VM FeatureAnalysisVerdict SwappingNo swappingNot required Per Page Access Perms99% of pages are read-writeOverkill Fragmentation mgmt.Very little OS visible fragmentation Per-page reallocation is not important Architecture Reading Club Summer'135

6 Main Memory Allocation Architecture Reading Club Summer'136

7 Paged VM – why is it needed ? Shared memory regions for Inter-Process-Communication Code regions protected by per-page R/W Copy on-write uses per-page R/W for lazy implementation of fork. Guard pages at the end of thread-stacks. Architecture Reading Club Summer'137 Dynamically allocated Heap region Paging Valuable Paging Not Needed Code ConstantsShared Memory Mapped Files Guard Pages VA * Stack

8 Direct Segments Hybrid Paged + Segmented memory (not one on top of the other). Architecture Reading Club Summer'138

9 Address Translation Architecture Reading Club Summer'139

10 OS Support : Handling Physical Memory Setup Direct Segment registers  BASE = Start VA of Direct Segment  LIMIT = End VA of Direct Segment  OFFSET = BASE – Start PA of Direct Segment  Save and restore register values as part of process metadata on context-switch Create contiguous physical memory region  Reserve at startup – big memory apps are cognizant of memory requirement at startup.  Memory compaction – latency insignificant for long running jobs Architecture Reading Club Summer'1310

11 OS Support : Handling Virtual Memory Primary regions  Abstraction presented to application  Contiguous Virtual address space backed by Direct Segment What goes in the primary region  Dynamically allocated R/W memory  Application can indicate what it needs to put in primary region The size of the primary region is set to a very high value to accommodate the whole of the physical memory if need be  64-bit VA support 128TB of VM, so pretty much never running out of VA space Architecture Reading Club Summer'1311

12 Evaluation Methodology Implement Primary Region in the kernel Find the number of TLB misses that would be served by the non-existent direct segments  x86 uses hardware page-table walker  they trap all TLB misses by duping the system into believing that the PTE residing in memory is invalid  In the handler They touch the page with the faulting address Again mark the PTE to invalid Architecture Reading Club Summer'1312

13 Results Architecture Reading Club Summer'1313

14 Results Architecture Reading Club Summer'1314

15 Why not large pages ? Huge pages does not automatically scale  New page size and/or more TLB entries TLBs dependent on access locality Fixed ISA-defined sparse page sizes  e.g., 4KB, 2MB, 1GB  Needs to be aligned at page size boundaries Multiple page sizes introduces TLB tradeoffs  Fully associative vs. set-associative designs Architecture Reading Club Summer'1315

16 Virtual Memory Basics Architecture Reading Club Spring'13 16 Core Cache TLB (Translation Lookaside Buffer) Process 1 Process 2 Virtual Address Space Physical Memory Page Table 16

Download ppt "Efficient Virtual Memory for Big Memory Servers U Wisc and HP Labs ISCA’13 Architecture Reading Club Summer'131."

Similar presentations

Ads by Google