© Imperial College London Composing Shared Memory Allocators Andy Cheadle Imperial College London.

Slides:



Advertisements
Similar presentations
Paging: Design Issues. Readings r Silbershatz et al: ,
Advertisements

Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.
Dynamic Memory Allocation in C.  What is Memory What is Memory  Memory Allocation in C Memory Allocation in C  Difference b\w static memory allocation.
Dynamic Memory Management
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Carnegie Mellon 1 Dynamic Memory Allocation: Basic Concepts / : Introduction to Computer Systems 18 th Lecture, March 24, 2015 Instructors:
Chris Riesbeck, Fall 2007 Dynamic Memory Allocation Today Dynamic memory allocation – mechanisms & policies Memory bugs.
C++ Programming Languages
Concurrency Important and difficult (Ada slides copied from Ed Schonberg)
Lecture 10: Part 1: OO Issues CS 540 George Mason University.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
KERNEL MEMORY ALLOCATION Unix Internals, Uresh Vahalia Sowmya Ponugoti CMSC 691X.
User-Level Memory Management in Linux Programming
CP104 Introduction to Programming Structure II Lecture 32 __ 1 Data Type planet_t and Basic Operations Abstract Data Type (ADT) is a data type combined.
Memory allocation CSE 2451 Matt Boggus. sizeof The sizeof unary operator will return the number of bytes reserved for a variable or data type. Determine:
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System Ben Gamsa, Orran Krieger, Jonathan Appavoo, Michael Stumm.
Parallel Memory Allocation Steven Saunders. 2Parallel Memory AllocationSteven Saunders Introduction Fallacy: All dynamic memory allocators are either.
Chapter 10 Storage management
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
Run time vs. Compile time
Composing High-Performance Memory Allocators Emery Berger, Ben Zorn, Kathryn McKinley.
C and Data Structures Baojian Hua
Chapter 5: Memory Management Dhamdhere: Operating Systems— A Concept-Based Approach Slide No: 1 Copyright ©2005 Memory Management Chapter 5.
Virtual Memory BY JEMINI ISLAM. What is Virtual Memory Virtual memory is a memory management system that gives a computer the appearance of having more.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Memory Layout C and Data Structures Baojian Hua
PALM-3000 Software Architecture T. TRUONG Team Meeting #7 27 February 2008.
CS533 Concepts of Operating Systems Jonathan Walpole.
CS212: Object Oriented Analysis and Design Lecture 6: Friends, Constructor and destructors.
Chapter 5: Programming Languages and Constructs by Ravi Sethi Activation Records Dolores Zage.
Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu
Fast Conservative Garbage Collection Rifat Shahriyar Stephen M. Blackburn Australian National University Kathryn S. M cKinley Microsoft Research.
© Imperial College London Exploring the Barrier to Entry Incremental Generational Garbage Collection for Haskell Andy Cheadle & Tony Field Imperial College.
E X C E E D I N G E X P E C T A T I O N S OP SYS Linux System Administration Dr. Hoganson Kennesaw State University Operating Systems Functions of an operating.
Cpr E 308 Spring 2004 Real-time Scheduling Provide time guarantees Upper bound on response times –Programmer’s job! –Every level of the system Soft versus.
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. C H A P T E R F I V E Memory Management.
Supporting Multi-Processors Bernard Wong February 17, 2003.
A summary by Nick Rayner for PSU CS533, Spring 2006
COMP3190: Principle of Programming Languages
Processes CS 6560: Operating Systems Design. 2 Von Neuman Model Both text (program) and data reside in memory Execution cycle Fetch instruction Decode.
System Components ● There are three main protected modules of the System  The Hardware Abstraction Layer ● A virtual machine to configure all devices.
David F. Bacon Perry Cheng V.T. Rajan IBM T.J. Watson Research Center ControllingFragmentation and Space Consumption in the Metronome.
Processes and Virtual Memory
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System Ben Gamsa, Orran Krieger, Jonathan Appavoo, Michael Stumm.
Efficient Dynamic Heap Allocation of Scratch-Pad Memory Ross McIlroy, Peter Dickman and Joe Sventek Carnegie Trust for the Universities of Scotland.
Slides created by: Professor Ian G. Harris Operating Systems  Allow the processor to perform several tasks at virtually the same time Ex. Web Controlled.
Sections Basic Data Structures. 1.5 Data Structures The way you view and structure the data that your programs manipulate greatly influences your.
Embedded Real-Time Systems
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
8 July 2015 Charles Reiss
Virtualization Virtualize hardware resources through abstraction CPU
CSE687 - Object Oriented Design class notes Survey of the C++ Programming Language Jim Fawcett Spring 2004.
Concepts of programming languages
PA1 is out Best by Feb , 10:00 pm Enjoy early
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
Page Replacement.
Outline Midterm results summary Distributed file systems – continued
System Structure and Process Model
Memory Allocation CS 217.
Operating System Introduction.
Operating Systems: A Modern Perspective, Chapter 6
Mark McKelvin EE249 Embedded System Design December 03, 2002
Foundations and Definitions
RUN-TIME STORAGE Chuen-Liang Chen Department of Computer Science
Presentation transcript:

© Imperial College London Composing Shared Memory Allocators Andy Cheadle Imperial College London

© Imperial College London Page 2 Motivation Increasing number of industrial applications driving bespoke database development –Pros: performance giving competitive advantage –Cons: development intensive, costly, high risk Telecoms - Intelligent Network Softswitch –In-memory Subscriber Database –In-memory Call Context Database –Scheduled Internal Triggers Fault-tolerant shared memory cluster services –Distributed cluster locks

© Imperial College London Page 3 Application Architecture Database manager – Replicator – Garbage Collector Database writer – 100ms timeslice Service creation & Provisioning Call Processing – 4 x 16 threads Call Context Scheduled Internal Triggers Subscriber  Multiple readers / writers – Serious potential for contention

© Imperial College London Page 4 Custom Allocators Should we be writing them? –Can they compete? –Bugs Production memory allocators: –Macros/monolithic functions to avoid overhead –Complicated –Not retargettable –Optimisations lead to obscure code  Hard to write, reuse and maintain

© Imperial College London Page 5 Custom Allocators Huge design space to explore –Excellent survey of Wilson et al. –Fast allocation policies –Efficient freeing mechanisms –Minimise fragmentation Develop within budget and on time! –If possible, use off-the-shelf software –Heap Layers!  For shared memory, we have no choice!

© Imperial College London Page 6 Contribution & Highlights Shared memory heap layers infrastructure –cf. Heap layers infrastructure of Berger et al. –Composable, extensible and high-performance –Exploit ‘mixin’ composition of C++ templates In-memory database engine infrastructure –Persistence –32- / 64-bit address space segment manager Segment pinning, concurrent access, record locks –Shared memory ‘smart’ pointers Location independence – Tolerance to record corruption Checksumming, multi-mode garbage collector

© Imperial College London Page 7 C++ Classes Fixed inheritance hierarchy Single class inheritance instantiation B A C B A C A A D

© Imperial College London Page 8 Heap Layer Mixin Classes C++ mixins yield highly configurable, layered, collaboration-based software components –Not stand-alone classes - implement small functional ‘slices’ 1.Derive a concrete composed class from a number of mixin classes using multiple inheritance and abstract subclasses 2.Parameterized inheritance: template class CustomHeap : public FoundationHeap { inline void *malloc( size_t size ) { … ptr = FoundationHeap::malloc( size’ ); return f( ptr ); } };

© Imperial College London Page 9 Naïve Shared Memory Allocator shmget(id, size, …)shmctl(…, IPC_RMID, …) SplittingHeap CoalescingHeap SegregatedHeap FreeListHeap BumpPointerHeap SharedMemoryHeap malloc()free()

© Imperial College London Page 10 Naïve Shared Memory Allocator StatisticsHeap< SplittingHeap< CoalescingHeap< SegregatedHeap< StatisticsHeap< FreeListHeap< BumpPointerHeap< SharedMemoryHeap< NullBlockHeap > > > > > > >

© Imperial College London Page 11 Shared Memory Complications Address space limitations –Unlimited number of database segments Location independence –Arbitrary address mapping of segments No virtual method dispatch –Location independence Collaboration-based subclass method chaining is too restrictive –Nullary constructors –Fixed Heap Layer interface –Template parameters and callback functions

© Imperial College London Page 12 Segment Manager Design stack C heap mapped segments … … DB Manager Segment Manager segment info 0 … segment info n - 1 arbitrary address base locked pinned evicted lru queue

© Imperial College London Page 13 Allocator Design rank isolated pinned deferred most-free queue control segment: data segments: lock global allocator local allocator lock free space lock local allocator free space segment info

© Imperial College London Page 14 Memory Block Design Knuth Boundary Tags: size+ + wasted next prev data freelist only

© Imperial College London Page 15 Smart pointer classes –Scoped –Reference counter ‘pins’ segment in memory –Location independent: template<class T, int TagWidth, int OffsetWidth, int BlockSize, T *(*dereferenceFunction) (isnmTaggedHeapOffset<TagWidth, OffsetWidth, BlockSize>&)> class isnmTaggedHeapOffsetPtr {…}; Stealable, atomic, CPU yielding spinlocks –OS support, Atomic Ops library, Assembler Pointers and Locks TagOffset

© Imperial College London Page 16 Shared Memory Heap Layer Interface class isnmNullMallocHeap : public isnmCommonNullHeap { inline void *malloc( isnm_size_t size ) {…} inline void *calloc( isnm_size_t nmemb, isnm_size_t size ) {…} inline void *realloc( void *ptr, isnm_size_t size ) {…} inline void free( void *ptr ) {…} inline void *malloc( isnm_size_t size, bool tryLock ) {…} inline void *calloc( isnm_size_t nmemb, isnm_size_t size, bool tryLock ) {…} };

© Imperial College London Page 17 Shared Memory Heap Layer Interface template class isnmNullBlockHeap : public isnmCommonNullHeap { inline MemBlock *mallocBlock( isnm_size_t size ) {…} inline MemBlock *callocBlock( isnm_size_t nmemb, isnm_size_t size ) {…} inline MemBlock *reallocBlock( MemBlock *ptr, isnm_size_t size ) {…} inline void freeBlock( MemBlock *ptr ) {…} inline MemBlock *mallocBlock( isnm_size_t size, bool tryLock ) {…} inline MemBlock *callocBlock( isnm_size_t nmemb, isnm_size_t size, bool tryLock ) {…} };

© Imperial College London Page 18 Shared Memory Heap Layer Interface class isnmCommonNullHeap { inline int lock( void ) {} inline int trylock( void ) {} inline int unlock( void ) {} inline void setHeapSize( isnm_size_t size ) {} inline isnm_size_t getHeapSize( void ) const {} inline void setHeapLimits( char **start, char **end) {} inline void getHeapLimits( char **start, char **end) {} inline void setFreeSpaceAvailable( isnm_size_t freeSpace ){} inline isnm_size_t getFreeSpaceAvailable( void ) {} inline isnm_size_t getNumFreeListBlocks( void ) {} inline void setNumAllocations( isnm_size_t allocations ) {} inline isnm_size_t getNumAllocations( void ) {} inline void setMinimumAllocated( isnm_size_t allocated ) {} inline isnm_size_t getMinimumAllocated( void ) {} inline isnm_size_t getAverageAllocated( void ) {}

© Imperial College London Page 19 Shared Memory Heap Layer Interface class isnmCommonNullHeap { inline void setMaximumAllocated( isnm_size_t allocd ) {} inline isnm_size_t getMaximumAllocated( void ) {} inline void setTotalAllocated( isnm_size_t allocd ) {} inline isnm_size_t getTotalAllocated( void ) {} inline void setTotalHeaderAllocated( isnm_size_t allocd ) {} inline isnm_size_t getTotalHeaderAllocated( void ) {} inline isnm_size_t getTotalUserSpaceAllocated( void ) {} inline void setTotalSizeWasted( isnm_size_t wasted ) {} inline isnm_size_t getTotalSizeWasted( void ) {} inline void setTotalAlignWasted( isnm_size_t wasted ) {} inline isnm_size_t getTotalAlignWasted( void ) {} inline isnm_size_t getTotalWasted( void ) {} inline void getHeapStatistics( isnmHeapStatistics *stats, isnm_ssize_t heapId, isnm_ssize_t subHeapId, int *numStatLayers, int *freeIndex ) {}

© Imperial College London Page 20 Shared Memory Heap Layer Interface class isnmCommonNullHeap { inline void getContiguousHeapStatistics(…) {} inline void setTotalGarbage( isnm_size_t garbage ) {} inline isnm_size_t getTotalGarbage( void ) {} inline void freeAll( void ) {} inline void setValidWord( isnm_size_t word ) {} inline isnm_size_t getValidWord( void ) {} inline void validate( void ) {} inline void invalidate( void ) {} inline bool verifyValid( void ) {} inline void setCheckSum( isnm_size_t checkSum ) {} inline isnm_size_t getCheckSum( void ) {} isnm_size_t calculateCheckSum( void ) {} inline bool verifyCheckSum( void ) {} inline void resetCheckSum( void ) {} };

© Imperial College London Page 21 Memory Block Interface class isnmMemBlock { explicit isnmMemBlock( isnm_size_t size = 0 ) {} inline void setSize( isnm_size_t size ) {} inline isnm_size_t getSize( void ) {} inline void setSizeWasted( isnm_size_t wasted ) {} inline isnm_size_t getSizeWasted( void ) {} inline static isnm_size_t getHeaderSize( void ) {} inline static isnm_size_t getHeaderFooterSize( void ) {} inline static isnm_size_t getFreeHeaderSize( void ) {} inline static isnm_size_t getFreeHeaderFooterSize( void ){} inline isnm_size_t getHeaderTag() {} inline isnm_size_t getFooterTag() {} inline bool isAllocated( void ) {} inline void allocate( isnm_size_t size ) {} inline void deallocate( isnm_size_t size ) {} };

© Imperial College London Page 22 Memory Block Implementation #ifdef IMDB_USE_TREE typedef isnmBoundaryMemBlock< isnmTreapMemBlock<isdbShmPtr, isnmSingleLinkedMemBlock<isdbShmPtr, isnmWastedMemBlock< isnmSizedMemBlock > > > isdbMemBlock; #else typedef isnmBoundaryMemBlock< isnmDoubleLinkedMemBlock<isdbShmPtr, isnmWastedMemBlock< isnmSizedMemBlock > > > isdbMemBlock; #endif

© Imperial College London Page 23 Allocator Implementation typedef isnmCheckSumLayer<BASE_ALLOCATOR_CHECKSUM_VALUE, isnmNullBlockHeap > isdbRootHeap; typedef isnmBumpPointerHeap<isdbShmPtr, isdbMemBlock, IMDB_SHM_PTR_ALIGN_BOUNDARY, false, 100, isdbRootHeap> isdbBumpPointerHeap; typedef isnmFirstFitFreeListHeap<isdbMemBlock, isnmFIFOOrderedMemBlockList<isdbMemBlock, isdbShmPtr, isnmDLMemBlockList >, SIMDB_SHM_PTR_ALIGN_BOUNDARY, false, isnmNullBlockHeap > isdbFreeListHeap;

© Imperial College London Page 24 Allocator Implementation typedef isnmSegregatedHeap<IMDB_NUM_SIZE_CLASSES, getSizeClassFromSize, getHeapLimitPercentageForSizeClass, isdbShmPtr, isdbMemBlock, isdbFreeListHeap, 100, isdbBumpPointerHeap> isdbSegregatedHeap; typedef isnmStatisticsHeap< isdbMemBlock, isnmBoundaryCoalescingHeap<isdbMemBlock, isnmSplittingHeap< isdbMemBlock, IMDB_BLOCK_SIZE_THRESHOLD, isdbSegregatedHeap> > > isdbShmBlockHeap;

© Imperial College London Page 25 Allocator Implementation typedef isnmLockedHeap< isnmMallocBlockInterfaceLayer<isdbMemBlock, isdbShmBlockHeap>, isdbLongLock> isdbHeapAllocator; typedef isnmSharedMemoryHeap<getSegmentManager, IMDB_MFSO_ORDER, isdbShmLocation, isdbHeapAllocator> isdbSharedMemoryHeap;

© Imperial College London Page 26 Conclusion Tuned allocator development is hard and time consuming work! Heap layer infrastructures vastly reduce this burden Shared memory heap layers eliminate a significant portion of in-memory database engine design I love using heap layers!

© Imperial College London Page 27 Further Work Benchmarking and Performance –Segment manager –Freelist management: list or tree? Iterator type? – Garbage collector Resilience to failure –Minimise critical sections –Enhanced failure detection –Failure recovery

© Imperial College London Page 28 References Dynamic Storage Allocation: A Survey and Critical Review –International Workshop on Memory Management, 1995 –Wilson, Johnstone, Neely, Boles Composing High-Performance Memory Allocators –International Conference on Programming Language Design and Implementation, 2001 –Berger, Zorn, McKinley

© Imperial College London Page 29 Shared Memory Heap Layers Library AtomicHeapLock.h HeapPtr.h SegregatedHeap.h BestFitFreeListHeap.h HeapStatisticsLayer.h SemaphoreHeapLock.h BitmapHeap.h IntegrousLayer.h SharedMemoryHeap.h BlockHeap.h LockedHeap.h SharedMemoryManager.h BlockingHeapLock.h MallocBlockInterfaceLayer.h SharedMemorySegment.h BoundaryCoalescingHeap.h MallocHeap.h SharedMemorySegmentManager.h BumpPointerHeap.h MemBlock.h SingleFitFreeListHeap.h ChainedSlottedBlockHeap.h MemBlockList.h SingleInstanceHeap.h CheckSumLayer.h MemBlockTree.h SlottedBlockHeap.h ClassContainerHeap.h MemoryManager.h SlottedHeap.h CoalescingHeap.h MemoryZeroingHeap.h SpinningHeapLock.h FirstFitFreeListHeap.h MutexHeapLock.h SplittingHeap.h FreeListHeap.h NextFitFreeListHeap.h StatisticsHeap.h FreeTreeHeap.h NullHeap.h StrictSegregatedHeap.h GCSupportLayer.h PThreadOwnerHeapLock.h ThreadOwnerHeapLock.h HeapLayers.h RecursiveHeapLock.h TypeVariateSegregatedHeap.h HeapLayersSupport.h STLAllocator.h WorstFitFreeListHeap.h