© Imperial College London Composing Shared Memory Allocators Andy Cheadle Imperial College London.

© Imperial College London Page 2 Motivation Increasing number of industrial applications driving bespoke database development –Pros: performance giving competitive advantage –Cons: development intensive, costly, high risk Telecoms - Intelligent Network Softswitch –In-memory Subscriber Database –In-memory Call Context Database –Scheduled Internal Triggers Fault-tolerant shared memory cluster services –Distributed cluster locks

© Imperial College London Page 3 Application Architecture Database manager – Replicator – Garbage Collector Database writer – 100ms timeslice Service creation & Provisioning Call Processing – 4 x 16 threads Call Context Scheduled Internal Triggers Subscriber  Multiple readers / writers – Serious potential for contention

© Imperial College London Page 4 Custom Allocators Should we be writing them? –Can they compete? –Bugs Production memory allocators: –Macros/monolithic functions to avoid overhead –Complicated –Not retargettable –Optimisations lead to obscure code  Hard to write, reuse and maintain

© Imperial College London Page 5 Custom Allocators Huge design space to explore –Excellent survey of Wilson et al. –Fast allocation policies –Efficient freeing mechanisms –Minimise fragmentation Develop within budget and on time! –If possible, use off-the-shelf software –Heap Layers!  For shared memory, we have no choice!

© Imperial College London Page 6 Contribution & Highlights Shared memory heap layers infrastructure –cf. Heap layers infrastructure of Berger et al. –Composable, extensible and high-performance –Exploit ‘mixin’ composition of C++ templates In-memory database engine infrastructure –Persistence –32- / 64-bit address space segment manager Segment pinning, concurrent access, record locks –Shared memory ‘smart’ pointers Location independence – Tolerance to record corruption Checksumming, multi-mode garbage collector

© Imperial College London Page 8 Heap Layer Mixin Classes C++ mixins yield highly configurable, layered, collaboration-based software components –Not stand-alone classes - implement small functional ‘slices’ 1.Derive a concrete composed class from a number of mixin classes using multiple inheritance and abstract subclasses 2.Parameterized inheritance: template class CustomHeap : public FoundationHeap { inline void *malloc( size_t size ) { … ptr = FoundationHeap::malloc( size’ ); return f( ptr ); } };

© Imperial College London Page 9 Naïve Shared Memory Allocator shmget(id, size, …)shmctl(…, IPC_RMID, …) SplittingHeap CoalescingHeap SegregatedHeap FreeListHeap BumpPointerHeap SharedMemoryHeap malloc()free()

© Imperial College London Page 10 Naïve Shared Memory Allocator StatisticsHeap< SplittingHeap< CoalescingHeap< SegregatedHeap< StatisticsHeap< FreeListHeap< BumpPointerHeap< SharedMemoryHeap< NullBlockHeap > > > > > > >

© Imperial College London Page 11 Shared Memory Complications Address space limitations –Unlimited number of database segments Location independence –Arbitrary address mapping of segments No virtual method dispatch –Location independence Collaboration-based subclass method chaining is too restrictive –Nullary constructors –Fixed Heap Layer interface –Template parameters and callback functions

© Imperial College London Page 12 Segment Manager Design stack C heap mapped segments … … DB Manager Segment Manager segment info 0 … segment info n - 1 arbitrary address base locked pinned evicted lru queue

© Imperial College London Page 13 Allocator Design rank isolated pinned deferred most-free queue control segment: data segments: lock global allocator local allocator lock free space lock local allocator free space segment info

© Imperial College London Page 15 Smart pointer classes –Scoped –Reference counter ‘pins’ segment in memory –Location independent: template<class T, int TagWidth, int OffsetWidth, int BlockSize, T *(*dereferenceFunction) (isnmTaggedHeapOffset<TagWidth, OffsetWidth, BlockSize>&)> class isnmTaggedHeapOffsetPtr {…}; Stealable, atomic, CPU yielding spinlocks –OS support, Atomic Ops library, Assembler Pointers and Locks TagOffset

© Imperial College London Page 16 Shared Memory Heap Layer Interface class isnmNullMallocHeap : public isnmCommonNullHeap { inline void *malloc( isnm_size_t size ) {…} inline void *calloc( isnm_size_t nmemb, isnm_size_t size ) {…} inline void *realloc( void *ptr, isnm_size_t size ) {…} inline void free( void *ptr ) {…} inline void *malloc( isnm_size_t size, bool tryLock ) {…} inline void *calloc( isnm_size_t nmemb, isnm_size_t size, bool tryLock ) {…} };

© Imperial College London Page 17 Shared Memory Heap Layer Interface template class isnmNullBlockHeap : public isnmCommonNullHeap { inline MemBlock *mallocBlock( isnm_size_t size ) {…} inline MemBlock *callocBlock( isnm_size_t nmemb, isnm_size_t size ) {…} inline MemBlock *reallocBlock( MemBlock *ptr, isnm_size_t size ) {…} inline void freeBlock( MemBlock *ptr ) {…} inline MemBlock *mallocBlock( isnm_size_t size, bool tryLock ) {…} inline MemBlock *callocBlock( isnm_size_t nmemb, isnm_size_t size, bool tryLock ) {…} };

© Imperial College London Page 18 Shared Memory Heap Layer Interface class isnmCommonNullHeap { inline int lock( void ) {} inline int trylock( void ) {} inline int unlock( void ) {} inline void setHeapSize( isnm_size_t size ) {} inline isnm_size_t getHeapSize( void ) const {} inline void setHeapLimits( char **start, char **end) {} inline void getHeapLimits( char **start, char **end) {} inline void setFreeSpaceAvailable( isnm_size_t freeSpace ){} inline isnm_size_t getFreeSpaceAvailable( void ) {} inline isnm_size_t getNumFreeListBlocks( void ) {} inline void setNumAllocations( isnm_size_t allocations ) {} inline isnm_size_t getNumAllocations( void ) {} inline void setMinimumAllocated( isnm_size_t allocated ) {} inline isnm_size_t getMinimumAllocated( void ) {} inline isnm_size_t getAverageAllocated( void ) {}

© Imperial College London Page 19 Shared Memory Heap Layer Interface class isnmCommonNullHeap { inline void setMaximumAllocated( isnm_size_t allocd ) {} inline isnm_size_t getMaximumAllocated( void ) {} inline void setTotalAllocated( isnm_size_t allocd ) {} inline isnm_size_t getTotalAllocated( void ) {} inline void setTotalHeaderAllocated( isnm_size_t allocd ) {} inline isnm_size_t getTotalHeaderAllocated( void ) {} inline isnm_size_t getTotalUserSpaceAllocated( void ) {} inline void setTotalSizeWasted( isnm_size_t wasted ) {} inline isnm_size_t getTotalSizeWasted( void ) {} inline void setTotalAlignWasted( isnm_size_t wasted ) {} inline isnm_size_t getTotalAlignWasted( void ) {} inline isnm_size_t getTotalWasted( void ) {} inline void getHeapStatistics( isnmHeapStatistics *stats, isnm_ssize_t heapId, isnm_ssize_t subHeapId, int *numStatLayers, int *freeIndex ) {}

© Imperial College London Page 20 Shared Memory Heap Layer Interface class isnmCommonNullHeap { inline void getContiguousHeapStatistics(…) {} inline void setTotalGarbage( isnm_size_t garbage ) {} inline isnm_size_t getTotalGarbage( void ) {} inline void freeAll( void ) {} inline void setValidWord( isnm_size_t word ) {} inline isnm_size_t getValidWord( void ) {} inline void validate( void ) {} inline void invalidate( void ) {} inline bool verifyValid( void ) {} inline void setCheckSum( isnm_size_t checkSum ) {} inline isnm_size_t getCheckSum( void ) {} isnm_size_t calculateCheckSum( void ) {} inline bool verifyCheckSum( void ) {} inline void resetCheckSum( void ) {} };

© Imperial College London Page 21 Memory Block Interface class isnmMemBlock { explicit isnmMemBlock( isnm_size_t size = 0 ) {} inline void setSize( isnm_size_t size ) {} inline isnm_size_t getSize( void ) {} inline void setSizeWasted( isnm_size_t wasted ) {} inline isnm_size_t getSizeWasted( void ) {} inline static isnm_size_t getHeaderSize( void ) {} inline static isnm_size_t getHeaderFooterSize( void ) {} inline static isnm_size_t getFreeHeaderSize( void ) {} inline static isnm_size_t getFreeHeaderFooterSize( void ){} inline isnm_size_t getHeaderTag() {} inline isnm_size_t getFooterTag() {} inline bool isAllocated( void ) {} inline void allocate( isnm_size_t size ) {} inline void deallocate( isnm_size_t size ) {} };

© Imperial College London Page 22 Memory Block Implementation #ifdef IMDB_USE_TREE typedef isnmBoundaryMemBlock< isnmTreapMemBlock<isdbShmPtr, isnmSingleLinkedMemBlock<isdbShmPtr, isnmWastedMemBlock< isnmSizedMemBlock > > > isdbMemBlock; #else typedef isnmBoundaryMemBlock< isnmDoubleLinkedMemBlock<isdbShmPtr, isnmWastedMemBlock< isnmSizedMemBlock > > > isdbMemBlock; #endif

© Imperial College London Page 23 Allocator Implementation typedef isnmCheckSumLayer<BASE_ALLOCATOR_CHECKSUM_VALUE, isnmNullBlockHeap > isdbRootHeap; typedef isnmBumpPointerHeap<isdbShmPtr, isdbMemBlock, IMDB_SHM_PTR_ALIGN_BOUNDARY, false, 100, isdbRootHeap> isdbBumpPointerHeap; typedef isnmFirstFitFreeListHeap<isdbMemBlock, isnmFIFOOrderedMemBlockList<isdbMemBlock, isdbShmPtr, isnmDLMemBlockList >, SIMDB_SHM_PTR_ALIGN_BOUNDARY, false, isnmNullBlockHeap > isdbFreeListHeap;

© Imperial College London Page 24 Allocator Implementation typedef isnmSegregatedHeap<IMDB_NUM_SIZE_CLASSES, getSizeClassFromSize, getHeapLimitPercentageForSizeClass, isdbShmPtr, isdbMemBlock, isdbFreeListHeap, 100, isdbBumpPointerHeap> isdbSegregatedHeap; typedef isnmStatisticsHeap< isdbMemBlock, isnmBoundaryCoalescingHeap<isdbMemBlock, isnmSplittingHeap< isdbMemBlock, IMDB_BLOCK_SIZE_THRESHOLD, isdbSegregatedHeap> > > isdbShmBlockHeap;

© Imperial College London Page 25 Allocator Implementation typedef isnmLockedHeap< isnmMallocBlockInterfaceLayer<isdbMemBlock, isdbShmBlockHeap>, isdbLongLock> isdbHeapAllocator; typedef isnmSharedMemoryHeap<getSegmentManager, IMDB_MFSO_ORDER, isdbShmLocation, isdbHeapAllocator> isdbSharedMemoryHeap;

© Imperial College London Page 26 Conclusion Tuned allocator development is hard and time consuming work! Heap layer infrastructures vastly reduce this burden Shared memory heap layers eliminate a significant portion of in-memory database engine design I love using heap layers!

© Imperial College London Page 27 Further Work Benchmarking and Performance –Segment manager –Freelist management: list or tree? Iterator type? – Garbage collector Resilience to failure –Minimise critical sections –Enhanced failure detection –Failure recovery

© Imperial College London Page 28 References Dynamic Storage Allocation: A Survey and Critical Review –International Workshop on Memory Management, 1995 –Wilson, Johnstone, Neely, Boles Composing High-Performance Memory Allocators –International Conference on Programming Language Design and Implementation, 2001 –Berger, Zorn, McKinley

© Imperial College London Page 29 Shared Memory Heap Layers Library AtomicHeapLock.h HeapPtr.h SegregatedHeap.h BestFitFreeListHeap.h HeapStatisticsLayer.h SemaphoreHeapLock.h BitmapHeap.h IntegrousLayer.h SharedMemoryHeap.h BlockHeap.h LockedHeap.h SharedMemoryManager.h BlockingHeapLock.h MallocBlockInterfaceLayer.h SharedMemorySegment.h BoundaryCoalescingHeap.h MallocHeap.h SharedMemorySegmentManager.h BumpPointerHeap.h MemBlock.h SingleFitFreeListHeap.h ChainedSlottedBlockHeap.h MemBlockList.h SingleInstanceHeap.h CheckSumLayer.h MemBlockTree.h SlottedBlockHeap.h ClassContainerHeap.h MemoryManager.h SlottedHeap.h CoalescingHeap.h MemoryZeroingHeap.h SpinningHeapLock.h FirstFitFreeListHeap.h MutexHeapLock.h SplittingHeap.h FreeListHeap.h NextFitFreeListHeap.h StatisticsHeap.h FreeTreeHeap.h NullHeap.h StrictSegregatedHeap.h GCSupportLayer.h PThreadOwnerHeapLock.h ThreadOwnerHeapLock.h HeapLayers.h RecursiveHeapLock.h TypeVariateSegregatedHeap.h HeapLayersSupport.h STLAllocator.h WorstFitFreeListHeap.h

© Imperial College London Composing Shared Memory Allocators Andy Cheadle Imperial College London.

Similar presentations

Presentation on theme: "© Imperial College London Composing Shared Memory Allocators Andy Cheadle Imperial College London."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

© Imperial College London Composing Shared Memory Allocators Andy Cheadle Imperial College London.

Similar presentations

Presentation on theme: "© Imperial College London Composing Shared Memory Allocators Andy Cheadle Imperial College London."— Presentation transcript:

Similar presentations

About project

Feedback