Dynamic memory allocation and fragmentation

Dynamic memory allocation and fragmentation
Seminar on Network and Operating Systems Group II

Schedule Today (Monday): Thursday:
General memory allocation mechanisms The Buddy System Thursday: General Object Caching Slabs

What is an allocator and what must it do?

Memory Allocator Keeps track of memory in use and free memory
Must be fast and waste little memory Services memory requests it receives Prevent forming of memory “holes” “For any possible allocation algorithm, there will always be a program behavior that forces it into severe fragmentation.” This is the general concept of what a memory allocator does or at least what it is supposed to do The allocator makes the choice where in memory to place a free block. It has unlimited access to all the machines memory and can request more memory from the operating system if necessary

The three levels of an allocator
Strategies Try to find regularities in incoming memory requests. Policies Decides where and how to place blocks in memory (selected by the strategy) Mechanisms The algorithms that implement the policy All basic memory allocators have a three level design which helps them do their job efficiently The Mechanism implements a policy which is motivated by a strategy. ((An allocator algorithm should be regarded as the mechanism that implements a placement policy which is motivated by a strategy for minimizing fragmentation.))

Policy techniques Uses splitting and coalescing to satisfy the incoming requests. Split large blocks for small requests Coalesce small blocks for larger requests Policies use mainly two techniques to satisfy incoming requests, splitting of large blocks and merging of small blocks

Fragmentation, why is it a problem?

Fragmentation Fragmentation is the inability to reuse memory that is free External fragmentation occurs when enough free memory is available but isn’t contiguous Many small holes Internal fragmentation arises when a large enough block is allocated but it is bigger than needed Blocks are usually split to prevent internal fragmentation Fragmentation is the inability to reuse memory that is free Most programs need memory that can expand “dynamic memory” during the program execution. Internal fragmentation is often accepted to prevent external which is considered a more serious problem.

What causes fragmentation?
Isolated deaths When adjacent objects do not die at the same time. Time-varying program behavior Memory requests change unexpectedly The most common causes for fragmentation are Isolated deaths and time-varying program behavior. An allocator that can predict the death of objects can exploit that information to reduce fragmentation. Program could free small blocks and request large ones instead. If possible the allocator’s should try to exploit these patterns or at least not let them undermine its strategy.

Why traditional approaches don’t work
Program behavior is not predictable in general The ability to reuse memory depends on the future interaction between the program and the allocator 100 blocks of size 10 and 200 of size 20? The inability to reuse memory depends not only on the number and sizes of holes, but on the future behavior of the program and the future responses of the allocator itself. Unfortunately there’s no way to predict general program behavior If there are 100 blocks of size 10 and 200 of size 20. Is the memory fragmented? Yes: All requests are of size 10? No: All requests are of size 30? That’s a problem. Even a request of exactly 100 blocks of 10 and 200 of 20 depends on the order in which they arrive. [Best fit would do] All depends on the moment by moment decisions of where to place them (important to have good placement policies).

How do we avoid fragmentation?
A single death is a tragedy. A million deaths is a statistic. -Joseph Stalin

Understanding program behavior
Common behavioral patterns Ramps Data structures that are accumulated over time Peaks Memory used in bursty patterns usually while building up temporal data structures. Plateaus Data structures build quickly and are used for long periods of time Real programs do not generally behave randomly. That’s why no single allocator policy works for all cases. Most common behavioral patterns are: Ramps which are used to refer to something that grows slowly over the whole execution of a program Peaks are faster growing volumes of objects that are discarded before the end of execution Plateaus are data structures that build quickly and are used for long periods of time

Memory usage in the GNU C Compiler
KBytes in use This shows memory usage for the GNU C compiler compiling the largest file of it’s own source code (combine.c) We can clearly see the peak behavior. Allocation Time in Megabytes

Memory usage in the Grobner program
KBytes in use This shows memory usage for the Grobner program, the program does a lot of data collection and rewrites, resulting in a typical ramp pattern (including some plateaus) A ramp or plateau profile has a very convenient property, in that if short-term fragmentation can be avoided, long term fragmentation is not a problem. Because nothing is freed until the end of the program so reuse is not an issue. Allocation Time in Megabytes

Memory usage in Espresso PLA Optimizer
KBytes in use Shows memory usage for a run of Espresso, which is an optimizer for programmable logic array design. This shows that typical program behavior is hard (if not impossible) to predict. Allocation Time in Megabytes

Mechanisms Most common mechanisms used Sequential fits
Segregated free lists Buddy System Bitmap fits Index fits

Sequential fits Based on a single linear list
Stores all free memory blocks Usually circularly or doubly linked Most use boundary tag technique Most common mechanisms use this method. The most commonly used mechanisms are sequential fits which are essentially just single linear list or array of all free blocks Boundary tags are stored in a special footer which is added to the block and includes the block size and various flags. The footer also includes pointers to neighboring blocks. This allows for easy traversal. Boundary tag and doubly linked list together makes coalescing simple and very fast. Does not scale well in terms of time costs, because the number of free block grows and the time to search the list may become excessively long.

Sequential fits Best fit, First fit, Worst fit Next fit Optimal fit
Uses a roving pointer for allocation Optimal fit “Samples” the list first to find a good enough fit Half fit Splits blocks twice the requested size General policy rules for sequential fits: Decide how the list should be ordered Define a splitting threshold

Segregated free lists Use arrays of lists which hold free blocks of particular size Use size classes for indexing purposes Usually in sizes that are a power of two Requested sizes are rounded up to the nearest available size 2 4 8 16 32 64 128 Another method is segregated free lists which add another dimension to sequential fits making searching faster. Each list is indexed by size classes which group similar size blocks together

Segregated free lists Simple segregated list Segregated fit
No splitting of free blocks Subject to severe external fragmentation Segregated fit Splits larger blocks if there is no free block in the appropriate free list Uses first fit or next fit to find a free block Three types: exact lists, strict size classes with rounding or size classes with range lists. Simple segregated lists only allocate objects of a single size, or small range of sizes. This makes allocation fast, but may lead to high external fragmentation, as unused parts of blocks cannot be reused for other object sizes. With segregated fits we gain the ability to split larger blocks when smaller ones are requested. Exact lists: separate list for each possible block size. Strict size classes with rounding: uses size classes (for example powers of two). Requested sizes are then rounded up to mach one of the sizes in the size class series. When splitting the resulting block size must be present in the list. Size classes with range lists: works the same as the previous. Only it allows each list to contain blocks of slightly different sizes.

Buddy system A special case of segregated fit
Supports limited splitting and coalescing Separate free list for each allowable size Simple block address computation A free block can only be merged with its unique buddy. Only whole entirely free blocks can be merged. One implementation of segregated fits using size classes with rounding is the buddy system. It supports limited but efficient splitting and coalescing. In the buddy system when a large block is split into two parts each part becomes a unique buddy to the other. A split block can only be merged with its unique buddy block.

Buddy system 16 MB 3 MB 8 MB 4 MB Free
Each level in the tree is a separate free list for a single size class. When the allocated node is freed then it checks its buddies status (free or allocated). If and only if it is also free then can the be merged into a larger block (and so on). 4 MB

Buddy system 16 MB 3 MB 8 MB 4 MB Split Free Free
Each level in the tree is a separate free list for a single size class. When the allocated node is freed then it checks its buddies status (free or allocated). If and only if it is also free then can the be merged into a larger block (and so on). 4 MB

Buddy system 16 MB 3 MB 8 MB 4 MB Split Split Free Free Free
Each level in the tree is a separate free list for a single size class. When the allocated node is freed then it checks its buddies status (free or allocated). If and only if it is also free then can the be merged into a larger block (and so on). 4 MB Free Free

Buddy system 16 MB 8 MB 4 MB Split Split Free Alloc. Free
Each level in the tree is a separate free list for a single size class. When the allocated node is freed then it checks its buddies status (free or allocated). If and only if it is also free then can the be merged into a larger block (and so on). 4 MB Alloc. Free

Binary buddies Simplest implementation
All buddy sizes are powers of two Each block divided into two equal parts Internal fragmentation very high Expected 28%, in practice usually higher (Demonstration applet) Binary buddies are the simplest implementation of the buddy system. All buddy sizes are in powers of two and each buddy in a pair is of equal size. Internal fragmentation, because any object size must be rounded up to the nearest size and the difference in size between classes is fairly high. (( Equal block sizes make address computation simple, because all buddies are aligned on a power of two boundary offset. System based on closer size classes may be similarly efficient if lookup tables are used (to perform size class mappings). ))

Fibonacci buddies Size classes based on the fibonacci series
More closely-spaced set of size classes Reduces internal fragmentation Blocks can only be split into sizes that are also in the series Uneven block sizes a disadvantage? When allocating many equal sized blocks Fibonacci buddies base all size classes on the fibonacci series to try to reduce internal fragmentation Numbers in the fibonacci series have that quality that they can only be split into 2 uneven parts. This can be a disadvantage when allocating many equal sized blocks

Fibonacci buddies 2 3 5 8 13 21 34 55 … Size series: Splitting blocks:
… Splitting blocks: 13 21 This is an example of how blocks in the fibonacci buddy system are split. 5 8 8 13

Weighted buddies Size classes are power of two
Between each pair is a size three times a power of two Two different splitting methods 2x numbers can be split in half 2x*3 numbers can be split in half or unevenly into two sizes. Another variant of the buddy system is weighted buddies. Size classes are a power of two and between each pair is a size three times a power of two. This is done to get even closer size classes to reduce internal fragmentation even more. Weighted buddies have two different splitting rules depending on whether the numbers are a power of two or three times the power of two.

Weighted buddies Size series: … (21) (20*3) (22) (21*3) (23) (22*3) (24) (23*3) … Splitting of 2x*3 numbers: 6 3 6 2 4 2x numbers can only be split in half 2x*3 numbers can be split in half or unevenly into two parts, leaving one block at 1/3 of the original size and the other at 2/3.

Double buddies Use 2 different binary buddy series Splitting rules
One list uses powers of two sizes Other uses powers of two spacing, offset by x Splitting rules Blocks can only be split in half Split blocks stay in the same series The last variation of the buddy system we discuss is the double buddy system. It unlike the other implementations uses 2 distinct size series each with different size classes. One list uses powers of two sizes Other uses powers of two spacing, offset by any number x

Double buddies Size series: … (21) (22) (23) (24) (25) (26) (27)… … (3*20) (3*21) (3*22) (3*23) (3*24) (3*25) (3*26)… In this example we offset the second list by 3 giving us the same size blocks as are in the weighted buddy system. The difference lies in the splitting and merging rules, that is free space is not shared between the two lists. So when splitting a block of size 6 we can not… Requested sizes are rounded up to the nearest size class in either series. This reduces the internal fragmentation by about 50% Splitting of 3*2x numbers: 6 3 6 2 4

Deferred coalescing Blocks are not merged as soon as they are freed.
Uses quick lists or subpools Arrays of free lists, one for each size class that is to be deferred Blocks larger than those defined to be deferred are returned to the general allocator Now we like to discuss some common enhancements that can be used with all basic allocator mechanisms One of them is deferred coalescing or delayed merging. It stores recently freed blocks of predefined sizes in a cache array and doesn’t merge them as soon as they are released. This saves time when allocating a large number of similar sized blocks. Deferred coalescing vary in some ways: How often they coalesce items from quick lists Which items are coalesced In what order the items are chosen for coalescing The order in which items are allocated from the quick lists (LIFO, FIFO)

Deferred reuse Recently freed blocks are not immediately reused
Older free blocks used instead of newly freed Compacts long-lived memory blocks Can cause increased fragmentation if only short-lived blocks are requested Is a method to optimize memory allocation and reduce fragmentation. It gives neighbors of a freed block more time to die by delaying the freed blocks reuse. And thus increasing the possibility that they can be merged.

Discussion

Questions? Why can deferred reuse cause increased fragmentation if only short-lived blocks are requested? How can the order in which the requests arrive effect memory fragmentation? Why is fragmentation at peaks more important than at intervening points? 1. Newly-allocated objects will be placed in holes left by old objects that have died, thus mixing objects created by different phases (which may die at different times). 2. A large number of small request in a row can prevent the possibility of allocating large successive requests 3. Scattered holes in the heap “most of the time” are not a problem if they are filled before a peak is reached

Questions? When would deferred coalescing be likely to cause more fragmentation? What is a possible disadvantage when splitting blocks using the fibonacci buddy system? In the double buddy system, why does the added size-class list reduce internal fragmentation by about 50%? Deferred coalescing may have significant effects on fragmentation, by changing the allocator’s decisions as to which blocks of memory to use to hold which objects … used memory could become scattered. The remaining block is of a different size is less likely to be useful if the program allocates many objects of the same size. Because the added list increase the number of available size-classes twofold and reduces the size gap between classes

Dynamic memory allocation and fragmentation

Similar presentations

Presentation on theme: "Dynamic memory allocation and fragmentation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dynamic memory allocation and fragmentation

Similar presentations

Presentation on theme: "Dynamic memory allocation and fragmentation"— Presentation transcript:

Similar presentations

About project

Feedback