Presentation on theme: "15.7 Buffer Management By Snigdha Rao Parvatneni SJSU ID: 008648978 Class Roll Number: 124 Course: CS257."— Presentation transcript:
15.7 Buffer Management By Snigdha Rao Parvatneni SJSU ID: Class Roll Number: 124 Course: CS257
Agenda Introduction Role of Buffer Manager Architecture of Buffer Management Buffer Management Strategies Relation between Physical Operator Selection and Buffer Management Example
Introduction Generally, we assume that operators in relations have some main memory buffers to store the data. It is very rare that these buffers are allocated in advance to the operator. Task of assigning main memory buffers to processes is given to the Buffer Manager. The Buffer Manager is responsible for allocating the main memory to the process as per the need and minimizing the delays and unsatisfiable requests.
Role of Buffer Manager Buffer Manager responds to the request for main memory access to disk blocks. Below picture depicts it.
Architecture of Buffer Management There are two broad architectures for a buffer manager: – Buffer manager controls main memory directly like in many Relational DBMS. – Buffer manager allocates buffers in virtual memory and let the OS decide which buffers should be in main memory and which buffer should be in OS managed disk swap space like in many Object Oriented DBMS and Main Memory DBMS.
Problem Irrespective of approach there is a problem that buffer manager has to limit number of buffers, to fit in available main memory. – When buffer manager controls main memory directly If requests exceeds available space then buffer manager has to select a buffer to empty by returning its contents to disk. When blocks have not been changed then they are simply erased from main memory. But, when blocks have been changed then they are written back to its place on disk. – When buffer manager allocates space in virtual memory Buffer manager has the option of allocating more buffers, which can actually fit into main memory. When all these buffers will be in use then there will be thrashing. It is an operating system problem where many blocks are moved in and out of disks swap space. Therefore, system will end up spending most of time in swapping blocks and getting very little work done.
Solution To resolve this problem When DBMS is initialized then the number of buffers is set. User need not worry about mode of buffering used. For users there is a fixed size buffer pool, in other words set of buffers are available to query and to other database actions.
Buffer Management Strategies Buffer Manager needs to make a critical choice of which block to keep and which block to discard when buffer is needed for newly requested blocks. Then buffer manager uses buffer replacement strategies. Some common strategies are – – Least-Recently-Used (LRU) – First-In-First-Out (FIFO) – The Clock Algorithm (Second Chance) – System Control
Last-Recently Used (LRU) This rule is to throw out the block which has not been read or written from long time. To do this the Buffer Manager needs to maintain a table which will indicate the last time when block in each buffer was accessed. It is also needed that each database access should make an entry in this table. Significant amount of is involved effort in maintaining this information. Buffers which are not used from long time is less likely to be accessed before than those buffers which have been accessed recently. Hence, It is an effective strategy.
First-In-First-Out (FIFO) In this rule, when buffer is needed then the buffer which has been occupied for longest by same block is emptied and used by new block. To do this Buffer Manager needs to know only the time at which block occupying the buffer was loaded into the buffer. Entry in the table is made when block is read from disk, not every time it is accessed. Involves less maintenance than LRU but it is more prone to mistakes.
The Clock Algorithm It is an efficient approximation of LRU and is commonly implemented. Buffers are treated to be arranged in circle where arrow points to one of the buffers. Arrow will rotate clockwise if it needs to find a buffer to place a disk block. Each buffer has an associated flag with value 0 or 1. Buffers with flag value 0 are vulnerable to content transfer to disk whereas buffer with flag value 1 are not vulnerable. Whenever block is read into buffer or contents of buffer are accessed, flag associated with it is set to 1.
Working of Clocks Algorithm Whenever buffer is needed for the block arrow looks for first 0 it can find in clockwise direction. Arrow move changes flag value from 1 to 0. Block is thrown out of buffer only when it remains unaccessed i.e. flag value 0 for the time between two rotations of the arrow. First rotation when flag is set from 1 to 0 and second rotation when arrow comes back to check flag value.
System Control Query processor and other DBMS components can advice buffer manager to avoid some mistake which occurs with LRU, FIFO or Clock. Some blocks cannot be moved out of main memory without modifying other blocks pointing to it. Such blocks are called pinned blocks. Buffer Manager needs to modify buffer replacement strategy, to avoid expelling pinned blocks. Thats why some blocks are remains in main memory even though there is no technical reason for not writing it to the disk.
Relation Between Physical Operator Selection And Buffer Management Query optimizer selects the physical operator to execute the query. These physical operator expects certain number of buffers for execution. However, the buffer Manager does not guarantee the availability of these buffers when query is executed. In this situation two question arises – Can an algorithm adapt to changes in the number of available main memory buffers? – When expected number of available buffers are less, then some blocks needs to be put in the disk instead of main memory. How buffer replacement strategy affects the performance?
Example Block based nested loop join – algorithm does not depends upon number of available buffers M, but performance depends. For each M-1 blocks of outer loop relation S, read blocks in main memory, organize the tuple into search structure where key is the common attribute between R and S. Now for each block b of inner loop relation R, read b into main memory and for each tuple t of b find tuples in S that join with t. The S uses M-1 buffers and it depends upon average number of buffers available at each iteration. One buffer is reserved for R. If we pin M-1 block that we use for S in one iteration of outer loop then we cannot loose these buffers during that round. If more buffers will become available then more blocks of R can be kept in the memory. Will it improve the running time?
Cases with LRU Case1 – When LRU is used as buffer replacement strategy and k buffers are available to hold blocks of R. – R is read in order such that blocks that remains in the buffer at the end of iteration of outer loop will be last k blocks of R. – For next iteration we will again start from beginning of R. Therefore, k buffers for R needs to be replace. Case2 – With better implementation of nested loop join when LRU is used visit blocks of R in order that alternates first to last then last to first. – In this we save k disk I/O on each iteration except first iteration.
With Other Algorithms Other algorithms also are impacted by the fact that availability of buffer can vary and by the buffer-replacement strategy used by the buffer manager. In sort based algorithm when availability of buffers reduces we can change the size of a sub-lists. Major limitation of this is we will be forced to create many sub-lists that we cannot then allocate a buffer for each sub- list in the merging process. In hash based algorithm when availability of buffers reduces we can reduce the size of buckets, provided bucket then should not become so large that they do not fit into the allotted main memory.
References DATABASE SYSTEMS: The Complete Book, Second Edition by Hector Garcia-Molina, Jeffrey D. Ullman & Jennifer Widom