Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC 213 Lecture 10: BTrees. Announcements You should not need to do more than the lab exercise states  If only says add a CharRange, you should not need.

Similar presentations


Presentation on theme: "CSC 213 Lecture 10: BTrees. Announcements You should not need to do more than the lab exercise states  If only says add a CharRange, you should not need."— Presentation transcript:

1 CSC 213 Lecture 10: BTrees

2 Announcements You should not need to do more than the lab exercise states  If only says add a CharRange, you should not need to define a CharClass  Use your finite state machine drawings and description of classes

3 Red-Black Tree Follow-up Delete 3, 6, 5, 2, 8, 4, 1, 7 from this tree 4 2 1 6 57 8

4 Lies My Professor Told Me Big-Oh notation does not always correctly model algorithm performance For example, big-Oh Treats all memory accesses as equal:  Register: 1 cycle  Cache: 20 cycles  RAM: 240 cycles  Hard Drive: 200,000 cycles

5 Paging == Bad What happens when heap needs more memory than is in RAM?  Disk accesses dominate total running time  Execution can take 20 times or more longer

6 Paging == Bad

7 Virtual Memory Organization Virtual Memory works by dividing memory into pages  Size of a page is constant throughout system 4096 bytes used in a lot of systems  Operating system then handles each page separately When not being used, will evict pages to disk Must reload page whenever it is accessed

8 Problems with Trees Tree are important and common way to organize information  Provide consistent O(log n) access time, something we all like But nodes contain only 1 - 3 entries and 2 – 4 children  Get spread over entire heap in no real order  Great way to torture computer & make hard drive beg for mercy Good when using roommates machine, not so god for your own

9 BTrees to the Rescue! BTrees are the real-world solution to paging  Apple uses them to track directories and files in their file system  Organizational scheme used within MySQL database (e.g., the most popular database)  It also makes julienne fries

10 What is “the BTree?” BTrees are similar to (2,4) trees  All leaves are at same level  Nodes contain variable number of children and entries  But nodes are much, much bigger Usually discuss a BTree of order m  Internal nodes have m / 2 to m entries  Root node has m of fewer entries

11 BTree Order Select order to minimize paging  Sized so that a full node, including its entries, and references to the children fill a page  Since each node has at least m / 2 entries, each page is at least 50% full How many pages will we access searching for an element?

12 Insertion into a BTree Insertion begins as normal  Search through tree to find location to insert  Add entry into the node it belongs Check for overflow  If the node now has m+1 entries? Split into two nodes of size m / 2 and promote median (middle) entry into parent  Check if this causes the parent to overflow

13 Removal from a BTree Swap entry to be removed with its successor at the bottom level If node at bottom level now has fewer than m / 2 entries  If possible, move entry from sibling to parent and steal entry from parent  Otherwise, combine node with its sibling and steal entry from parent Check if this propagates underflow to parent

14 Choosing a Good m How do we choose a BTree’s order?  Want to minimize number of disk accesses  Want to maximize page usage  Select m so a full node fills a page Smallest node (using m / 2 entries) uses ½ a page What is the maximum number of pages used for any search, insert or remove?

15 Using BTrees One very common place to find BTrees is in databases  Often have too much data to fit in RAM  Needs simple, efficient organization But databases also need data to be stored permanently  Does not interact well with heap objects, since heap is stored in RAM

16 Database BTrees Maintain BTree in memory…  … but keep records on disk  Entries include ID and where in file to find the record  Immediately write changes back to the disk So we know that all updates will be saved  Also means we do not need to keep file in any specific order

17 RandomAccessFile For this scheme to work, we cannot read and write file sequentially  Instead, we must be able to jump around throughout entire file  Also need way to specifying locations in the file Java’s solution: the RandomAccessFile class

18 RandomAccessFile Instances can create new files or work with already existing files RandomAccessFile raf = new RandomAccessFile(“file.txt”, “rw”);  Creates file.txt if it does not already exist  Allows read and write access to the file  Throws an IOException if a problem arises  Can now use variable raf to access/modify the file

19 Reading RandomAccessFile Read from RandomAccessFile using normal file input methods:  boolean readBoolean(), int readInt(), double readDouble() … reads and returns the appropriate value  int read(byte[] b) re ad up to b.length bytes and store in b; return the number of bytes read

20 Writing to RandomAccessFile Write to RandomAccessFile using normal file output methods:  void writeInt(int i), void writeDouble(double d) … write the appropriate value to the file  void write(byte[] b) write the contents of the array b to the file

21 Typical File I/O Ordinarily we read and write files sequentially RandomAccessFile raf = new …; while (c != ‘s’) { c = raf.readChar(); } This is an example file we access raf:

22 Typical File I/O Ordinarily we read and write files sequentially RandomAccessFile raf = new …; while (c != ‘s’) { c = raf.readChar(); raf.writeChar(c); } This is an example file we access

23 Typical File I/O Ordinarily we read and write files sequentially RandomAccessFile raf = new …; while (c != ‘s’) { c = raf.readChar(); raf.writeChar(c); } TTis is an example file we access

24 Typical File I/O Ordinarily we read and write files sequentially RandomAccessFile raf = new …; while (c != ‘s’) { c = raf.readChar(); raf.writeChar(c); } TTii is an example file we access

25 Typical File I/O Ordinarily we read and write files sequentially RandomAccessFile raf = new …; while (c != ‘s’) { c = raf.readChar(); raf.writeChar(c); } TTii s an example file we access

26 Typical File I/O Ordinarily we read and write files sequentially RandomAccessFile raf = new …; while (c != ‘s’) { c = raf.readChar(); raf.writeChar(c); } TTii ssan example file we access

27 RandomAccessFile RandomAccessFile includes ability to position where we next read from/write to anywhere in file  void seek(long pos) moves anywhere in file  Positions are specified by the number of bytes from beginning of file

28 RandomAccessFile I/O Ordinarily we read and write files sequentially RandomAccessFile raf = new …; raf.seek(raf.length()-1); c = raf.readChar(); raf.seek(0); raf.writeChar(c); This is an example file we access

29 RandomAccessFile I/O Ordinarily we read and write files sequentially RandomAccessFile raf = new …; raf.seek(raf.length()-1); c = raf.readChar(); raf.seek(0); raf.writeChar(c); shis is an example file we access

30 So, how do we use this? We use these file positions to simplify our BTrees and entries  Each Entry contains the ID number and position of record within the file We can even use this to simplify building our BTree whenever we start the program  Record contents of BTree at end of file  Store num. elements in BTree at start of file  Can then find and read BTree at startup

31 Daily Quiz Write the Entry class that we would use with a disk-based BTree


Download ppt "CSC 213 Lecture 10: BTrees. Announcements You should not need to do more than the lab exercise states  If only says add a CharRange, you should not need."

Similar presentations


Ads by Google