1 CSE 326: Data Structures Trees. 2 Today: Splay Trees Fast both in worst-case amortized analysis and in practice Are used in the kernel of NT for keep.

Slides:



Advertisements
Similar presentations
Data Structures Haim Kaplan and Uri Zwick November 2012 Lecture 5 B-Trees.
Advertisements

1 Introduction to Database Systems CSE 444 Lectures 19: Data Storage and Indexes November 14, 2007.
AA Trees another alternative to AVL trees. Balanced Binary Search Trees A Binary Search Tree (BST) of N nodes is balanced if height is in O(log N) A balanced.
EECS 311: Chapter 4 Notes Chris Riesbeck EECS Northwestern.
Splay Trees CSIT 402 Data Structures II. Motivation Problems with other balanced trees – AVL: extra storage/complexity for height fields Periulous delete.
CSE332: Data Abstractions Lecture 10: More B-Trees Tyler Robison Summer
CSE332: Data Abstractions Lecture 9: B Trees Dan Grossman Spring 2010.
CPSC 320: Intermediate Algorithm Design & Analysis Splay Trees (for Amortized Analysis) Steve Wolfman 1.
CSE332: Data Abstractions Lecture 9: BTrees Tyler Robison Summer
CSE332: Data Abstractions Lecture 8: AVL Delete; Memory Hierarchy Dan Grossman Spring 2010.
1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.
CSE 326: Data Structures Splay Trees Ben Lerner Summer 2007.
CSC 213 Lecture 7: Binary, AVL, and Splay Trees. Binary Search Trees (§ 9.1) Binary search tree (BST) is a binary tree storing key- value pairs (entries):
CPSC 231 B-Trees (D.H.)1 LEARNING OBJECTIVES Problems with simple indexing. Multilevel indexing: B-Tree. –B-Tree creation: insertion and deletion of nodes.
1 CSE 326: Data Structures Part Four: Trees Henry Kautz Autumn 2002.
1 Lecture 20: Indexes Friday, February 25, Outline Representing data elements (12) Index structures (13.1, 13.2) B-trees (13.3)
CSE 326: Data Structures Lecture #7 Binary Search Trees Alon Halevy Spring Quarter 2001.
1 Lecture 19: B-trees and Hash Tables Wednesday, November 12, 2003.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
CS 206 Introduction to Computer Science II 11 / 24 / 2008 Instructor: Michael Eckmann.
CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees Alon Halevy Spring Quarter 2001.
Balanced Trees. Binary Search tree with a balance condition Why? For every node in the tree, the height of its left and right subtrees must differ by.
(B+-Trees, that is) Steve Wolfman 2014W1
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
Splay Trees Splay trees are binary search trees (BSTs) that:
0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
David Kaplan Dept of Computer Science & Engineering Autumn 2001
Splay Trees and B-Trees
Advanced Data Structures and Algorithms COSC-600 Lecture presentation-6.
B-trees (Balanced Trees) A B-tree is a special kind of tree, similar to a binary tree. However, It is not a binary search tree. It is not a binary tree.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
CSE373: Data Structures & Algorithms Lecture 15: B-Trees Linda Shapiro Winter 2015.
1 B-Trees & (a,b)-Trees CS 6310: Advanced Data Structures Western Michigan University Presented by: Lawrence Kalisz.
CSC 213 – Large Scale Programming. Today’s Goal  Review Map & Dictionary implementations  What do they do well? When would they be used?  Why do they.
B-Trees and Red Black Trees. Binary Trees B Trees spread data all over – Fine for memory – Bad on disks.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Starting at Binary Trees
Search Trees Chapter   . Outline  Binary Search Trees  AVL Trees  Splay Trees.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Xiaoying Gao, Peter Andreae, VUW B Trees and B+ Trees COMP 261.
CompSci 100E 39.1 Memory Model  For this course: Assume Uniform Access Time  All elements in an array accessible with same time cost  Reality is somewhat.
CSE 326: Data Structures Lecture #12 Splay It Again, Sam Steve Wolfman Winter Quarter 2000.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
Oct 26, 2001CSE 373, Autumn A Forest of Trees Binary search trees: simple. –good on average: O(log n) –bad in the worst case: O(n) AVL trees: more.
CSE 332 Data Abstractions: A Heterozygous Forest of AVL, Splay, and B Trees Kate Deibel Summer 2012 July 2, 2012CSE 332 Data Abstractions, Summer
© 2004 Goodrich, Tamassia Trees
Review for Exam 2 Topics covered (since exam 1): –Splay Tree –K-D Trees –RB Tree –Priority Queue and Binary Heap –B-Tree For each of these data structures.
CSE 326: Data Structures Lecture #11 AVL and Splay Trees Steve Wolfman Winter Quarter 2000.
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
CompSci Memory Model  For this course: Assume Uniform Access Time  All elements in an array accessible with same time cost  Reality is somewhat.
CIS 068 Welcome to CIS 068 ! Lesson 12: Data Structures 3 Trees.
Jim Anderson Comp 750, Fall 2009 Splay Trees - 1 Splay Trees In balanced tree schemes, explicit rules are followed to ensure balance. In splay trees, there.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees Linda Shapiro Winter 2015.
AVL Trees 1. Balancing a BST Goal – Keep the height small – For any node, left and right sub-tree have approximately the same height Ensures fast (O(lgn))
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
ITEC 2620M Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: ec2620m.htm Office: TEL 3049.
CSE 326: Data Structures Trees
Multiway Search Trees Data may not fit into main memory
Splay Trees.
SPLAY TREE Features Binary Search Tree Self adjusting balanced tree
Lecture 21: B-Trees Monday, Nov. 19, 2001.
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
CSE 326: Data Structures Splay Trees
CSE 326: Data Structures Lecture #10 Amazingly Vexing Letters
326 Lecture 9 Henry Kautz Winter Quarter 2002
Presentation transcript:

1 CSE 326: Data Structures Trees

2 Today: Splay Trees Fast both in worst-case amortized analysis and in practice Are used in the kernel of NT for keep track of process information! Invented by Sleator and Tarjan (1985) Details: Weiss 4.5 (basic splay trees) 11.5 (amortized analysis) 12.1 (better “top down” implementation)

3 Basic Idea “Blind” rebalancing – no height info kept! Worst-case time per operation is O(n) Worst-case amortized time is O(log n) Insert/find always rotates node to the root! Good locality: –Most commonly accessed keys move high in tree – become easier and easier to find

4 Idea You’re forced to make a really deep access: Since you’re down there anyway, fix up a lot of deep nodes! move n to root by series of zig-zag and zig-zig rotations, followed by a final single rotation (zig) if necessary

5 Zig-Zag* g X p Y n Z W * This is just a double rotation n Y g W p ZX Helped Unchanged Hurt up 2 down 1 up 1down 1

6 Zig-Zig n Z Y p X g W g W X p Y n Z

7 Why Splaying Helps Node n and its children are always helped (raised) Except for last step, nodes that are hurt by a zig- zag or zig-zig are later helped by a rotation higher up the tree! Result: –shallow nodes may increase depth by one or two –helped nodes decrease depth by a large amount If a node n on the access path is at depth d before the splay, it’s at about depth d/2 after the splay –Exceptions are the root, the child of the root, and the node splayed

8 Splaying Example Find(6) zig-zig

9 Still Splaying 6 zig-zig

10 Almost There, Stay on Target zig

11 Splay Again Find(4) zig-zag

12 Example Splayed Out zig-zag

13 Locality “Locality” – if an item is accessed, it is likely to be accessed again soon –Why? Assume m  n access in a tree of size n –Total worst case time is O(m log n) –O(log n) per access amortized time Suppose only k distinct items are accessed in the m accesses. –Time is O(n log n + m log k ) –Compare with O( m log n ) for AVL tree getting those k items near root those k items are all at the top of the tree

14 Splay Operations: Insert To insert, could do an ordinary BST insert –but would not fix up tree –A BST insert followed by a find (splay)? Better idea: do the splay before the insert! How?

15 Split Split(T, x) creates two BST’s L and R: –All elements of T are in either L or R –All elements in L are  x –All elements in R are  x –L and R share no elements Then how do we do the insert?

16 Split Split(T, x) creates two BST’s L and R: –All elements of T are in either L or R –All elements in L are  x –All elements in R are > x –L and R share no elements Then how do we do the insert? Insert as root, with children L and R

17 Splitting in Splay Trees How can we split? –We have the splay operation –We can find x or the parent of where x would be if we were to insert it as an ordinary BST –We can splay x or the parent to the root –Then break one of the links from the root to a child

18 Split split(x) TLR splay OR LRLR  x > x < x could be x, or what would have been the parent of x if root is  x if root is > x

19 Back to Insert split(x) LR x LR > x  x Insert(x): Split on x Join subtrees using x as root

20 Insert Example Insert(5) split(5)

21 Splay Operations: Delete find(x) LR x LR > x< x delete x Now what?

22 Join Join(L, R): given two trees such that L < R, merge them Splay on the maximum element in L then attach R LR R splay L

23 Delete Completed T find(x) LR x LR > x< x delete x T - x Join(L,R)

24 Delete Example Delete(4) find(4) Find max

25 Splay Trees, Summary Splay trees are arguably the most practical kind of self-balancing trees If number of finds is much larger than n, then locality is crucial! –Example: word-counting Also supports efficient Split and Join operations – useful for other tasks –E.g., range queries

26 Dictionary & Search ADTs Dictionary ADT (aka map ADT) Stores values associated with user-specified keys –keys may be any (homogenous) comparable type –values may be any (homogenous) type Search ADT: (aka Set ADT) stores keys only

27 Dictionary & Search ADTs insert(kohlrabi, upscale tuber) find(kreplach) kreplach: tasty stuffed dough create :  dictionary insert : dictionary  key  values  dictionary find :dictionary  key  values delete : dictionary  key  dictionary create :  dictionary insert : dictionary  key  values  dictionary find :dictionary  key  values delete : dictionary  key  dictionary kim chispicy cabbage Kreplachtasty stuffed dough KiwiAustralian fruit

28 Dictionary Implementations Arrays: –Unsorted –Sorted Linked lists BST –Random –AVL –Splay

29 Dictionary Implementations ArraysListsBinary Search Trees unsortedsortedAVLsplay insertO(1)O(n)O(1)O(log n) amortized findO(n)O(log n)O(n)O(log n) amortized delete find + O(1) O(n)find + O(1)O(log n) amortized

30 The last dictionary we discuss: B-Trees Suppose we want to store the data on disk A disk access is a lot more expensive than one CPU operation Example –1,000,000 entries in the dictionary –An AVL tree requires log(1,000,000)  20 disk accesses – this is expensive Idea in B Trees: –Increase the fan-out, decrease the hight –Make 1 node = 1 block

31 All keys are stored at leaves Nonleaf nodes have guidance keys, to help the search Parameter d = the degree B-Trees Basics book uses the order M = 2d+1) Rules for Keys: The root is either a leaf, or has between 1 and 2d keys All other nodes (except the root) have between d and 2d keys Rules for Keys: The root is either a leaf, or has between 1 and 2d keys All other nodes (except the root) have between d and 2d keys Rule for number of children: Each node (except leaves) has one more children than keys Rule for number of children: Each node (except leaves) has one more children than keys Balance rule: The tree is perfectly balanced ! Balance rule: The tree is perfectly balanced !

32 A non-leaf node: A leaf node: B-Trees Basics <=k<120120<=k<240 Keys 240<=k Record with key 40 Record with key 50Record with key 60 Next leaf Keys k < 30 Then called a B+ tree

33 B+Tree Example d = 2 (M = 5) Find the key  < 40  < 40  40

34 B+Tree Design How large d ? Example: –Key size = 4 bytes –Pointer size = 8 bytes –Block size = 4096 byes 2d x 4 + (2d+1)  8 <= 4096 d = 170

B+ Trees Depth Assume d = 170 How deep is the B-tree ? Depth = 0 (just the root)  at least 170 keys Depth = 1  at least  171  30  10 3 keys Depth = 2     5  10 6 keys Depth = 3    860  10 6 keys Depth = 4    147  10 9 keys Nobody has more keys ! With a B tree we can find any data item with at most 5 disk accesses !

36 Insertion in a B+ Tree Insert (K, P) Find leaf where K belongs, insert If no overflow (2d keys or less), halt If overflow (2d+1 keys), split node, insert in parent: If leaf, keep K3 too in right node When root splits, new root has 1 key only K1K2K3K4K5 P0P1P2P3P4p5 K1K2 P0P1P2 K4K5 P3P4p5 parent K3 parent