Presentation is loading. Please wait.

Presentation is loading. Please wait.

E.G.M. PetrakisTrees in Main Memory1 Balanced BST  Balanced BSTs guarantee O(logN) performance at all times  the height or left and right sub-trees are.

Similar presentations


Presentation on theme: "E.G.M. PetrakisTrees in Main Memory1 Balanced BST  Balanced BSTs guarantee O(logN) performance at all times  the height or left and right sub-trees are."— Presentation transcript:

1 E.G.M. PetrakisTrees in Main Memory1 Balanced BST  Balanced BSTs guarantee O(logN) performance at all times  the height or left and right sub-trees are about the same  simple BST are O(N) in the worst case  Categories of BSTs  AVL, SPLAY trees: dynamic environment  optimal trees: static environment

2 E.G.M. PetrakisTrees in Main Memory2 AVL Trees  AVL ( Adelson, Lelskii, Landis): the height of the left and right subtrees differ by at most 1  the same for every subtree  Number of comparisons for membership operations  best case: completely balanced  worst case: 1.44 log(N+2)  expected case: logN +.25 <=

3 E.G.M. PetrakisTrees in Main Memory3 AVL Trees  “—” : completely balanced sub-tree  “/” : the left sub-tree is 1 higher  “\” : the right sub-tree is 1 higher

4 E.G.M. PetrakisTrees in Main Memory4 AVL Trees

5 E.G.M. PetrakisTrees in Main Memory5 “//” : the left sub-tree is 2 higher “\\” : the right sub-tree is 2 higher critical node Non AVL Trees

6 E.G.M. PetrakisTrees in Main Memory6 Single Right Rotations  Insertions or deletions may result in non AVL trees => apply rotations to balance the tree T3T3 T1T1T2 Α Β T1T1 T3T3 Α Β

7 E.G.M. PetrakisTrees in Main Memory7 T1T1 Α Β Α T3T3T2 ΑB ΑΑ T3T3 T1T2 Single Left Rotations

8 E.G.M. PetrakisTrees in Main Memory8 4 2 4 2 1 single right rotation 2 1 insert 1 4 8 insert 9 4 8 single left rotation 8 4 4 9 9

9 E.G.M. PetrakisTrees in Main Memory9 Double Left Rotation  Composition of two single rotations (one right and one left rotation) ΑΒ T4 Α A C T1 ΑB T2 T3T3 or T4 Α Β C Α T1T1 or Τ2 T3T3 Α A T3T3 or ΑC Τ1 T4T4T2T2  

10 E.G.M. PetrakisTrees in Main Memory10 Example of Double Left Rotation 4 2 8 7 9 6 \\ / / == == 4 2 7 6 8 9 7 48 26 9   Critical node insert 6

11 E.G.M. PetrakisTrees in Main Memory11 Double Right Rotation  Composition of two single rotations (one left and one right rotation) ΑB ΑΑ T4T4 T1T1 ΑΒ T2 T3T3 or ΑC ΑB T4T4 or A T1 T3 Α C Β T3T3 or ΑA Τ4Τ4Τ4Τ4 T2T2 T1  

12 E.G.M. PetrakisTrees in Main Memory12 Insertion (deletion) in AVL 1.Follow the search path to verify that the key is not already there 2.Insert (delete) the key 3.Retreat along the search path and check the balance factor 4.Rebalance if necessary (see next)

13 E.G.M. PetrakisTrees in Main Memory13 Rebalancing  For every node reached coming up from its left sub-tree after insertion readjust balance factor  ‘=’ becomes ‘/’ => no operation  ‘\’ becomes ‘=’ => no operation  ‘/’ becomes ‘//’ => must be rebalanced!!  The “//” node becomes a “critical node”  Only the path from the critical node to the leaf has to be rebalanced!!  Rotation is applied only at the critical node!

14 E.G.M. PetrakisTrees in Main Memory14 Rebalancing (cont.)  The balance factor of the critical node determines what rotation is to take place  single or double  If the child and the grand child (inserted node) of the critical node are on the same direction (both “/”) => single rotation  Else => double rotation  Rebalance similarly if coming up from the right sub-tree (opposite signs)

15 E.G.M. PetrakisTrees in Main Memory15 Performance  Performance of membership operations on AVL trees:  easy for the worst case!  An AVL tree will never be more than 45% higher that its perfectly balanced counterpart (Adelson, Velskii, Landis):  log(N+1) <= h b (N) <= l.4404log(N+2) – 0.302

16 E.G.M. PetrakisTrees in Main Memory16 Worst case AVL  Sparse tree => each sub-tree has minimum number of nodes

17 E.G.M. PetrakisTrees in Main Memory17 Fibonacci Trees  T h : tree of height h  T h has two sub-trees, one with height h-1 and one with height h-2  else it wouldn’t have minimum number of nodes  T 0 is the empty sub-tree (also Fibonacci)  T 1 is the sub-tree with 1 node (also Fibonacci)

18 E.G.M. PetrakisTrees in Main Memory18 Fibonacci Trees (cont.)  Average height  N h number of nodes of T h  N h = N h-1 + N h-2 + 1  N 0 = 1  N 1 = 2  From which h <= 1.44 log(N+1)

19 E.G.M. PetrakisTrees in Main Memory19 single rotation 7 8 7 8 9 insert 9 8 97 More Examples

20 E.G.M. PetrakisTrees in Main Memory20 double rotation 6 8 6 8 7 insert 7 7 86 Examples (cont.)

21 E.G.M. PetrakisTrees in Main Memory21 double rotation 8 6 8 6 7 insert 7 7 86 Examples (cont.)

22 E.G.M. PetrakisTrees in Main Memory22 delete 6 6 7 9 8 7 9 8 8 9 7 single rotation Examples (cont.)

23 E.G.M. PetrakisTrees in Main Memory23 5 6 9 8 6 9 8 8 9 6 delete 5 7 7 7 single rotation Examples (cont.)

24 E.G.M. PetrakisTrees in Main Memory24 5 6 8 6 8 7 8 6 delete 5 7 7 double rotation Examples (cont.)

25 E.G.M. PetrakisTrees in Main Memory25 5 3 4 8 7 1010 9 6 11 2 delete 4 5 2 3 8 7 1010 9 6 11 1 1 delete 8 General Deletions

26 E.G.M. PetrakisTrees in Main Memory26 delete 8 delete 5 5 2 3 7 6 1010 9 11 1 delete 6 5 2 3 10 7 1 9 1 General Deletions (cont.)

27 E.G.M. PetrakisTrees in Main Memory27 delete 5 3 2 10 7 1 9 1 General Deletions (cont.)

28 E.G.M. PetrakisTrees in Main Memory28 Self Organizing Search  Splay Trees: adapt to query patterns  move to root (front): whenever a node is accessed move it to the root using rotations  equivalent to move-to-front in arrays  current node  insertions: inserted node  deletions: father of deleted node or null if this is the root  membership: the last accessed node

29 E.G.M. PetrakisTrees in Main Memory29 20 15 30 13 10 14 8 Search(10) first rotation 20 15 30 10 813 14 second rotation 20 3010 158 13 14 third rotation 10 820 3015 13 14

30 E.G.M. PetrakisTrees in Main Memory30 Splay Cases  If the current node q has no grandfather but it has father p => only one single rotation –two symmetric cases: L, R q a bc q ab c p p

31 E.G.M. PetrakisTrees in Main Memory31 gp q p q a b cd p ab c d LL gp p q a bc d p q a b c d RL LL symmetric of RR RL symmetric of RL  If p has also grandfather qp => 4 cases

32 E.G.M. PetrakisTrees in Main Memory32 1 aq 8 j i 2 7 h 6 g 3 c 4 d 5 e f current node 1 αq 8 j i 2 7 h 6 g c 4 d 5 e f 3 RR LL

33 E.G.M. PetrakisTrees in Main Memory33 1 aq 8 j i 2 7 6 g 3 c 4 5 e f b LR 1 αq 8 j i 2 7 6 h 3 c 4 5 e f d g RL a, b, c … are sub-trees

34 E.G.M. PetrakisTrees in Main Memory34 1 a q 8j i 2 7 6 h 3 c 4 5 e f b g d

35 E.G.M. PetrakisTrees in Main Memory35 Splay Performance  Splay trees adapt to unknown or changing probability distributions  Splay trees do not guarantee logarithmic cost for each access  AVL trees do!  asymptotic cost close to the cost of the optimal BST for unknown probability distributions  It can be shown that the “cost of m operations on an initially empty splay tree, where n are insertions is O(mlogn) in the worst case”

36 E.G.M. PetrakisTrees in Main Memory36 Optimal BST  Static environment: no insertions or deletions  Keys are accessed with various frequencies  Have the most frequently accessed keys near the root  Application: a symbol table in main memory

37 E.G.M. PetrakisTrees in Main Memory37 Searching  Given symbols a 1 < a 2 < ….< a n and their probabilities: p 1, p 2, … p n minimize cost  Successful search  Transform unsuccessful to successful  consider new symbols E 1, E 2, … E n -  … α 1 … α 2 …α i … α i + 1 ….…α n … α n + 1 E0E0 E1E1 E2E2 EiEi EnEn E i = (α i, α i + 1 ) E 0 = (- , α 1 )E n = (α n,  )

38 E.G.M. PetrakisTrees in Main Memory38 anan a n- 1 a n-2 EiEiEiEi failure node a n-3 unsuccessful search for all values in E i terminates on the same failure node (in fact, one node higher) Unsuccessful Search

39 E.G.M. PetrakisTrees in Main Memory39 (a 1, a 2, a 3 ) = (do, if, read) p i = q i = 1/7read read read if if if do do do cost = 15/7 cost = 13/7 cost = 15/7 Optimal BST if Example

40 E.G.M. PetrakisTrees in Main Memory40 read if do if read do cost = 15/7

41 E.G.M. PetrakisTrees in Main Memory41 Search Cost  If p i is the probability to search for a i and q i is the probability to search in E i then successful search unsuccessful search

42 E.G.M. PetrakisTrees in Main Memory42  In a BST, a subtree has nodes that are consecutive in a sorted sequence of keys (e.g. [5,26]) 1025 132426 1214 2020 25 132426 14 5 Observation 1

43 E.G.M. PetrakisTrees in Main Memory43  If T ij is a sub-tree of an optimal BST holding keys from i to j then  T ij must be optimal among all possible BSTs that store the same keys  optimality lemma: all sub-trees of an optimal BST are also optimal Observation 2

44 E.G.M. PetrakisTrees in Main Memory44 Optimal BST Construction 1)Construct all possible trees: NP-hard, there are trees!! 2)Dynamic programming solves the problem in polynomial time O(n 3 )  at each step, the algorithm finds and stores the optimal tree in each range of key values  increases the size of the range at each step until the whole range is obtained

45 E.G.M. PetrakisTrees in Main Memory45 Example (successful search only): 1)BSTs with1 node range 1102040 cost= 0.30.20.10.4 2)BSTs with2 nodesk=1-10 k=10-20 k=20-20 range 1-10range 20-40range 10-20 1 10 20 40 cost=0.3 1+0.2 2=0.7 cost=0.2+0.1 2=0.4 10 1 40 20 10 cost=0.2 1+0,3.2=0.8cost=0.1+0.2 2=0.5 optimal keys1102040 probabilities0,30,20,10,4 cost=0.1+0.8=0.9 cost=0.4+0.2=0.6 optimal

46 E.G.M. PetrakisTrees in Main Memory46 k=1-20 range 1-20 1 20 10 cost=0.1+2 0.3+3 0.2=1.3 1 20 10 cost=0.2+2(0.3+0.1)=1 1 10 20 k=10-40 range 10-40 10 40 20 cost=0.2+2 0.4+3 0.1=1.3 cost=0.3+2 0.2+3 0.1=1 20 1040 cost=0.1+2(0.2+0.4)=1.3 40 10 20 cost=0.4+2 0.2+30.1=1.1 optimal 3)BSTs with3 nodes

47 E.G.M. PetrakisTrees in Main Memory47 4)BSTs with4 nodes range 1-40 1 40 10 20 cost=0.3+2 0.4+3 0.2+4 0.1=2.1 10 1 40 20 cost=0.2+2(0.3+0.4)+3 0.1=1.9 20 40 1 1 10 OPTIMAL BST cost=0.1+2(0.3+0.4)+3 0.2=2.1 cost=0.4+2 0.3+3 0.2+4 0.1=2

48 E.G.M. PetrakisTrees in Main Memory48 Complexity  Compute all optimal BSTs for all C ij, i,j=1,2..n  Let m=j-i: number of keys in range C ij  n-m-1 C ij ’s must be computed  The one with the minimum cost must be found, this takes O(m(n-m-1)) time  For all C ij ’s it takes  There is a better O(n 2 ) algorithm by Knuth  There is also a O(n) greedy algorithm

49 E.G.M. PetrakisTrees in Main Memory49 Optimal BSTs  High probability keys should be near the root  But, the value of a key is also a factor  It may not be desirable to put the smallest or largest key close to the root => this may result in skinny trees (e.g., lists)


Download ppt "E.G.M. PetrakisTrees in Main Memory1 Balanced BST  Balanced BSTs guarantee O(logN) performance at all times  the height or left and right sub-trees are."

Similar presentations


Ads by Google