CSE332: Data Abstractions Lecture 6: Dictionaries; Binary Search Trees Tyler Robison Summer 2010 1.

Slides:



Advertisements
Similar presentations
COL 106 Shweta Agrawal and Amit Kumar
Advertisements

Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Trees Types and Operations
S. Sudarshan Based partly on material from Fawzi Emad & Chau-Wen Tseng
1 abstract containers hierarchical (1 to many) graph (many to many) first ith last sequence/linear (1 to 1) set.
Binary Trees, Binary Search Trees CMPS 2133 Spring 2008.
CSE332: Data Abstractions Lecture 10: More B-Trees Tyler Robison Summer
CSE332: Data Abstractions Lecture 9: B Trees Dan Grossman Spring 2010.
CSE332: Data Abstractions Lecture 6: Dictionaries; Binary Search Trees Dan Grossman Spring 2010.
CSE332: Data Abstractions Lecture 9: BTrees Tyler Robison Summer
CSE332: Data Abstractions Lecture 7: AVL Trees Tyler Robison Summer
Binary Heaps CSE 373 Data Structures Lecture 11. 2/5/03Binary Heaps - Lecture 112 Readings Reading ›Sections
CSE 326: Data Structures Lecture #7 Branching Out Steve Wolfman Winter Quarter 2000.
CSE 326 Binary Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001.
Version TCSS 342, Winter 2006 Lecture Notes Priority Queues Heaps.
CSE 326: Data Structures Binary Search Trees Ben Lerner Summer 2007.
BST Data Structure A BST node contains: A BST contains
CSE 326: Data Structures Lecture #7 Binary Search Trees Alon Halevy Spring Quarter 2001.
1 abstract containers hierarchical (1 to many) graph (many to many) first ith last sequence/linear (1 to 1) set.
CSE 326: Data Structures Lecture #8 Binary Search Trees Alon Halevy Spring Quarter 2001.
CSE373: Data Structures & Algorithms Lecture 5: Dictionaries; Binary Search Trees Aaron Bauer Winter 2014.
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees Nicki Dell Spring 2014 CSE373: Data Structures & Algorithms1.
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees Lauren Milne Summer
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees Nicki Dell Spring 2014 CSE373: Data Structures & Algorithms1.
1 Search Trees - Motivation Assume you would like to store several (key, value) pairs in a data structure that would support the following operations efficiently.
1 Trees A tree is a data structure used to represent different kinds of data and help solve a number of algorithmic problems Game trees (i.e., chess ),
1 Binary Trees Informal defn: each node has 0, 1, or 2 children Informal defn: each node has 0, 1, or 2 children Formal defn: a binary tree is a structure.
CPSC 221: Data Structures Lecture #5 Branching Out Steve Wolfman 2014W1 1.
1 Searching Searching in a sorted linked list takes linear time in the worst and average case. Searching in a sorted array takes logarithmic time in the.
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued Aaron Bauer Winter 2014 CSE373: Data Structures & Algorithms1.
Programming Abstractions Cynthia Lee CS106X. Topics:  Binary Search Tree (BST) › Starting with a dream: binary search in a linked list? › How our dream.
CSE373: Data Structures & Algorithms Lecture 6: Priority Queues Kevin Quinn Fall 2015.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
ADT Binary Search Tree Ellen Walker CPSC 201 Data Structures Hiram College.
Rooted Tree a b d ef i j g h c k root parent node (self) child descendent leaf (no children) e, i, k, g, h are leaves internal node (not a leaf) sibling.
CSE373: Data Structures & Algorithms Lecture 4: Dictionaries; Binary Search Trees Dan Grossman Fall 2013.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees Linda Shapiro Winter 2015.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
Binary Search Trees.  Understand tree terminology  Understand and implement tree traversals  Define the binary search tree property  Implement binary.
1 Joe Meehean. A A B B D D I I C C E E X X A A B B D D I I C C E E X X  Terminology each circle is a node pointers are edges topmost node is the root.
1 CSE 326: Data Structures Trees Lecture 6: Friday, Jan 17, 2003.
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees Linda Shapiro Winter 2015.
CSE373: Data Structures & Algorithms Lecture 5: Dictionary ADTs; Binary Trees Lauren Milne Summer 2015.
CSE373: Data Structures & Algorithms Lecture 4: Dictionaries; Binary Search Trees Kevin Quinn Fall 2015.
CS6045: Advanced Algorithms Data Structures. Dynamic Sets Next few lectures will focus on data structures rather than straight algorithms In particular,
BSTs, AVL Trees and Heaps Ezgi Shenqi Bran. What to know about Trees? Height of a tree Length of the longest path from root to a leaf Height of an empty.
CSE373: Data Structures & Algorithms Priority Queues
CSE373: Data Structures & Algorithms Lecture 5: Dictionary ADTs; Binary Trees Linda Shapiro Winter 2015.
CSE373: Data Structures & Algorithms
Instructor: Lilian de Greef Quarter: Summer 2017
Trees Chapter 15.
CSE373: Data Structures & Algorithms More Heaps; Dictionaries; Binary Search Trees Riley Porter Winter 2016 Winter 2017.
CSE373: Data Structures & Algorithms Lecture 5: Dictionary ADTs; Binary Trees Linda Shapiro Spring 2016.
CSE373: Data Structures & Algorithms
CSE373: Data Structures & Algorithms
Binary Search Trees.
CSE373: Data Structures & Algorithms
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
CSE373: Data Structures & Algorithms Lecture 4: Dictionaries; Binary Search Trees Hunter Zahn Summer 2016 Summer 2016.
Cse 373 April 26th – Exam Review.
Binary Search Trees Why this is a useful data structure. Terminology
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
Chapter 22 : Binary Trees, AVL Trees, and Priority Queues
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
CSE373: Data Structures & Algorithms Lecture 5: Dictionary ADTs; Binary Trees Catie Baker Spring 2015.
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees
CMSC 341: Data Structures Priority Queues – Binary Heaps
Heaps and the Heapsort Heaps and priority queues
CSE 332: Data Abstractions Binary Search Trees
Presentation transcript:

CSE332: Data Abstractions Lecture 6: Dictionaries; Binary Search Trees Tyler Robison Summer

Where we are ADTs so far: 1. Stack: push, pop, isEmpty 2. Queue: enqueue, dequeue, isEmpty 3. Priority queue: insert, deleteMin Next: 4. Dictionary: associate keys with values  probably the most common, way more than priority queue  Ex: Binary Search Tree, HashMap 2 LIFO FIFO Min

The Dictionary (a.k.a. Map, a.k.a. Associative Array) ADT  Data:  set of (key, value) pairs  keys must be comparable ( or =)  Primary Operations:  insert(key,val): places (key,val) in map  If key already used, overwrites existing entry  find(key): returns val associated with key  delete(key) 3

Comparison: Set ADT vs. Dictionary ADT The Set ADT is like a Dictionary without any values  A key is present or not (no repeats) For find, insert, delete, there is little difference  In dictionary, values are “just along for the ride”  So same data-structure ideas work for dictionaries and sets  Java HashSet implemented using a HashMap, for instance Set ADT may have other important operations  union, intersection, is_subset  notice these are operators on 2 sets 4

Dictionary data structures Will spend the next week or two looking at three important dictionary data structures: 1. AVL trees  Binary search trees with guaranteed balancing 2. B-Trees  Also always balanced, but different and shallower  B!=Binary; B-Trees generally have large branching factor 3. Hashtables  Not tree-like at all Skipping: Other balanced trees (red-black, splay) 5

A Modest Few Uses Any time you want to store information according to some key and be able to retrieve it efficiently  Lots of programs do that!  Networks: router tables  Compilers: symbol tables  Databases, phone directories, associating username with profile, … 6

Some possible data structures Worst case for dictionary with n key/value pairs insert find delete  Unsorted linked-list  Unsorted array  Sorted linked list  Sorted array We’ll see a Binary Search Tree (BST) probably does better… But not in the worst case unless we keep it balanced *Correction: Given our policy of ‘no duplicates’, we would need to do O(n) work to check for a key’s existence before insertion 7 O(1)* O(n) O(n) O(n) O(n) O(n) O(n) O( log n) O(n)

Some tree terms (review… again)  A tree can be balanced or not  A balanced tree with n nodes has a height of O( log n)  Different tree data structures have different “balance conditions” to achieve this 8 A B DE C FG Balanced: n=7 h=2 B C D E F G A Unbalanced: n=7 h=6 

Binary Trees  Binary tree is empty or  a node (with data), and with  a left subtree (maybe empty)  a right subtree (maybe empty)  Representation: A B DE C F HG JI Data right pointer left pointer For a dictionary, data will include key and a value Ditched this representation for binary heaps, but it’s useful for BST 9

Binary Trees: Some Numbers Recall: height of a tree = longest path from root to leaf (counting # of edges) Operations tend to be a function of height For binary tree of height h:  max # of leaves:  max # of nodes:  min # of leaves:  min # of nodes: For n nodes, we cannot do better than O( log n) height, and we want to avoid O(n) height 2h2h 2 (h+1) – 1 1 h+1 10

Calculating height How do we find the height of a tree with root r ? int treeHeight(Node root) { ??? } 11

Calculating height How do we find the height of a tree with root r ? int treeHeight(Node root) { if(root == null) return -1; return 1 + max(treeHeight(root.left), treeHeight(root.right)); } Running time for tree with n nodes: O(n) – single pass over tree Note: non-recursive is painful – need your own stack of pending nodes; much easier to use recursion’s call stack 12

Tree Traversals A traversal is an order for visiting all the nodes of a tree  Pre-order:root, left subtree, right subtree +*245  In-order:left subtree, root, right subtree 2*4+5  Post-order: left subtree, right subtree, root 24*5+ + * 24 5 Expression tree 13

More on traversals void inOrdertraversal(Node t){ if(t != null) { traverse(t.left); process(t.element); traverse(t.right); } Sometimes order doesn’t matter Example: sum all elements Sometimes order matters Example: print tree with parent above indented children (pre-order) Example: print BST values in order (in- order) A B D E C F G A B DE C FG 14

Binary Search Tree  Structural property (“binary”)  each node has  2 children  Order property  all keys in left subtree smaller than node’s key  all keys in right subtree larger than node’s key  result: easy to find any given key

Are these BSTs?

Are these BSTs? Yep Nope

Find in BST, Recursive Data find(Key key, Node root){ if(root == null) return null; if(key < root.key) return find(key,root.left); if(key > root.key) return find(key,root.right); return root.data; } 18 Run-time (for worst-case)?

Find in BST, Iterative Data find(Key key, Node root){ while(root != null && root.key != key) { if(key < root.key) root = root.left; else if(key > root.key) root = root.right; } if(root == null) return null; return root.data; } 19 For iteratively calculating height & doing traversals, we needed a stack. Why do we not need one here?

Other “finding operations”  Find minimum node  Find maximum node  Find predecessor  Find successor

Insert in BST insert(13) insert(8) insert(31) How do we insert k elements to get a completely unbalanced tree? How do we insert k elements to get a balanced tree?

Lazy Deletion A general technique for making delete as fast as find :  Instead of actually removing the item just mark it deleted “Uh, I’ll do it later” Plusses:  Simpler  Can do removals later in batches  If re-added soon thereafter, just unmark the deletion Minuses:  Extra space for the “is-it-deleted” flag  Data structure full of deleted nodes wastes space  Can hurt run-times of other operations We’ll see lazy deletion in use later   22

(Non-lazy) Deletion in BST Why might deletion be harder than insertion? 10 23

Deletion  Removing an item disrupts the tree structure  Basic idea: find the node to be removed, then “fix” the tree so that it is still a binary search tree  Three cases:  node has no children (leaf)  node has one child  node has two children 24

Deletion – The Leaf Case delete(17) Just remove it

Deletion – The One Child Case delete(15) 26 Replace it with its child

Deletion – The Two Child Case What can we replace 5 with? 10 delete(5) 27

Deletion – The Two Child Case Idea: Replace the deleted node with a value guaranteed to be between the two child subtrees Options:  successor from right subtree: findMin(node.right)  predecessor from left subtree: findMax(node.left)  These are the easy cases of predecessor/successor Now delete the original node containing successor or predecessor  Leaf or one child case – easy cases of delete! 28

BuildTree for BST  BuildHeap equivalent for trees  Insert keys 1, 2, 3, 4, 5, 6, 7, 8, 9 into an empty BST  In order (and reverse order) not going to work well  Try a different ordering  median first, then left median, right median, etc.  5, 3, 7, 2, 1, 4, 8, 6, 9  What tree does that give us?  What big-O runtime? O(n log n), definitely better 29

Unbalanced BST  Balancing a tree at build time is insufficient, as sequences of operations can eventually transform that carefully balanced tree into the dreaded list  At that point, everything is O(n)   find  insert  delete

Balanced BST Observation  BST: the shallower the better!  For a BST with n nodes inserted in arbitrary order  Average height is O( log n) – see text for proof  Worst case height is O(n)  Simple cases such as inserting in key order lead to the worst-case scenario Solution: Require a Balance Condition that 1. ensures depth is always O( log n) – strong enough! 2. is easy to maintain – not too strong! 31

Potential Balance Conditions 1. Left and right subtrees of the root have equal number of nodes 2.Left and right subtrees of the root have equal height Too weak! Height mismatch example: Too weak! Double chain example: 32

Potential Balance Conditions 3. Left and right subtrees of every node have equal number of nodes 4. Left and right subtrees of every node have equal height Too strong! Only perfect trees (2 n – 1 nodes) Too strong! Only perfect trees (2 n – 1 nodes) 33

The AVL Tree Balance Condition Left and right subtrees of every node have heights differing by at most 1 Definition: balance(node) = height(node.left) – height(node.right) AVL property: for every node x, –1  balance(x)  1 That is, heights differ by at most 1  Ensures small depth  Will prove this by showing that an AVL tree of height h must have a number of nodes exponential in h  Easy (well, efficient) to maintain  Using single and double rotations  Perhaps not so easy to code…. 34 Have fun on project 2!