Advanced Implementation of Tables

Slides:

Advertisements

Similar presentations

The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:

Advertisements

Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.

Hashing as a Dictionary Implementation

CS202 - Fundamental Structures of Computer Science II

Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.

Balanced Search Trees. 2-3 Trees Trees Red-Black Trees AVL Trees.

Hashing Techniques.

A balanced life is a prefect life.

Dictionaries and Their Implementations

© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.

Quick Review of material covered Apr 8 B+-Tree Overview and some definitions –balanced tree –multi-level –reorganizes itself on insertion and deletion.

© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.

Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.

Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.

© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.

Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.

1 Hash Tables  a hash table is an array of size Tsize  has index positions 0.. Tsize-1  two types of hash tables  open hash table  array element type.

IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.

Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.

© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.

Chapter 13 B Advanced Implementations of Tables – Balanced BSTs.

TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.

2-3 Trees, Trees Red-Black Trees

Chapter 13 A Advanced Implementations of Tables. © 2004 Pearson Addison-Wesley. All rights reserved 13 A-2 Balanced Search Trees The efficiency of the.

Hashing Hashing is another method for sorting and searching data.

HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.

Hashing as a Dictionary Implementation Chapter 19.

COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.

Balanced Search Trees Chapter 19 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.

Chapter 13 C Advanced Implementations of Tables – Hash Tables.

Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.

1 the BSTree class  BSTreeNode has same structure as binary tree nodes  elements stored in a BSTree are a key- value pair  must be a class (or a struct)

TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.

Sets and Maps Chapter 9.

Sections 10.5 – 10.6 Hashing.

Balanced Search Trees 2-3 Trees AVL Trees Red-Black Trees

Data Structures Using C++ 2E

CSCI 210 Data Structures and Algorithms

Data Abstraction & Problem Solving with C++

School of Computer Science and Engineering

Slides by Steve Armstrong LeTourneau University Longview, TX

Lecture 18. Basics and types of Trees

Data Structures Using C++ 2E

Review Graph Directed Graph Undirected Graph Sub-Graph

Hash functions Open addressing

AVL Trees "The voyage of discovery is not in seeking new landscapes but in having new eyes. " - Marcel Proust.

The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:

Introduction to Hashing & Hashing Techniques

Advanced Associative Structures

CS202 - Fundamental Structures of Computer Science II

Chapter 21 Hashing: Implementing Dictionaries and Sets

Dictionaries and Their Implementations

Hash Tables and Associative Containers

Data Structures – Week #7

Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.

Advance Database System

CS202 - Fundamental Structures of Computer Science II

Advanced Implementation of Tables

Sets and Maps Chapter 9.

Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.

Introduction to Hashing & Hashing Techniques

Ch Hash Tables Array or linked list Binary search trees

Ch. 13 Hash Tables .

Data Structures and Algorithm Analysis Hashing

Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.

Lecture-Hashing.

Presentation transcript:

Advanced Implementation of Tables Chapter 12

Chapter 12 -- Advanced Implementations of Trees Although we described the advantages of using the binary search tree, the efficiency of this implementation suffers when the tree loses its balance. This chapter introduces various search trees, which remain balanced in all situations. CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Balanced Search Trees As we saw in a previous chapter, the efficiency of the binary search tree is related to the tree’s height. The operations Retrieve, Insert, and Delete follow the path from the root of the tree to the node that contains the desired item (or to the parent of the item in the case of insert) As we learned, the height of a BST is sensitive to the order in which you insert the items. CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Consider a tree with the following nodes: 10,20, 30, 40, 50, 60, and 70 Depending upon the insertion order you could end up with: CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Nodes are similar to a BST node CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees A 2-3 Tree could look like this: CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Traversal Similar to a BST CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Searching Similar to a BST CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Inserting Quite different Let’s start with the following trees To the left is a BST, to the right is a 2-3 tree with the same elements CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Insert 39, 38, 37, … , 32 (in that order) Here is what the trees will look like, CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees So, let’s walk through the insertions and figure out how it works. Insert 39 – Fairly straightforward Now, what happens when you insert 38? CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees This is a little more difficult You can’t insert a third value into a node, so you have to split it (and push up the middle value) So, split 38,39,40 and push up 39 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Insert 37 – fairly straightforward CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Insert 36 This also creates a 3 value node, so you have to split it, and push up the middle, Which creates a 3 value node, so you split it… CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees And push up the middle CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Insert 35, 34, and 33 Now Insert 32 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Insert 32 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees The Algorithm Locate the leaf node to insert into If leaf node contains only one item before insert (two after) you are done. If it contains two before (three after) you need to split the node and push the middle up CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees This process (pushing up) continues until a node is reached that has only one node in it (before push up) Here is the illustration of splitting an internal node: CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Sometimes you may have to split all the way up to the root node CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Deletion This is the inverse of the insertion Start with the following: Now delete 70 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Step 1 Make the node a leaf – swap with the inorder successor CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Step 2 – delete and fix from there. Now delete 100 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Delete 100 Now delete 80 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Delete 80 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees The results of deleting 70, 100, and 80 from a BST and a 2-3 tree CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Deletion Overview CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Insertion Insert 60, 30, 10, and 20 into an empty tree Now insert 50, 40, 70, 80, 15, 90, 100 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees 50 and 40 are easy Now insert 70 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Insert 70 Split the 40-50-60 node (push middle up) Then Insert 70 Insert 80 and 15 (easy) Now insert 90 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Insert 90 Split 60-70-80 node and push middle up Then insert 90 Now Insert 100 There is a trick here…. CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Insert 100 Split the 4 node at the root. Then do the insertion as before The thing to remember – You split every 4 node on your way down for an insertion. CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Splitting a 4 node At the Root CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Splitting a 4 node Whose parent is a 2-node CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Splitting a 4 node Whose parent is a 3-node CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Deleting from a 2-3-4 tree The deletion algorithm has the same beginning as the deletion fro a 2-3 tree. Locate the node with item I Locate the inorder successor Swap If the leaf with I is a 3-node or a 4-node, just remove it. Otherwise restructure Combine nodes and then remove I CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Red-Black Trees Represent a 2-3-4 tree as a BST with colored links 4 nodes are easy CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees 3 nodes have options CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Example 2-3-4 Tree Red-Black Tree CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Code for a Red-Black Tree Node enum Color {RED, BLACK}; class TreeNode{ private: TreeItemType Item; TreeNode *left, *right; Color leftColor, rightColor; friend class RedBlackTree; }; CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Searching and traversing a Red-Black Tree Since a red-black tree is a binary search tree you can traverse it by using the algorithms for a binary search tree (you simply ignore the color of the pointers) Inserting Since a red-black tree actually represents a 2-3-4 tree, you simply need to adjust the 2-3-4 insertion algorithms to accommodate the red-black representation. CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Inserting (cont) Recall that you split 4 nodes you encounter on insertion. CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees If you split a 4 node whose parent is a 2 node you push up CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Splitting a 4-node whose parent is a 3-node is a little more complicated – there are a few cases Case 1 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Case 2 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Case 3 CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees This chapter also introduces AVL Trees Named for its inventors: Adel’son-Vel’skii and Landis CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Hashing Hashing Enables access to table items in time that is relatively constant and independent of the items Hash function Maps the search key of a table item into a location that will contain the item Hash table An array that contains the table items, as assigned by a hash function CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees A perfect hash function Maps each search key into a unique location Possible if all the search keys are known Collisions Occur when the hash function maps more than one item into the same array location Collision-resolution schemes Assign locations in the hash table to items with different search keys when the items are involved in a collision CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Requirements for a hash function Be easy and fast to compute Place items evenly throughout the hash table The calculation of the hash function should involve the entire search key If a hash function uses module arithmetic, the base should be prime CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Simple hash functions Selecting digits Does not distribute items evenly Folding Modulo arithmetic The table size should be prime Converting a character string to an integer If the search key is a character string, it can be converted into an integer before the hash function is applied CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Resolving Collisions Approach 1: Open addressing A category of collision resolution schemes that probe for an empty, or open, location in the hash table The size of the has table has to be increased when the table becomes full CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Approach 1: Open addressing (cont) Linear probing Searches the hash table sequentially, starting from the original location specified by the hash function Quadratic probing Searches the hash table beginning with the original location that the hash function specifies and continues at increments of 12, 22, 32, and so on Double hashing Uses two hash functions, called rehashing CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Resolving Collisions Approach 2: Restructuring the hash table The hash table can accommodate more than one item in the same location Buckets Each location in the hash table is itself an array Separate chaining Each hash table location is a linked list Successfully resolves collisions The size of the ADT table is dynamic CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees CS 308 Chapter 12 -- Advanced Implementations of Trees

The Efficiency of Hashing An analysis of the average-case efficiency Load factor  Ratio of the current number of items in the table to the maximum size of the array table Measures how full a hash table is Should not exceed 2/3 Hashing efficiency for a particular search also depends on whether the search is successful Unsuccessful searches generally require more time than successful searches CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees CS 308 Chapter 12 -- Advanced Implementations of Trees

Table Traversal: An Inefficient Operation Under Hashing For many applications, hashing provides the most efficient implementation Hashing is not efficient for Traversal in sorted order Finding the item with the smallest or largest value in its search key Range query CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Summary A 2-3 tree and a 2-3-4 tree are variants of a binary search tree in which the balanced is easily maintained The insertion and deletion algorithms for a 2-3-4 tree are more efficient than the corresponding algorithms for a 2-3 tree A red-black tree is a binary tree representation of a 2-3-4 tree that requires less storage than a 2-3-4 tree CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Summary (cont) An AVL tree is a binary search tree that is guaranteed to remain balanced Hashing as a table implementation calculates where the data item should be rather than search for it A hash function should be extremely easy to compute and should scatter the search keys evenly throughout the hash table CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees Summary (cont) A collision occurs when two different search keys hash into the same array location Hashing as a table implementation does not efficiently support operations that require the table items to be ordered is simpler and faster than balanced search tree implementations when traversals are not important CS 308 Chapter 12 -- Advanced Implementations of Trees

Chapter 12 -- Advanced Implementations of Trees CS 308 Chapter 12 -- Advanced Implementations of Trees