TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.

Slides:



Advertisements
Similar presentations
Chapter 11. Hash Tables.
Advertisements

The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Data Structures Using C++ 2E
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
Hashing as a Dictionary Implementation
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
TTIT33 Algorithms and Optimization – Lecture 5 Algorithms Jan Maluszynski - HT TTIT33 – Algorithms and optimization Lecture 5 Algorithms ADT Map,
Hashing Techniques.
Hashing CS 3358 Data Structures.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Hash Tables1 Part E Hash Tables  
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
1 Hash Tables  a hash table is an array of size Tsize  has index positions 0.. Tsize-1  two types of hash tables  open hash table  array element type.
Hashtables David Kauchak cs302 Spring Administrative Talk today at lunch Midterm must take it by Friday at 6pm No assignment over the break.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
Final Review Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2010.
IT 60101: Lecture #151 Foundation of Computing Systems Lecture 15 Searching Algorithms.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Implementing Dictionaries Many applications require a dynamic set that supports dictionary-type operations such as Insert, Delete, and Search. E.g., a.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
CS121 Data Structures CS121 © JAS 2004 Tables An abstract table, T, contains table entries that are either empty, or pairs of the form (K, I) where K is.
Red-Black Tree Algorithm : Design & Analysis [12].
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
Chapter 12 Hash Table. ● So far, the best worst-case time for searching is O(log n). ● Hash tables  average search time of O(1).  worst case search.
WEEK 1 Hashing CE222 Dr. Senem Kumova Metin
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hashing 8 April Example Consider a situation where we want to make a list of records for students currently doing the BSU CS degree, with each.
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Advanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2004 Simonas Šaltenis E1-215b
Chapter 11 Hash Anshuman Razdan Div of Computing Studies
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
1 the BSTree class  BSTreeNode has same structure as binary tree nodes  elements stored in a BSTree are a key- value pair  must be a class (or a struct)
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
CE 221 Data Structures and Algorithms
Hash table CSC317 We have elements with key and satellite data
CSCI 210 Data Structures and Algorithms
Hashing CSE 2011 Winter July 2018.
Hashing: is an alternative search technique (earlier we had BST) Motivation: Try to access directly each possible keys! Suggestion: Enumerate possible.
Advanced Associative Structures
Hash Table.
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
Presentation transcript:

TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets // Sets whose membership varies during computation  Array Doubling  Implementing Stack with array doubling Searching // Exist or not, in where  Binary Search Trees  Hashing

Amortized Analysis Provides average cost of each operation in the worst case for successive operations Aggregate method  show for a sequence of n operations takes worst-case time T(n) in total  In the worst case, the average cost, or amortized cost, per operation is therefore T(n)/n Accounting method // spreading a large cost over time  amortized cost = actual cost + accounting cost  assign different accounting cost to different operations  1. the sum of accounting costs is nonnegative for any legal sequence of operations  2. to make sure it is feasible to analyze the amortized cost of each operation

Array Doubling We don’t know how big an array we might need when the computation begins If not more room for inserting new elements,  allocating a new array that is twice as large as the current array  transferring all the elements into the new array Let t be the cost of transferring one element  suppose inserting the (n+1) element triggers an array- doubling  cost t*n for this array-doubling operation  cost t*n/2 + t*n/4 + t*n/8 + … for previous array- doubling, i.e. cost less than t*n  total cost less than 2t*n  The average cost for each insert operation = 2t

Implementing Stack with array doubling Array doubling is used behind the scenes to enlarge the array as necessary  Assuming actual cost of push or pop is 1  when no enlarging of the array occurs  the actual cost of push is 1 + t*n  when array doubling is required Accounting scheme, assigning  accounting cost for a push to be 2t  when no enlarging of array occurs  accounting cost for push to be –t*n + 2t  when array doubling is required The amortized cost of each push operation is 1+2t From the time the stack is created, the sum of the accounting cost must never be negative.

Searching: Binary Search Trees Binary Search Tree property  A binary tree in which the nodes have keys from an ordered set has the binary search tree property  if the key at each node is greater than all the keys in its left subtree and  less than or equal to all keys in its right subtree  In this case the binary tree is called a binary search tree An inorder traversal of a binary search tree produces a sorted list of keys.

Binary Search Trees, e.g. Binary Search trees with different degrees of balances Black dots denote empty trees

Binary Search Tree Retrieval Element bstSearch(BinTree bst, Key K)  Element found  if (bst == nil)  found = null;  else  Element root = root(bst);  If (K == root.key) found = root;  else if (K < root.key) found = bstSearch (leftSubtree(bst), K);  else found = bstSearch(rightSubtree(bst), K);  return found;

Analysis of Binary Search Tree Retrieval use the number of internal nodes of the tree that are examined which searching for key  let it be n For a long chain tree structure,  (n) For a tree as balanced as possible,  (lg n) >> The objective is to make the tree as balanced as possible  Technique: Binary Tree Rotations

Binary Tree Rotations Left Rotation on (15, 25)

Making binary search trees as balanced as possible Red-Black Tree  Let T be a red-black tree with n internal nodes. Then no node has depth greater than 2 lg(n + 1).

Hashing to aid searching Imagine that we could assign a unique array index to every possible key that could occur in an application.  locating, inserting, deleting elements could be done very easily and quickly  key space is much too large The purpose of hashing is to translate (by using hash function) an extremely large key space into a reasonable small range of integers (called hash code). Hash Table  an array H on indexes (hash code) 0, …, h-1  hash function maps a key into an integer in the range 0, …, h-1  Each entry may contain one or more keys!  Hash function is a many-to-one function

Hash Table, e.g. data k: 1055, 1492, 1776, 1812, 1918, and 1945 hash function  hashCode(k) = 5k mod 8 hashCode: key  0: 1776  1:  2:  3: 1055  4: 1492, 1812 // Collision!  5: 1945  6: 1918  7:

Handling Collisions: Closed Address Hashing H[i] is a linked list hashCode: key  0: -> 1776  1: ->  2: ->  3: ->1055  4: ->1492 -> 1812  5: ->1945  6: ->1918  7: -> To search a given key K, first compute its hash code, say i, then search through the linked list at H[i], comparing K with the keys of the elements in the list.

Analysis of Closed Address Hashing Searching for a key Basic Operation: comparisons  Assume computing a hash code equals a units of comparisons  there are total n elements stored in the table,  each elements is equally likely to be search Average number of comparison for an unsuccessful search (after hashing) equal  A u (n) = n/h Average cost of a successful search  when key i = 1, …, n, was inserted at the end of a linked list, each linked list had average length given by (i – 1)/h  expected number of key comparisons = 1 + comparisons make for inserting an element at the end of a linked list  A s (n) = 1/n  {i=1 to n}( 1 + (i - 1)/h ) = 1 + n/(2h) + 1/(2h)

Assuming uniformly distribution of hash code hash code for each key in our set is equally likely to be any integer in the range 0, …, h-1 If n/h is a constants then  O(1) key comparisons can be achieved, on average, for successful search and unsuccessful search. Uniformly distribution of hash code depends on the choice of Hash Function

Choosing a Hash Function // for achieve uniformly distribution of hash code If the key type is integer  hashCode(K) = (a K) mod h Choose h as a power of 2, and h >= 8 Choose a = 8 Floor[h/23] + 5 If the key type is string of characters, treat them as sequence of integers, k1, k2, k3, …, kl  hashCode(K) = (a l k1 + a l-1 k2 + …+a kl) mod h Use array doubling whenever n/h (called load factor, where n is the number of elements in the table) gets high, say 0.5

Handling Collisions: Open Address Hashing is a strategy for storing all elements in the array of the hash table, rather than using linked lists to accommodate collisions  if the hash cell corresponding to the hash code is occupied by a different elements,  then a sequence of alternative locations for the current element is defined (by rehashing) Rehashing by linear probing  rehash(j) = (j+1) mod h  where j is the location most recently probed,  initially j = i, the hash code for K Rehashing by double hashing  rehash(j, d) = (j + d) mod h  e.g. d = hashIncr(K) = (2K + 1) mod h  // computing an odd increment ensures that whole hash table is accessed in the search (provided h is a power of 2)

Open Address Hashing, e.g. Linear probing hashCode: key  0: 1776  1:  2:  3: 1055  4: 1492  5: 1945  6: 1918  7: Now insert 1812, hashcode(1812) = 4, i.e. i = 4  h = 8, initially j = i = 4  rehash(j) = (j+1) mod h  rehash(4) = (4+1) mod 8 = 5 // collision again  rehash(5) = (5+1) mod 8 = 6 // collision again  … put in 7

Retrieval and Deletion under open addressing hashing Retrieval procedure imitates the insertion procedure, stop search as soon as emptyCell is encountered. Deletion of a key  cannot simply delete the the key and assign the cell to emptyCell // cause problem for retrieval procedure  need to assign the cell to a value indicating “obsolete”