Presentation is loading. Please wait.

Presentation is loading. Please wait.

Algorithms and Data Structures Hash Tables and Associative Arrays.

Similar presentations


Presentation on theme: "Algorithms and Data Structures Hash Tables and Associative Arrays."— Presentation transcript:

1 Algorithms and Data Structures Hash Tables and Associative Arrays

2 Introduction Hash tables are one implementation of associative arrays, or dictionaries. An associative array is an array with a potentially infinite or very large index set but small indices are actually used. The main idea is to map the former to the latter. N feasible elements, n actually used elements 2

3 Exercises Write a program that reads a text file and outputs the 100 most frequent words in the text. Assume you have a large file consisting of triples (transaction, price, customer ID). Explain how to computer the total payment due for each customer. Your program should run in linear time. 3

4 Why hashing? Pros – Fast Work with array Using calculation to figure out the array location for both search and insert – Easy to program Cons – Limited array – Data distribution – Cannot visit in any kind of order – Good hashing function is needed 4

5 Collisions Two or more have been hashed into the same array element Solutions – Open addressing – find other empty array element Linear probing – not good when big cluster found Quadratic probing – not to probe to the adjacent element using square distance instead of linear distance Double hashing – use the second hash when collision happens – Chaining – install a linked list at each index 5

6 Hashing with Chaining Maintain a list for each element in arrays O(1+n/m) to find or remove with random hash function 6

7 Hashing with Linear probing Open probing or open hashing Find and insert are trivial How to remove? 7

8 Figure 4.2 8

9 Chaining versus Linear Probing Chaining Referential integrity Linked list does not guarantee contiguous physical memory allocation Search time is small when the number of element is closed to the size of the table More overhead Harder implementation Linear Location could be changed Contiguous physical memory visit, thus better performance Search time is high when the number of element is closed to the size of the table Less overhead Easier implementation 9

10 อ้างอิง Kurt Mehlhorn and Peter Sanders, Algorithms and Data Structures: The Basic Toolbox, Springer 2008. Robert Lafore, Data Structures & Algorithms in JAVA, SAMS, 2002. 10


Download ppt "Algorithms and Data Structures Hash Tables and Associative Arrays."

Similar presentations


Ads by Google