Presentation is loading. Please wait.

Presentation is loading. Please wait.

Java Methods Lookup Tables and Hashing Object-Oriented Programming

Similar presentations


Presentation on theme: "Java Methods Lookup Tables and Hashing Object-Oriented Programming"— Presentation transcript:

1 Java Methods Lookup Tables and Hashing Object-Oriented Programming
and Data Structures 3rd AP edition Maria Litvin ● Gary Litvin A rather technical chapter. Lookup Tables and Hashing Copyright © 2015 by Maria Litvin, Gary Litvin, and Skylight Publishing. All rights reserved.

2 Objectives: Learn about lookup tables Learn about hashing
Review java.util.HashSet and java.util.HashMap You do not need to implement your own hash tables: case studies and labs use java.util.HashSet and java.util.HashMap. However, you need to understand how HashSet and HashMap work.

3 Lookup Tables A lookup table is an array that helps to find data very quickly. The array stores references to data records (or some values). A data record is identified by some key. The value of a key is directly translated into an array index using a simple formula. A lookup table may waste some space, but it provides instantaneous (O(1)) access to the data — no search is needed.

4 Lookup Tables (cont’d)
Only one key can be mapped onto a particular index (no collisions). The index that corresponds to a key must fall into the valid range (from 0 to array.length-1). Access to data is “instantaneous” (O(1)). In certain applications a look up table can be a 2-D array.

5 Lookup Tables: Example 1
Zip codes Corresponding locales Some table entries remain unused Last time we checked, the lowest valid zip code was and the highest

6 Lookup Tables: Example 2
private static final int [ ] n_thPowerOf3 = { 1, 3, 9, 27, 81, 243, 729, 2187, 6561, }; ... // precondition: 0 <= n < 10 public int powOf3 (int n) { return n_thPowerOf3 [ n ]; } It is not hard to calculate a power of 3. A lookup table is more appropriate when the function is harder to compute and we need to compute it frequently, but not necessarily with the highest precision. For example, we can approximate a function defined on [0, 1] by its values in 1000 points evenly distributed over [0, 1].

7 Lookup Tables: Example 3
256 colors used in a particular image; each of the palette entries corresponds to a triplet of RGB values The “color scheme” in these slides also uses a small lookup table. It maps “logical colors” (heading color, text color, highlight color, etc.) onto a set of screen colors. Choosing a different scheme will change the appearance of all the slides.

8 Applications of Lookup Tables
Data retrieval Data compression and encryption Tabulating functions Color mapping Also calculating distributions of values, such as frequencies of occurrence for different letters in a text.

9 Hash Tables A hash table is similar to a lookup table.
The value of a key is translated into an array index using a hash function. The index computed for a key must fall into the valid range. The hash function can map different keys onto the same array index — this situation is called a collision. Hash tables build upon the lookup table idea, only in a hash table several values can fall into the same location.

10 Hash Tables (cont’d) The hash function should map the keys onto the array indices randomly and uniformly. A well-designed hash table and hash function minimize the number of collisions. There are two common techniques for resolving collisions: chaining and probing. Chaining is more common. The Java library uses chaining.

11 Chaining Each element in the array is itself a collection, called a bucket (a list or a BST), which is searched for the desired key The Java library implements a bucket as a singly-linked list. Buckets

12 Probing If the place where we want to store the key is occupied by a different key, we store the former in another location in the same array, computed using a certain probing formula The probing function recalculates the index The same procedure is applied when we retrieve a key. Probing is mentioned for the sake of completeness.

13 java.util.HashSet<E> and java.util.HashMap<K,V> Classes
These classes implement the Set<E> and Map<K,V> interfaces, respectively, using hash tables (with chaining). This implementation may be more efficient than TreeSet and TreeMap. In many applications, TreeSet and TreeMap may be preferable because for them an iterator returns values / keys in ascending order.

14 HashSet and HashMap (cont’d)
Collisions are resolved through chaining. The sizes of buckets must remain relatively small. Load factor: Total number of items Number of buckets 0.75 is considered a reasonable load factor, which means the number of values is about 3/4 of the number of buckets. Load factor =

15 HashSet and HashMap (cont’d)
Fine tuning: Load factor too large  lots of collisions Load factor too small  wasted space and slow iterations over the whole set If the load factor exceeds the specified limit, the table is automatically rehashed into a larger table; if possible this should be avoided. HashSet and HashMap have constructors that take a load factor limit as a parameter. The default is 0.75.

16 HashSet and HashMap (cont’d)
Objects in a HashSet or keys in a HashMap must have a reasonable int hashCode method that overrides Object’s hashCode and helps calculate the hashing function. The hashCode method returns an int from the entire int range; it is later mapped on the range of indices in a particular hash table. String, Integer, Double: each has a reasonable hashCode defined. The index must be mapped onto the range of the indices in the table: from 0 to size This involves truncation. The Java library uses tables of the size equal to a power of two and doubles the size of the table if it needs to expand the table.

17 hashCode Examples For String: For Person:
(where si is Unicode for the i-th character in the string) For Person: In most cases a decent hashCode method for a class can be constructed by combining hashCode values for its fields. public int hashCode ( ) { return getFirstName( ).hashCode( ) + getLastName( ).hashCode( ); }

18 Consistency HashSet / HashMap first use hashCode, then equals.
TreeSet / TreeMap use only compareTo (or a comparator) For consistent performance, these methods should agree with each other: x.equals (y)  x.compareTo (y) == 0 x.equals (y)  x.hashCode( ) == y.hashCode( ) Thus, it is a good idea for Comparable objects to also redefine the equals and hashCode methods.

19 HashSet<E> Constructors
The loadFactor parameter is the load factor limit; if the table’s load factor exceeds this limit, the table is rehashed into a larger array.

20 HashMap<K,V> Constructors
Actually Java developers implemented Map first, then Set as a special case of Map.

21 Review: What is the main difference between a lookup table and a hash table? What is a collision? Name two common techniques for resolving collisions. What is a bucket? What is a load factor? What is the main difference between a lookup table and a hash table? In a lookup table, only one value can be placed into each entry; in a hash table several values may be placed into the same entry. What is a collision? A situation where two different values are mapped onto the same entry in the table. Name two common techniques for resolving collisions. Chaining and probing. What is a bucket? A collection that holds all the values that fall into the same hash table entry. What is a load factor? The ratio of the number of items stored in the hash table to the table size.

22 Review (cont’d): How is hash table performance affected when the load factor is too high? Too low? What happens to a HashSet or a HashMap when the load factor exceeds the specified limit? HashSet’s no-args constructor sets the initial capacity to 16 and the load factor limit to How many values can be stored in this table before it is rehashed? How is hash table performance affected when the load factor is too high? Too low? Too high — too many collisions, slows data access; too low — space is wasted and iterations over the whole set are slow. What happens to a HashSet or a HashMap when the load factor exceeds the specified limit? A new table of a larger size (double the size) is allocated, and all the items are rehashed into the new table. HashSet’s no-args constructor sets the initial capacity to 16 and the load factor limit to How many items can be stored in this table before it is rehashed? 12 items.

23 Review (cont’d): What is the sequence of values returned by an iterator for a HashSet? What is the range of values for the hashCode method? Which method(s) of an object are used to find it in a TreeSet? A HashSet? What is the sequence of values returned by an iterator for a HashSet? No particular order, random. What is the range of values for the hashCode method? All integers. Which method(s) of an object are used to find it in a TreeSet? A HashSet? TreeSet: compareTo; HashSet: first hashCode, then equals.


Download ppt "Java Methods Lookup Tables and Hashing Object-Oriented Programming"

Similar presentations


Ads by Google