Backup Slides

An Example of Hash Function Implementation struct MyStruct { string str; string item; }; --------------------------------------------------------- // The hash function takes key “obj.str” to index of bucket int hash( const MyStruct & obj ) { int product = 1; int modulus = 0; for ( int i = 0; i < 3 && i < int( obj.str.length( ) ); i++ ) product *= (obj.str[ i ]-64); modulus = product % SIZE1; return modulus; }

Uniform Hashing  When the elements are spread evenly (or near evenly) among the indexes of a hash table, it is called uniform hashing  If elements are spread evenly, such that the number of elements at an index is less than some small constant, uniform hashing allows a search to be done in  ( 1 ) time  The hash function largely determines whether or not we will have uniform hashing 3

Bad Hash Functions  h( k ) = 5 is obviously a bad hash function  h( k ) = k % 100 could be a bad hash function if there is meaning attached to parts of a key  Consider that the key might be an employee id  The last two digits may give the state of birth 4

Ideal Hash Function for Uniform Hashing  The hash table size should be a prime number that is not too close to a power of 2  31 is a prime number but is too close to a power of 2  97 is a prime number not too close to a power of 2  A good hash function might be: h( k ) = k % 97 5

Hash Functions Can be Made for Keys that are Strings 6 1int sum = 0; 2for ( int i = 0; i < int( str.length( ) ); i++ ) 3sum += str[ i ]; 4hash_index = sum % 97;

Speed vs. Memory Conservation  Speed comes from reducing the number of collisions  In a search, if there are no collisions, the first element in the linked list in the one we want to find (fast)  Therefore, the greatest speed comes about by making a hash table much larger than the number of keys (but there will still be an occasional collision) 7

Speed vs. Memory Conservation (cont.)  Each empty LinkedList object in a hash table wastes 8 bytes of memory (4 bytes for the start pointer and 4 bytes for the current pointer)  The best memory conservation comes from trying to reduce the number of empty LinkedList objects  The hash table size would be made much smaller than the number of keys (there would still be an occasional empty linked list) 8

Hash Table Design  Decide whether speed or memory conservation is more important (and how much more important) for the application  Come up with a good table size which  Allows for the use of a good hash function  Strikes the appropriate balance between speed and memory conservation 9

Ideal Hash Tables  Can we have a hash function which guarantees that there will be no collisions?  Yes: h( k ) = k  Each key k is unique; therefore, each index produced from h( k ) is unique  Consider 300 employees that have a 4 digit id  A hash table size of 10000 with the hash function above guarantees the best possible speed 10

Ideal Hash Tables (cont.)  Should we use LinkedList objects if there are no collisions?  Suppose each Employee object takes up 100 bytes  An array size of 10000 Employee objects with only 300 used indexes will have 9700 unused indexes, each taking up 100 bytes  Best to use LinkedList objects (in this case) – the 9700 unused indexes will only use 8 bytes each  Additional space can be saved by not storing the employee id in the object (if no collisions) 11

Ideal Hash Tables (cont.)  Can we have a hash table without any collisions and without any empty linked lists?  Sometimes. Consider 300 employees with id’s from 0 to 299. We can make a hash table size of 300, and use h( k ) = k  LinkedList objects wouldn’t be necessary and in fact, would waste space  It would also not be necessary to store the employee id in the object 12

Implementing a Hash Table  We’ll implement a HashTable with linked lists (chaining)  without chaining, a hash table can become full  If the client has the ideal hash table mentioned on the previous slide, he/she would be better off to just use an Array for the hash table 13

Implementing a Hash Function  We shouldn’t write the hash function  The client should write the hash function that he/she would like to use  Then, the client should pass the hash function that he/she wrote as a parameter into the constructor of the HashTable class  This can be implemented with function pointers 14

Function Pointers  A function pointer is a pointer that holds the address of a function  The function can be called using the function pointer instead of the function name 15

Function Pointers (cont.)  Example of a function pointer declaration: float (*funcptr) (string); 16

Function Pointers (cont.)  Example of a function pointer declaration: float (*funcptr) (string); 17 funcptr is the name of the pointer; the name can be chosen like any other pointer name

Function Pointers (cont.)  Example of a function pointer declaration: float (*funcptr) (string); 18 The parentheses are necessary.

Function Pointers (cont.)  Example of a function pointer declaration: float (*funcptr) (string); 19 The return type of the function that funcptr can point to is given here (in this case, the return type is a float)

Function Pointers (cont.)  Example of a function pointer declaration: float (*funcptr) (string); 20 The parameter list of a function that funcptr can point to is given here – in this case, there is only one parameter of string type.

Function Pointers (cont.)  Example of a function pointer declaration: float (*funcptr) (string);  What would a function pointer declaration look like if the function it can point to has a void return type and accepts two integer parameters? 21

Function Pointers (cont.) 22 void (*fp) (int, int);

Function Pointers (cont.) 23 void (*fp) (int, int); void foo( int a, int b ) { cout << “a is: “ << a << endl; cout << “b is: “ << b << endl; } A function that fp can point to

Assigning the Address of a Function to a Function Pointer 24 void (*fp) (int, int); void foo( int a, int b ) { cout << “a is: “ << a << endl; cout << “b is: “ << b << endl; } fp = foo; The address of foo is assigned to fp like this

Calling a Function by Using a Function Pointer 25 Once the address of foo has been assigned to fp, the foo function can be called using fp like this void (*fp) (int, int); void foo( int a, int b ) { cout << “a is: “ << a << endl; cout << “b is: “ << b << endl; } fp( 5, 10 );

Design of the HashTable Constructor  Once the client designs the hash function, the client passes the name of the hash function, as a parameter into the HashTable constructor  The HashTable constructor accepts the parameter using a function pointer in this parameter location  The address of the function is saved to a function pointer in the private section  Then, the hash table can call the hash function that the client made by using the function pointer 26

HashTable.h 27 1 #include "LinkedList.h" 2 #include "Array.h“ 3 4 template 5 class HashTable 6 { 7 public: 8HashTable( int (*hf)(const DataType &), int s ); 9bool insert( const DataType & newObject ); 10bool retrieve( DataType & retrieved ); 11bool remove( DataType & removed ); 12bool update( DataType & updateObject ); 13void makeEmpty( ); HashTable.h continued…

HashTable.h 28 14 private: 15Array > table; 16int (*hashfunc)(const DataType &); 17 }; 18 19 #include "HashTable.cpp" Space is necessary here

Clientele  The LinkedList class is being used in the HashTable class, along with the Array class  Note that when one writes a class the clientele extends beyond the main programmers who might use the class  The clientele extends to people who write other classes 29

HashTable Constructor 30 1 template 2 HashTable ::HashTable( 3 int (*hf)(const DataType &), int s ) 4: table( s ) 5 { 6hashfunc = hf; 7 } This call to the Array constructor creates an Array of LinkedList’s of type DataType

HashTable Constructor (cont.) 31 1 template 2 HashTable ::HashTable( 3 int (*hf)(const DataType &), int s ) 4: table( s ) 5 { 6hashfunc = hf; 7 } The DataType for Array is LinkedList (DataType in Array is different than DataType in HashTable)

HashTable Constructor (cont.) 32 1 template 2 HashTable ::HashTable( 3 int (*hf)(const DataType &), int s ) 4: table( s ) 5 { 6hashfunc = hf; 7 } In the Array constructor, an Array of size s is made, having LinkedList elements – when this array is created, the LinkedList constructor is called for each element.

HashTable Constructor (cont.) 33 1 template 2 HashTable ::HashTable( 3 int (*hf)(const DataType &), int s ) 4: table( s ) 5 { 6hashfunc = hf; 7 }

insert 34 8 template 8 9 bool HashTable ::insert( 10 const DataType & newObject ) 11 { 12int location = hashfunc( newObject ); 13if ( location = table.length( ) ) 14return false; 15table[ location ].insert( newObject ); 16return true; 17 } Keep in mind that this is a LinkedList object.

retrieve 35 18 template 19 bool HashTable ::retrieve( 20 DataType & retrieved ) 21 { 22int location = hashfunc( retrieved ); 23if ( location = table.length( ) ) 24return false; 25if ( !table[ location ].retrieve( retrieved ) ) 26return false; 27return true; 28 }

remove 36 29 template 30 bool HashTable ::remove( 31 DataType & removed ) 32 { 33int location = hashfunc( removed ); 34if ( location = table.length( ) ) 35return false; 36if ( !table[ location ].remove( removed ) ) 37return false; 38return true; 39 }

update 37 40 template 41 bool HashTable ::update( 42 DataType & updateObject ) 43 { 44int location = hashfunc( updateObject ); 45if ( location = table.length( ) ) 46return false; 47if ( !table[location].find( updateObject ) ) 48return false; 49table[location].replace( updateObject ); 50return true; 51 }

makeEmpty 38 50 template 51 void HashTable ::makeEmpty( ) 52 { 53for ( int i = 0; i < table.length( ); i++ ) 54table[ i ].makeEmpty( ); 55 }

Using HashTable 39 1 #include 2 #include 3 #include "HashTable.h" 4 5 using namespace std; 6 7 struct MyStruct { 8string str; 9int num; 10bool operator ==( const MyStruct & r ) { return str == r.str; } 11 }; str will be the key

Using HashTable (cont.) 40 1 #include 2 #include 3 #include "HashTable.h" 4 5 using namespace std; 6 7 struct MyStruct { 8string str; 9int num; 10bool operator ==( const MyStruct & r ) { return str == r.str; } 11 }; It is necessary to overload the == operator for the LinkedList functions

Using HashTable (cont.) 41 1 #include 2 #include 3 #include "HashTable.h" 4 5 using namespace std; 6 7 struct MyStruct { 8string str; 9int num; 10bool operator ==( const MyStruct & r ) { return str == r.str; } 11 }; In the actual code, a comment is placed above HashTable, telling the client that this is needed and what is required.

Using HashTable (cont.) 42 12 const int SIZE1 = 97, SIZE2 = 199; 13 14 int hash1( const MyStruct & obj ); 15 int hash2( const MyStruct & obj ); 16 17 int main( ) 18 { 19HashTable ht1( hash1, SIZE1 ), 20ht2( hash2, SIZE2);

Using HashTable (cont.) 43 21MyStruct myobj; 22 23myobj.str = "elephant"; 24myobj.num = 25; 25ht1.insert( myobj ); 26 27myobj.str = "giraffe"; 28myobj.num = 50; 29ht2.insert( myobj ); … // other code using the hash tables …

Using HashTable (cont.) 44 30return 0; 31 } 32 33 int hash1( const MyStruct & obj ) 34 { 35int sum = 0; 36for ( int i = 0; i < 3 && i < int( obj.str.length( ) ); i++ ) 37sum += obj.str[ i ]; 38return sum % SIZE1; 39 }

