Hashing.

Slides:

Advertisements

Similar presentations

Chapter 11. Hash Tables.

Advertisements

1 11. Hash Tables Heejin Park College of Information and Communications Hanyang University.

David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.

1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.

Analysis of Algorithms CS 477/677

Hash Tables Introduction to Algorithms Hash Tables CSE 680 Prof. Roger Crawfis.

Hash Tables CIS 606 Spring 2010.

CSE 1302 Lecture 23 Hashing and Hash Tables Richard Gesick.

Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.

CS 253: Algorithms Chapter 11 Hashing Credit: Dr. George Bebis.

Hashing CS 3358 Data Structures.

Dictionaries and Their Implementations

1.1 Data Structure and Algorithm Lecture 9 Hashing Topics Reference: Introduction to Algorithm by Cormen Chapter 12: Hash Tables.

Hash Tables How well do hash tables support dynamic set operations? Implementations –Direct address –Hash functions Collision resolution methods –Universal.

1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.

11.Hash Tables Hsu, Lih-Hsing. Computer Theory Lab. Chapter 11P Directed-address tables Direct addressing is a simple technique that works well.

Lecture 11 March 5 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.

CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.

Hash Tables1 Part E Hash Tables  

Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)

Design and Analysis of Algorithms - Chapter 71 Hashing b A very efficient method for implementing a dictionary, i.e., a set with the operations: – insert.

Hash Tables1 Part E Hash Tables  

Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.

COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.

CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.

Lecture 10: Search Structures and Hashing

Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.

Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.

Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.

CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.

1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.

Hash Table March COP 3502, UCF.

Spring 2015 Lecture 6: Hash Tables

Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.

IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.

Hashing Table Professor Sin-Min Lee Department of Computer Science.

Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.

Implementing Dictionaries Many applications require a dynamic set that supports dictionary-type operations such as Insert, Delete, and Search. E.g., a.

1 Hash table. 2 A basic problem We have to store some records and perform the following:  add new record  delete record  search a record by key Find.

David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.

Comp 335 File Structures Hashing.

Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.

Storage and Retrieval Structures by Ron Peterson.

Can’t provide fast insertion/removal and fast lookup at the same time Vectors, Linked Lists, Stack, Queues, Deques 4 Data Structures - CSCI 102 Copyright.

Hashing Hashing is another method for sorting and searching data.

Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.

David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,

1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.

Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.

Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.

Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.

COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.

Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.

Copyright © Curt Hill Hashing A quick lookup strategy.

Midterm Midterm is Wednesday next week ! The quiz contains 5 problems = 50 min + 0 min more –Master Theorem/ Examples –Quicksort/ Mergesort –Binary Heaps.

Data Structure & Algorithm Lecture 8 – Hashing JJCAO Most materials are stolen from Prof. Yoram Moses’s course.

Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,

CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.

CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:

TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.

Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.

Data Structures Using C++ 2E

Hashing Jeff Chastine.

Hash table CSC317 We have elements with key and satellite data

Hashing Alexandra Stefan.

Data Structures Using C++ 2E

Dictionaries and Their Implementations

Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures

Lecture-Hashing.

Presentation transcript:

Hashing

Motivating Applications Large collection of datasets Datasets are dynamic (insert, delete) Goal: efficient searching/insertion/deletion Hashing is ONLY applicable for exact-match searching

Direct Address Tables If the keys domain is U  Create an array T of size U For each key K  add the object to T[K] Supports insertion/deletion/searching in O(1)

Solution is to use hashing tables Direct Address Tables Alg.: DIRECT-ADDRESS-SEARCH(T, k) return T[k] Alg.: DIRECT-ADDRESS-INSERT(T, x) T[key[x]] ← x Alg.: DIRECT-ADDRESS-DELETE(T, x) T[key[x]] ← NIL Running time for these operations: O(1) Solution is to use hashing tables Drawbacks >> If U is large, e.g., the domain of integers, then T is large (sometimes infeasible) >> Limited to integer values and does not support duplication

Direct Access Tables: Example U is the domain K is the actual number of keys

Hashing A data structure that maps values from a certain domain or range to another domain or range Hash function 3 15 Domain: String values 20 55 Domain: Integer values

Hashing A data structure that maps values from a certain domain or range to another domain or range Hash function Student IDs 950000 ….. 960000 Range ….. 10000 Domain: numbers [950,000 … 960,000] Domain: numbers [0 … 10,000]

Hash Tables When K is much smaller than U, a hash table requires much less space than a direct-address table Can reduce storage requirements to |K| Can still get O(1) search time, but on the average case, not the worst case

Hash Tables: Main Idea Use a hash function h to compute the slot for each key k Store the element in slot h(k) Maintain a hash table of size m  T [0…m-1] A hash function h transforms a key into an index in a hash table T[0…m-1]: h : U → {0, 1, . . . , m - 1} We say that k hashes to slot h(k)

Hash Tables: Main Idea Hash Table (of size m) U (universe of keys) U (universe of keys) h(k1) h(k4) k1 K (actual keys) k4 k2 h(k2) = h(k5) k5 k3 h(k3) m - 1 >> m is much smaller that U (m <<U) >> m can be even smaller than |K|

Example Back to the example of 100 students, each with 9-digit SSN All what we need is a hash table of size 100

What About Collisions Collisions! U (universe of keys) h(k1) h(k4) k1 K (actual keys) k4 k2 h(k2) = h(k5) Collisions! k5 k3 h(k3) m - 1 Collision means two or more keys will go to the same slot

Handling Collisions Many ways to handle it Chaining Open addressing Linear probing Quadratic probing Double hashing

Chaining: Main Idea Put all elements that hash to the same slot into a linked list (Chain) Slot j contains a pointer to the head of the list of all elements that hash to j

Chaining - Discussion Choosing the size of the hash table Small enough not to waste space Large enough such that lists remain short Typically 10% -20% of the total number of elements How should we keep the lists: ordered or not? Usually each list is unsorted linked list

Insertion in Hash Tables Alg.: CHAINED-HASH-INSERT(T, x) insert x at the head of list T[h(key[x])] Worst-case running time is O(1) May or may not allow duplication based on the application

Deletion in Hash Tables Alg.: CHAINED-HASH-DELETE(T, x) delete x from the list T[h(key[x])] Need to find the element to be deleted. Worst-case running time: Deletion depends on searching the corresponding list

Searching in Hash Tables Alg.: CHAINED-HASH-SEARCH(T, k) search for an element with key k in list T[h(k)] Running time is proportional to the length of the list of elements in slot h(k) What is the worst case and average case??

Analysis of Hashing with Chaining: Worst Case m - 1 T chain All keys will go to only one chain Chain size is O(n) Searching is O(n) + time to apply h(k)

Analysis of Hashing with Chaining: Average Case m - 1 T chain With good hash function and uniform distribution of keys Any given element is equally likely to hash into any of the m slots All chain will have similar sizes Assume n (total # of keys), m is the hash table size Average chain size  O (n/m) Average Search Time O(n/m): The common case

Analysis of Hashing with Chaining: Average Case If m (# of slots) is proportional to n (# of keys): m = O(n) n/m = O(1)  Searching takes constant time on average

Hash Functions

Hash Functions A hash function transforms a key (k) into a table address (0…m-1) What makes a good hash function? (1) Easy to compute (2) Approximates a random function: for every input, every output is equally likely (simple uniform hashing) (3) Reduces the number of collisions

Hash Functions Make table size (m) a prime number Common function Goal: Map a key k into one of the m slots in the hash table Make table size (m) a prime number Avoids even and power-of-2 numbers Common function h(k) = F(k) mod m Some function or operation on K (usually generates an integer) The output of the “mod” is number [0…m-1]

Examples of Hash Functions Collection of images F(k): Sum of the pixels colors h(k) = F(k) mod m Collection of strings F(k): Sum of the ascii values h(k) = F(k) mod m Collection of numbers F(k): just return k h(k) = F(k) mod m