Look-up problem IP address did we see the IP address before?

Slides:



Advertisements
Similar presentations
Preliminaries Advantages –Hash tables can insert(), remove(), and find() with complexity close to O(1). –Relatively easy to program Disadvantages –There.
Advertisements

Hashing.
Theory I Algorithm Design and Analysis (5 Hashing) Prof. Th. Ottmann.
Precept 6 Hashing & Partitioning 1 Peng Sun. Server Load Balancing Balance load across servers Normal techniques: Round-robin? 2.
September 26, Algorithms and Data Structures Lecture VI Simonas Šaltenis Nykredit Center for Database Research Aalborg University
An Improved Construction for Counting Bloom Filters Flavio Bonomi Michael Mitzenmacher Rina Panigrahy Sushil Singh George Varghese Presented by: Sailesh.
Bloom Filters Kira Radinsky Slides based on material from:
Data Structures – LECTURE 11 Hash tables
Hash Tables How well do hash tables support dynamic set operations? Implementations –Direct address –Hash functions Collision resolution methods –Universal.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Bloom filters Probability and Computing Randomized algorithms and probabilistic analysis P109~P111 Michael Mitzenmacher Eli Upfal.
CSE 326 Hashing Richard Anderson (instead of Martin Tompa)
Hashing General idea: Get a large array
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Note to other teachers and users of these.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Hashtables David Kauchak cs302 Spring Administrative Talk today at lunch Midterm must take it by Friday at 6pm No assignment over the break.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
Hash Tables Universal Families of Hash Functions Bloom Filters Wednesday, July 23 rd 1.
1 Lecture 11: Bloom Filters, Final Review December 7, 2011 Dan Suciu -- CSEP544 Fall 2011.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
Implementing Dictionaries Many applications require a dynamic set that supports dictionary-type operations such as Insert, Delete, and Search. E.g., a.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Can’t provide fast insertion/removal and fast lookup at the same time Vectors, Linked Lists, Stack, Queues, Deques 4 Data Structures - CSCI 102 Copyright.
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Spatial Issues in DBGlobe Dieter Pfoser. Location Parameter in Services Entering the harbor (x,y position)… …triggers information request.
Lecture 12COMPSCI.220.FS.T Symbol Table and Hashing A ( symbol) table is a set of table entries, ( K,V) Each entry contains: –a unique key, K,
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hashing 8 April Example Consider a situation where we want to make a list of records for students currently doing the BSU CS degree, with each.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Ihab Mohammed and Safaa Alwajidi. Introduction Hash tables are dictionary structure that store objects with keys and provide very fast access. Hash table.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
H ASH TABLES. H ASHING Key indexed arrays had perfect search performance O(1) But required a dense range of index values Otherwise memory is wasted Hashing.
October 6, Algorithms and Data Structures Lecture VII Simonas Šaltenis Aalborg University
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
1 Lecture 21: Hash Tables Wednesday, November 17, 2004.
Data Structure & Algorithm Lecture 8 – Hashing JJCAO Most materials are stolen from Prof. Yoram Moses’s course.
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
Bloom Filters. Lecture on Bloom Filters Not described in the textbook ! Lecture based in part on: Broder, Andrei; Mitzenmacher, Michael (2005), "Network.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Searching Tables Table: sequence of (key,information) pairs (key,information) pair is a record key uniquely identifies information, so no duplicate records.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
1 What is it? A side order for your eggs? A form of narcotic intake? A combination of the two?
Lecture 10 Hashing.
Hash table CSC317 We have elements with key and satellite data
Lower bounds for approximate membership dynamic data structures
Hashing Alexandra Stefan.
Lecture 21: Hash Tables Monday, February 28, 2005.
Hashing Alexandra Stefan.
Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 
Hash table another data structure for implementing a map or a set
Advanced Associative Structures
Richard Anderson (instead of Martin Tompa)
Bloom filters Probability and Computing Michael Mitzenmacher Eli Upfal
Hash In-Class Quiz.
תרגול 8 Hash Tables ds162-ps08 11/23/2018.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hashing Alexandra Stefan.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hash Tables Computer Science and Engineering
A Hash Table with Chaining
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Pseudorandom number, Universal Hashing, Chaining and Linear-Probing
Bloom filters From Probability and Computing
Ch. 13 Hash Tables  .
Presentation transcript:

Look-up problem IP address did we see the IP address before?

Hashing + chaining use IP address as an index linked list IP address hash function index

How to choose a hash function? x  [0,1] x   x.n  depends on the distribution of the data interpret x as a number x  x mod n IP-addresses n=256 bad n=257 good?

Universal hash functions choose a hash function “randomly” n = number of entries in the hash table U = the universe h: U  {0,...,n-1} a hash function

Universal hash functions choose a hash function “randomly” a set of hash functions H is universal if  x,y  U and random h  H P ( h(x) = h(y) )  1/n n = number of entries in the hash table U = the universe h: U  {0,...,n-1} a hash function

Universal hash functions a set of hash functions H is universal if  x,y  U and random h  H P ( h(x) = h(y) )  1/n For IP addresses choose a 1,a 2,a 3,a 4  {0,1,...,256} (x 1,x 2,x 3,x 4 )  a 1 x 1 +a 2 x 2 +a 3 x 3 +a 4 x 4 mod 257

Perfect hashing Goal: worst-case O(1) search space used O(m) static set of elements

Perfect hashing Goal: worst-case O(1) search space used O(m) static set of elements n = m 2 i.e., space used  (m 2 ) H = family of universal hash functions  hash function h  H with no collision

Perfect hashing Goal: worst-case O(1) search space used O(m) n = m H = family of universal hash functions  h  H such that  x i 2 = O(m) x 1,...,x n the number of elements that map to 1,2,...,n

Perfect hashing Goal: worst-case O(1) search space used O(m) n = m H = family of universal hash functions  h  H such that  x i 2 = O(m) x 1,...,x n the number of elements that map to 1,2,...,n secondary hash table of size x i 2

Bloom filter n-bits of storage Goal: store an m element subset of IP addresses IP address HASH 000

Bloom filter - insert n-bits of storage IP address HASH 111 INSERT(x) for i from 1 to k do A(h i (x))  1

Bloom filter – member n-bits of storage IP address HASH 111 MEMBER(x) for i from 1 to k do if A(h i (x))=0 then return FALSE return TRUE

Bloom filter – member MEMBER(x) for i from 1 to k do if A(h i (x))=0 then return FALSE return TRUE sometimes gives false positive answer error parameter: false positive probability

Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions

Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions p = fraction of the bits filled p  e -km/n

Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions p = fraction of the bits filled false positive probability (1-p) k p  e -km/n

Bloom filter – analysis error parameter: false positive probability m = number of items to be stored n = number of bits of storage k = number of hash functions optimal k  0.7 m/n false positive rate  m/n