Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.

Slides:



Advertisements
Similar presentations
Chapter 11. Hash Tables.
Advertisements

David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Hash Tables.
Hash Tables CIS 606 Spring 2010.
Hashing.
Hashing as a Dictionary Implementation
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
Hashing CS 3358 Data Structures.
11.Hash Tables Hsu, Lih-Hsing. Computer Theory Lab. Chapter 11P Directed-address tables Direct addressing is a simple technique that works well.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Tirgul 9 Hash Tables (continued) Reminder Examples.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Lecture 10: Search Structures and Hashing
Hashing General idea: Get a large array
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Spring 2015 Lecture 6: Hash Tables
Hashing Table Professor Sin-Min Lee Department of Computer Science.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Implementing Dictionaries Many applications require a dynamic set that supports dictionary-type operations such as Insert, Delete, and Search. E.g., a.
1 HashTable. 2 Dictionary A collection of data that is accessed by “key” values –The keys may be ordered or unordered –Multiple key values may/may-not.
1 Symbol Tables The symbol table contains information about –variables –functions –class names –type names –temporary variables –etc.
Data Structures Hash Tables. Hashing Tables l Motivation: symbol tables n A compiler uses a symbol table to relate symbols to associated data u Symbols:
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Comp 335 File Structures Hashing.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
Chapter 12 Hash Table. ● So far, the best worst-case time for searching is O(log n). ● Hash tables  average search time of O(1).  worst case search.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
Lecture 12COMPSCI.220.FS.T Symbol Table and Hashing A ( symbol) table is a set of table entries, ( K,V) Each entry contains: –a unique key, K,
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Midterm Midterm is Wednesday next week ! The quiz contains 5 problems = 50 min + 0 min more –Master Theorem/ Examples –Quicksort/ Mergesort –Binary Heaps.
Data Structure & Algorithm Lecture 8 – Hashing JJCAO Most materials are stolen from Prof. Yoram Moses’s course.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
Hashing Jeff Chastine.
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Hashing Alexandra Stefan.
Hashing Alexandra Stefan.
Dictionaries and Their Implementations
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Data Structures – Week #7
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
EE 312 Software Design and Implementation I
Data Structures – Week #7
DATA STRUCTURES-COLLISION TECHNIQUES
DATA STRUCTURES-COLLISION TECHNIQUES
EE 312 Software Design and Implementation I
Presentation transcript:

Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc. Typical symbol table operations are Insert, Delete and Search It's a dictionary structure!

Symbol Tables What kind of information is usually stored in a symbol table? type storage class size scope stack frame offset register We also need a way to keep track of reserved words.

Symbol Tables Where is a symbol table stored? array/linked list simple, but linear lookup time However, we may use a sorted array for reserved words, since they are generally few and known in advance. balanced tree O(lgn) lookup time hash table most common implementation O(1) amortized time for dictionary operations

Hashing Hash tables Hash functions use array of size m to store elements given key k (the identifier name), use a function h to compute index h(k) for that key collisions are possible two keys hash into the same slot. Hash functions A good hash function is easy to compute avoids collisions (by breaking up patterns in the keys and uniformly distributing the hash values)

Hashing In the following slides: k is a key h(k) is the hash function m is the size of the hash table n is the number of keys in the hash table

Hashing What makes a good hash function? It is easy to compute It minimizes collisions. hash = to chop into small pieces (Merriam- Webster) = to chop any patterns in the keys so that the results are uniformly distributed (cs311)

Hashing When the key is a string, we generally use the ASCII values of its characters in some way: Examples for k = c1c2c3...cx h(k) = (c1128x-1+c2128x-2+...+cx1280) mod m h(k) = (c1+c2+...+cx) mod m h(k) = (h1(c1)+h2(c2)+...hx(cx)) mod m, where each hi is an independent hash function.

Hash functions Truncation Ignore part of the key and use the remaining part directly as the index. Example: if the keys are 8-digit numbers and the hash table has 1000 entries, then the first, fourth and eighth digit could make the hash function. Not a very good method : does not distribute keys uniformly

Hash functions Folding Break up the key in parts and combine them in some way. Example : if the keys are 9 digit numbers, break up a key into three 3-digit numbers and add them up.

Hash functions Middle square Compute k*k and pick some digits from the resulting number. Example : given a 9-digit key k, and a hash table of size 1000 pick three digits from the middle of the number k*k. Works fairly well in practice if the keys do not have many leading or trailing zeroes.

Hash functions Division h(k)=k mod m Fast Not all values of m are suitable for this. For example powers of 2 should be avoided because then k mod m is just the least significant digits of k Good values for m are prime numbers .

Hash functions Multiplication h(k)=m (k  c- k  c)  , 0<c<1 In English : Multiply the key k by a constant c, 0<c<1 Take the fractional part of k  c Multiply that by m Take the floor of the result The value of m does not make a difference Some values of c work better than others A good value is

Hash functions Multiplication Example: nice distribution! Suppose the size of the table, m, is 1301. For k=1234, h(k)=850 For k=1235, h(k)=353 For k=1236, h(k)=115 For k=1237, h(k)=660 For k=1238, h(k)=164 For k=1239, h(k)=968 For k=1240, h(k)=471 nice distribution!

Hash functions Universal Hashing Worst-case scenario: The chosen keys all hash to the same slot. This can be avoided if the hash function is not fixed: Start with a collection of hash functions Select one at random and use that. Good performance on average: the probability that the randomly chosen hash function exhibits the worst-case behavior is very low.

Load factor Given a hash table of size m, and n elements stored in it, we define the load factor of the table as =n/m The load factor gives us an indication of how full the table is. The possible values of the load factor depend on the method we use for resolving collisions.

Resolving collisions: Chaining Put all the elements that collide in a chain (list) attached to the slot. The hash table is an array of linked lists The load factor indicates the average number of elements stored in a chain. It could be less than, equal to, or larger than 1. * a.k.a. closed addressing

Resolving collisions: Chaining Insert/Delete/Lookup in expected O(1) time Keep the list doubly-linked to facilitate deletions Worst case of lookup time is linear. However, this assumes that the chains are kept small. If the chains start becoming too long, the table must be enlarged and all the keys rehashed.

Resolving collisions: Chaining Assumption: simple uniform hashing any given key is equally likely to hash into any of the m slots Analysis of unsuccessful search: average time to search unsuccessfully for key k = the average time to search to the end of a chain. The average length of a chain is . Total (average) time required : (1+ )

Resolving collisions: Chaining Analysis of successful search: Expected number e of elements examined during a successful search for key k = one more than the expected number of elements examined when k was inserted. it makes no difference whether we insert at the beginning or the end of the list. Take the average, over the n items in the table, of 1 plus the expected length of the chain to which the i th element was added:

Resolving collisions: Chaining Total time : (1+ )

Resolving collisions: Chaining Both types of search take (1+ ) time on average. If n=O(m), then =O(1) and the total time for Search is O(1) on average Insert : O(1) in the worst case Delete : O(1) in the worst case

Resolving collisions: Chaining Storage for the elements may be allocated and deallocated within the hash table itself by linking all unused slots into a free list. Insert: if key k hashes into empty slot h(k), put it there and set a flag to indicate that this is the actual position where the element hashed. if h(k) is not empty, and the element k1 it contains has its flag set, then use a slot off the free list to store k1. Its flag should be unset. if h(k) is not empty, and the element k1 it contains has its flag unset, then move k1 to another slot and store k in h(k).