Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-usewww.db-book.com Hashing.

Slides:



Advertisements
Similar presentations
CS4432: Database Systems II Hash Indexing 1. Hash-Based Indexes Adaptation of main memory hash tables Support equality searches No range searches 2.
Advertisements

©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Hashing Dashiell Fryer CS 157B Dr. Lee. Contents Static Hashing Static Hashing File OrganizationFile Organization Properties of the Hash FunctionProperties.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
Indexing (Cont.) These slides are a modified version of the slides of the book “Database System Concepts” (Chapter 12), 5th Ed., McGraw-Hill,McGraw-Hill.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
File Processing : Hash 2015, Spring Pusan National University Ki-Joune Li.
Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 11: Indexing.
CM20145 Indexing and Hashing
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
CIS552Indexing and Hashing1 Cost estimation Basic Concepts Ordered Indices B + - Tree Index Files B - Tree Index Files Static Hashing Dynamic Hashing Comparison.
Index Basic Concepts Indexing mechanisms used to speed up access to desired data. E.g., author catalog in library Search Key - attribute to set of attributes.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
CST203-2 Database Management Systems Lecture 7. Disadvantages on index structure: We must access an index structure to locate data, or must use binary.
INDEXING AND HASHING.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. Indexing and Hashing Database Management Systems I Alex Coman, Winter 2006.
Arch (A) Tented Arch (T) Whorl (W) Loop (U or R) The four main classes of fingerprints Loop (60%) Arch/Tented Arch (6%) Whorl (34%) Other (Less than 1%)
B+-tree and Hash Indexes
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part A Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Database Management Systems I Alex Coman, Winter 2006
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Quick Review of material covered Apr 8 B+-Tree Overview and some definitions –balanced tree –multi-level –reorganizes itself on insertion and deletion.
Indexing and Hashing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 11: Storage and.
E.G.M. PetrakisHashing1 Hashing on the Disk  Keys are stored in “disk pages” (“buckets”)  several records fit within one page  Retrieval:  find address.
Ch12: Indexing and Hashing  Basic Concepts  Ordered Indices B+-Tree Index Files B+-Tree Index Files B-Tree Index Files B-Tree Index Files  Hashing Static.
Computing & Information Sciences Kansas State University Friday, 24 Oct 2008CIS 560: Database System Concepts Lecture 23 of 42 Friday, 24 October 2008.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
Chapter 12: Indexing and Hashing
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
12.1 Chapter 12: Indexing and Hashing Spring 2009 Sections , , Problems , 12.7, 12.8, 12.13, 12.15,
©Silberschatz, Korth and Sudarshan11.1Database System Concepts Chapter 11: Storage and File Structure File Organization Organization of Records in Files.
Basic Concepts Indexing mechanisms used to speed up access to desired data. E.g., author catalog in library Search Key - attribute to set of attributes.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 560: Database System Concepts Lecture 22 of 42 Wednesday, 22 October.
Indexing and Hashing By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan Chapter 12: Indexing and Hashing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Indexing.
Hashing by Rafael Jaffarove CS157b. Motivation  Fast data access  Search  Insertion  Deletion  Ideal seek time is O(1)
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Module D: Hashing.
Computing & Information Sciences Kansas State University Monday, 31 Mar 2008CIS 560: Database System Concepts Lecture 25 of 42 Monday, 31 March 2008 William.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 11: Indexing.
Chapter 5 Record Storage and Primary File Organizations
PART 4 DATA STORAGE AND QUERY. Chapter 12 Indexing and Hashing.
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and Hashing.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Hash 2004, Spring Pusan National University Ki-Joune Li.
Dynamic Hashing (Chapter 12)
Chapter 12: Indexing and Hashing
Chapter 11: Indexing and Hashing
Dynamic Hashing.
Indexing And Hashing.
Hashing Chapter 11.
Indexing and Hashing Basic Concepts Ordered Indices
Indexing and Hashing B.Ramamurthy Chapter 11 2/5/2019 B.Ramamurthy.
2018, Spring Pusan National University Ki-Joune Li
Presentation transcript:

Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Hashing

©Silberschatz, Korth and Sudarshan12.2Database System Concepts - 5 th Edition, Oct 4, 2006 Static Hashing n A bucket is a unit of storage containing one or more records (a bucket is typically a disk block). n In a hash file organization we obtain the bucket of a record directly from its search-key value using a hash function. n Hash function h is a function from the set of all search-key values K to the set of all bucket addresses B. n Hash function is used to locate records for access, insertion as well as deletion. n Records with different search-key values may be mapped to the same bucket; thus entire bucket has to be searched sequentially to locate a record.

©Silberschatz, Korth and Sudarshan12.3Database System Concepts - 5 th Edition, Oct 4, 2006 Example of Hash File Organization n There are 10 buckets, n The binary representation of the ith character is assumed to be the integer i. n The hash function returns the sum of the binary representations of the characters modulo 10 l E.g. h(Perryridge) = 5 h(Round Hill) = 3 h(Brighton) = 3 Hash file organization of account file, using branch_name as key (See figure in next slide.)

©Silberschatz, Korth and Sudarshan12.4Database System Concepts - 5 th Edition, Oct 4, 2006 Example of Hash File Organization Hash file organization of account file, using branch_name as key (see previous slide for details).

©Silberschatz, Korth and Sudarshan12.5Database System Concepts - 5 th Edition, Oct 4, 2006 Hash Functions n Worst hash function maps all search-key values to the same bucket; this makes access time proportional to the number of search-key values in the file. n An ideal hash function is uniform, i.e., each bucket is assigned the same number of search-key values from the set of all possible values. n Ideal hash function is random, so each bucket will have the same number of records assigned to it irrespective of the actual distribution of search-key values in the file. n Typical hash functions perform computation on the internal binary representation of the search-key. l For example, for a string search-key, the binary representations of all the characters in the string could be added and the sum modulo the number of buckets could be returned..

©Silberschatz, Korth and Sudarshan12.6Database System Concepts - 5 th Edition, Oct 4, 2006 Handling of Bucket Overflows n Bucket overflow can occur because of l Insufficient buckets l Skew in distribution of records. This can occur due to two reasons:  multiple records have same search-key value  chosen hash function produces non-uniform distribution of key values n Although the probability of bucket overflow can be reduced, it cannot be eliminated; it is handled by using overflow buckets.

©Silberschatz, Korth and Sudarshan12.7Database System Concepts - 5 th Edition, Oct 4, 2006 Handling of Bucket Overflows (Cont.) n Overflow chaining – the overflow buckets of a given bucket are chained together in a linked list. n Above scheme is called closed hashing. l An alternative, called open hashing, which does not use overflow buckets, is not suitable for database applications.

©Silberschatz, Korth and Sudarshan12.8Database System Concepts - 5 th Edition, Oct 4, 2006 Hash Indices n Hashing can be used not only for file organization, but also for index- structure creation. n A hash index organizes the search keys, with their associated record pointers, into a hash file structure. n Strictly speaking, hash indices are always secondary indices l if the file itself is organized using hashing, a separate primary hash index on it using the same search-key is unnecessary. l However, we use the term hash index to refer to both secondary index structures and hash organized files.

©Silberschatz, Korth and Sudarshan12.9Database System Concepts - 5 th Edition, Oct 4, 2006 Example of Hash Index

©Silberschatz, Korth and Sudarshan12.10Database System Concepts - 5 th Edition, Oct 4, 2006 Deficiencies of Static Hashing n In static hashing, function h maps search-key values to a fixed set of B of bucket addresses. Databases grow or shrink with time. l If initial number of buckets is too small, and file grows, performance will degrade due to too much overflows. l If space is allocated for anticipated growth, a significant amount of space will be wasted initially (and buckets will be underfull). l If database shrinks, again space will be wasted. n One solution: periodic re-organization of the file with a new hash function l Expensive, disrupts normal operations n Better solution: allow the number of buckets to be modified dynamically.