Download presentation
Presentation is loading. Please wait.
Published byHomer Stafford Modified over 9 years ago
1
Hashed Files Text Versus Binary Meghan Cavanagh
2
Hashed Files a file that is searched using one of the hashing methods User gives the key, the function maps the key to the address and passes it to the operating system then the record is retrieved Mapping in a Hashed File Key -> Address=Hash Function ->Address
3
Hashing Methods Direct Hashing Modulo Division Digit Extraction Mid-squareFoldingRotationalPseudorandom
4
Direct Hashing Method the key is obtained without any algorithmic manipulation Contains a record for every possible key Limited situations for this method Very powerful because it guarantees that there are no synonyms or collisions
5
Modulo Division Method (division remainder hashing) divides the key by the file size and uses the remainder plus one for the address Algorithm works with any list size but a prime number produces fewer collisions than other list sizes The list size in the equation below is the number of elements in the file address = key % list _size + 1
6
Digit Extraction Method selected digits are extracted from the key and used as the address For example if you use a six digit employee number to hash to a three digit address you could select the first, third and fourth digits and use them as the address 125870 = 158 122801=128 121267=112 123413=134
7
Collision occurs when a hashing algorithm produces an address for an insertion and that address is already occupied Synonyms two or more keys the hatch to the same home address Home Address the first address produced by the hashing algorithm Prime Area the memory that contains the home address
8
Collision Resolution Open Addressing Resolution- when a collision occurs, the prime area addresses are searched for an opened or unoccupied record where the new data can be placed Linked List Resolution- eliminates the probability of future collisions where the first record is stored in the home address, but it contains a pointer to the second record Bucket Hashing- uses a location that can accommodate multiple data units to reduce collision Combination Approaching- uses several approaches to resolve the collision
9
Text File File of characters Cannot contain integers, floating point numbers or any other data structures in their internal memory format In order to store these data types they must be converted to their character equivalent formats The most well known text files are file streams for key boards, monitors and printers
10
Binary Files Collection of data stored in the internal format of the computer Data can be an integer, a floating point number, a character or any other structured data (except a file) Contains data that is meaningful only if they are properly interpreted by the program Textual Data 1 byte is used to represent one character Numeric Data 2 or more bytes is considered a data item
11
The End
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.