Sie sind auf Seite 1von 14

Hashing

Dictionary
• A Dictionary is a collection of elements.
• Each element has a field called key.
– (Key, Data)
• Every key is usually distinct.

• Typical dictionary operations are:


• Insert
• Select
• Delete
• Collection of student records in a class:
– (S_rno, S_name, marks )
– All s_rno are distinct
Hashing
• A hash table is a data structure that store elements and allows
search and deletions to be performed O(1) time.

• Hash Table is an alternative method for representing a dictionary.


• In a hash table, a hash function is used to map keys into positions in
a table. This act is called hashing.

• A function that transforms a key into a table index is called hash


function.
If a function h is applied on key k it returns an index i, then
i = h(k)
h(k) is called hash function.
Hashing
• Why we need Hashing?
– It should be easy and quick to compute.
– It should keep the number of collision as
minimum as possible.
– The search operation on a sorted array using the
binary search method takes O(log2n). We can
improve the search time by using hashing.
Hashing can make this happen in O(1).
– Usually implemented on Dictionaries.
Hashing Methods
Hashing methods:
– Division Methods
– Mid-Square Methods
– Folding Methods
Division Methods

• Division Methods
– The key k is divided by some number m and the remainder
is used as the hash address of k.
–h(k) = K mod m
This gives the indexes in the range 0 to m-1 so the
hash table should be of size m.

– For example:
Consider a hash table with 9 elements i.e., m=9.
Then the hash function will map the key 132 to slot 6.
h(132) = 132 % 9 = 6
Division Methods

• Example:
Let m=100 and k= 2345.
h(2345) = 2345 % 100 = 45
Consider another key k=2445
h(2445) = 2445 % 100 = 45

• To avoid hash collision in division method, the best choice for m is


to take a prime number, which usually spreading the keys quite
uniformly.
• Example:
– Let a hash table size of 100. It is better to choose either 97 or 1009
instead of 109.
– h(2345) = 2345 % 97 = 17
– H(2445) = 2445 % 97 = 20
Mid-Square Method
• The key k is multiplied by itself and the address is obtained by
selecting an appropriate number of digits from the middle of the
square.

• The number of digits selected depends on the size of the table.

• Example:
K: 3205 7148 2345
k2: 10272025 51093904 5499025
h(k): 72 93 99
Mid-Square Method
• Note: If a two digit address is required,
positions 4 to 5 (from right to left) could be
chosen.
Folding Method
• In the first step, the key k is divided into number of parts
k1,k2,k3,…,kn (from left to right), where each part has the same
number of digits except the last part, which can has lesser digits.

• In the second step, these parts are added together and the hash
value is obtained by ignoring the last carry(if any).
• Example:
K: 9235 714 71458
Parts: 92, 35 71, 4 71, 45, 8
Sum
Of
parts : 127 75 124
h(k): 27 75 24
Disadvantage of Hashing
• When an element is inserted, if it hashes to the same value as
an already inserted element, then we have a collision.

• Collision resolving techniques:


–Open Addressing
»Linear Probing
»Quadratic Probing
»Double Hashing
–Chaining
Linear Probing
Linear probing uses the following hash function:
h(k, i) = [h’(k) + i] mod m, for i = 0, 1, 2, …, m-1
Where m is the size of the hash table and h’(k) = k mod m and
i is the number in the sequence.
Quadratic Probing
• The quadratic probing uses the following hash function:
h(k, i) = [h’(k) + c1 * i + c2*i2 ] mod m
Where m is the size of the hash table and h’(k) = k mod m and
i is the number in the sequence and c1 , c2 are constant(≠0) .

• Q) Given the hash function h(k, i) = (h’(k) + i + i2) mod 11 and


h’(k) = k mod 11.
What is the number of collision to store the following keys:
23, 12, 19, 11, 33, 16, 46, 37
Double Hashing and Chaining
• Double hashing uses a hash function of the form:
h(k, i) = [h1(k) + i* h2(k)] mod m for i = 0, 1, 2, …, m-1
Where m is the size of the hash table and i is the number in the
sequence and h1(K) = k mod m and h2(k) = k mod m’ are two
auxiliary hash function. Here m’ is chosen to be slightly less
than m( i.e., (m-1) or (m-2)).

• Chaining:
– This is a technique used to avoid collisions.
– The idea is to store the items that has hash value to
the same value in the sorted list.

Das könnte Ihnen auch gefallen