Sie sind auf Seite 1von 13

CS 225 Lab #10 Hash Tables

Hash Tables

Hash Table (or Dictionary) is a data structure designed for O(1) average-case add, remove, and find operations when a search key is known If two keys collide (hash to the same value), running times of all these operations may degenerate to O(n) Made up of an array of key-value pairs, a hash function, and a collision-handling scheme

http://research.cs.vt.edu/AVresearch/hashing/openhash.php

Hash Function

Used to map the search key to the index of a slot in the table where the corresponding record is supposedly stored. Mapping may consist of 2 conversions
map

input type to integer

input

type could be any primitive or user defined type can always map binary data to an integer
map

integer to valid table index


% tableSize

someInteger

Hash Function (2)

Hash Function maps each key k in our hash table to the index, an integer in the range [0,N-1], where N is the size of the hash table. For this assignment, we will be using the following hash functions:
Summing

Components Cyclic Shift Polynomial Hash

Summing Components

This function maps strings to integers Algorithm


sum

the ascii values of each character of a string

Example:
hash(dog)

= 'd' + 'o' + 'g' = 100 + 111 + 103 = 314 hash(god) = 'g' + 'o' + 'd' = 103 + 111 + 100 = 314 Regardless of the table size, these two keys will collide using this hash function!

Cyclic Shift

This function maps strings to integers Algorithm


same

as summing components, but perform a 5-bit cyclic shift on the sum before adding each character's ascii value
int hash(string const & key) { unsigned int h = 0; for(int i = 0; i < key.size(); ++i) { h = (h << 5 | h >> 27); h += (unsigned int) key[i]; } return hash((int) h); }

Polynomial Hash

This function maps strings to integers Algorithm:

a polynomial in some non-zero constant a that takes components (x[0],x[1],...,x[k-1]) with a != 1. This can represented mathematically as:
h(x)

= x[0]*a(k-1) + x[1]*a(k-2) + ... + x[k-2]*a + x[k-1]

where k is the length of x

Example:
Let a = 2. hash(man) = 'm'*a^(2) + 'a'*a + 'n' = 109*4 + 97*2 + 110 = 436 + 194 + 110 = 740

Collision-Handling Schemes

Collision when two keys hash to the same table index Collision-Handling Schemes a technique to allow multiple entries with keys that hash to the same value to both exist in the table at the same time For this lab we'll describe two simple schemes:
Separate

Chaining Linear Probing

Separate Chaining

Separate Chaining a simple collision-handling scheme where each table cell contains a linked list of entries. When a collision occurs, the new entry can be added to the linked list.

http://www.isr.umd.edu/~austin/ence200.d/java-examples.htm

Linear Probing

Linear Probing simple collision-handling scheme where each table cell contains only one element, but collisions are handled by performing a linear search for the next empty cell. Each step of this search is called a probe, beginning with the 0th probe.

Linear Probing (2)

http://codeidol.com/java/javagenerics/Maps/Implementing-Map

Linear Probing (3)

The ith probe with key x is defined by the following function:


H(x,i)

= (h(x)+f(i)) % tableSize where


h(x)

is the hashing function f(i) is the probing function

The probing function in Linear Probing is a linear function:


f(i)

=c*i+d
some positive non-zero constant integers c and d

for

Hash Table Exercise

For the lab assignment you'll be


implementing

parts of separate chaining and linear probing collision-handling schemes comparing the running times of different combinations of hash functions and collision handling schemes
use

the 'time' command to measure running times

Das könnte Ihnen auch gefallen