ISAM vs Hash Indexing

Unit - V
1) Indexed Sequential Access Method(ISAM): ISAM is a

method for creating, maintaining, and manipulating indexes
of key-fields extracted from random data file records to
achieve fast retrieval of required file records.
 ISAM is an advanced sequential file organization method.

 IBM developed ISAM for mainframe computers.
 In ISAM records are stored in order of primary key in the

file. Using the primary key, the records are sorted. For each
primary key, an index value is generated and mapped with
the record.
 This ISAM index is nothing but the address of record in the
1
file.
Fig: Indexed Sequential Access Method(ISAM)
File Organization 2
 Indexes can be obviously built for each field that uniquely
identifies a record (or set of records within the file), and whose
type is amenable to ordering.
 Multiple indexes hence provide a high degree of flexibility for

accessing the data via search on various attributes; this organization
also allows the use of variable length records (containing different
fields).
 Method of storing: Address index is appended to the record.
 Types: Dense, Sparse, multilevel indexing.
 Design: Complex.
 Storage Cost: Costlier.
 Advantage: Searching records is faster. Suitable for large database.
Any of the columns can be used as key column. Searching range of
data & partial data are efficient.
 Disadvantage: Extra cost to maintain index.
File reconstruction is needed as insert / update / delete.
Does not grow with data. 3
Fig: Indexed File Organization 4
 Advantages of ISAM
 Since each record has its data block address, searching for a record
in larger database is easy and quick. There is no extra effort to
search records. But proper primary key has to be selected to make
ISAM efficient.
 ISAM method gives flexibility of using any column as key field and
index will be generated based on that. In addition to the primary key
and its index, we can have index generated for other fields too.
Hence searching becomes more efficient, if there is search based on
columns other than primary key.
 ISAM supports range retrieval, partial retrieval of records. Since the

index is based on the key value, we can retrieve the data for the
given range of values. In the same way, when a partial key value is
provided, say student names starting with ‘JA’ can also be searched
easily. 5
 Disadvantages of ISAM
 An extra cost to maintain index has to be afforded. i.e.; we

need to have extra space in the disk to store this index
value. When there is multiple key-index combinations, the
disk space will also increase.
 As the new records are inserted, these files have to be

restructured to maintain the sequence. Similarly, when the
record is deleted, the space used by it needs to be released.
Else, the performance of the database will slow down.
6
2. Hash based Indexing: Hashing is the process of mapping a key
value to a position in a table is called Hashing.
 Hashing is a technique used for performing insertions, deletions and
finds in constant average time (i.e. O(1))
 Hash function is determines position of key in the array.
 Hashing is widely useful technique for implementing Dictionaries
ADT.
 Hash table ADT is an alternative solution with O(1) expected query
time and O(n +N) space, where N is the size of the table. Like an
array, but with a function to map the large range of keys into a
smaller one.
 Properties and Hash Table : Hash Table properties are
 Determinism
 Uniformity.
 Variable range,
 Data normalization
 Continuity. 7
 Hash Table: Hash table ADT is an alternative solution with O(1)
expected query time and O(n +N) space, where N is the size of the
table. Like an array, but with a function to map the large range of keys
into a smaller one.
 Hash Tables are favor efficient storage and retrieval of data lists which
are linear in nature.
Index Values / Data
0 ...
1
...
2
3
4
N-1 ...
Buckets Values / Data

Fig: Hash Table Data Organization 8
 Collision resolution deals with keys that are mapped to same
addresses.
 How to deal with two keys which hash to the same spot in the
array, use chaining, set up an array of links (a table), indexed
by the keys, to lists of items with the same key
 All keys that map to the same hash value are kept in a list (or )
“bucket”, having a second key into a previously used slot is
called a collision.
 The hash table is implemented as an array of linked lists,
inserting an item that hashes at index is simply insertion into
the linked list at position in the table.
 Store all elements that hash to the same slot in a linked list,
store a pointer to the head of the linked list in the hash table
slot. 9
 Ex: 23 ,24, 25, 26, 27, 28, 29, 30, 31, 32, 33 ,60
Take n % 10 is a Hash Function

BUCKET: 0 -->30 -->60 Collision
BUCKET: 1 -->31
BUCKET: 2 -->32
BUCKET: 3 -->23 --> 33Collision
BUCKET: 4 -->24
BUCKET: 5 -->25
BUCKET: 6 -->26
BUCKET: 7 -->27
BUCKET: 8 -->28
BUCKET: 9 29
 To avoid Collision use Collision resolution Techniques
10
Factors affecting for Hashing: Hashing
offers excellent performance for insertion
and retrieval of data.
Choice of hash function
Collision resolution strategy
Load Factor
Operations
Insertion( )
Deletion( )
Initialization( )
Searching( )
Display ( )
Sorting ( )
Updation( )
Count( ) 13
Issues with Hashing
1. Multiple keys can hash to the same slot,

collisions are possible i.e. design hash functions
such that collisions are minimized and but
avoiding collisions is impossible.
2. Search will cost Ө(n) time in the worst case i.e.

however, all operations can be made to have an
expected complexity of Ө(1).
Characteristics
Easy to compute.
Easy to minimize collisions.
Easy to folding.
Easy to truncation.
Easy to modular arithmetic.
Easy to retrieve the element.
Easy to searching
Easy to sorting.
15
Advantages
1. Hashing is used to index and retrieve items in a
database.
2. It is also used in many encryption algorithms.
3. Reduce the time complexity.
4. Reduce the space complexity.
5. To implementing dictionaries.
6. Hashing is used in constant time per operation
(on the average).
16
Applications
 Relational DB Query processing
 File Organization, Telephone Dictionaries.
 Symbol table of a compiler.
 Memory-management tables in operating systems.
 Large-scale distributed systems.
 Online spelling checkers.
 Indexes
 Search engine databases
 Game programs - (transposition table)
17

ISAM vs Hash Indexing

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

ISAM vs Hash Indexing

Hochgeladen von

Copyright:

Verfügbare Formate

Unit - V

1) Indexed Sequential Access Method(ISAM): ISAM is a

 ISAM is an advanced sequential file organization method.

 In ISAM records are stored in order of primary key in the

 Multiple indexes hence provide a high degree of flexibility for

 ISAM supports range retrieval, partial retrieval of records. Since the

 An extra cost to maintain index has to be afforded. i.e.; we

 As the new records are inserted, these files have to be

Buckets Values / Data

Take n % 10 is a Hash Function

 To avoid Collision use Collision resolution Techniques

Choice of hash function

Collision resolution strategy

1. Multiple keys can hash to the same slot,

2. Search will cost Ө(n) time in the worst case i.e.

Das könnte Ihnen auch gefallen