Sie sind auf Seite 1von 33

Module 1 MCA 202 ADMN 2009 - 10

DATA STRUCTURES

MODULE 1

ALGORITHM AND ARRAYS

INTRODUCTION AND OVERVIEW

DATA

The term data means value or set of values. It can be used in both singular and plural
terms. The following are some examples of data:

 34

 12/01/1965

 ISBN 81-203-0000-0

 │││

 21, 25, 67, 89, 90

 Pascal

In the above mentioned example values are represented in different ways. Here each
value or a collection of all such value is termed data.

ENTITY

An entity is one that has certain attributes and which may be assigned values. For
example an employee in an organization is an entity, and the possible attributes are:

Entity : EMPLOYEE

Attributes : NAME DOB SEX DESIGNATION

Values : RAVI.A 30/12/81 M DIRECTOR

All the employees in an organization constitute an entity set. Each attribute of an


entity set has a range of values and this range is called the domain of the attribute.

INFORMATION

The term information is used for data with its attribute(s).In other words, information
can be defined as meaningful data or processed data.

For example the set of data represented above becomes information related to a person when
the user imposes meanings as mentioned below in Table 1.1.

Department of Computer Sciences and Applications,SJCET,Palai


Page 1
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

Data Meaning

38 Age of the person


12/01/1965 Date of birth of the person
ISBN 81-203-0000-0 Book number recently published by the person
│││ Number of awards equal to tally marks received by the person
21, 25, 28,30,35,38 Important ages of the person
Pascal Nickname of the person.

RELATION BETWEEN DATA AND Table


INFORMATION Information
_1

Information
Procedur _2
e to
Data Information
process
data _3

Fig: Information
1.1 _i

DATA TYPE

A data type is a term which refers to the kind of data that may appear in a
computation of data.
DataTable 1.2 illustrates the data and its corresponding
Data Type data type.

38 Numeric
(Integers)
12/01/1965 Date
ISBN 81-203-0000-0 Alphanumeric
│││ Graphics
21, 25, 28,30,35,38 Array of Integers.
Pascal Nick name

Table
1.2

Department of Computer Sciences and Applications,SJCET,Palai


Page 2
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

CONCEPT OF DATA STRUCTURES

A digital computer manipulates only primitive data, that is of 0’s and


1’s.Manipulation of primitive data does not require any extra user effort. But in real life
applications various kinds of data other than primitive data are involved, and manipulation of
these data requires the following tasks:

 Storage representation of user data: User data should be represented in such


a way that the computer should understand it.

 Retrieval of stored data: Data retrieved from the computer should be in such a
way that the user should understand it.

 Transformation of data: Various operations which is required to be performed


on user data so that it can be transformed from one form to another.

Basically a data structure constitute the fundamentals of computer science and implies
the Domain (The range of value that the data may have), Function (This is the set of
operations that can be legally be applied in the data object), Axioms (This refers to the set of
rules with which the different operations belonging to the function can actually be
implemented). Data Structures is the way of organizing data to do operations on a set of data.
The types of data include numeric data and Alpha numeric data. The data may be a single
value or a set of values, which is to be processed and organized .This organization of a
collection of data leads to structuring of data. Therefore data structure deals with structuring
of data.

A data structure D is a triplet, that is, D= (D, F, A) where ‘D’ is the set of data
objects, ‘F’ is a set of functions, and ‘A’ is a set of rules to implement the functions.

D = (0, ±1, ±2, ±3,…..)

F = (+,-,*, /, %)

A = (A set of binary arithmetic’s to perform addition, substarction, division,


multiplication,

and modulo operations.)

DEFINITION:

Data structure is the representation of logical relationship existing between


individual elements of data. In other words, data structure is the way of organizing the data
items which consider not only the item stored, but also relationship with individual units of
data. There are specific operations performed on the various types of data structures such as
list, stack, queue, tree, graph, hash table etc.

Abstract Data Type (ADT)

Department of Computer Sciences and Applications,SJCET,Palai


Page 3
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

The data structure with specific operations such as insertion, deletion, searching,
sorting, merging, traversing and display are called Abstract Data Type (ADT), which are user
defined data types.

E.g.: LIST →ADT

Item →Components

Operations →Insertion, deletion, search, sort and display.

Overview of data structures

Data Structure specifies,

Θ The Organization of data.

Θ Accessing methods.

Θ Degree of associativety.

Θ Processing alternatives of the Information.

The data structure affects both structural and functional aspects of a program.

i.e, ALGORITHM + DATA STRUCTURES = PROGRAM

The Algorithm is a step by step procedure to solve a problem and the data structure is the
way of organizing data, maintaining the logical relationship.

CLASSIFICATION OF DATA STRUCTURE

Data Structure is classified basically into two types as,

1. Primitive

2. Non primitive data structures.

Department of Computer Sciences and Applications,SJCET,Palai


Page 4
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

DATA
SRUCTURE
NON
PRIMITIVE
PRIMITIVE
DATA
DATA
STRUCTURES
SRUCTURE

INTEGE POINTE STRING LINEA NON


FLOAT
R R S R LINEAR

TRE SE
ARRA STAC LIS QUEU GRAP TABL
E T
Y K T E H EsPH

Fig: 1.2

Primitive Data Structures

These are the basic data structures which are directly operated by machine
instructions.

Eg: Integer, Floating point, Pointer values, Strings.

Non- Primitive Data Structures

These are most sophisticated data structures which are derived from Primitive data
structures .The non-primitive data structures emphasize on structuring of a group of
homogeneous or heterogeneous data items.

Eg: Array, List, files, Stack, Queues etc.

The common operations done on data structure are categorized into four types:

1. CREATE – The create operations result in retrieving memory for the program
elements which takes place either during compile time or run time.

2. DESTROY or DELETE – This option destroys the memory space allocated for a
specified Data structures.

3. SELECTION – This Option deals with accessing a particular data within the data
structure.

4. UPDATION – This modifies the data stored in the data structure.

The other options include,

a. Searching – The searching option finds the presence of a derived data item in a list of
data items.

Department of Computer Sciences and Applications,SJCET,Palai


Page 5
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

b. Sorting – is the process of arranging all data items in the data structure in a particular
order, either in the descending order or ascending order.

c. Merging – is a process of combining the data items of two different data list into a
single sorted list.

d. Traversing – refers to accessing or visiting through each element in a given list of


items.

Non-primitive data structures are

ARRAYS Array is defined as a set of finite number of homogeneous elements or


data items arranged in a linear order.

E.g.: int a [10]

Properties:

Θ Every individual elements of an array can be accessed by specifying the name of


array followed by index location.

Θ The index of the first element is at location 0, That is, a[0] till a[n].

Θ The elements are stored in consecutive memory locations.

Θ The number of elements stored in an array is the size of the array represented b y
( Upper bound – Lower bound ) + 1

Θ Arrays are read and written using ‘for’ loops.

For reading,

1. For (i=0;i<n;i++)

2. {

3. scanf (“%d”,&a[i]);

4. }

For Writing,

5. For (i=0; i<n; i++)

6. {

7. printf(“%d”,a[i]);

8. }

Department of Computer Sciences and Applications,SJCET,Palai


Page 6
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

LIST A list or linear linked list is defined as variable number of data items .In
dynamic representation of a list, every data item is represented in structures called ‘nodes’
where each node contains two fields. One field for storing ‘data ‘known as Information field
and other for storing address of the node called as Pointer field.

A B C D

Fig:
1.3

Based on the type of linkages between the nodes there exists various types of node as,

♦ Singly Linked list

♦ Doubly Linked list

♦ Circular linked list

♦ Circular Doubly Liked list.

STACKS It is an ordered collection of linear elements, like an array where all


insertions and deletions are done only at one end called TOP of the stack. Therefore, this
data structure is also called LIFO structure (Last In First Out). Insertion of an element into
the stack is called PUSH operation. Deletion of an element from the stack is referred as POP
operation.

POP
PUSH
TOP

Fig: 1.4

QUEUE A queue is a linear data structure where operations are done at both
ends of the structure. It is of the type FIFO (First In First Out).

Department of Computer Sciences and Applications,SJCET,Palai


Page 7
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

FRON REAR
T 45 57 80 90 23
77
Fig:
1.5

In the queue all insertions are done from the front end of the queue called FRONT,
and all deletions are done at other end referred as REAR end.

TREE A tree is defined as a finite set of data items, which is represented


using nodes. Tree is a non-linear type of data structure in which data items are arranged as
sorted in a sorted sequence. A tree is represented as a hierarchal relationship between
various elements in the set. The properties of a tree data structure are,

• The special data item represented at the top of the tree is called root.

• The remaining data items are partitioned into number of mutually exclusive or
disjoint subsets and each subset represents a tree structure called sub tree.
The left subset denotes left-sub tree and the right subset represents a right
sub-tree.

• The tree grows in length towards the bottom in the data structure.

Left sub Right Sub


B C tree
tree

D E F G

Fig: 1.6

GRAPH

It is a mathematical, non-linear data structure capable of representing many kinds of


physical structures, the graph is represented as G (V, E). Here, the graph G represents the set
of all vertices denoted by V and the set of all edges denoted by E. Graph is classified into
Directed, non-directed, Connected, non-Connected, simple and multi graphs.

Department of Computer Sciences and Applications,SJCET,Palai


Page 8
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

A B
G

D C
Fig:
1.7

ALGORITHM

Definition: An algorithm is a finite set of instructions to accomplish a particular task. All


algorithms should have the following properties,

1. Input

2. Output

3. Definiteness: Each instruction is clear and ambiguous.

4. Finiteness: Algorithm should terminate after a number of steps.

5. Effectiveness: Every instruction must be basic enough to be carried out.

An algorithm can be distinguished from a program. A program may not be finite that
is it may be non terminating. For e.g. operating system is a program that is waiting infinitely
in a loop.

ALGORITHM SPECIFICATION

Pseudocode conventions

We describe an algorithm using a natural language like English, and the resulting
instructions must be definite. Another possibility is Graphic representations called
flowcharts, but it works only if the algorithm is small and simple. Pseudocode that resembles
C also can be used.

Algorithm is defined as finite step by step list of well defined instructions for solving
a problem. The algorithmic notation refers to the format of describing an Algorithm. This
format of formal presentation of an Algorithm consists of two parts:

1) The first section tells the purpose of the Algorithm, identifies the variables which
occur in the Algorithm and list of Input data.

2) The second part of the Algorithm consists of list of steps that needs to be executed.

Department of Computer Sciences and Applications,SJCET,Palai


Page 9
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

The Various conventions used in Algorithmic notation are:

# Comments begin with // and continue until the end of line.

# Blocks are indicated with matching braces: {and}.That is, a compound statement or a .

collection of simple statements can be represented as a block Statements are delimited


by ‘;’

# Steps, Controls and Exit.

The steps of the Algorithm are executed one after the other beginning with step1
and executed downwards, ore control moves to the respective location depending on the
step number, mentioned .i.e., The control can be transferred to any step n of the algorithm
by the statement “Go to Step n” or such statements can be avoided by certain control
structures.

An algorithm can obtain self contained modules and Three types of logic or flow of
control called,

Θ Sequence Logic or Sequential flow

Θ Selection Logic or Conditional Flow.

Θ Integration Logic or Repetitive Flow.

Sequence Logic

In selection logic all the modules are executed in sequence .the sequence may be
referred explicitly by numbered steps or implicitly by the order in which modules are
written.

Algorithm

Module
A

Module B

Module C
Fig:
1.8

Selection Logic

The selection logic employees a number of conditions which lead to a selection of


one out of several alternative modules. The structures which implement this logic are

Department of Computer Sciences and Applications,SJCET,Palai


Page 10
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

called conditional structures or “If Structures”. The Statement ends with a structure
statement as below.

[End of If structure]

These conditional structures are of Three types:

1. Single Alternative : These structures are of the form :

If condition, then:

[Module A]

[End of If structure]

2. Double Alternative : It is of the form,

If condition, then:

[Module A]

Else :

[Module B]

[End of If structure]

3. Multiple Alternatives :

If condition (1), then:

[Module A1]

Else if condition (2), then:

[Module A2]

Else If Condition (M), then:

[Module Am]

Else :

[Module B]

Department of Computer Sciences and Applications,SJCET,Palai


Page 11
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

[End of If structure]

Iterative Logic

The two types of structures involving loop are “For Loop” and “While Loop”. Each
type begins with a “Repeat” statement and followed by a module, called the body of the
loop. The end of structure is by the statement, [End of Loop]

1) Repeat For k = R to S by T:

[Module]

[End of Loop]

The Repeat – For loop uses an index variable as K, to Control the loop. The ‘R’ is
called the initial values the end value or the test value, and T the increment.

2) The Repeat – while loop uses a condition to control the loop.

Repeat While condition:

[Module]

[End of Loop]

If several statements appear in the same step,

E.g. Set K = 1, LOC = 1 and MAX = DATA [1]

They are executed from left to right. The Algorithm is completed with the
statement “Exit”.

3) Comments: Each step can contain comment in brackets which indicates the main
purpose of the step. It usually appears at the beginning or at the end of the step.

# Variable names

Variable names will use capital letters as MAX, DATA etc. Single letter names of
variables are used as counters or subscripts in capital letters (K and N) or lowercase
letters (k and n).

# Assignment statement

The Assignment statement can use the formats of =, → ,: = etc.

<Variable>:= <expression>;

E.g.: MAX: = DATA [1], It assigns value of DATA [1] to MAX.

# Boolean values

Department of Computer Sciences and Applications,SJCET,Palai


Page 12
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

There exist two Boolean values TRUE and FALSE. To produce this the logical
operators and, or, and not and relational operators < ,<=,=,!= ,>= and > are used. Elements
of multidimensional array are accessed using [ and ] .The (i,j)th element of the array is
denoted as A[i,j],Array indices starts at zero.

# Input and Output

Data can be input and assigned to variables by means of a Read statement and output
using Write statement.

Read: Variable names,

And print statement as,

Write: Messages or variable names.

# Procedures

The procedure is used for an independent algorithmic module to solve a particular


problem.i.e, it is used to describe certain sub algorithms. They are completely
independently defined algorithmic module which is used to invoke and call. It receives
values, called arguments.

E.g.: NAME [PAR1, PAR2…………….PARK ]

# Every Algorithm is identified with a name and consists of heading and a body. The
heading is of the form

Algorithm Name (<parameter list>)

Where ‘Name’ is the name of the procedure and ‘parameter list’ is a listing of the
procedure parameters. The body consists of one or more simple or compound statements
within { and }.An algorithm may or may not return values. Simple variables are passed
by value and array or records are passed by reference.
Example1 (Algorithm 1.0): The following algorithm finds and returns the maximum of
n given numbers:

1. Algorithm Max(A,n)

2. // A is an array of size n.

3. {

4. Result := A[1];

5. for i:=2 to n do

6. if A[i] > Result then Result := A[i];

Department of Computer Sciences and Applications,SJCET,Palai


Page 13
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

7. return Result;

8. }

Recursive Algorithm

One of the efficient methods to make a program readable is to make it modular. That
is, divide the program in to a number of functions. The view of the function is that it is
invoked from one point and it is executed and returns the result to the invoking point.
Recursion is the method by which a function calls itself repeatedly. An Algorithm is said to
be recursive if it calls itself, or if the same algorithm is called in the body

Example (Algorithm 1.1): To find the factorial of a number

1. Int fact (n)

2. {

3. If (n==0) then return 1

4. Else

5. Return n*fact (n-1);

6. }

PERFORMANCE ANALYSIS AND MEASUREMENT

An Algorithm is defined as a sequence of steps which gives the method of solving a


given problem and is the logic of the program. Performance analysis means we are analyzing
the program before it is actually executed. In performance measurement, we are measuring
the actual time taken to run the program by executing it in a system. Performance analysis of
a program is done by analyzing the time and space requirements of a program because
efficiency of an Algorithm depends on two criteria’s:

 Run time

 Space Requirement

COMPLEXITY OF ALGORITHMS

The complexity analysis of an algorithm is the measure of efficiency or performance in


terms of runtime and space.
Performance analysis: The algorithm can be judged based on certain criteria, for e.g.
a) Does it do what we want it to do?
b) Does it do according to the original specifications of the task?
Department of Computer Sciences and Applications,SJCET,Palai
Page 14
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

c) Is there documentation that describes how to use it and how it works?

d) Are procedure created in such a way that they perform logical subfunctions?

e) Is the code readable?

Space complexity

A space complexity analysis analyzes, how much memory space does each program
occupies to run to completion. Actually we are not calculating the actual space required
for a program, we are just taking an estimate.
The space needed by each of the programs is the sum of the following components.

1. A fixed part that is independent of the characteristics (number and size of the inputs
and outputs, space for the program code, space for variables, constants etc.
2. A variable part that consists of the space needed for the variables that is dependent on
the particular problem instance being solved, the space needed by the referenced
variables and the recursion stack space.
Illustration: To illustrate how to find space complexity.
(Algorithm 1.3) Algorithm abc (a, b, c)

{
return a + b + b * c + (a + b- c)/ (a + b) +4.0;
}
The space needed by the algorithm is the sum of the following components:
1. A fixed part that is independent of the characteristics (e.g., number, size) of
inputs and outputs. This basically includes the instruction space, space for simple
variables and fixed-size component variables, space for constants etc.

2. A variable part that consists of the space needed by component variables


whose size is dependent on the particular problem instance being solved, space
needed by the reference variables, and recursion space etc.
The space requirement S(P), of a program P, thus is written as S(P)=c+Sp(Instance
characteristics).Where ‘c’ consists of fixed part or a constant and Sp is instance
characteristics. Since are taking only an estimate of the space complexity, we shall just
consider only the Sp. Thus, in the above Algorithm, the problem instance is characterized by
the specific values of a, b, and c.With assumption that one word is sufficient to store the
values of each of a,b,c,and the result ,therefore the space needed by abc is independent of the
instance characteristics ,Sp = 0.

Time complexity

Department of Computer Sciences and Applications,SJCET,Palai


Page 15
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

The time taken by a program, T (P), is the sum of compile time and run time or the
computer time it needs to run to completion. Compile time doesn’t depend on instance
characteristics. Because a compiled program can be run many times without recompilation.
So run-time only is considered to analyze time complexity. The time taken for a program
depends on the time taken to execute each statement of the program and also the complexity
of each of the statement is different. So time for a program depend on the number of
additions, subtractions, multiplications, assignments etc.

That is, tp(n) = caADD(n) + csSUB(n)+cmMUL(n)+cdDIV(n)+………where n


denotes the instance characteristics, and ca, cs, cm, cd, respectively ,denote the time needed for
an addition, subtractions, multiplications, assignments etc.
For e.g. A=b+c
D=a+b*c/r+s+f
The time taken for the second statement is greater than the first. To evaluate the
actual time by looking at the program is not possible, it is assumed that each meaningful step
of the program takes one unit of time to execute. So counting the number of steps in the
program is done to find the total time taken.
If space of any two algorithms is considered to be constant, then the complexity
depends on runtime. The runtime of an execution depends on the parameters such as,

ж Number of comparisons

ж Number of data exchanges.

ж Presence of recursive executions.

ж The Runtime is proportional to the size of the program.

ж The execution time increases as the Input data set increases in size.

Illustration: To illustrate how to find Time complexity. It is done by two methods as,
1. Step count method

2. Step per statement execution or frequency method


STEP COUNT METHOD
Here we introduce a variable ‘count’ in to the program. As each time a particular
meaningful statement is executed count increments by one.

For e.g.:( Algorithm 1.4)

1. Sum (int *a, int n)

Department of Computer Sciences and Applications,SJCET,Palai


Page 16
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

2. {
3. int i;
4. int a=0;
5. for(i= 0 to n-1)
6. {
7. a[i]=i;
8. }
9. }
After introducing the variable ‘count’ in to the program as,
(Algorithm 1.5)

1. Sum(int *a, int n)


2. {
3. int i;
4. int a=0;
a. count++; /* for the statement int a=0 */
5. for(i= 0 to n-1)
6. {
a. count++; /* for for statement*/
7. a[i]=i;
a. count ++; /* for a[i]=i */
8. }
a. count ++; /*for the last time of for statement */
9. return i;
a. count ++; /* for return */
10. }
After the executing the above program we get the value 2n+3 in the variable count.
We can see that number of steps in this program depends on the variable n.

Department of Computer Sciences and Applications,SJCET,Palai


Page 17
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

STEP PER STATEMENT EXECUTION OR FREQUENCY METHOD (STEP


TABLE METHOD)

It determines how the computing time increases as the magnitude of the input
increases. In such case the step count (that is s/e, step per statement execution) of an
algorithm can be computed as a function of the magnitude if the input. In the Algorithm
illustrated below Sum, the
time complexity is measured as a function of the number ‘n’ of element being added.

The complexity analysis defines three cases of step count based on runtime,

¡ Best case

¡ Worst case

¡ Average case

Best case
If the algorithm finds the element at the first search itself, it is referred as a best case
algorithm.

Statement s/e frequency total


time

Algorithm Sum(a,n) 0 - 0

{ 0 - 0

s :=0.0; 1 1 1

for := 1 to n do 1 n+1 n+1

s := s + a[i]; 1 n n

return s; 1 1 1

} 0 - 0

Total 2n
+3

Table
1.3

Worst case

Department of Computer Sciences and Applications,SJCET,Palai


Page 18
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

If the algorithm finds the element at the end of the search or if the searching of the
element fails, the algorithm is in the worst case or it requires maximum number of steps that
can be executed for the given parameters.

Average case

The analysis of average case behavior is complex than best case and worst case
analysis, and is taken by the probability of Input data. As the volume of Input data increases,
the average case algorithm behaves like worst case algoprithm.In an average case behavior;
the searched element is located in between a position of the first and last element.

ASYMPTOTIC NOTATION ( O , Ω , θ )

The Asymptotic notation introduces some terminology that enables to make meaningful
statements about the time and space complexities of an algorithm. The functions f and g are
nonnegative functions.

O – NOTATION (Rate of Growth)

The O-notation (pronounced as Big “Oh”) is used to measure the performance of an


algorithm which depends on the volume of Input data. The O-notation is used to define the
order of growth of an algorithm, as the input size increases, the performance varies.

The function f (n) =O (g (n)) (read as f of n is big oh of g of n) iff there exists two
positive constants c and n0 such that f (n) <=cg (n) for all n, where n>=n0.
Suppose we have a program having count 3n+2. We can write it as f (n) =3n+2. We
say that 3n+2=O (n), that is of the order of n, because f (n) <=4n or 3n + 2 <=
4n, for all n>=2.

Another e.g.
Suppose f (n) =10n2+4n+2. We say that f (n) =O (n2) sine 10n2+4n+2<=11n2 for n>=2 But
here we can’t say that f (n) =O (n) since 10 n 2+ 4n+2 never less than or equal to cn,That is ,
10n2+4n+2 != O (n) , But we are able to say that f (n) =O (n3) since f (n) can be less than or
equal to cn3, same as 10n2+4n+2<=10n4, for n>=2.

Normally suppose a program has step count equals 5, and then we say that it has an order O
(constant) or O (1).

If f (n) = 3n+5 then O (n) or O (n2) or O (n3) or ---


But we cannot express its time complexity as O (1).

If f (n) = 5n2+8n+2 then O (n2) or O (n3) or ---


But we cannot express its time complexity as O (n) or O (1).

If it is 6n3+3n+2 then O (n3) or O (n4) or ----


But we cannot express its time complexity as O (n2) or O (n) or O (1).

If it is n log n+6n+9 then O (n log n) (to the base 2)


If it is log n+6 then O (log n)
Department of Computer Sciences and Applications,SJCET,Palai
Page 19
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

The seven various O-notations used are:

• O (1) – We write O (1) to mean a computing time that is constant. Where the data
item is searched in the first search itself.

• O (n) – O(n) is called linear, where all the elements of the list are traversed and
searched and the element is located at the nth location.

• O (n2) – It is called quadratic, where all elements of the list are traversed for each
element. E.g: worst case of bubble sort.

• O (log n) –The O(log n) is sufficiently faster, for a large value of n, when the
searching is done by dividing a list of items into 2 half’s and each time a half is
traversed based on middle element.E.g: binary searching, binary tree traversal.

• O (nlogn) – The O (nlogn) is better than O(n2 ),but not good as O(n),when the list
is divided into 2-halfs and a half is traversed each time.E.g: Quick sort.

The other notations are O (n2), O (n3), O (2n) etc.

OMEGA NOTATION (Ω )

The function f (n) = Ω (g(n)) (read as “ f of n is omega of g of n” ) iff there exists two
positive constants c and n0 such that f(n) >= c * g(n) for all n, where n >= n0.
Suppose we have a program having count 3n+2. we can write it as f(n)=3n+2. And we say
that 3n+2=Ω (n), that is of the order of n, because f(n)>=3n or 3n+2>=3n, for all n>=1.
Normally suppose a program has step count equals 5, and then we say that it has an order
Ω (constant) or Ω (1).

If f(n) = 3n+5 then Ω (n) or Ω (1)


But we cannot express its time complexity as O(n2).

If f(n) = 5n2+8n+2 then Ω (n2) or Ω (n) or Ω (1)


But we cannot express its time complexity as O(n3) or O(n4).

If f(n) = 6n3+3n+2 then Ω (n3) or Ω (n2) or Ω (n) or Ω (1)


But we cannot express its time complexity as O(n4)

If f(n) = n log n+6n+9 then Ω (n log n) (to the base 2)


If f(n) = log n+6 then Ω (log n)

THETA NOTATION (θ )

The function, f(n) = θ (g(n)) (read as f of n is big oh of g of n) iff there exists two
positive constants c1, c2 and n0 such that c1.g(n) >= f(n) <= c2.g(n) for all n, where n >=
n0.
Department of Computer Sciences and Applications,SJCET,Palai
Page 20
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

Suppose we have a program having count 3n+2. we can write it as f(n)=3n+2. We say that
3n+2=θ (n), that is of the order of n, because c1. n >= 3n + 2 <= c2. n .after a particular
value of n.
But we cannot say that 3n + 2 = θ (1) or θ (n2) or θ (n 3).

This can be explained by the following graph.

Normally suppose a program has step count equals 5, then we say that it has an order θ
(constant) or θ (1).
If f(n) = 3n+5 then θ (n)
But we cannot express its time complexity as θ (n2) or θ (1).

If f(n) = 5n2+8n+2 then θ (n2).


But we cannot express its time complexity as θ (n3) or θ (n4) or θ (n) or θ (1).

If f(n) = 6n3+3n+2 then θ (n3)


But we cannot express its time complexity as O(n4) or θ (n2) or θ (n) or θ (1).

If f(n) = n log n+6n+9 then θ (n log n) (to the base 2)


If f(n) = log n+6 then θ (log n)

From this we may conclude that θ is the most precise notation.


For the binary search program we will get a time of the order log n. (to the base 2)

ARRAYS

Definition An array is defined as a finite, ordered collection of homogeneous data


elements .the number of elements in the array is size of the array.

Array is also defined to be consecutive set of memory locations.The kind of data


stored in the array is the data type of the array. Base of the array is the address of the memory
locations where the first element in the array is located. Index or subscript value refers to the
position of each element which is represented in square bracket followed by a array name.

45 57 80 90 23
A [6] 77
Fig:
1.9

The memory allocation for an array is calculated using the formula,

Address (A[i]) =M + (I - L) * W

Where M – Base address

I – Position whose memory address needs to be computed.

Department of Computer Sciences and Applications,SJCET,Palai


Page 21
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

L – Lower bound Address

W – Word length.

ALGORITHMIC NOTATION OF AN ARRAY

(Algorithm 1.6)

Structure ARRAY (value, index)

Declare CREATE ( ) → array

RETRIEVE (array, index) → value

STORE (array, index, value) → array

For all A ε array, i, j ε index, x ε value

RETRIEVE (CREATE, i) :: error

RETRIEVE (STORE (A, i, x), j) :: =

IF EQUAL ( i , j ) then x else RETRIEVE ( A, j)

End

End ARRAY.

Explanation:

The function CREATE produces a new, empty array. RETRIEVE takes as input an
array and an index, and either returns the appropriate value or an error. STORE is used to
enter new index-value pairs. RETRIEVE (STORE (A,I,x),j) is explained as ‘to retrieve the jth
item where x has already been store at index i in A, which is equilavaent to checking if I and
j are equal and if so ,x,or is searched for the jth value in the remaining array, A.

The various operations done on an array includes,

 Insertion

 Deletion

 Sorting

 Searching

Department of Computer Sciences and Applications,SJCET,Palai


Page 22
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

An Array is classified as

1) 1-Dimensional Array.

2) 2-dimensional Array.

3) Multi-Dimensional array.

REPRESENTATION OF ARRAYS

1-Dimensional Array
A 1-d array is an array where all elements are referenced using only one subscript or
index value. It is very important to see how arrays are represented in memory. Computer
memory can be considered as one-dimensional array of locations from 1 to m. So we are
going to see how the multi dimensional arrays are represented in memory. Suppose a one
dimensional array declared as int a[20];The index of the array starts from 0 and ends at 19.
The number of elements in the array is 20. So the number of elements in the array declared as
int a[i] is i+1.

Suppose we have the one dimensional array declared as int a[5]. Then the elements in
the array are a[0], a[1], a[2], a[3], a[4]. The address of the first location, a[0] is the base
address ‘a’. Then address of the location a[1] is a+1. the address of a[4] is a+4. So if we have
an element a[i], then the address of that element a[i] will be a+i. (Here we assume that one
element occupies one location or one byte in memory. Actually an integer element occupies
2 locations or 2 bytes in memory and a character element occupies 1 location).

Operations on a 1-Dimensional Array


Insertion into a 1-Dimensional Array(Algorithm 1.7)

Algorithm INSERT (LA, N, K, ITEM)

//LA is a Linear array with N elements and K is any Position with condition
K<=N.The Algorithm inserts an element into kth position of LA //

Set j:=N

Repeat While j>=K

Set LA [j+1] := LA[j]

Set j: = j-1

End While

Set LA [K]:=ITEM //Insert ITEM //

Set N: = N+1 //Reset value of N //


Department of Computer Sciences and Applications,SJCET,Palai
Page 23
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

Exit.

Deletion into a 1-Dimensional Array(Algorithm 1.8)

Algorithm DELETE (LA, N, K, ITEM)

//LA is a Linear array where N number of elements are present and K is the
index variable, such that K<=N.The Algorithm deletes Kth element from
LA //

Set ITEM: = LA [K]

Repeat for J=K to N-1

Set LA[j]:= LA [J+1} //move J+1 element upward//

End for

Set N=N-1 //Reset the number N of elements in LA //

Exit.

2-dimensional Array.
A 2-d array is a collection of homogeneous elements where elements are ordered into
a number of rows and columns. Consider the case of a 2 dimensional array declared as int
a[4][5]. We can represent this as a matrix consisting of 4 rows and 5 columns. We can say
that each row of this matrix consists of 5 elements.

The two dimensional array is represented in the Figure: 1.10 given below.

0 1 2 3
X X X X X
0
X X X X X

1 X X X X X

X X X X X
2

Fig: 1.10

Department of Computer Sciences and Applications,SJCET,Palai


Page 24

Fig:
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

The 2 dimensional array A[u1] [u2] may be interpreted as u1 rows, row0, row1, …,
row u1-1, each row consisting of u2 elements. Suppose @ is the address of A [0] [0], then
the address of A [i] [0] is @+i*u2, as there are i rows , each of size u2, preceding the first
element in the ith row. Knowing the address of A [i] [0], we can say that address of A [i] [j] is
then simply as @ + i * u2 + j.

E.g.: An m x n matrix where m number of rows, n number of Columns is present is a


representation 2-d array.

1 2 3
A 4 5 6
=
7 8 9 m =3,
Fig: n= 3
1.10

The total number of elements in the above matrix =m*n.Here, in the matrix A each element
is specified using 2 index variables i and j,That is, A [i][j]

A matrix can be represented in the memory using continuous memory locations .The 2
formats for storing a matrix in the memory are,

1) Row – major ordering

2) Column – major ordering

In the row – major ordering, the elements in a matrix, is sorted in a row by row manner. In a
column major ordering, the elements are stored as column by column order.

Example

The 2-d Matrix can be illustrated using the Example in the Figure 1.11, of sparse Matrix.

Sparse matrix is as type of matrix stored in 2-d array where the value of the majority of
elements is ‘zero’ or ‘null ‘.

E.g.:

0 2 0

A= 4 5 0

0 0
Fig: 0 m =3,
n= 3
1.12

Department of Computer Sciences and Applications,SJCET,Palai


Page 25
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

In a sparse matrix, since there are few non-zero entries using. Using such 2-d array in the
memory results in the disadvantages such as,

a) Wastage of memory space

b) Increases runtime in execution, and therefore decreases the efficiency.

In order to have an effective representation of a sparse matrix, it is stored in the memory


using an alternative format known as 3-Tuple method.

Here, every non-zero entry is stored in a 2-d array with reference to its roe value and
column value .ie,it is represented in the form as,(i,j,value).If the number non-zero entries
represented in a sparse matrix is equal to ‘t’,then the row index ‘i’ will range from 0 to ‘t’
and the column index variable will range from 1 to 3 .i.e,represented as A(0 : t,1 : 3 ),which
means ‘t’ non-zero items are stored into 3 various coloumns.This representation reduces
memory wastage and reduces runtime. The type of operations done on a 2-d array includes
searching,sorting,traversing,merging etc.Another one operation performed on a matrix is
computing the transpose martrix,where A( i, j,value) is stored as B( j, i, value).

The matrix in the Figure 1.12 represents a 6 x 6 matrix of the form sparse. In order for the
effective representation of this sparse matrix in the memory, it needs to be represented in the
3-tuple format. It requires 9 rows and 3 columns, As there are 8 non-zero elements, The first
row represents the number for the representation of the row order(m),column order(n) ,and
the rest 8 rows represent the nonzero value and its row index and column index. This matrix
is told to be 3-tuple format, because, the number of columns has the row number, column
number and the last column has the non-zero terms. That is, the entire 6 x 6, 36 entries in the
memory, which actually contains only 8 values, are reduced into the form with 9 columns
and 8 entries. The 3-tuple matrix is represented in the Figure: 1.13.

The Transpose of the above matrix A can be computed using the procedure
TRANSPOSE (A, B), where A is the original matrix in 3-tuple form and B is the Transpose
of A as represented in Figure: 1.14.

Department of Computer Sciences and Applications,SJCET,Palai


Page 26

15 0 0 22
0 11 3 0 0
0
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES 0 0 0 -6 0
0
Col 1 Col
0 02 Col 30 Col 4 Col
05 Col
06
0
Row
1 91 0 0 0
Row 0 0
2
Row 0 0 28 0 0
3 0
Row
4 Fig: 1.13
Row
5
Row
6

(Algorithm 1.9)

Procedure TRANSPOSE (A, B)

// A is a matrix represented in sparse form and B is said to be its transpose.

( m, n ,t) ← ( A(0,1) ,A(0,2) ,A(0,3) )

( B ( 0,1) , B( 0,2) , B( 0,3) ) ← ( n, m, t )

If t <= 0 then return //check for zero elements

q← 1 //q is the position of next term in B

for col←1 to n do //transpose by column

for p←1 to t do //for all non-zero terms do

if A(p ,2)= col //correct column

then [ (B(q,1) B(q,2) B(q,3) ) ← ( A(p,2) , A(p,1), A(p,3) ) //insert next term of B

q←q+1 ]

End

End.

End TRANSPOSE.
Department of Computer Sciences and Applications,SJCET,Palai
Page 27
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES
6 6 8

The transpose Matrix B of matrix A1is represented below. 1


15

1 4
22
1 2 3
1 6
A (0 -15

A (1 2` 2
11
A (2
2 3
A (3 3

A (4 3 4
-6
A (5
5 1
A (6 91
A (7
6 3
A (8 28
Fig:
1.14

Department of Computer Sciences and Applications,SJCET,Palai


Page 28
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

6 6 8

1 1
15
1 2 3
1 4
B (0 22

B (1 1 6
-15

2` 2
B (3 11

B (4 2 3
3
B (5
3 4
B (6 -6
B (7 5 1
91
A (8
Fig:
6 1.15 3
28
Three-dimensional Array
The 3-D array assumes 3 dimensions to refer to a value stored.ie, the number of rows,
number of columns, and number of pages. The element in aijk assumes ith row, jth column, kth
page as illustrated in figure:1.15. Consider a three dimensional array A [u1] [u2] [u3]. The
array is interpreted as u1 two dimensional arrays of dimension u2 x u3. to locate A[I] [j] [k],
we first obtain @ + I u2 u3 + j u3 + k as the address of A [I] [j] [k].
Generalizing this, the addressing formula for any element A [i1] [i2] ….[i n] in an n
dimensional array declared as A [u1] [u2] ….[un] may be easily obtained. If @ is the address
for A [i] [0] …[0] then @ + i1 u2 u3….un is the address for A [i1] [0] …[0]. The address for
A [i1] [i2] [0] ….[0] is then @ + i1 u2 u3…un + i2 u3 u4 … un.
Repeating this way, address for A [i1] [i2] … [in] is

Department of Computer Sciences and Applications,SJCET,Palai


Page 29
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

@ + i1 u2 u3 …un
+ i2 u3 u4 … un
+ i3 u4 u5 … un



+ in-1 un
+in

i
K

Fig:
1.16
J

ORDERED LISTS

Definition: An ordered list is told to be a collection of related set of elements or a set of


data objects also called linear list. Examples are the days of a week forms a ordered list as
(MONDAY,TUESDAY,WEDNESDAY,THURSDAY,FRIDAY,SATURDAY,SUNDAY)
etc or floors of a building as (basement,lobby,first,second,third) etc.The ordered list is either
empty or can be represented as (a1,a2,a3,………..an)where ai are atoms from the set S.There
are a variety of operations that can be performed on the list, they include,

a) To find the length of the list ,n

b) To read the list from left to right and from right to left.

c) To retrieve the ith element,1<=i<=n

d) To store a new value into the ith position,1<=i<=n

e) To insert a new value in the ith position,1<=i<=n+1,making the rest the elements
shifted to the next positions.

f) To delete the element in position i,1<=i<=n causing the elements numbered in i+1,
…n shifted to the previous locations as n-1.

The most common way to represent the ordered list is using an array, where the list
element ai is at location i of the array index. The ordered list can be sequentially mapped by

Department of Computer Sciences and Applications,SJCET,Palai


Page 30
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

storing the ai and ai+1 into consecutive locations of i and i+1 of the array. This makes the users
capable to retrieve or modify the values in a random order in constant amount of time.

Ordered list processing

The problem requiring to process the ordered list can be done using the 1-d arrays.
The classical example for the List processing technique is the manipulation of symbolic
polynomials. This refers to the 4 basic operations as Addition, subtraction, multiplication and
division that could be done on 2 polynomials. Polynomials are told to be symbolic as it
contains the list of coefficients and exponents in any given polynomial.

A polynomial is defined to be a sum of terms where each term is of the form axe ,here
‘x’ is a variable,’ a’ is the coefficient and ‘e’ is the exponent.

A general polynomial A(x) is of the form,

Anxn+an-1xn-1+…+a1x +a0 where an ≠0,the degree of A is n.To represent A(x) an ordered list
of coefficients using a one dimensional array of length n+2 is used.

Example: Consider two polynomials A(x) = 3x2 +2x + 4 and B(x) =x4 +10x3 +3x2 +1

3 2 3 1
A [X] 2 0 4

4 4 1 3 10 2
B[X] 3 0 Fig:1
1.17

In the array representation of the polynomial it uses 2m+1 locations, where m is the number
of terms in the polynomial and the first entry in the array represents the total number of
terms.

The resultant polynomial C contains 2(m+n)+1) locations.

7 4 1 3 10 2 6 1 2 0
5 Fig:
1.18

The addition of the 2 polynomials can be represented using the procedure PADD.
(Algorithm 1.10)

Procedure PADD (A, B, C)

// A (1:2m+1), B (1:2n+1), C (1:2(m+n) +1) //

Department of Computer Sciences and Applications,SJCET,Palai


Page 31
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

m ← A(1);

n ← B(1);

p ← q ← r ← 2;

While p<=2m and q<=2n do

Case // Compare
exponents //

: A (p) = B (q) : C (r+1) ← A (p+1) + B (q+1) //add coefficients //

if C (r+1) ≠ 0

then [C(r) ←A (p); r ← r + 2] //Store exponent //

p←p+2;q←q+2 //advance to next


terms //

: A (p) < B (q) : C (r+1) ← B (q+1); C(r) ←B(q) // Store new term //

q ← q + 2; r ← r + 2 //advance to next
terms //

: A (p) > B (q) : C (r+1) ← A (p+1) ;C(r) ←A(p) // Store new


term //

p ← p + 2; r ← r + 2 //advance to next
terms //

End

End

While p <= 2m do //Copy remaining terms of A//

C( r) ←A (p);

C(r+1) ←A (p+1)

p ←p+2; r← r+2;

End

While q<=2n do //Copy remaining terms of B//

C( r) ←B (q);

C(r+1) ←B (q+1)

Department of Computer Sciences and Applications,SJCET,Palai


Page 32
Module 1 MCA 202 ADMN 2009 - 10
DATA STRUCTURES

q ←q+2; r← r+2;

End

C (1) ← r/2 -1 //Number of terms in the sum //

End PADD

Explanation

The procedure has the Capital letter parameters which are array names. The 3
pointers (p, q, r) are used to designate a term in A, B or C.The basic iteration is governed by
a While loop and blocks of statements are grouped together using square brackets. Variables
‘m’ and ‘n’ refers to the nonzero terms in A and B respectively.

Department of Computer Sciences and Applications,SJCET,Palai


Page 33

Das könnte Ihnen auch gefallen