Sie sind auf Seite 1von 22

1

st
Feb 2014:
- Data structures: conceptual and concrete ways to organize data for efficient storage and
efficient manipulation
- Employment of this data structures in the design of efficient algorithms
Topics Covered:
Introduction to Data Structure
Abstract Data Type (ADT)
Introduction to Algorithm Analysis and Design
Why to study Data Structure:
Any organization has a collection of records that can be searched, processed in any order, or modified.
o Data structures organize data:
Good choice: more efficient programs
Bad choice: poor program performance
The choice can make a difference between the program running in a
few seconds or many day
o Characteristics of a problems solution
efficient: if it solves problem within resource constraints
time
space
Cost: amount of resources a solution consumes
Costs & Benefits :
A data structure requires a certain amount of:
space for each data item it stores
time to perform a single basic operation
programming effort.

Selecting a data Structure :
Select a data structure as follows:
1. Analyze the problem to determine the resource constraints a solution must meet.
2. Determine the basic operations that must be supported. Quantify the resource constraints
for each operation.
3. Select the data structure that best meets these requirements.
Abstract Data Type :
o A logical view of the data objects together with specifications of the operations required to
create and manipulate them.
Describe an algorithm pseudo-code
Describe a data structure ADT
o A data structure is the physical implementation of an ADT
Each ADT operation is implemented by one or more subroutines.
Data structures are used to organize data in main memory
Abstract Data Type (ADT) is Simple or structured data type whose implementation details are
hidden.
An abstract data type is not a part of a program, because a program written in a programing
language requires the definition of a data structure, not only the operations on data structure.
An object oriented language (OOL) such as C++ has direct link to abstract data types by
implementing them as a class.
Data Type :
Set of values
Operations that can be performed on those values
Ex: short int
can take values (-32768 to 32767)
Operations are +, -, , /

Data Type Classification:











ADT :
Both an interface and an implementation
Interface and implementation are independent
Interface defines
The type of the data stored
Operations that are performed on the data
parameters of each operation
Implementation defines
data organization
developing efficient algorithm for each operation



Primitive
Non-
Primitive
Sint,
float,short
int,
User
Defined
Referen
ce
Lang.
Defined
Arrays,strings,
structure,
Point
ersS
class
AD
T
Data Types
Basics Operations of ADT :
insert(S, x)
delete(S, x)
search(S, x)
findMin(S)
findMax(S)
findSuccessor(S, x)
findPredecessor(S, x)
Classification of ADT :
Linear
Arrays
Linked list
Circular list
Doubly linked list
Stack
Queue
Circular queue
Priority queue
Non-linear
Trees
Binary Trees and Types
Binary Search Trees and Variants
Threaded Binary Trees
Heaps
Graphs
Undirected
Directed
Hash Tables
ADT Summary :
Standard data collection organizations (Data Structures) with desired operations
Described by an interface
Many implementations are possible
Facilitate reuse and easy extensibility
Design issues are Time and Space Complexity
Problems, Algorithms and Programs
Programmers deal with:
problems,
algorithms and
computer programs.
Problem: a task to be performed.
Best thought of as inputs and matching outputs.
Problem definition should include constraints on the resources that may be
consumed by any acceptable solution.
Problems mathematical functions
A function is a matching between inputs (the domain) and outputs (the range).
An input to a function may be single number, or a collection of information.
The values making up an input are called the parameters of the function.
A particular input must always result in the same output every time the function is
computed.
Algorithm: a method or a process followed to solve a problem.
A recipe: The algorithm gives us a recipe for solving the problem by performing a
series of steps, where each step is completely understood and can be implemented.
An algorithm takes the input to a problem (function) and transforms it to the output.
A mapping of input to output.
A problem can be solved by many algorithms.
For example, the problem of sorting can be solved by the
Following algorithms:
Insertion sort
Bubble sort
Selection sort
Shellsort
Mergesort
An algorithm possesses the following properties:
It must be correct.
It must be composed of a series of concrete steps.
There can be no ambiguity as to which step will be performed next.
It must be composed of a finite number of steps.
It must terminate.
A computer program is an instance, or concrete representation, for an algorithm in some programming
language.
Algorithm Design Techniques
The design of algorithms is also an important focus.
Types of algorithms:
Greedy algorithms
Divide and Conquer
Dynamic programming
Randomized algorithms
Backtracking
Algorithm Analysis:
Predict the amount of resources required:
Memory: how much space is needed?
Computational time: how fast the algorithm runs?
FACT: running time grows with the size of the input
Input size (number of elements in the input)
Size of an array, polynomial degree, # of elements in a matrix, # of bits in the binary
representation of the input, vertices and edges in a graph
Def: Running time = the number of primitive operations (steps) executed before termination
Running time is expressed as T(n) for some function T on input size n.
Two approaches to obtaining running time:
Measuring under standard benchmark conditions.
Estimating the algorithms performance
Estimation is based on:
The size of the input
The number of basic operations
The time to complete a basic operation does not depend on the value of its operands.
Lists of ADT :
List is an ordered sequence of elements
List has the property length (count of elements)
The elements are arranged consecutively.
Can be implemented as static(Array implementation) or dynamic (Linked List implementation)
Array
Fundamental data structure
Homogeneous collection of values
store values sequentially in memory
associate INDEX with each value
use array name and index to quickly access the value.
efficient method for working with large collection of data.
An array can be
Single-dimensional
Multi-dimensional
Array Memory Layout:
The index in a one-dimensional array directly defines the relative positions of the element in
actual memory.
Two-dimensional array is stored in memory using row-major or column-major storage
Operations on Array:
The common operations on arrays are searching, insertion, deletion and traversal.
An array is more suitable when the number of deletions and insertions is small, but a lot of
searching and retrieval activities are expected.
Pros and cons:
Advantages
Simple and easy to use
Faster Access to elements (Constant time random access)
Disadvantages
Fixed size
Inefficient insertions and deletions
Ques: We have stored the two-dimensional array students in memory. The array is 100 4 (100 rows
and 4 columns). Show the address of the element students [5][3] assuming that the element student
[1][1] is stored in the memory location with address 1000 and each element occupies only one memory
location. The computer uses row-major storage.
Solutions: We can use the following formula to find the location of an element, assuming each element
occupies one memory location.
Y=x+ {cols * (i-1)}+(j-1)
If the first element occupies the location 1000, the target element occupies the location 1018.
Linked Lists:
A linked list is a collection of data in which each element contains the location of the next
element.
Each element contains two parts: data and link. The name of the list is the same as the name of
this pointer variable.
Head: pointer to the first node
The last node points to NULL
Operations on Linked List :
Search
Insertion
Deletion
Traversal
Search operation:
Insertion:
Four cases can arise:
Inserting into an empty list.
Insertion at the beginning of the list.
Insertion at the end of the list.
Insertion in the middle of the list.
Deletion:
Two cases are:
deleting the first node
Deleting any other node.
Linked List advantage and disadvantage:

Advantages
Not so simple
Sequential access
Disadvantages
Dynamic size
Efficient insertions and deletions
Linked List Application:
It is a dynamic data structure in which the list can start with no nodes and then grow as new
nodes are needed
It is a suitable structure if a large number of insertions and deletions are needed, but searching a
linked list is slower that searching an array.
It is a very efficient data structure for sorted list that will go through many insertions and
deletions
Linked List Operations:
Comparisons of linked list and Array


Variations of linked list :
Singly linked list: It has only head part and corresponding references to the next nodes.
Doubly linked list: A linked list which has both head and tail parts, thus allowing the traversal in
bi-directional fashion. Except the first node, the head node refers to the previous node.
Circular linked list: A linked list whose last node has reference to the first node.
Try:
How many pointers are contained as data members in the nodes of a circular, doubly linked list
of integers with five nodes?
If the address of A [1][1] and A[2][1] are 1000 and 1010 respectively and each element occupies
2 bytes then the array has been stored in _________ order.
The operation of processing each element in the list is known as _____________
8
th
Feb 2014:
Topics:
Arrays
Linked Lists
Stacks
Queues
Stacks:
A stack is a restricted linear list in which all additions and deletions are made at one end, the
top. (LIFO)
Operations on stack ADT:
No search
No adding in arbitrary positions
No sorting
No access to anything beyond the top element.
Stack --- stack(stackName)
Push ---push(stackName,dataItem)
Pop ---- pop(stackName,dataItem)
Empty--- empty(stackName)
Stack ADT implementation:
Stack ADTs can be implemented using either AS an array or a linked list.
Stack array Implementation:
createStack(S): Define an array S for some fixed sixe N
top -1
push(x,S): if top = N-1 then error
else top top + 1
S*top+ x
StackEmpty(S): return (top < 0)
pop(S): if isStackEmpty() then error
else item S*top+
top top 1
return(item)
Application of stack
Expression Evaluation
Function calls
Memory Management (Run time Environment)
Backtracking
Parenthesis Matching
Expression Evaluation:
The three Notations of Expressions are:
Infix a+b
Postfix(RPN) ab+
Prefix(PN) +ab
Conversion from Infix to postfix:
1. Print operands as they arrive.
2. If the stack is empty or contains a left parenthesis on top, push the incoming
operator onto the stack.
3. If the incoming symbol is a left parenthesis, push it on the stack.
4. If the incoming symbol is a right parenthesis, pop the stack and print the
operators until you see a left parenthesis. Discard the pair of parenthesis.
5. If the incoming symbol has higher precedence than the top of the stack, push it
on the stack.
6. If the incoming symbol has equal precedence with the top of the stack, use
association. If the association is left to right, pop and print the top of the stack and
then push the incoming operator. If the association is right to left, push the
incoming operator.
7. If the incoming symbol has lower precedence than the symbol on the top of the
stack, pop the stack and print the top operator. Then test the incoming operator
against the new top of stack.
8. At the end of the expression, pop and print all operators on the stack. (No
parentheses should remain.)
Ex:
Infix arithmetic expression a + b * c d.
Input: a + b * c d Output: a opStack: empty
Input: a + b * c d Output: a opStack: +
Input: a + b * c d Output: a b opStack: +
Input: a + b * c d Output: a b opStack: + *
Input: a + b * c d Output: a b c opStack: + *
Input: a + b * c d Output: a b c opStack: + *
Input: a + b * c d Output: a b c * opStack: +
Input: a + b * c d Output: a b c * + opStack: empty
Input: a + b * c d Output: a b c * + opStack:
Input: a + b * c d Output: a b c * + d opStack:
Input: a + b * c d Output: a b c * + d opStack: empty
Ex:
a/b^c-d abc^/d-
a-b+c ab-c+
a*(b+c) abc+c*
a * (b + c * d) + e a b c d * + * e +
Queue ADT:
A queue is a linear list in which data can only be inserted at one end, called the rear, and deleted
from the other end, called the front.(FIFO).
Operations on Queue ADT:
Queue
Enqueue
Dequeue
Empty
Queue Implementation:
A queue ADT can be implemented using either as an array or a linked list
Application of Queue ADT:
For implementing any "natural" FIFO service, like telephone enquiries, reservation requests,
traffic flow, etc.
For implementing any "computational" FIFO service, for instance, to access some resources.
Examples: printer queues, disk queues, etc.
For searching in special data structures (breadth-first search in graphs and trees).
For handling scheduling of processes in a multitasking operating system.


Try :
How many pointers are contained as data members in the nodes of a circular, doubly linked list
of integers with five nodes?
If the address of A[1][1] and A[2][1] are 1000 and 1010 respectively and each element occupies
2 bytes then the array has been stored in _________ order.
The operation of processing each element in the list is known as _____________
22 feb 2014:
O-Notation : Intuitively: O(g(n)) = the set of functions with a smaller or same order of growth
as g(n)












Examples :
3n + 2 = O(n) ; 3n + 2 <= 4n for all n >= 2
3n + 3 = O(n) ; 3n + 3 <= 4n for all n >= 3
100n + 6 = O(n) ; 100n + 6 <= 101n for all n >= 6
= O(n
2
)
10 n
2
+ 4n + 2 < = 11 n
2
for n >= 5

O - notation Intuitively: O(g(n)) = the set of functions with a larger or same order of growth as
g(n)














3n + 2 = ? 3n + 2 >= 3n for all n >= 1
3n + 3 = ? 3n + 3 >= 3n for all n >= 1
100n + 6 = ? 100n + 6 >= 100n for all n >= 1
3n + 3 = ?
3n + 3 <=6n for all n>=1,c2 = 6
3n+3 >= 3n for all n>=1, c1=3
3n<=3n+3<= 3n for all n>=1 3n+3 = O(n)


Sorting:
Iterative methods:
Insertion sort
Bubble sort
Selection sort

Divide and conquer
Merge sort
Quicksort

Counting sort
Radix sort
Bucket sort

Insertion Sort
Alg.: INSERTION-SORT(A)
for j 2 to n
do key A* j +
Insert A[ j ] into the sorted sequence A[1 . . j -1]
i j - 1
while i > 0 and A[i] > key
do A*i + 1+ A*i+
i i 1
A*i + 1+ key
Insertion sort sorts the elements in place





Analysis of Insertion ADT:
INSERTION-SORT(A)
for j 2 to n
do key A* j +
Insert A[ j ] into the sorted sequence A[1 . . j -1]
i j - 1
while i > 0 and A[i] > key
do A*i + 1+ A*i+
i i 1
A[i + 1+ key



Best Case Analysis :

The array is already sorted
A[i] key upon the first time the while loop test is run (when i = j -1)
t
j
= 1
T(n) = c
1
n + c
2
(n -1) + c
4
(n -1) + c
5
(n -1) + c
8
(n-1) = (c
1
+ c
2
+ c
4
+ c
5
+ c
8
)n + (c
2

+ c
4
+ c
5
+ c
8
)
= an + b = O(n)

Worst Case Analysis :
The array is in reverse sorted order
Always A[i] > key in while loop test
Have to compare key with all elements to the left of the j-th position compare
with j-1 elements t
j
= j






( ) ( ) ) 1 ( 1 1 ) 1 ( ) 1 ( ) (
8
2
7
2
6
2
5 4 2 1
+ + + + + + =

= = =
n c t c t c t c n c n c n c n T
n
j
j
n
j
j
n
j
j
Alg.: SELECTION-SORT(A)
n length*A+
for j 1 to n - 1
do smallest j
for i j + 1 to n
do if A[i] < A[smallest]
then smallest i
exchange A*j+ A*smallest+
Divide the problem into a number of sub-problems
Similar sub-problems of smaller size
Conquer the sub-problems
Solve the sub-problems recursively
Sub-problem size small enough solve the problems in straightforward manner
Combine the solutions to the sub-problems
Obtain the solution for the original problem
Merger and Sort APPROACH :
TO SORT AN ARRAY A[P . . R]:
DIVIDE
DIVIDE THE N-ELEMENT SEQUENCE TO BE SORTED INTO TWO SUBSEQUENCES OF N/2
ELEMENTS EACH
CONQUER
SORT THE SUBSEQUENCES RECURSIVELY USING MERGE SORT
WHEN THE SIZE OF THE SEQUENCES IS 1 THERE IS NOTHING MORE TO DO
COMBINE
MERGE THE TWO SORTED SUBSEQUENCES

MERGE SORT :
ALG.: MERGE-SORT(A, P, R)
IF P < R CHECK FOR BASE CASE
THEN Q (P + R)/2 DIVIDE
MERGE-SORT(A, P, Q) CONQUER
MERGE-SORT(A, Q + 1, R) CONQUER
MERGE(A, P, Q, R) COMBINE
INITIAL CALL: MERGE-SORT(A, 1, N)
MERGING:
INPUT: ARRAY A AND INDICES P, Q, R SUCH THAT P Q < R
SUBARRAYS A[P . . Q] AND A[Q + 1 . . R] ARE SORTED
OUTPUT: ONE SINGLE SORTED SUBARRAY A[P . . R]
IDEA FOR MERGING:
TWO PILES OF SORTED CARDS
CHOOSE THE SMALLER OF THE TWO TOP CARDS
REMOVE IT AND PLACE IT IN THE OUTPUT PILE
REPEAT THE PROCESS UNTIL ONE PILE IS EMPTY
TAKE THE REMAINING INPUT PILE AND PLACE IT FACE-DOWN ONTO THE OUTPUT PILE
MERGER PSEUDO CODE:
ALG.: MERGE (A, P, Q, R)
1. COMPUTE N
1
AND N
2

2. COPY THE FIRST N
1
ELEMENTS INTO L[1 . . N
1
+ 1] AND THE NEXT N
2
ELEMENTS INTO R[1 . . N
2
+
1]
3. L[N
1
+ 1+ ; R[N
2
+ 1+
4. I 1; J 1
5. FOR K P TO R
6. DO IF L* I + R* J +
7. THEN A*K+ L* I +
8. I I + 1
9. ELSE A*K+ R* J +
10. J J + 1
RUNNING TIME OF MERGE:
INITIALIZATION (COPYING INTO TEMPORARY ARRAYS):
O(N
1
+ N
2
) = O(N)
ADDING THE ELEMENTS TO THE FINAL ARRAY (THE LAST FOR LOOP):
N ITERATIONS, EACH TAKING CONSTANT TIME O(N)
TOTAL TIME FOR MERGE:
O(N)
ANALYSING DIVIDE AND CONQUER:
THE RECURRENCE IS BASED ON THE THREE STEPS OF THE PARADIGM:
T(N) RUNNING TIME ON A PROBLEM OF SIZE N
DIVIDE THE PROBLEM INTO A SUBPROBLEMS, EACH OF SIZE N/B: TAKES D(N)
CONQUER (SOLVE) THE SUBPROBLEMS AT(N/B)
COMBINE THE SOLUTIONS C(N)
O(1) IF N C
T(N) = AT(N/B) + D(N) + C(N) OTHERWISE
MERGE - SORT RUNNING TIME:
DIVIDE:
COMPUTE Q AS THE AVERAGE OF P AND R: D(N) = O(1)
CONQUER:
RECURSIVELY SOLVE 2 SUBPROBLEMS, EACH OF SIZE N/2 2T (N/2)
COMBINE:
MERGE ON AN N-ELEMENT SUBARRAY TAKES O(N) TIME C(N) = O(N)
O(1) IF N =1
T(N) = 2T(N/2) + O(N) IF N > 1

SOLVE THE RESCURRSION:
T(N) = C IF N = 1
2T(N/2) + CN IF N > 1
USE MASTERS THEOREM:

COMPARE N WITH F(N) = CN
CASE 2: T(N) = (NLGN)

Das könnte Ihnen auch gefallen