Sie sind auf Seite 1von 23

Design And analysis of algorithm

Narayan Changder

Copyright 2011 by Narayan Changder

All rights reserved.

ISBN . . .

. . . Publications

Notes developed by Mr Narayan Changder

To

iii

Notes developed by Mr Narayan Changder

Contents
1 COMPLEXITY ANALYSIS 1.1 Classication of algorithm . . . . . . . . . . . . . . 1.2 Correctness and Analysis of Algorithms: . . . . . . 1.3 Algorithm analysis . . . . . . . . . . . . . . . . . . 1.4 Goal of algorithm analysis . . . . . . . . . . . . . . 1.5 The RAM Model of Computation . . . . . . . . . . 1.5.1 Hypothetical RAM model is not always true 1.6 Asymptotic analysis . . . . . . . . . . . . . . . . . 1.6.1 Best, Worst, and Average-Case Complexity . 1.6.2 Rates of growth . . . . . . . . . . . . . . . . 1.6.3 Classication of Growth . . . . . . . . . . . 1.7 Importance of ecient algorithms: . . . . . . . . . . 1.8 Program performance measurement: . . . . . . . . . 1.9 Data structures and algorithms: . . . . . . . . . . . 1 1 2 2 3 3 4 4 5 7 7 8 10 10 13 13 14

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

2 LINEAR LISTS: 2.1 abstract data type (ADT) . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Linked Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS

Notes developed by Mr Narayan Changder

List of Algorithms

vii

LIST OF ALGORITHMS

Notes developed by Mr Narayan Changder

Chapter 1 COMPLEXITY ANALYSIS


What is an algorithm? An algorithm is a procedure to accomplish a specic task.An algorithm is the idea behind any reasonable problem.An algorithm is a procedure that takes any of the possible input instances and transforms it to the desired output. The study of algorithms concentrates on the high level design of data structures and methods for using them to solve problems. The subject is highly mathematical, but the mathematics can be compartmentalized, allowing a student to concentrate on what rather than why.

1.1

Classication of algorithm

There are various ways to classify algorithms, each with its own merits and demerits.Algorithms can be categorized by style and by application. Commonly used styles are divide and conquer (recursion), dynamic programming (bottom-up or memoization), and greedy strategy (do the best thing locally and hope for the best). Common application categories include mathematics, geometry, graphs, string matching, sorting and searching. Combinatorial algorithms are a larger category including any algorithm that must consider the best result among a large sample space of possibilities. Many combinatorial problems are NP-Complete. Do not confuse dynamic programming with linear programming. The latter refers to a particular problem in linear algebra and numerical analysis with important application in industry and operations research. It is not a style or technique, nor does it have anything to do with programs that run in linear time. It is a problem that can be used to solve many combinatorial problems. 1

CHAPTER 1. COMPLEXITY ANALYSIS

1.2

Correctness and Analysis of Algorithms:

There are many ways to analyze the time and space requirements of an algorithm. We will describe the time requirements of an algorithm as a function of the input size, rather than a list of measured times for particular inputs on a particular computer. Usually, the time requirements are the main concern, so the space requirements becomes a secondary issue. One way to analyze the time complexity of an algorithm is worst case analysis. Here we imagine that we never get lucky, and calculate how long the program will take in the worst case. The average case analysis may be more useful, when the worst case does not show up very often, and this method averages the time complexity of the algorithm over all possible inputs, weighted by the input distribution. Average case analysis is not as common as worst case analysis, and average case complexity is often dicult to compute due to the need for careful probabilistic analysis. Another method of measuring time complexity is called amortized analysis. Amortized analysis is motivated by the following scenario. Lets say that there is an algorithm whose average case complexity is linear time, however in practice this algorithm is run in conjunction with some other algorithms, whose worst case complexities are constant time. It turns out that when you use these algorithms, the slow one is used much less frequently than the fast ones. So much less that we can distribute the linear cost of the slow one over the fast costs so that the total amortized time over the use of all the algorithms is constant time per run. This kind of analysis comes up with with the fancier data structures like Fibonacci heaps, which support a collection of algorithms. The amortized analysis is the only way in which the fancier data structure can be proved better than the standard binary heap data structure.

1.3

Algorithm analysis

Analyzing an algorithm means predicting the resources that the algorithm requires.Occasionally, resources such as memory, communication bandwidth, or computer hardware are of primary concern, but most often it is computational time that we want to measure. The most important and durable part of computer science is algorithm because algorithm can be studied in a language-and machine-independent way this means that we need techniques that enable us to compare the eciency of algorithms without implementing them. Our two most important tools for analyzing any algorithm are The RAM model of computation. The asymptotic analysis of worst-case,average case and best case behaviour of algorithm. Notes developed by Mr Narayan Changder

1.4. Goal of algorithm analysis

1.4

Goal of algorithm analysis

The practical goal of algorithm analysis is to predict the performance of dierent algorithms in order to make suitable design decisions During the 2008 United States Presidential Campaign, candidate Barack Obama was asked to perform an impromptu analysis when he visited Google. Chief executive Eric Schmidt jokingly asked him for the most ecient way to sort a million 32-bit integers. Obama had apparently been tipped o, because he quickly replied, I think the bubble sort would be the wrong way to go This is true: bubble sort is conceptually simple but slow for large datasets. The answer Schmidt was probably looking for is radix sort. So the goal of algorithm analysis is to make meaningful comparisons between algorithms, but there are some problems: The relative performance of the algorithms might depend on characteristics of the hardware, so one algorithm might be faster on Machine A, another on Machine B. The general solution to this problem is to specify a machine model and analyze the number of steps, or operations, an algorithm requires under a given model. Relative performance might depend on the details of the dataset. For example, some sorting algorithms run faster if the data are already partially sorted; other algorithms run slower in this case. A common way to avoid this problem is to analyze the worst case scenario. It is also sometimes useful to analyze average case performance, but it is usually harder, and sometimes it is not clear what set of cases to average over. Relative performance also depends on the size of the problem. A sorting algorithm that is fast for small lists might be slow for long lists. The usual solution to this problem is to express run time (or number of operations) as a function of problem size, and to compare the functions asymptotically as the problem size increases.

1.5

The RAM Model of Computation

Machine-independent algorithm design depends upon a hypothetical a generic oneprocessor computer called the Random Access Machine or RAM. Under this model of computation, In the RAM model, instructions are executed one after another, with no concurrent operations Here we are confronted with a computer where: Each simple operation (+, *, , =, if, call) takes exactly one time step The data types in the RAM model are integer and oating point Notes developed by Mr Narayan Changder 3

CHAPTER 1. COMPLEXITY ANALYSIS In the RAM model, we do not attempt to model the memory hierarchy that is common in contemporary computers(we do not model caches or virtual memory) Loops and subroutines are not considered simple operations. Instead, they are the composition of many single-step operations. It makes no sense for sort to be a single-step operation, since sorting 1,000,000 items will certainly take much longer than sorting 10 items. The time it takes to run a program depends upon the number of loop iterations or the specic nature of the subprogram. Each memory access takes exactly one time step. Here we have as much memory as we need. The RAM model takes no notice of whether an item is in cache or on the disk (we do not model caches or virtual memory)

1.5.1

Hypothetical RAM model is not always true

The RAM is a simple model which gives us a simple view of how computers perform.in RAM model we measure run time by counting up the number of steps an algorithm takes on a given problem instance.Though RAM is an excellent model for understanding how an algorithm will perform on a real computer it has some drawback like this Adding two numbers takes less time than multiplying two numbers on most processors, which violates the rst assumption of the model. compiler loop unrolling may violate the fourth assumption memory access times dier greatly depending on whether data sits in cache or on the disk violate the fth assumption And yet, despite these complaints, the RAM is an excellent model for understanding how an algorithm will perform on a real computer. Every model has a size range over which it is useful. Take, for example, the model that the Earth is at. You might argue that this is a bad model, since all we know Earth is round. But, when laying the foundation of a house, the at Earth model is suciently accurate that it can be reliably used. It is so much easier to manipulate a at-Earth model than round earth model. The same is true with the RAM model of computation

1.6

Asymptotic analysis

A programmer usually has a choice of data structures and algorithms to use. Choosing the best one for a particular job involves, among other factors, two important measures: Time Complexity : how much time will the program take? Notes developed by Mr Narayan Changder

1.6. Asymptotic analysis Space Complexity : how much storage will the program need? It really makes little sense to classify an individual program as being ecient or inecient. It makes more sense to compare two (correct) programs that perform the same task and ask which one of the two is more ecient, that is, which one performs the task more quickly. However, even here there are diculties. The running time of a program is not well-dened. The run time can be dierent depending on the number and speed of the processors in the computer on which it is run. It can depend on details of the compiler which is used to translate the program from high-level language to machine language. Furthermore, the run time of a program depends on the size of the problem which the program has to solve. The term eciency can refer to ecient use of almost any resource, including time, computer memory, disk space, or network bandwidth. In this section, however, we will deal exclusively with time eciency, and the major question that we want to ask about a program is, how long does it take to perform its task? When the run times of two programs are compared, it often happens that Program A solves small problems faster than Program B, while Program B solves large problems faster than Program A, so that it is simply not the case that one program is faster than the other in all cases. In spite of these diculties, there is a eld of computer science dedicated to analyzing the eciency of programs. The eld is known as Analysis of Algorithms.One of the main techniques of analysis of algorithms is asymptotic analysis. The term asymptotic here means basically the tendency in the long run. An asymptotic analysis of an algorithms run time looks at the question of how the run time depends on the size of the problem. The analysis is asymptotic because it only considers what happens to the run time as the size of the problem increases without limit; it is not concerned with what happens for problems of small size or, in fact, for problems of any xed nite size. Only what happens in the long run, as the problem size increases without limit, is important. Showing that Algorithm A is asymptotically faster than Algorithm B doesnt necessarily mean that Algorithm A will run faster than Algorithm B for problems of size 10 or size 1000 or even size 1000000 it only means that if you keep increasing the problem size, you will eventually come to a point where Algorithm A is faster than Algorithm B. An asymptotic analysis is only a rst approximation, but in practice it often gives important and useful information.

1.6.1

Best, Worst, and Average-Case Complexity

One can count how many steps our algorithm takes on any given input instance by executing it by using the RAM model of computation. However, to understand how good or bad an algorithm is in general, we must know how it works over all instances. Notes developed by Mr Narayan Changder 5

CHAPTER 1. COMPLEXITY ANALYSIS To understand the notions of the best, worst, and average-case complexity,think about running an algorithm over all possible instances of data that can be fed to it.Forexample in the problem of sorting, the set of possible input instances consists of all possible arrangements of n keys, over all possible values of n.We can represent each input instance as a point on a graph (shown in Figure 2.1) where the x-axis represents the size of the input problem (for sorting, the number of items to sort), and the y-axis denotes the number of steps taken by the algorithm in this instance.We can dene three interesting functions over the plot of these points: There are dierent kinds of analyses that people do. The one were mostly going to focus on is whats called worst-case analysis.And this is what we do usually where we dene T (n) to be the maximum time on any input of size n. So, its the maximum input, the maximum it could possibly cost us on an input of size n. What that does is, if you look at the fact that sometimes the inputs are better and sometimes theyre worse, were looking at the worst case of those because thats the way were going to be able to make a guarantee. The worst-case complexity of the algorithm is the function dened by the maximum number of steps taken in any instance of size n. This represents the curve passing through the highest point in each column.Generally we want upper bonds because it represents a guarantee to the user. The best-case complexity of the algorithm is the function dened by the minimum number of steps taken in any instance of size n. This represents the curve passing through the lowest point of each column.best-case analysis is bogus because The best-case probably doesnt ever happen. Actually, its interesting because for the sorting problem, the most common things that get sorted are things that are already sorted interestingly, or at least almost sorted. The average-case complexity of the algorithm, which is the function dened by the average number of steps over all instances of size n.Here T (n) is then the expected time over all inputs of size n. Its the expected time and it depends on the Expected inputs.Its the time of every input times the probability that it will be that input.How do we know what the probability a particular input occurs is in a given situation?.We have to make the assumption of the statistical distribution of inputs. Otherwise, expected time doesnt mean anything because we dont know what the probability of something is.One of the most common assumptions is that all inputs are equally likely. Thats called the uniform distribution.The average case time is given by the following formula:
m

A(n) =
i=1

pi ti

(1.1)

Notes developed by Mr Narayan Changder

1.6. Asymptotic analysis

1.6.2

Rates of growth

Analysis of algorithms is the branch of computer science that studies the performance of algorithms.In analysis of algorithms, it is meaningless to say that an algorithm M,when presented with input x,runs in time y second.this is because the actual time is not only a function of the algorithm used,it is a function of dierent factors,e.g. how and on what machine the algorithm is implemented or in what language the algorithm is implemented. Most important factor is the rate of increase in operations for an algorithm to solve a problem as the size of the problem increases without limit.This is referred to as the rate of growth of the algorithm. What happens with small sets of input data is not as interesting as what happens when the data set gets large.

1.6.3

Classication of Growth

Because the rate of growth of an algorithm is important, and we have seen that the rate of growth is dominated by the largest term in an equation, we will discard the terms that grow more slowly. When we strip all of these things away, we are left with what we call the order of the function or related algorithm. We can then group algorithms together based on their order. We group them in three categoriesthose that grow at least as fast as some function, those that grow at the same rate, and those that grow no faster. Big oh f (n) = O(g(n)) means c g(n) is an upper bound onf (n). Thus there exists some constant c such that f (n) is always c g(n), for large enough n (i.e. ,n n0 for some constant n0 ). Big Omega f (n) = (g(n)) means c g(n) is a lower bound on f (n). Thus there exists some constant c such that f (n) is always c g(n), n n0 . Big Theta f (n) = (g(n)) means c1 g(n) is an upper bound on f (n) and c2 g(n) is a lower bound on f (n), for all n n0 . Thus there exist constants c1 and c2 such that f (n) c1 g(n) and f (n) c2 g(n). This means that g(n) provides a nice, tight bound on f (n).Formally, this class of functions is dened as the place where big omega and big oh overlap, so (g(n) = (g(n) Notes developed by Mr Narayan Changder 7

CHAPTER 1. COMPLEXITY ANALYSIS Example 1.6.1 3n2 100n + 6 = O(n2 ), because I choose c = 3 and 3n2 > 3n2 100n + 6; Example 1.6.2 3n2 100n+6 = O(n3 ), because I choose c = 1 and n3 > 3n2 100n+6 when n > 3; Example 1.6.3 3n2 100n + 6 = O(n), because for any c, I choose c n < 3n2 when n > c; Example 1.6.4 3n2 100n + 6 = (n2 ), because I choose c = 2 and 2n2 < 3n2 100n + 6 when n > 100; Example 1.6.5 3n2 100n+6 = (n3 ), because I choose c = 3 and 3n2 100n+6 < n3 when n > 3; Example 1.6.6 3n2 100n + 6 = (n2 ), because both O and apply; Example 1.6.7 3n2 100n + 6 = (n3 ), because only O applies; Example 1.6.8 3n2 100n + 6 = (n), because only applies.

1.7

Importance of ecient algorithms:

Algorithms are the foundation of computer Science, it is telling the computer to do the task in the most ecient matter. An algorithm is particularly important in optimizing a computer program, the eciency of the algorithm usually determines the eciency of the program as a whole. The analysis of algorithm is the theoretical study of computer program performance and resource usage(resources such as communication, such as memory, whether RAM memory or disk memory) And a particular focus on performance and on correctness of programs. In practice, another issue is also important: eciency. When analyzing a program in terms of eciency, we want to look at questions such as, How long does it take for the program to run? and Is there another approach that will get the answer more quickly? Eciency will always be less important than correctness; if you dont care whether a program works correctly, you can make it run very quickly indeed, but no one will think its much of an achievement! On the other hand, a program that gives a correct answer after ten thousand years isnt very useful either, so eciency is often an important issue. If youre in an engineering situation and writing code, writing software, whats more important than performance? Notes developed by Mr Narayan Changder

1.7. Importance of ecient algorithms: Correctness - A programme is written to provide functions specied in its functional requirement specication.A programme is functionally correct if it behaves according to its functional specication.A good design should correctly implement all the functionalities identied in the functional requirement specication Maintainability- Software should be easily amenable to change and it can be restored to a specied condition within a specied period of time. For example, antivirus software may include the ability to periodically receive virus denition updates in order to maintain the softwares eectiveness. Modularity- The resulting software comprises well dened, independent components. That leads to better maintainability. The components could be then implemented and tested in isolation before being integrated to form a desired software system. This allows division of work in a software development project. Extensibility: It should be easy to increase the functions performed by it. Reliability: Reliability measures the level of risk and the likelihood of potential application failures. It also measures the defects injected due to modications made to the software (its stability as termed by ISO). Correctness: The software which we are making should meet all the specications stated by the customer. Usability/Learnability: The amount of eorts or time required to learn how to use the software should be less. This makes the software user-friendly even for IT-illiterate people. Integrity : Just like medicines have side-eects, in the same way a software may have a side-eect i.e. it may aect the working of another application. But a quality software should not have side eects. Eciency : This characteristic relates to the way software uses the available resources. The software should make eective use of the storage space and execute command as per desired timing requirements. Security : With the increase in security threats nowadays, this factor is gaining importance. The software shouldnt have ill eects on data / hardware. Proper measures should be taken to keep data secure from external threats. Safety : The software should not be hazardous to the environment/life. Maintainability : Maintenance of the software should be easy for any kind of user. Flexibility : Changes in the software should be easy to make. Notes developed by Mr Narayan Changder 9

CHAPTER 1. COMPLEXITY ANALYSIS Scalability : It should be very easy to upgrade it for more work(or for more number of users). Testability : Testing the software should be easy. Modularity : Any software is said to made of units and modules which are independent of each other. These modules are then integrated to make the nal software. If the software is divided into separate independent parts that can be modied, tested separately, it has high modularity. Interoperability : Interoperability is the ability of software to exchange information with other applications and make use of information transparently. Reusability : If we are able to use the software code with some modications for dierent purpose then we call software to be reusable. Portability : The ability of software to perform same functions across all environments and platforms, demonstrate its portability.

1.8 1.9

Program performance measurement: Data structures and algorithms:

Algorithms + Data Structures = Programs.An algorithm denes a sequence of steps and decisions which can be employed to solve a problem.Data structures describe a collection of values, often with names and information about the hierarchical relationship of those values:A database is a data structure, a shopping list is a data structure, and a pair of graph coordinates is a data structure. Data structures exist as collection of atomic units which are domain specic. Data structures in recipies have atomic units such as integers and oating point numbers. The data structure of a recipe has atomic units such as cups and teaspoons. Data structures are distinct from algorithms. Algorithms are almost exclusively procedural in nature. Even functional languages must eventually be decomposed into procedural steps for deriving results. Data structures have no such procedure component but instead have hierarchy, contents and value. The important relationship between algorithms and data structures is that algorithms, with few exceptions, can accomplish very little without collecting input, storing intermediate results, and delivering output. Changing a data structure in a slow program can work the same way an organ transplant does in a sick patient. Important classes of abstract data types such as containers, dictionaries, and priority queues, have many dierent but functionally equivalent data Notes developed by Mr Narayan Changder

1.9. Data structures and algorithms: structures that implement them. Changing the data structure does not change the correctness of the program,since we presumably replace a correct implementation with a dierent correct implementation. However, the new implementation of the data type realizes dierent tradeos in the time to execute various operations, so the total performance can improve dramatically. Like a patient in need of a transplant, only one part might need to be replaced in order to x the problem. But it is better to be born with a good heart than have to wait for a replacement. The maximum benet from good data structures results from designing your program around them in the rst place.

Notes developed by Mr Narayan Changder

11

CHAPTER 1. COMPLEXITY ANALYSIS

Notes developed by Mr Narayan Changder

Chapter 2 LINEAR LISTS:


In liner list travers of list is done in liner fashion.in this chapter we will learn dierent linear list in computer science.A linear list is a list in which each element has a unique successor Types of lists:General: data can be inserted and deleted anywhere in the list Unordered or random Ordered: Data are arranged according to a key. Restricted: data can be inserted or deleted at the ends of the list LIFO (stack) FIFO (queue) Four basic operations associated with linear lists are: Insertion deletion retrieval traversal

2.1

abstract data type (ADT)

In computer science, an abstract data type (ADT) is a mathematical model for a certain class of data structures that have similar behavior; A data type can be considered abstract when it is dened in terms of operations on it, and its implementation is hidden.An intuitive explanation: Dene an interface (in mathematical terms a signature) 13

CHAPTER 2. LINEAR LISTS: Dene known constants Dene functions in terms of the constants and composition with other functions. You dont know how the ADT computes, but you know what it computes. We are all used to dealing with the primitive data types as abstract data types. It is quite likely that you dont know the details of how values of type double are represented as sequences of bits. The details are, in fact, rather complicated. However, you know how to work with double values by adding them, multiplying them, truncating them to values of type int, inputting and outputting them, and so on. To work with values of type double, you only need to know how to use these operations and what they do. You dont need to know how they are implemented.

2.2

Linked Lists

1. A linked list is an ordered collection of data in which each element contains the location of the next element. 2. Each element (item) called node contains two parts: data and link Link contains a pointer variable that identies the rst element in the list. Nodes are called self-referential structures. Each instance of the structure contains a npointer to another instance of the same structural type. 3. Unlike arrays, data can be easily inserted and deleted in the linked list. But the search becomes sequential because the elements are no longer physically sequenced. 4. Linked list ADT consists of the data structure and and all operations that manipulate the data. 5. A head node contains metadata about the list such as count, a head pointer to the rst node, and a rear pointer to the last node. 6. Data node contains the data type which depends entirely on the application and a pointer to another data structure of its own type.

Notes developed by Mr Narayan Changder

Bibliography
[1] Books of Shrii Shrii Anandamurti (Prabhat Ranjan Sarkar): http://shop.anandamarga.org/ [2] Avtk. Ananda Mitra Ac., The Spiritual Philosophy of Shrii Shrii Anandamurti: A Commentary on Ananda Sutram, Ananda Marga Publications (1991) ISBN: 81-7252-119-7

15

Das könnte Ihnen auch gefallen