Sie sind auf Seite 1von 17

METAHEURISTICS APPLIED TO AUTOMATIC

SOFTWARE TESTING: A BRIEF OVERVIEW


TECHNICAL REPORT TR-EVOL-001
Noviembre 2007
DIRECTOR:
M.Sc. Carlos Jordan ESCUELA SUPERIOR POLITCNICA DEL LITORAL
ASISTENTES:
Andrs Ziga
Duval Medina ESCUELA SUPERIOR POLITCNICA DEL LITORAL
COLABORADORES:
Dr. Blas Galvn UNIVERSIDAD DE LAS PALMAS DE GRAN CANARIA
Dr. Monique Sneck KATHOLIEKE UNIVERSITEIT LEUVEN
Facultad de Ingeniera en Electricidad y Computacin
http://www.fiec.espol.edu.ec
2
METAHEURISTICS APPLIED TO AUTOMATIC SOFTWARE TESTING: A BRIEF
OVERVIEW
ABSTRACT
This paper presents a brief overview on Software Testing Techniques and the state of
the art in Metaheuristics applications to the Automatic Software Test Data Generation
problem. A bibliographical review was made about some approaches to the problem
proposed in the 15 last years. Our focuses in the white box or structural, that can be
represent as search problem and solve by metaheuristics techniques. A simple testing
approach, which considers the main ideas of the testing procedure that will be
developed, is presented.
KEYWORDS:
Automatic Software Testing, Test Data Generation, Evolutionary Computing, Metaheuristics.
1. INTRODUCTION.
Testing is a very important activity in the software development process. The principal objective of
software testing is uncover errors and assures that program behavior is correct, according to specific
test requirements [1]. About 40% of the total software development costs is spent in it; its stage is very
expensive, not only economically, but in human resources, time and work intensity, without any
contribution in terms of functionality. Ideally, software testing guarantees the absence or faults in the
software, but in reality it only reveals the presence of faults but never guarantees their absence.
Testing activities support quality assurance by executing the software being studied to gather
information about the nature of that software. The software is executed with input data, or test cases,
and the output data is observed. The output data produced by the execution of the program with a
particular test case provides a specification of the actual program behavior.
To plan and execute tests, one must be consider the software itself and the function it computes, the
inputs and how they can be combined, and the environment in which the software will eventually
operate. This difficult and time-consuming process requires technical sophistication and proper
planning. Testers must not only have good development skills, but also be knowledgeable in formal
languages, graph theory, and algorithms.
Because testing requires the execution of the software, it is often called dynamic analysis. A form of
verification that does not require execution of the software, such as model checking, is called static
analysis. As a form of verification, testing has several advantages over static-analysis techniques. One
advantage of testing is the relative ease with which many of the testing activities can be performed.
Test-case requirements can be generated from various forms of the software, such as its
implementation. Often, these test-case requirements can be generated automatically. Software can be
instrumented so that it reports information about the executions with the test cases. This information
can be used to measure how well the test cases satisfy the test-case requirements. Output from the
executions can be compared with expected results to identify those test cases on which the software
failed. A second advantage of testing is that the software being developed can be executed in its
expected environment. The results of these executions with the test cases provide confidence that the
software will perform as intended. A third advantage of testing is that much of the process can be
automated. With this automation, the test cases can be reused for testing as the software evolves.
3
To get a clearer view of some of software testings inherent difficulties, software testing can be
approached in four phases:
Modeling the software environment
Selecting test scenarios
Running and evaluation of test scenarios
Measuring test progress and performance
These four phases offer a basic structure for the testing development.
The majorities of the software testing techniques focus on the source code, some techniques focuses on
the structure or functionality of this code, and not consider the environment of the software system.
A hierarchical flowgraph decomposition of testing techniques is depicted in Fig. 1. [2].
Fig 1. Hierarchical flowgraph of software testing techniques (take from [2])
In this report, we focus on the Execution-Based Testing, especially on Program-based testing
considering the structure of the program under test. Another point of view of testing techniques is
presented by Harman [4]:
Analytical Quality Assurance Techniques Analytical Quality Assurance Techniques
Simulation
Review
Inspection
Walkthrough
Static Analysis
Model Checking
Mathematical Proof
Static Techniques
Informal Techniques
Formal Techniques
Dynamic Techniques
Mutation Testing
Evolutionary Testing
Functional Testing
Structural Testing
Test
Slicing
Measurement
Measurement
Slicing
Debugging
Statistical Testing
4
Fig 2. Classification for software quality assurance techniques (take from [4])
A program-based testing approach relies upon the structure and attributes of a programs source code
PSC to create a test case. A specification-based testing technique simply uses statements about the
functional and/or non-functional requirements for PSC to create the desired test case T . A combined
testing technique creates a test case that is influenced by both program-based and specification-based
testing approaches. Moreover, the tests in T can be classified based upon whether they are white-box,
black-box, or grey-box test cases. Specification-based test cases are black-box tests that were created
without knowledge of PSC. White-box (or, alternatively, glass-box) test cases consider the entire
source code of P, while so called grey-box tests only considers a portion of Ps source code. Both
white-box and grey-box approaches to testing would be considered program-based or combined
techniques.
Each of these test techniques is accompanied by an adequate measure. Software Measurement is
defined as the process of empirical objective assignment of numbers to entities (software or process), in
order to characterize a specific attribute [4].
In the following sections, we describe some of these techniques.
2. CLASSIC SOFTWARE TESTING TECHNIQUES.
Software testing procedures can be development using several testing strategies. Follow we presents
two of these approaches.
2.1. FUNCTIONAL (BLACK BOX) TESTING.
Functional testing is also called specification-based or black-box testing. The idea is to use test cases
which are created by using software requirements without knowledge about the internal structure of the
software. Functional testing is based on program specifications and not on the internal structure of the
code.
Functional testing, or more precisely, functional test case design, attempts to answer the question
What test cases shall I use to exercise my program? considering only the specification of a program
and not its design or implementation structure.
The strength of black box testing is that tests can be derived early in the development cycle. This can
detect missing logic faults. The software is treated as a black box and its functionality is tested by
providing it with various combinations of input test data [3].
2.2. STRUCTURAL (WHITE BOX) TESTING.
White box testing is also called as structural testing or glass box testing and is the process of deriving
tests from the internal structure of the software under test. There is usually a large number of different
paths through the program, e.g. 100 lines of the C language with two nested loops can have about 10
14
different program paths. It is not usually possible to systematically test every possible program path.
Many forms of structural testing make reference to the control flow graph (CFG) of the program in
question. A control flow graph for a program F is a directed graph ( ) , , , G N E s e , where N is a set of
nodes, E is a set of edges (arcs), and s and e are respective unique entry and exit nodes to the graph.
Each node n N is a statement in the program, with each edge, ( )
,
i j
e n n E , representing a
transfer of control from node
i
n to node
j
n
.
5
As an example of a control graph, we present the triangle classification program, depicted in Fig. 3.
The triangle classification program is a classical example used in several software testing papers. The
pseudo-code for this program is as follows:
int triType(int a, int b, int c)
{
1 int type = PLAIN;
1 if (a > b)
2 swap(a, b);
3 if (a > c)
4 swap(a,c)
5 if (b > c)
6 swap(b, c)
7 if (a == b)
{
8 if (b == c)
9 type = EQUILATERAL;
. else
10 type = SCALENE;
. }
11 else if (b == c)
12 type = ISOSCELES;
13 return type;
}
(a)
(b)
Fig. 3. The triangle program (a) code and (b) flow graph (taked from [8])
Assuming three non-zero, non-negative integer lengths for the sides of a triangle, the program decides
if the triangle is isosceles, scalene, equilateral, or invalid. Nodes corresponding to decision statements
(for an example an if or a while statement) are referred to as branching nodes. In the triangle example,
6
branching nodes are nodes 1, 3, 5, 7, 8 and 11. Outgoing edges from these nodes are referred to as
branches. The condition determining whether a branch is taken is referred to as the branch predicate.
For the true branch from node 1, the branch predicate is a > b.
An input vector I is a vector ( )
1 2
, , ,
k
I x x x L of input variables to the program F . The domain of
an input variable
i
x , ( )
i
Dom x , is the set if all values that
i
x can take on, for 1, , i k L . The domain
of the program F is the cross product of the domain of each variable involved:
( ) ( ) ( )
1 2
( )
k
Dom F Dom x Dom x Dom x L .
A program input
x
is a single point in the k-dimensional input space ( ) ( ) Dom x Dom P . A path P
through a control flow graph is a sequence
1 2
, ,
m
P n n n L ,such that for all i , 1, , i m L ,
( )
1
,
i i
n n E
+
. A path is said to be feasible if there exists a program input for which the path is
traversed, otherwise the path is said to be infeasible.
As we said before, the test strategy must be achieve a testing criteria. Several criteria for white box
testing are presented by Sthamer [3]:
Statement Testing: Every statement in the software under test has to be executed at least once
during testing.
Branch testing: Branch coverage is a stronger criterion than statement coverage. It requires
every possible outcome of all decisions to be exercised at least once, i.e. each possible transfer
of control in the program be exercised. It includes statement coverage since every statement is
executed if every branch in a program is exercised once. However, some errors can only be
detected if the statements and branches are executed in a certain order, which leads to path
testing.
Path testing: In path testing every possible path in the software under test is executed; this
increases the probability of error detection and is a stronger method than both statement and
branch testing. A path through software can be described as the conjunction of predicates in
relation to the software's input variables. However, path testing is generally considered
impractical because a program with loop statements can have an infinite number of paths. A
path is said to be 'feasible', when there exists an input for which the path is traversed during
program execution, otherwise the path is unfeasible.
Coverage can be interpreted as test criteria, and there exist several coverage measures defined in testing
research. Since the great number of statements, branches and paths contained in a medium size
program, coverage measurement must be defined according to the testing strategies used in a specific
test. An example of branch coverage measurement is:
Number of covered branch
Coverage
Number of total branches

3. AUTOMATIC SOFTWARE TESTING.


As we said, the testing phase is the most expensive phase in the software development process. Finding
bugs in code is a very difficult task.
As an example, in a study conducted by Tom McGibbon [7], the authors compares traditional
development approach with two formal methods, VM; and Z. For a program with 30,000 source lines
7
of code, the traditional methods will be able to deliver software with 34 defects with an estimated life-
cycle cost of US. $2.5 million. Using the Z method, the total cost would be reduced by US. $2.2
million, but still about 8 defects would be left. Additional cost savings was be achieved by VDM,
however, it resulted in 24 defects in the final product. In the cases presented in his report, a substantial
part of the cost is consumed for the testing phase.
Automation and increase efficiency of software testing procedures can reduce the total development
costs and time and improve significantly the quality of final product. One of the most important
research topics in the software testing automation is the Automatic Test Data Generation.
Test data generation in software testing is the process of identifying a set of program input data which
satisfy a given testing criteria. Several research works was made about this topic; in the 80s, most of
the methods development used symbolic evaluation to drive test data [10]. This symbolic evaluation
was based on the solving of symbolic equations associated with the code predicates; these methods had
the difficult of solve the equations proposed by several algebraic manipulations.
The first attempt to combine the actual executions of the program with a search technique was
proposed by Miller and Spooner [17]; their method was originally designed for the generation of
floating-point test data. In this approach, the tester selects a path trough the program, and the branch
statement is replaced with a path constraint; a function f using these constraints. The value of f
provides a real valued measure of how close are these constrains to being satisfied [17].
Bogdan Korel [11], extends the ideas of Miller and Spooner, and in 1990 propose the application of
function minimization methods to solve the problem. Korels approach is based on the definition of a
real valued function associated with each predicate in the program; this function is positive when the
branch predicate is false and negative when this predicate is true [11]. The branch predicates have the
following form:
1 2
op E E where
1
E and
2
E are arithmetic expressions and
op
is one of the following
relationship operators { } , , , , , < > . Korel defines the following branch functions associated with
the relationship operator:
Branch
Predicate
Branch
Function
Relational
operator
1 2
E E >
2 1
E E >
1 2
E E
2 1
E E

1 2
E E <
1 2
E E <
1 2
E E
1 2
E E

1 2
E E ( )
1 2
abs E E
1 2
E E ( )
1 2
abs E E
Table 1. Branch functions defined by Korels approach.
The test data generation problem can be represented as a minimization problem as follows [11]: Let
( )
i
F x be a branch function of branch ( )
1
,
i i
n n
+
; the first subgoal is to find a value of
x
which
guarantees the traversal of
i
P and cause ( )
i
F x be negative (or zero) at
i
n , and as result ( )
1
,
i i
n n
+
will
be successfully executed. The objective function ( )
i
F x is a measure that how close the predicate is
being true value. This problem can be represent as an optimization problem as:
8
( ) ( )
argmin
. .
i
t
x F x
s t P is traversed

'

This problem can be solved using numerical optimization techniques for constrained problems. The
approach also includes the presence of arrays and pointers which are difficult to handle by symbolic
representation. Thats the first interpretation of the test data generation problem as an optimization
problem.
But, this approach was difficult: is based in direct search methods, and requires continuity of the branch
functions because it uses information about derivatives, and is susceptible to find a can be trapped in
local minimum.
The success of metaheuristic techniques in solve global optimization problems, especially problems
involve several variables and multiple local optimum, take the attention of the researchers in software
testing.
Since 90 several research efforts was made about the application of metaheuristics techniques to
software engineering problems. These applications will be covered in following sections.
4. METAHEURISTICS IN SOFTWARE ENGINEERING.
The term Metaheuristics can be interpreted as a group of general purpose algorithms based on iterative
methods that guide the search for the optimum using concepts of exploration and exploitation of the
search spaces. It works on several points (candidate solution population) and not only on a single point
in the search space.
Metaheuristics have several advantages over he classical direct search method based in gradients: one
of the best advantage is that metaheuristics can works on problems with several variables and dont
need information about the objective function or its derivatives. All of the methods presents here are
used to carry out the structural testing and uses information about the program flow graph.
One of the firsts approach in the application of metaheuristics was made by Pei et.al. [10], which uses a
binary coded genetic algorithm to solve the test data generation problem in structural testing. Peis
approach is based on a pathwise test data generation which consists in three steps: program flow
control graph constructions, path selection and test data generation and dynamic program execution.
The path selection is made manually.
They use the same branch functions defined by Korel and depicted in table 1. As fitness function, they
define
1 2 n
F F F F + + + L , where F is the sum of the branch functions along the path previously
selected is.
One year later, Roper et.al. [12], presents another application of genetic algorithms to the test data
generation. In this report, they fail to explained several aspects of the method employed, and only
define that the fitness of a chromosome corresponds to the coverage it achieves of the program under
test [12], assign high fitness to chromosomes that executes the true value of each branch.
In 1995, Sthamer [3] presents a doctoral thesis about test data generation applying genetic algorithms,
and he compares it with random testing. Sthamer consider three fitness functions based on the
predicates, one of them is defined by:
( )
2
Fit
h g

+
9
Where h and
g
are the expressions involved in the predicate,

is a constant, and is a small


quantity to prevent numeric overflow. This fitness definition assigns higher fitness to test sets located
in the boundary of the predicates search space; this criterion will permit to traverse all branches,
because the closer the test data are to the boundary, the more information is provided about the
correctness of the condition under test [3]. In each execution, the program will concentrate on branches
not yet traversed, and discard branches that already have been traversed in order to reduce the
computation time.
Sthamers testing tool structured is presented in Fig. 4. This thesis is one of the most complete works
about metaheuristics applications in test data generation because the explanation of the method used
and test results.
The metrics used is the percentage of branches being executed. This works conclude that the use of
Gray code is better than binary code to represent the chromosomes, especially for numeric variables.
Thats is an important result because we use a real representation for the chromosomes in our approach.
Another result is that reciprocal fitness function (presented above) is better than other fitness used.
Sthamer uses test programs coded in ADA language.
Pargas et.al. [13], develop a tool named TGen which solves the problem of data generation using also
the control-dependence graph (defined in terms of the programs control flow graph and the
postdominance relation thats exists among the nodes in the control flow graph). The predicates
executed by each test data is recording and highest fitness is assigned to solutions that covers the
greatest number of predicates. This is the first work that uses a while loop in the test program; the
while loop iterates until the target selected is satisfied or until the max number of attempts is achieved.
Fig. 4. Sthamers testing tool structure.
Joachim Wegner from Daimler Chrysler [14], gives the name of Evolutionary Testing to the
applications of metaheuristics to the data generation. Wegner defines the following characteristics of
evolutionary testing:
Test data generation uses metaheuristics.
Test objective has to be defined numerically and transformed into an optimization problem.
Fitness is based on the monitoring results of data or coverage criteria.
The search space is formed by the inputs domain.
10
In [15], Wegner presents some research applications of evolutionary testing in DaimlerChrysler and its
vehicles control software.
Mark Harman, actually Professor of Kings College (UK), proposes the term Search Based Software
Engineering to refer to the metaheuristics applications in software engineering problems [16, 17, 18,
19]. He creates a research network called SEMINAL (Software Engineering with Metaheuristic
INnovate ALgorithms, and the aim of this network was provide a new perspective on software
engineering problems, allowing them to reformulated as search problem to which metaheuristics can be
applied.
Some of the problems analyzed by the SEMINAL network were:
Testing: structural testing, specification based testing, testing to determine the worst case
execution time.
Module clustering
Cost estimation
In last years, several metaheuristics applications appears in the software testing problem. The Javier
Tuya research group of the Universidad de Oviedo (Spain) has conducted research in the applications
of metaheuristics in software testing. One of the is the work made by Diaz et. al.[5], which applies a
tabu search algorithm for structural testing. The goal of the Diazs works is to obtain the maximum
branch coverage for the program; he defines this metric as:
% 100 *100
Number of unfeasible branches
Max
Number of total branches
_


,
A branch is unfeasible if there no test to cover it. Several branch functions are defined, including
functions for AND, OR and NOT operators. A more of the approaches, they define subgoals by every
predicate and take it as objective node. For the classic triangle classifier program, they obtain results of
coverage of 100%, in contrast with random testing that obtained 58%. They also conclude that the
effectiveness and efficiency of their algorithm is not very dependent of the domain of input variables.
Another Tuyas research group work is presented in Blanco et. al. [20]; in this paper, a scatter search
procedure is used by generate test cases. This approach was compared with tabu search proposed by
Diaz, and both tabu and scatter search has achieve 100% of coverage, but scatter search take more less
test cases and time that tabu search. The branch functions were the same as those used in Diazs work.
Another group that works in this field is the Enrique Albas research group from Universidad de
Mlaga (Spain) [21, 22 and 23]. In these works the author define several fitness functions, for both true
or false value of the branch predicates [21]:
Branch
Predicate
True fitness False fitness
a b < 1 a b + b a
a b a b 1 b a +
a b ( )
2
b a ( )
( )
1
2
1 b a

+
a b ( )
( )
1
2
1 b a

+ ( )
2
b a
a
( )
1
2
1 a

+
2
a
11
Table 2. Fitness functions for the Albas group approach used in [21].
This group proposes three approaches for testing: Evolutionary strategies (ES), Particle Swam
Optimization (PSO) and a Real Coded Genetic Algorithms (GA); the results is summarized in
Chicanos Thesis [23] and is compared with random testing (RA). The fitness function used for this
comparison was:
( ) ( ) ( )
arctan 0.1
2
fitness x distance x

+
Where the function ( )
distance x is defined by the functions presented in table 2 and used as fitness
function in [21]. Te test criteria used is the branch coverage.
The results of [23] conclude that ES and PSO have similar performance, with small differences for
some program under test. So, ES and PSO presents better performance that GA and Ramdom testing.
The worst results was obtained for the select program (select k-th element for non ordering list), with
PSO = 88.89%, ES = 83.33%, GA = 83.33% and Rand = 11.11% of coverage. The better results was
obtained for sa program (Simmulated annealing), with PSO = 100%, ES = 99.94%, GA = 96.72% and
Rand = 96.67% of coverage. The PSO approach needs several less evaluations that other to achieve this
levels of coverage. Chicano concludes that approach with lowest number of evaluations is better than
other if it have highest coverage than other.
Chicanos thesis is also a very good reference because the explanation of the methods and the appendix
containing the methods parameter adjustment procedure.
5. A SIMPLE TESTING APPROACH.
Now, we consider a simple method for testing, based on the proposed by Harmen-Hinrich Sthamer in
hid Doctoral Thesis [3], because is a good start point in the testing strategies, principally in the If-Then
statement. Thats a first approach, which must be evaluated for acceptation or modifications.
An If-then-else statement has the following form:
If Cond(n)
Sen(n+1)
Else
Sen(n+2)
End if
Where Cond(n) represents the condition (node predicate) associated with the node n, S(n+k) represents
the statement that be executed and result in node (n+k) depending of the result of condition Cond(n).
The flow graph of this structure has the following form:
N
N+2 N+1
If
Cond(n)
then
Else
Fig. 5. Flowgraph for a If-Then-Else structure.
12
The flowgraph mean that if Cond(N) is true, then S(n+1) is executed, else, if Cond(N) is false, then
S(n+2) is executed.
We define a Test Case ( ) T n as a pair of variables , x y , and define an objective node N for each
iteration, a condition ( ) Cond n x y , and statement ( ) 1 S x y + and ( ) 2 S x y . We initialize
( ) T n as a random number. The source code must be instrumented as follows:

branch[0] = branch[0]+1;
If x y
branch[1]=branch[1]+1;
x+y;
Else
branch[2]=branch[2]+1;
x-y;
End if
Each time that condition Cond(n) is true, the counter branch[1] is incremented and x+y is executed,
and when Cond(n) is false, branch[2] is incremented and x-y is executed.
A suite of test cases (called in evolutionary computing terms as population) is generated by mean of
random number generator. Each test case is evaluated in the program, and counters branch[k] is
incremented if this branch k = (n, n+1) is traversed by the test case. In each iteration, an objective
node is defined; this objective node is the node where the test is start.
There are some statuses of the test result:
If a node n is traversed only in one way of the condition, once all test cases were evaluated, we
say that node n is achieved.
If a node n is traversed in the two ways of the condition, once all test cases were evaluated, we
say that node n is covered.
If a node n is not traversed, once all test cases were evaluated, we say that node k is not
evaluated.
Of course, the last status of test is obtained for nodes associated with branch not covered by a specific
test case.
At the end of test procedure, the status structure containing the results of partial branch coverage is
compared and we determine the coverage criteria for the test as the ratio between branches cover to
total program branch.
% 100 *100
Number of branches covered
Max
Number of total branches
_


,
In a first stage, the test cases were generated as random numbers, and the testing was made as we
explain above.
A scheme of the approach can be depicted as follows:
13
Program under
test
Control
unit
Module
of optimization
Suite of test cases
(Initial population)
Fig. 6. Schema of the first approach proposed.
The structure of the program under test is extracted manually and is an input to the testing program. A
suite of test cases is randomly generated. The control unit is the main module of the testing procedure,
and it conducts the search procedure for each test case. Once a test case is presented, the control unit
evaluates the fitness function and sends this information to the optimization module, which contain the
evolutionary methods to search the best test cases. Once the search procedure was made, this optimum
test case is evaluated into the program under test and the status is determinate.
At the end of the test procedure, the control unit determines the metrics for evaluate the performance of
the test executed.
The optimization module contains the evolutionary methods that we are development to carry out the
search of optimal test data, considering the fitness function which is based on the distance between the
variables of a specific predicate in the program under test. The objective of the application of such
metaheuristics is to find the boundary values which make a condition change its value from true to
false.
The optimization algorithm take the information about the predicate distance defined by each condition
as presents Sthamer [3] or Chicano [23] thesis, and run the optimization approach maximizing the
fitness functions which is defined as a function of this distance. The details of the evolutionary methods
applied to the testing procedure, will be present in other report about this methods.
6. OUR FIRST APPROACH.
The first approach to the strategy that will be applied in our project, is similar to the approach show
above. We consider some ideas from Chicanos and Sthamers Doctoral works.
On the beginning, we extract manually the structure of the program under test to construct the
flowgraph. Follow, we generate a random initial population of test case ( ) T n . We consider that each
node is an partial objective to be satisfied by the generated data; as if, we divide the entire program in
partial objectives, and consider that each objective has associated a distance function that depends of
the relational operator and also have a fitness or evaluation function defined in terms of these
distance. The distance end fitness function is the same proposed by Chicanos and University of
Malaga group, and presents in Table 2.
For each partial objective (each node in the flowgraph), is desirable that this objective would be
covered by the suite of test data generated; when we work with a random generated test data, we obtain
several objectives covered by the test data and some objectives achieved by the same test suite.
At this time, we conclude the simulations with a random test data generator and we are defining the
strategies for the application of the genetic algorithm to the generation of test data using the distance
and fitness function depicted above.
14
The first program under analysis is the triangle classifier, which classifies the type of triangle
depending of the sides entering or determines if it is not a triangle. This is the principal program tested
in the references cited.
As preliminary results of the application of the random generator, we achieve from 80% to 90% as
coverage criteria. These results is being detailed in the tests report.
7. CONCLUSIONS.
The representation of software engineering problems as an optimization problem is a research subject
that have been studied in the last 10 years, especially by UK researchers in the beginning of the 2000
[3, 4, 12, 14, 15, 16, 17, 18, 19], and in the three last years with Spanish researchers [5, 20, 21, 22, 23].
Metaheuristics techniques have been better performance than classical optimization techniques
especially in problem that involves great dimension search spaces and non-continuous objective
functions, as the objective functions associated with the program predicates.
We have find several approaches to define the fitness or evaluation function, but in general, it is
constructed based on the predicate relationship; this functions has the particularity of is an positive
value functions. The authors not have any explain for this, bus we thinks that is for use of very simple
chromosome selection scheme like roulette wheel.
There are two coverage approaches presented in literature: statement coverage and branch coverage.
Statement coverage implies that all atomic conditions take value of true and false, if the statement take
both logical values we say that statement is cover, but if only take true or false, we say that statement is
achieved. A branch coverage criterion implies that all branches in program must be traversed by an
specific test data set.
Another approach made by the authors is the partition of the global objective in several partial
objectives, defining an objective statement that must take a logical value; each partial objective is
treated as an optimization problem with objective function based on the predicate condition.
The metrics commonly applied to measure the performance of the test methods is the percent coverage,
defined as the ratio between coverage statements and total statements. Another metrics are used as
number of evaluations, the measure of how closer is the test data to the predicate condition boundaries.
The metrics varies between researchers.
The software testing techniques applying metaheuristics is a promising research field that is still in
research phase, with several successfully applications like occurs in DaimlerChrysler, where test the
software control of the vehicles. The latest research works conducted in Spain, leads the efforts to he
use of Evolution Strategies over the classical Genetic Algorithms.
As a part of our project, wee will propose the application of a technique to the test data generation,
using a novel approach in evolutionary computing named Evolutionary Flexible Computing. This
approach will be compared with classical genetic algorithm and evolution strategies methods applied to
generically test programs.
Finally, we have presented a first approach to the testing strategy that we are development. This
approach must be change, but its show the principal ideas of the testing strategy under development.
15
8. REFERENCES.
[1]. Pressman, R. S., Software engineering: a practitioners approach. McGraw-Hill, fifth edition,
2001. ISBN 0-07-365578-3..
[2]. Zhu, H., Hall, P. A. V., and May, J. H. R., Software unit test coverage and adequacy. ACM
Computing Surveys, 29(4):366427, 1997.
[3]. Sthamer, H-H., The Automatic Generation of Software Test Data Using Genetic Algorithms,
Ph.D. Thesis, University of Glamorgan, United Kingdom, 1995.
[4]. Harman, M., Software Measurement and Testing, Class notes CS3SMT, Kings College
London, Egland, Attum Term, 2004.
[5]. Diaz, E., Tuya, J., Blanco, R., Dolado, J. J., A tabu search algorithm for structural software
testing, Computers and Operations Research, Vol. xx, No. xx, Febrary 2007.
[6]. Kaner, C., Measurement Issues and Software Testing, in Testing Computer Software, 3
rd
edition, XXXX
[7]. McGraw, G., Mchael, C., Automatic Generation of Test-Cases for Software Testing, in
Proceedings of the 18
th
Annual Conference of the Cognitive Science Society, July 1996.
[8]. Edvarson, J., A survey on Automatic Test Data Generation, in Proceedings of the Second
Conference on Computer Sciences and Engineering in Linkoing, pp. 21-28, October 1999.
[9]. McGibbon, T., An Analysis of Two Formal methods VDM and Z, disponible en
http://www.dacs.dtic.mil, Agosto 1997.
[10]. Pei, M., Goodman, E. D., Zongyi, G., Zhong, K., Automated Software Test Data Generation
Using A Genetic Algorithm, Tehnical Report GARAGE, Michigan State University, June
2004.
[11]. Korel, B., Automated Software Test Data Generation, IEEE Transactions on Software
Engineering, Vol. 16, No. 8, August 1990.
[12]. Roper, M., MacLean, I., Brooks, A., Miller, J., Wood, M., Genetic algorithms ans the
automatic generation of test data, Research Report RR/95/195, Department of Computer and
Information Sciences, University of Strathclyde, UK, 1995.
[13]. Pargas, R. P., Harrold, M. J., Peck, R. R., Test-Data Generation Usng Genetic Algorithms,
Journal of Software Testing, Verification and Reliability, 9(4), pp. 263-282, 1999.
[14]. Wegner, J., Overview on evolutionary testing, IEEE Seminal Workshop, Toronto, Canada,
May 2001.
[15]. Wegner, J., Overview on Evolutionary Testing at DaimlerChrysler, Seminal Meeting,
University of Glamorgan, UK, July 2000.
[16]. Harman, M., Search Based Software Engineering, Information and Software Technology,
Vol. 43, No. 14, pp. 833-839, December 2001.
[17]. McMinn, P., Search-based Software Test Data Generation: A survey, Software Testing
Verification and Reliability, Vol. 14(2), pp. 105-156, 2004.
16
[18]. Clarke, J., Harman, M., Hierons, R., Jones, B., Lumkin, M., Rees, K., Roper, M., Shepperd, M.,
The Application of Metaheuristic Search Techniques to Problems in Software Engineering,
Technical Report SEMINAL-TR-01-2000, Brunel University, United Kingdom, August 2000.
[19]. Clarke, J., Harman, M., Hierons, R., Jones, B., Lumkin, M., Rees, K., Roper, M., Shepperd, M.,
Reformulating software engineering as a search problem, IEEE Proceedings of Software
Engineering, Vol. 150(3), pp. 665-690, 2003.
[20]. Blanco, R., Diaz, E., Tuya, J., Generacin automtica de casos de prueba mediante bsqueda
dispersa, Revista Espaola de Innovacin, Calidad e Ingeniera de Software, Vol. 2, No. 1,
2006.
[21]. Alba, E., Chicano, F., Software testing using evolutionary strategies, The 2
nd
Workshop on
Rapid Integration of Software Engineering Techniques (RISE 05), LNCS3943, pp.56-65,
Greece, September 2005.
[22]. Alba, E., Chicano, F., Janson, S., Testeo de software on dos tcnicas metaheuristicas, in VX
Jornadas de Ingeniera de Software y Bases de Datos JISBD 2006, Jos Riquelme-Pere Botella
eds., Barcelona, 2006.
[23]. Chicano, J. F., Metaheursticas e Ingeniera del Software, Tesis Doctoral, Departamento de
Lenguajes y Ciencias de la Computacin, Universidad de Mlaga, Spain, 2007.
17

Das könnte Ihnen auch gefallen