Sie sind auf Seite 1von 5

Published in Proceedings of European Software Control and Metrics (ESCOM) Conference Berlin 26-28 May 1997

Using checklists to evaluate software product quality

- measurement concepts for qualitative determination of quality characteristics -

Teade Punter

Eindhoven University of Technology 513, 5600 MB, Eindhoven, The Netherlands


Issues related to software quality are often focusing upon the process during which software is produced.
Central aim is to improve the software process to produce software more effectively. Another way to deal
with software quality is paying attention to the fit of the produced software and the users’ needs and demands.
This is often assigned as software product quality.
Although the quality of software products will be experienced by its use in practice, it will be more effective
to find that the software does not fit the users expectations in an earlier stage. Evaluating software product
quality will provide a mechanism to detect problems more early.

To evaluate software products, their quality -the expectations of the users- should be expressed as a
combination of quality-sub- characteristics, e.g. functionality and maintainability. However it should be
noticed that software does not directly manifest quality characteristics. But they can be measured by their
product objects -like number of statements or presence of a user manual. Those product objects contribute to
quality characteristics. Hence, evaluating software product quality means that the product objects that are
related to appropriate quality characteristics should be measured. After doing this, the overall value of the
quality characteristic must be determined and can be compared to evaluation criteria.

Determining the value of product characteristics and comparing them, requires measurement. However,
evaluation experiences at KEMA Nederland B.V. show that also product objects which should be
determined in a qualitative way, e.g. the presence of manual are relevant to an evaluation.
To deal with qualitative issues, the concept of indicator should be applied to the measurement system of
This paper states that the checklist technique will be a suitable instrument to deal with indicators. However
some refinements should be made. Concepts for doing this are presented in this paper. These are based upon
experiences received at KEMA Nederland B.V. during the development of a service to evaluate software
product quality.

1. Evaluating according to ISO CD 14598

Evaluation is executed according to a specified procedure. Requirements for an evaluation are formulated by
ISO CD 14598 (1996). An evaluation should be objective - based upon observation not opinion. It also should
be reproducible - evaluation of the same product to the same evaluation specification by different evaluators
should produce results that can accepted as being identical.

ISO 14598 distinguishes four activities during the process of evaluation: analysis, specification, design and
execution. Below these activities are translated into the context of KEMA’s evaluating service. It should be
noticed that analysis is not distinguished as a separate step: it is executed during specification. Therefore the
evaluation process can be described as consisting of three activities:
• specification of evaluation - during specification a quality profile is determined due to the fact that
different software product have different relevant characteristics. The profile is based upon
information about customer, business and the software product itself. It presents relevant quality
characteristics -ordered according to ISO 9126- and their assessory levels of evaluation. The profile
reflects the notion of quality for a certain software product and makes quality tangible for both
developers as users. A description about determining a quality profile can be found in (Eisinga et al,
Published in Proceedings of European Software Control and Metrics (ESCOM) Conference Berlin 26-28 May 1997

• design of evaluation - by designing the evaluation acceptance criteria for the several characteristics of the
profile are formulated. Also the techniques are selected. Result of the second step is an evaluation plan
upon which is agreement between customer and evaluator of the evaluation.
• execution of evaluation - the evaluation is executed according to the evaluation plan. The evaluator uses
techniques to assess the software product.

Evaluating according to ISO CD 14598 requires evaluation modules instead of techniques. An evaluation
module is a structured set of instructions and data [ISO 14598-6, 1996] ... that can be used to judge about a
software product. It is more than a technique, because it justifies the approach and underlying theory. Using
evaluation modules makes it possible to store experience about the product objects gained from previous
evaluations. Another benefit is the possibility of executing evaluations efficiently because a format is ready.
Normally three types of evaluation modules are distinguished: static analysis, dynamic analysis by executing
the code and inspection with checklists -derived from Bache and Bazzana, 1994. Reference tools -like SUMI-
are also an interesting type of evaluation modules.

In figure 1 the three activities of the evaluation process are presented horizontally. Each of the activities has
its input and output information. Relations are depicted by arrows.

business user/ software evaluation evaluation

ISO 9126
process customer product criteria modules

specification of quality design of evaluation execution of evaluation

evaluation profile evaluation plan evaluation report

ISO CD = activity
= result of activity

= information for/from

A B = A is input for B

Figure 1 Evaluation process

This paper focuses upon design of evaluation. Design deals with the process after having established the
relevant quality characteristics. The problem during this activity of evaluation is how to assess the
characteristics to determine their values.

2. Metrics and indicators

To determine their value, quality characteristics should be measured. To measure, interpretation of quality
characteristics is needed. The meaning of the quality -sub- characteristics has to be translated into measures. A
measure is the number or a category assigned to a quality characteristic to describe the software product -
derived from ISO 9126 (1991).
Using the conventional ordering of measures -direct, indirect, external and internal- measures is not enough.
The contribution of the product objects -which should be measured- to their quality characteristics can be
determined in a quantitative as well a qualitative way.
Normally measures are only applicable for quantitative determination. However, both are relevant to
evaluation. This requires a measurement concept which can deal with qualitative determination. The definition
of indicator in ISO 9126 (1991) -an indirect measure that can be used to estimate or predict another measure-
shows a point for departure to attend this, by using a an existing concept.
Expressing qualitative determination into measurement terms is necessary to compare the results of an
evaluation. Results of metrics and indicators should be compared to each other to judge about the quality -
sub- characteristic.
Published in Proceedings of European Software Control and Metrics (ESCOM) Conference Berlin 26-28 May 1997

An example is given to illustrate quantitative and qualitative determination of product objects during
evaluation. Statements of code and documentation are both product objects which can be elements of the
subject of evaluation. Related to these product objects are: number of statements and readability of
documentation, which are measures for quality sub characteristic analysability.
Number of statements has a scale and measurement procedure, which can be applied to the code and result in
a number. Number of nodes is recognized as a metric, which has a scale and method or procedure for
Instead readability of documentation depends upon facts like presence of a table of contents or explanations in
the text. Their availability is noted as yes or no, or perhaps to some extent.
Readability of documentation is an indicator. Which is an -indirect- measure too, but deals with objects of the
software product of which the value should be determined in a qualitative way. E.g. readability of
documentation cannot be measured applying a simple specified scale and measurement procedure.
Figure 2 shows the relation between indicator and metric, their associated product objects and the quality
Subject of evaluation Measures Quality sub characteristics
(with product objects)
qualitative e.g. readability of
e.g.(user) manual determination documentation
e.g. statements of code quantitative e.g. number of
determination statements

Interpretation of quality characteristics

Figure 2 Relation between product objects, measures and quality characteristics

3. Checklists as evaluation modules for dealing with items and indicators

In previous section the concept of indicator was introduced. To handle the problem of a measurement
procedure for qualitative determination, the concept of item is necessary. Items are questions about a
product object, which is scored according to a counting rule.
• question. The sentence which describes what should be known about the indicator.
E.g. the sentence “Is a description of the data structure of software product present?” is an example of a
question of an item for the indicator readability of documentation.
• counting rule. This is the instruction under which the value of the question is obtained. A counting rule
contains the possibilities of reply to the related question and associated values.
An example of a counting rule is: “score a zero when data structure does not exist or is preliminary (only
entities, no relationships explicated). Score a half when a draft data structure is present. Score a one
when approved and reviewed data structure (by development team) is present ”. This counting rule
belongs to the question mentioned above.

Question Counting rule

does the design documentation 0: table does not exist at all
includes a table of contents? 1: consistent table of contents
is a complete description about the functions of 0: no description at all
the software product 0,5: a description -not complete- exists
available? 1: a complete description exists (approved an reviewed by development team)
is a datastructure of the software product 0: no or preliminary datastructure (only entities, no relationships explicated)
available? 0,5: draft datastructure available
1: approved and reviewed datastructure (by development team) available

Table 1 Some items related to indicator ‘readability of documentation’

Indicator is defined as an aggregation of items, which are determined in a qualitative way and expressed as
a measure. Indicators can be recognized by their expression, they always contains the product object which
Published in Proceedings of European Software Control and Metrics (ESCOM) Conference Berlin 26-28 May 1997

is measured -like manual for the indicator presence of manual. Indicators are always related to quality sub
This elaboration of the concept of indicator is derived from ISO 1991 (1991) definition and its application
in social sciences. In social sciences indicator is the result of a notion as it is meant into a notion as it should
be determined. E.g. measuring blood pressure and breathing to establish the emotional tension of someone.
The outstanding instrument to deal with items -and indicators- is a checklist. E.g. the example of items
given in table 1 could be items for a checklist related to quality sub characteristic ‘analysability’.

4. Requirements for designing checklists

Section 1 has stated that evaluations must be objective and reproducible. It is stated that measuring
software product quality, three subjects should be addressed to provide objective and reproducible
1. determination of the indicators and measures. Suitable indicators and measures which determine a
quality characteristic are chosen. Also the extent to which an indicator and measures contributes to a
quality characteristic and indicators respectively is established.
2. the procedure to measure indicators and direct measures. This consists of instructions for an
evaluator to provide reproducible measurements.
3. the judgement of measurement results. After having determined the value of indicators the degree of
satisfaction about the characteristic of the product has to be established.

During every evaluation, attention should be paid to each of those subjects. Using evaluation modules -
like checklists- requires that these subjects should be addressed in their design.
In case of designing checklists the second subject is covered by counting rules for items and applying a
rating procedure for summarizing values of items to values for indicators and characteristics. An
example of a general procedure can be found in Punter (1997).
For design it is necessary to document issues concerning these subjects properly (ISO CD 14598, 1996).
Therefore basic elements -of which a checklist should consist- are described (4.1). Also the underlying
theory of a checklist should be described -this deals with subject 1 and 3: determination and judgement.
To do so a justification model should be used (4.2).

4.1 Basic elements for a checklist

A checklist should exist of indicators which relate to one or more quality sub characteristics. Each
indicator is determined by items which contain a question and a counting rule to determine the value of
the item.

Indicators have varying degrees of significance for judging quality characteristics. E.g. it could be stated
that the indicator modularity of documentation is more relevant to determine changeability than
simplicity of code. According to this indicators should be weighted differently according to how much
influence they exert. Indicators can be of low, medium or of high importance to a quality sub

Answering a question of an item according to a counting rule results in a value for that item. This is a
number which will be assigned to a quality characteristic. Therefore each item has a weight. The weight
is defined in the relationship between item and indicator.
A value of an item is assigned by applying the counting rule. Therefore the range of values in the
counting rule -e.g. zero, half and one- and their associated possibilities of reply are ordered in a way in
which a positive answer to a question results in a higher value.
Items are common, but uniquely related to a software product or a type of software products. This means
that evaluating different types of software products may require different items.

Items can differ for the fact that a question is not applicable for every type of software product. E.g. one of
the items of the indicator ‘consistency between code and documentation’ - related to quality sub
characteristic changeability- consists of the question: ‘is a cross-reference present in which relations
between modules and logical data stores are described?’ The question is applicable to a 3GL environment.
Published in Proceedings of European Software Control and Metrics (ESCOM) Conference Berlin 26-28 May 1997

However in a 4GL environment like MS/Access a more appropriate question would be: ‘is a cross-reference
present in which relations between forms and tables are described?’
Items can also differ for the fact that a counting rule of an item differs for types of software products.

Basic elements for a checklist and their related concept are depicted in the Entity-Relationship diagram in
figure 3.

Subject of Checklist Quality -sub-

evaluation characteristic

Product object ..determines.. ..measures.. Quality sub

is applied to Item Indicator characteristic
with a.. with a..

Weight of Weight of
Item Indicator

Figure 3 Basic elements and relationships of a checklist

4.2 Justification model of a checklist

Designing checklists according to the basic elements mentioned before cannot guarantee objective and
reproducible evaluation. Also choices made during design should be documented. These choices are
about: why select, which indicators for a quality sub characteristic and how interpret the measurement
results into a judgement about the quality sub characteristic.
Choices made during design of checklists are documented in justification models. These should be based
upon general theory about how to interpret quality sub characteristics. E.g. programming standards can
provide parts of such theory. However often theory which cover the whole range of the interpretation of
a characteristic does not exist. Then evaluator should interpret the characteristics itself.

The basic elements as well as the justification model are requirements for checklists to provide objective
and reproducible evaluation.

5. Conclusion

Indicator and item are concepts to deal with qualitative determination of software product quality.
Checklists are a technique to manage items during an evaluation. Using checklists as evaluation module
they should be designed according the basic elements presented in section 4.1. A checklist should also
contain a justification model in which choices made during design are described.

Baarda and de Goede, Basics of methods and techniques(in Dutch), Alphen a/d Rijn, Stenfert Kroese, 1995
Bache, R. and G. Bazzana, Software metrics for product assessment, London, McGraw-Hill, 1994
Eisinga, P.J., J. Trienekens, M. Van der Zwan, Determination of quality characteristics of software products: concepts
and case study experiences, in: Proceeding of the 1st World Software Quality Conference, San Fransisco.
Fenton, N., Software metrics -a rigourous approach, London, McGraw -Hill, 1991
ISO 9126, Software Quality Characteristics and Metrics, 1991
ISO CD 9126, Software Quality Characteristics and Metrics, 1996
ISO CD 14598, Software Product Evaluation, 1996
Kitchenham, B., S. Lawrence Pfleeger en N. Fenton, Towards a framework for Software Measurement Validation,
IEEE Transactions on Software Engineering, December 1995
Punter, T., Requirement for evaluation checklists, in: Trienekens et al, Software Quality from a business perspective,
Deventer, Kluwer Bedrijfswetenschappen (to be appeared), 1997.