Sie sind auf Seite 1von 4

The set of all possible input states is called the input space.

To find reliability of software, we


need to find output space from given input space and software.[1]
For reliability testing, data is gathered from various stages of development, such as the
design and operating stages. The tests are limited due to restrictions such as cost and time
restrictions. Statistical samples are obtained from the software products to test for the
reliability of the software. Once sufficient data or information is gathered, statistical studies
are done. Time constraints are handled by applying fixed dates or deadlines for the tests to
be performed. After this phase, design of the software is stopped and the actual
implementation phase starts. As there are restrictions on costs and time, the data is
gathered carefully so that each data has some purpose and gets its expected precision.[2] To
achieve the satisfactory results from reliability testing one must take care of some reliability
characteristics. For example Mean Time to Failure (MTTF)[3] is measured in terms of three
factors:
1. operating time,
2. number of on off cycles,
3. and calendar time.
If the restrictions are on operation time or if the focus is on first point for improvement, then
one can apply compressed time accelerations to reduce the testing time. If the focus is on
calendar time (i.e. if there are predefined deadlines), then intensified stress testing is used.[2]

Measurement[edit]
Software reliability is measured in terms of mean time between failures(MTBF).[4]
MTBF consists of mean time to failure (MTTF) and mean time to repair(MTTR). MTTF is the
difference of time between two consecutive failures and MTTR is the time required to fix the
failure.[5] Reliability for good software is a number between 0 and 1. Reliability increases
when errors or bugs from the program are removed.[6]
For example, if MTBF = 1000 hours for average software, then the software should work for
1000 hours for continuous operations.

Points for defining objectives[edit]


Some restrictions on creating objectives include:
1. Behaviour of the software should be defined in given conditions.
2. The objective should be feasible.

3. Time constraints should be provided.[7]

Design diversity
Design diversity techniques are specifically developed to tolerate design faults in
software arising out of wrong specifications and incorrect coding. Two or more
variants of a software developed by different teams but to a common specification
are used. These variants are then used in a time or space redundant manner to
achieve fault tolerance. Popular techniques which are based on the design diversity
concept for fault tolerance in software are:
N-version programming: First introduced by Avizienis et. al. [2] in 1977,
this concept is similar to the NMR (N-modular programming) approach in
hardware fault tolerance. In this technique, N (N>=2) independently
generated functionally equivalent programs called versions, are executed in
parallel. A majority voting logic is used to compare the results produced by
all the versions and report one of the results which is presumed correct. The
ability to tolerate faults here depends on how ``independent'' the different
versions of the program are. This technique has been applied to a number of
real-life systems like railroad traffic control and flight control, even though
the overhead involved in generating different versions and implementing the
voting logic may be high.
Recovery block: Recovery blocks were first introduced by Horning et. al.
[14]. This scheme is analogous to the cold standby scheme for hardware
fault tolerance. Basically, in this approach, multiple variants of a software
which are functionally equivalent are deployed in a time redundant fashion.
An acceptance test is used to test the validity of the result produced by the
primary version. If the result from the primary version passes the acceptance
test, this result is reported and execution stops. If, on the other hand, the
result from the primary version fails the acceptance test, another version
from among the multiple versions is invoked and the result produced is
checked by the acceptance test. The execution of the structure does not stop
until the acceptance test is passed by one of the multiple versions or until all
the versions have been exhausted. The significant differences in the recovery
block approach from N-version programming are that only one version is
executed at a time and the acceptability of results is decided by a test rather
than by majority voting. The recovery block technique has been applied to
real life systems and has been the basis for the distributed recovery block

structure for integrating hardware and software fault tolerance and the
extended distributed recovery block structure for command and control
applications. Modeling and analysis of recovery blocks are desribed by
Tomek et al. [28,29].
N-self checking programming: In N-self checking programming, multiple
variants of a software are used in a hot-standby fashion as opposed to the
recovery block technique in which the variants are used in the cold-standby
mode. A self-checking software component is a variant with an acceptance
test or a pair of variants with an associated comparison test [19]. Fault
tolerance is achieved by executing more than one self-checking component
in parallel. These components can also be used to tolerate one or more
hardware faults.
The design diversity approach was developed mainly to deal with Bohrbugs. It
relies on the assumption of independence of between multiple variants of software.
However, as some studies have shown, this assumption may not always be valid.
Design diversity can also be used to treat Heisenbugs. Since there are multiple
versions of software operating, it not likely that all of them will experience the
same transient failure. On the disadvantages of design diversity is the high cost
involved in developing multiple variants of software. However, as we shall see in
Section 3.3, there are another approaches which are more efficient and better suited
to deal with Heisenbugs.
Falut tolrence

Complex safety critical systems currently being designed and


built are often difficult multi-disciplinary undertakings. Part of
these systems is often a computer control system. In order to
ensure that these systems perform as specified, even under
extreme conditions, it is important to have a fault tolerant
computing system; both hardware and software. Current methods
for software fault tolerance include recovery blocks, N-version
programming, and self-checking software. Through the rest of this
discourse on software fault tolerance, we will describe the nature
of the software problem, discuss the current methodologies for
solving these problems, and conclude some thoughts on future
research directions
Introduction of falut tolrence
Software fault tolerance is the ability for software to detect and
recover from a fault that is happening or has already happened in

either the software or hardware in the system in which the


software is running in order to provide service in accordance with
the specification. Software fault tolerance is a necessary
component in order to construct the next generation of highly
available and reliable computing systems from embedded
systems to data warehouse systems. Software fault tolerance is
not a solution unto itself however, and it is important to realize
that software fault tolerance is just one piece necessary to create
the next generation of systems.

Das könnte Ihnen auch gefallen