Sie sind auf Seite 1von 28

UNIT 5

Reliability & Clock


Synchronization:
Introduction,
impact of faults,

obtaining

parameter

values,

reliability

models

for

fault

tolerant

synchronization

in

hardware.

hardware redundancy,
software error models,
Taking time into account,
clock synchronization,

nonfault-tolerant

synchronization algorithms,

Roll No: 15

Obtaining parameter values


Definition
The first step in developing a model is to
decide what the input parameters should be.

A model should always be based on


parameters that can either be accurately
measured or estimated with confidence.

Real Time and Fault Tolerance

Obtaining Device-Failure Rates


There are two ways to obtain device-failure rates,
1. collecting field data
2. life- cycle testing in the laboratory.

.The former is more realistic, since it represents the


failure rate when the devices are being used in their
normal operating conditions.

Real Time and Fault Tolerance

Mathematical model classifications


include
The most common accelerant is temperature. The higher the
temperature. the greater the failure rate. The acceleration
factor is given by the following equation

R(T) = (8.1)

where:
A : Represents a constant.
Ea : Represents the activation energy and depends largely
on the logic family used.
K : Represents the Boltzmann constant (0.8625 x I 04e V/K).
T : Represents the temperature (in degrees Kelvin).

Real Time and Fault Tolerance

Mathematical model classifications


include

Real Time and Fault Tolerance

Measure Error Propagation Time


To measure how quickly an error can propagate, we use
fault injection.

This is best done on a prototype. Special-purpose


hardware is used to simulate a fault on a selected line.

The status of the related lines is monitored using logic


analyzers to determine how far and how quickly the error
propagates.

If a prototype is not available, a software simulation


can
Real Time and Fault Tolerance

Reliability models for hardware redundancy


Redundant system elements transition to the next
higher state upon occurrence of any hardware failure.
Hardware repairs transition the system element model
to the next lower state The system is a closed form
semi-Markov process that can be solved for the
appropriate reliability measures using conventional
methods.

Closed-form solutions for the reliability measures of


interest for this type of model under most common
repair restrictions, types of standby, etc., are available
in the technical literature referenced by this notebook.
Real Time and Fault Tolerance

General Hardware Redundancy Model

Real Time and Fault Tolerance

The concept of three types of "coverage" is introduced


as a part of the model.

Fault detection coverage (Cd) is the probability of


detecting a fault given that a fault has occurred.
Fault Isolation coverage (Ci) is the probability that a
fault will be correctly isolated to the recoverable
interface (level at which redundancy is available)
given that a fault has occurred and
been detected.
Fault recovery coverage (Cr) is the probability that the
redundant structure will recover system services given
that a fault has occurred, been detected, and correctly
isolated
Real Time and Fault Tolerance

The model shown in Figure is a simplified model since it


does not separately consider the possible impact of
transient failures.

To account for transient failures would represent an uplift


of failure rate by some percentage.

The model also assumes that Cd is the same for both the
primary element and the backup element.
In practice, there may be different levels of fault
detection coverage between primary and backup
equipment due to a difference in test exposure intensity.

sombody@gmail.com

Hardware Reliability Model

Real Time and Fault Tolerance

Software error models


Definition
Software reliability is the probability of the
software components of producing incorrect
output.
Software should not wear out and continue to
operate after a bad result.
Many software models contain:
Assumptions
Factors
Mathematical function
Real Time and Fault Tolerance

Types
Software reliability can be divided into categories:
1. Prediction modeling
2. Estimation modeling.
These modeling techniques follow observation and
analyzes with statistical inference.

Real Time and Fault Tolerance

Prediction Model:

This model uses historical data. They analyze previous


data and some observations. They usually made prior
development and regular test phases. The model follow the
concept phase and the predication from the future time.

Estimation Model :

Estimation model uses the current data from the current


software

development

effort

and

doesn't

use

the

conceptual development phases and can estimate at any


time

sombody@gmail.com

Other Models in software:


Jelinski-Moranda
Goel-Okumoto (exponential)
Rayleigh
Titan Reliability Modeling Software (Predictive)
Weibull
Delayed s-Shaped
Inflexion s-Shaped

Real Time and Fault Tolerance

Taking time into account


Definition

Checkpoints are placed at interval of T, interval


between two checkpoint constitute a miniframe

NMR

is

used,

before

writing

into

checkpoint

probability of writing information into a checkpoint is


very small and it will be ignored.

Probability of checkpoint is corrupted then it will be


ignore.
Real Time and Fault Tolerance

3 Types of failures are considered here


Permanent Failure
Independent Failure
Correlated transient Failure

Real Time and Fault Tolerance

Clock Synchronization
Definition

Clock synchronization deals with understanding the


temporal ordering of events produced by concurrent
processes.
It is useful for synchronizing senders and receivers of
messages,

controlling

joint

activity,

and

the

serializing concurrent access to shared objects.

The goal is that multiple unrelated processes running


on different machines should be in agreement with
and be able to make consistent decisions
the
Real Timeabout
and Fault Tolerance

Real Time and Fault Tolerance

Another aspects
For these kinds of events, we introduce the concept of a
logical clock, one where the clock need not have any
bearing on the time of day but rather be able to create
event sequence numbers that can be used for comparing
sets of events, such as a messages.

Another aspect of clock synchronization deals with


synchronizing

time-of-day

clocks

among

groups

of

machines. In this case, we want to ensure that all


machines can report the same time, regardless of how
imprecise their clocks may be or what the network
latencies are between the machines

Real Time and Fault Tolerance

Lamports algorithm
remedies the situation by forcing a resequencing of
timestamps
to ensure that the happens before relationship is
properly depicted for events related to sending and
receiving messages. It works as follows:

Each process has a clock, which can be a simple counter


that
is incremented for each event.
The sending of a message is an event and each message
carries
Real Time and Fault Tolerance

The arrival of a message at a process is also an


event will also receive a timestamp by the
receiving process, of course.

The process clock is incremented prior to time


stamping the event, as it would be for any other
event. If the clock value is less than the timestamp
in the received message, the systems clock is
adjusted to the (messages timestamp + 1).

Real Time and Fault Tolerance

Otherwise nothing is done. The event is now time

Impact Of fault
Introduction

Loss of synchrony
Synchronization is carried out by exchanging
timing messages and adjusting themselves
appropriately.

Two ways when synchrony can be lost as a result


of some clock becoming faulty:

When multiple nonoverlapping cliques are

form,

Real Time and Fault Tolerance

When clock driven too fast or too slow.

It can be lost when clock are being run at their


upper or lower frequency limits

Real Time and Fault Tolerance

Fault tolerant synchronization in hardware.


To synchronize in hardware we can use phase
locked loops
The basic structure of phased locked loops is
shown in fig below
Comperator

Filter

VCO

Real Time and Fault Tolerance

Fig Explained..
The objectives is to align, as closely as possible,
the output of the oscillator with an oscillatory
signal input.

Comparator puts out signal that is proportional to


the difference between the phase of input and that
of oscillator.

This is passed through a filter, resultant signal is


used to modify the frequency of a voltage
Real Time and Fault Tolerance

Advantages

1. Very small clock skew that can be attained


2. NO burden on rest of system

Real Time and Fault Tolerance

THANK YOU

Das könnte Ihnen auch gefallen