Beruflich Dokumente
Kultur Dokumente
H/W Faults
• Art of War: to win a battle, you need to
understand your enemy (and the
environments).
• For F/T: to combat computer faults, we
need to understand the nature of faults
chp04 1
Temporal Nature of H/W Faults
(cont)
• Fault duration (temporal)
– Duration of causes and effects
– Categories
• Permanent faults
• Intermittent faults
• Transient faults
– Most common: intermittent and transient faults
chp04 2
Permanent Faults
• The cause remains indefinitely if no repairs
– h/w permanently damaged due to:
• wear and tear
• design mistakes
• manufacturing defects, etc
• Capability/Risk of inducing an error is
always there
• Example: stuck-at fault in memory
chp04 3
Intermittent Faults
• The cause will not disappear
• environmental stress
• design deficiencies
• partially defectives due to aging
• h/w fatigue
• heat sensitivity
• voltage threshold
• The capability of inducing an error may not always
be present
• depending on its state: benign or active
chp04 4
Transient Faults
• The cause exists for some period of time and
then disappears without even repair
• temporary environmental, electrical, or mechanical
conditions
– power jitter
– ionisation due to cosmic rays or alpha particles
– electro-magnetic interference
– solar wind/flares
chp04 6
Having understood the concept
and nature of computer faults, we
ask, “How could we implement
fault-tolerance approach to
combat computer faults?” …next
slide.
chp04 7
Computing Techniques to
implement fault-tolerance
approach
What are the techniques used to achieve
computer fault tolerance? Answer:
– redundancy technique (the core requirement)
– fault/error detection technique
– fault masking technique
– fault confinement/isolation technique
– fault/error diagnosis technique
– reconfiguration technique
– system recovery technique
chp05_6_7 8
We first look at the core requirement,
Redundancy, in the next lecture.
What is redundancy?
chp04 9