Beruflich Dokumente
Kultur Dokumente
The PFD and SFF figures can be assessed for a specific sys-
tem configuration from the FMEA (failure modes and effects
Figure 2 A 2oo2 dual system provides High Availability, but Low Integrity
2 Architecture of safety systems | ABB technical datasheet ABB technical datasheet | Architecture of safety systems 3
3. What does all this mean in practice? If we look at the simplex SIL3 controller it addresses the four
The 800xA HI (high integrity) SIL3 controller from ABB is an basic requirements of the standard in a very straight forward Variant Including SFF % λdu SIL2/SIL3 SIL2/SIL3
evolution of the existing SIL2 controller that has been suc- way:
cessfully marketed for the last 3 years. The SIL3 certified PFD PFH
controller has the same physical structure as the SIL2 version − − The PFD is a measure of the probability of the system fai-
but with upgraded firmware and software. In common with the ling in a dangerous (undetected) manner. The 800xA SIL2 PM865 Single PU 99.55% 5.74E-09 SIL2 SIL2
SIL2 unit it is an example of a safety system designed from its and SIL3 controllers have essentially the same hardware. Processor Module 1 x PM865 1.21E-5 1.72E-10
conception specifically to meet the detailed requirements of The basic electronics is designed for the highest levels Termination Plate 1 x TP830 SIL3 SIL3
the IEC61508 standard. of reliability. It uses large scale integration, field proven Supervisory Module 1 x SM810/SM811 8.04E-6 1.15E-10
components and world class production and testing me- Termination Plate 1 x TP855
thods. Based on empirical figures the calculated PFD for
basic system elements is shown in the table below. These PM865 Redundant PU 99.55% 5.74E-09 SIL2 SIL2
are right at the top end of the requirement band for SIL 3 Processor Module 2 x PM865 1.21E-5 1.72E-10
systems. If we analyse the actual hardware failures from Termination Plate 2 x TP830 SIL3 SIL3
the field returns (there are some 3200 modules in the field CEX-Bus Interconnection Unit 2 x BC810 8.04E-6 1.15E-10
many for 2 years), this figure could be increased still further. Termination Plate 2 x TP857
This figure is achieved by the fundamental design rather Supervisory Module 2 x SM810/SM811
than by duplication and voting. (PFH in the table 2 is the Termination Plate 2 x TP855
probability of dangerous failure per hour).
− − The systematic safety integrity of the 800xA HI is main- I/O 99.98% 1.36E-10 9.52E-6 1.36E-10
ly achieved by an exhaustive design, development and Digital Input Module 1ch DI880
testing program by the system designer with all processes Module Termination Unit (MTU) TU842/843
and design milestones carried out within a rigorous TUV
Table 2 shows the SFF, PFD and PFH for the 800xA HI components
certified functional safety management system (FSMS) and
with every stage of the hardware and software develop-
ment process scrutinised and approved by an independent
Figure 3 800xA High Integrity Certificate certifying body such as TUV. One may argue that no matter
how good the processes are, design or systematic failure
The 800xA high integrity controller can be configured in vari- cannot be 100% eliminated. This is where the “embedded
ous simplex or dual redundant architectures, but all possible diversity” of the 800xA HI (which is discussed later in the
combinations of processors and I/O meet exactly the same text) cuts in and provides an active continuous check for
safety Integrity criteria and all meet the requirements of SIL3. operational software faults.
How this is achieved in the product design will be discussed − − The SFF figure and the HFT concept are the interesting
later, but this means the requirements of availability (MTBF) parameters and it is here 800xA HI challenges the conven-
can be completely separated from the requirements of safety tional architecture based analysis.
integrity defined within the standard. Duplicating the safety − − The fundamental design ensures that all detected faults
controller and / or I/O modules increases the availability of are reported and either leaves the controller operating in a
that part of the system depending on the needs of the appli- degraded mode (but still safe) or initiate a safe action (shut
cation, but in all cases the safety integrity metrics remain the down).
same.
4 Architecture of safety systems | ABB technical datasheet ABB technical datasheet | Architecture of safety systems 5
4. A high SFF indicates a high integrity design an HFT of 1 which improves its systematic integrity as well as petrochemical applications, the 800xA HI may be configured The standard considers three types of system failure as fol-
The safe failure fraction of a subsystem is calculated as: providing a level of fault tolerance. in various dual redundant modes, as previously stated above. lows:
The important thing is the simplex system and the dual red-
SFF = (∑λs +∑λdd) /(∑λs +∑λd) It is often argued that by increasing the SFF merely moves undant systems have exactly the same PFD, exactly the same − − Random hardware failures
dangerous undetected failure modes into the detected cate- SFF and both have an HFT of 1. They have exactly the same − − Systematic - design, implementation or operational failures
Where gory, which in turn means an increase in spurious trips. safety integrity: the only thing to change is the MTBF (availabi- − − Common mode failures
∑λs is the total probability of safe failures; lity) which can increase by more than 400 years over a similar
∑λd is the total probability of dangerous failures; and For confidence in our safety system, the one thing we do not simplex system. The probability of random hardware failures occurring can be
∑λdd is the total probability of dangerous failures detected by want is undetected dangerous failure modes. They increase assessed from the reliability data of component provided by
the diagnostic tests. the potential for long term undetected failures and even in a Reliability, safety integrity and redundancy are terms that have the manufacturer and are likely to only affect a single chan-
conventional dual or triple system, an undetected dangerous been very much confused in earlier generations of system, nel at a time in a multi-channel redundant system. However,
The three types of failure are clearly defined in the standard as failure at minimum degrades the system by rendering one are now much better defined and by separating reliability from systematic and common mode faults could affect all channels
follows: path inoperable on demand, and at worse if the fault is com- safety integrity and fault tolerance from HFT it should make of a multi-channel voting system in exactly the same way. This
mon, could leave the whole system in a dangerous state. This comparisons of safety system performance much easier under could result in a complete failure of the system.
Safe failure is especially true for TMR where a single undisclosed failure the new standards.
renders the 2 out of 3 voting algorithm, on which its integrity Consequently voting systems with identical channels should
− − The subsystem failed safe if it carries out the safety func- depends, unable to work. As an aside, it is ironic that a triple system that claims high be avoided if the effects of systematic and common mode
tion without a demand from the process. levels of diagnostic cover gains nothing by way of integrity issues are to be reduced. Of course the majority of dual, triple
The 800xA HI effectively achieves 100% diagnostic cover as from the triple architecture. The 2oo3 voter does not improve and quad systems rely on voting between identical channels.
Dangerous failure there are no known dangerous failure modes, and can hence the safety integrity and because the channels are all the same
achieve SIL3 compliance without calling on the HFT card. HFT technology, does not improve the systematic assessment and 6. Diversity better than quantity.
− − The subsystem failed to danger if it cannot carry out its was included in the standard, largely to enable legacy sys- neither the common mode issues, and because of the laws of Diverse voting systems have been around a long time. The
safety function on demand tems that relied heavily on redundancy and voting systems to diminishing returns, does not necessarily improve the availabi- safety systems used for nuclear power utilise voting between
meet the SIL level requirements. lity over a similar dual redundant architecture. different systems often utilising different technologies (relay,
Detected failure pneumatics, electronics etc), supplied by different companies
However the definition of HFT in the standard is very specific 5. Voting and diagnostics and installed and commissioned by different teams. The pro-
− − A failure is detected if built in diagnostics reveals the failure, and it applies only to undetected faults. It is definitely not an Voting is the most common method used to detect discre- bability of systematic or common mode failures affecting the
for 800xA High Integrity failures are revealed in a time bet- indication that a product will continue to function after a fault pancies in processing results of redundant channels in integrity of the overall system is therefore greatly reduced.
ween 50mS and 1S. has been detected, which is what most users expect from a multi-channel systems. Table 1 which is directly taken from
fault tolerant system. the standard indicates that voted results can be considered The simplex 800xA HI controller and I/O units have embedded
Also failures can be revealed in three ways: a mechanism to increase diagnostic coverage. However, the diverse parallel processing paths where active discrepancy
What about spurious trips? If a safety system has 100% dia- authors of the IEC61508 standards recognised that there are checking between the paths compliments the built in active
− − Through normal operation - (usually resulting in a spurious gnostic cover but is prone to component or software failure, inherent weaknesses with voting systems when attempting diagnostics.
trip) then it will produce an unacceptable level of spurious trips. to achieve high levels of integrity. If the voting mechanism
− − Through periodic proof testing – (could be as infrequent as becomes unavailable due to an undisclosed failure developing Embedded hardware diversity in the controller hardware is
every 8 years for 800xA HI) In addition to the high PFD figure plus the high SFF, the in one channel, the system’s integrity is compromised, and achieved by the use of different processor boards for the con-
− − Through built in diagnostics. simplex 800xA HI controller and I/O has an inherently high what is worse no one knows. If a fault is detected from the troller (PM865) and supervision module (SM811). Diversity in
level of reliability by virtue of the high levels of integration and vote the system enters a degraded mode and may have its software is achieved by the use of different operating system
The unique design of the 800xA HI diagnostics utilise a high low stress and dissipation electronics. This gives the simplex safety integrity capabilities reduced. More importantly if the renditions, compilers, coding guidelines and different pro-
degree of conventional active diagnostics (built in testing) controller an MTBF of approaching 20 years. (It is in the same failure is not disclosed, the degraded state is not necessarily grammatic implementations between controller and supervisi-
plus active discrepancy checking between the two diverse region as the latest generation TMR system.) discovered until a demand on the system is made – when it on module. As a further measure against systematic and com-
execution paths, giving the simplex controller an SFF of close may be too late. mon mode problems, the controller and supervision module
to 100% (99.8% is the figure quoted). Also, by virtue of the The embedded diverse structure of the simplex controller were developed and tested by different teams operating from
diverse structure, the SIL3 product has an HFT of 1 for the further enhances the statistical MTBF (mean time between Also, simple voting systems often suffer from single points of two different countries by people with different backgrounds
simplex controller and the simplex I/O. From the table above failures) by enabling the SIL3 controller to continue to function potential failure in the voting system itself. and experiences. The I/O modules also use two signal paths
it can be seen that 800xA HI effectively meets the PFD and in a degraded (but certified) manner for a limited period after with embedded diverse technology, one using FPGA techno-
SFF requirements for SIL4, despite only being certified to an I/O channel fault has been detected. Availability can only be effectively increased if the redundant logy and the other using MCPU.
meet SIL3. The reason that this has been achieved is because system can continue to operate at the specified SIL in both a
the SIL2 controller is classified as having an HFT of 0, but However, if system availability is of paramount importance, fully redundant and also degraded state. As stated, 800xA HI 800xA HI does not conform to the conventional 1oo2D
still meets the SIL3 requirements for PFD. However, the SIL3 which is the case in many oil and gas and has exactly the same safety integrity in both simplex and dual architecture and cannot be described in such terms. If it is
controller, because of its embedded diverse technology has redundant configurations. considered necessary to give it an architectural label, the sa-
6 Architecture of safety systems | ABB technical datasheet ABB technical datasheet | Architecture of safety systems 7
fety architecture should be described as: – yes you guessed. 7. Active voting or main – standby to cost you millions of dollars lost revenue in unscheduled
“embedded, diverse technology”. This diverse technology is Having separated the requirements for Integrity from those of down time, it is a small price to pay for peace of mind.
employed in a dual format when implemented in a single con- availability, it is much easier now to measure the effectiveness
figuration and a quad format in a redundant configuration. of the various designs. 8. Forget the architecture - look at the certified data set
Whether the system is dual, triple, quad, 1oo2, 2oo3 or 2oo4
Silicon electronics are inherently extremely reliable once the is no longer important. In fact, unless we know exactly what
infant mortality stage has passed. Component selection and the architecture is designed to achieve, these terms can be
production burn in testing ensures that the 800xA HI, even at the least confusing, and in the last generation of systems
in simplex mode, achieves the highest levels of reliability. the definitions of “integrity” and “availability” were definitely
Empirical assessments (used in the formulation of the achie- confused.
ved SIL) fall right at the top of the SIL3 band and field returns
based on over 600 safety systems delivered with over 50,000 The important data that defines the integrity and availability of
I/O in the field in full operation indicate that the actual figures your safety system will be contained in the SIL
achieved are an order of magnitude better than these.
Achievement report you should expect from your certified
With these levels of reliability achieved with the simplex system integrator. This report will give you the following infor-
product, one might wonder why a dual redundant offering is mation:
necessary at all. There are, however, many highly critical or
unmanned processes, where the cost of just one spurious trip −− Calculated PFD for your system configuration supported by
in a 20 year period is infinitely more costly than the addition of certified reliability data and calculations.
Figure 4 800xA High Integrity in Dual format with Single I/Os a redundant system. −− The safe failure fraction figure for your system. Again sup-
ported by certified diagnostic cover data and calculations.
The physical structure of 800xA HI is unique in enabling the −− Certificates confirming the systematic integrity of the basic
I/O and controllers to be offered in redundant mode inde- system covering the development of all safety related sub-
pendently of each other, thus increasing the availability of the systems and elements. See attached for 800xA HI
I/O and /or the controller independently. This means that for −− Certificates covering the functional safety management
critical processes, that can be maintained with the total loss system (FSMS) used by the system integrator confirming
of (say) one I/O channel (two faults), only the processors need the competence of the projects team and the processes
duplication. In most processes only a small proportion of the used.
I/O is so critical that it requires 100% availability, consequent- −− A detailed SIL achievement report including the results of
ly mixed redundant and non-redundant I/O systems can be the functional safety assessment (FSA) carried out during
configured with consequent cost saving. the project and the audit reports carried out by the team.
8 Architecture of safety systems | ABB technical datasheet ABB technical datasheet | Architecture of safety systems 9
ABB Oil & Gas
Suite 110
4411 6th Street SE
Calgary, AB T2G 4E8
name: Anne Roberts-Kraska
email: anne.k.roberts-kraska@ca.abb.com
phone: 403 225 5511
www.abb.com