Sie sind auf Seite 1von 7

Future Architecture of Flight Control Systems

Kristina Ahlstrom & Jan Torin


Chalmers Universify of Technology

ABSTRACT [ 2 ] . More sensor and actuator elements are based on


MicroElectronic Mechanical Systems, MEMS, technology
The development of fault tolerant embedded control that can be integrated either on the same silicon die or in the
systems such as flight control systems (FCS) are currently same package as the associated micro-controller. Hence, in the
highly specialized and time-consuming. We introduce a future, we will see different types of commercial-off-the-shelf
conceptual architecture for the next decade control system components designed for fault tolerant applications such as fly-
where all control and logic are distributed to a number of and drive-by-wire systems [3]. One important reason for
computer nodes locally linked to actuators and connected adding intelligence to the nodes in a distributed system is to
via a communication network. In this way, we substantially achieve fault detection with a minimum of hardware. Such
nodes with embedded error detection are extremely valuable in
reduce the life-cycle cost of embedded systems and attain
the design of fault tolerant systems.
scalable fault tolerance.
A distributed system is inherently suited to be designed with
All fault tolerance is based on redundancy. Our
scalable fault tolerance. Distribution refers to distribution of
philosophy is to cover permanent faults with hardware computing power and control. Note the distinction in the use of
replication and handle all error processing caused by both the term distribution in [4], the communication is distributed
permanent and transient faults with software techniques. not the control. The goal is to discover the benefits and
With intelligent nodes and use of inherent redundancy we disadvantages of distributing thecontrol or computing power.
introduce a robust and simple fault tolerant system that
utilizes minimum hardware and has bandwidth
requirements of less than 300 kbitsls, which can be met
with an electrical bus. The study is based on an FCS for
JAS 39 Gripen, a multi-role combat aircraft that is
statically unstable a t subsonic speed.

INTRODUCTION

Traditionally, flight control systems (FCS) are implemented


with a central dependable, fault tolerant computer [ 11 because
the available airborne computers are complex, large, and
expensive. These centralized complex systems are inflexible
and require a great deal of hardware because different failure
rates of FCS components cannot be balanced to the safety
requirements. Fig. 1. Future architecture of FCSs
The evolution of microelectronics will continue to have an
extreme influence on the increase in computer performance, This paper presents a future flight control system concept
and thus, future sensors and actuators will be made intelligent (Figure 1) to be designed with the available technology of next
decade highlighting: flexibility, scalability, low-weight,
predictability, and testability, at less complexity and lower
maintenance costs.
Authors' Current Addwrur:
K.Ahlsrmm and J. To"". DCpamMnl of Campufer Engineering. Chalmers Univenity of An early version of the architecture described was presented
Technology.412 96 GoLenburg, Sweden. at DASC2001 [51. The proposed architecture has completely
Based on a presenlatiun 81 DASC 2WI decentralised control without any central core with minimum
0 ~ ~ ~ ~ , / 8 ~ 8 ~ m z 1IEEE
s i 7 . ~ 0 ~ 2 hardware for certain safety I -reliability requirements and

21
minimum communication requirements. The results are based possibility for analyzing timing and dependability is best
on knowledge of the flight control system characteristics of the served by a broadcast bus topology. The protocol that suits
JAS 39 Gripen, a multi-role supersonic combat aircraft with safety critical, hard real-time applications, such as FCS,
over 15,000 d c accumulated flight hours. Consequently, we belongs without question to the Multi-master, Time-triggered
address highly dependable real-time systems and are family [6]. The predetermined time slots in a time-triggered
convinced that the architecture results are general and can be communication make it easy to design nodes with a temporal
applied to other combat and commercial aircrafts, as well as firewall (by using double time conditions for sending) resulting
other embedded control systems, in cars and trains, etc. The in fail-silent bus interfaces. The fault behavior of the bus is thus
architecture aims to tolerate permanent and transient physical simple, Byzantine faults cannot occur and fault tolerance on
faults. The architecture presented reuses the software of JAS system level can be designed, which is a requirement for
39 Gripens FCS and software design faults are, hence, not meeting safety demands.
addressed in this paper. Since time-triggered systems are pre-scheduled, there will
be a system cycle that is repeated with a fixed period. This
GOALS AND OBJECTIVES makes the system, by necessity, cyclic and thus simple to
design it as composable, easy to verify and self-testable.
An FCS must be designed to continuously provide services
despite internal failures. This is the goal of a fault tolerant TASKS AND CRITICAL FAULTS OF
system, and a distributed system is inherently assumed to have A FLIGHT CONTROL SYSTEM
the following features:
A stable aircraft (the F-4 Phantom, JA-37 Viggen, and
Scalability and flexibility. Modularity makes it present commercial airliners) will without assistance from the
possible to scale the system without system FCS, find its own stable flight path. The FCS is used by the
redesign, and distributed autonomous nodes pilot to control the flight of the aircraft in three-dimensional
make it easy to add or change functionality and to space. A statically unstable aircraft (JAS 39 Gripen, F-22, or
EuroFighter) does not, by itself, find a stable flight path.
adapt new technologies.
Without artificial stabilization achieved by the FCS, the
Testability. Inherent decomposition in functional aircraft will depart from the intended flight path (i.e., the
modules due to decentralized control reduces aircraft will enter into a set of uncontrollable maneuvers). The
communication between nodes and software FCS of an unstable aircraft, therefore, has two tasks:
complexity, and simplifies verification at module
level. Functional modules also support graceful
To stabilize the aircraft by constantly balancing it
- degradation. 1.
along its flight path.
Reduced complexity and maintenance costs.
Manageable complexity is gained through 2. To execute the pilots commands (basically the same
splitting a central control unit into a number of task as in an FCS in a stable aircraft).
less complex computer nodes. Several nodes may
be identical, thus achieving a more cost-effective To fulfill task 1, the FCS needs to measure the current flight
system. path of the aircraft. This is done by use of a number of sensors,
Commercial-off-the-shelf, COTS. By using e.g., Accelerometers, Rate Gyros and Angle of Attack I Side
intelligent and autonomic nodes with a slip sensors. To achieve task 2, the FCS needs only to use the
standardized interface, COTS can be used for input from the pilot, the control stick, and pedals. The FCS then
hardware as well as already developed software. mixes the outputs of the two tasks; stabilization and pilot
command, into one set of commands and moves the control
Fault tolerance. Distributed fault tolerance
surfaces accordingly.
makes it possible to tailor the redundancy JAS 39 Gripen is based on an aerodynamic configuration
according to reliability requirements and to statically unstable at subsonic speed and, hence, necessitates
implement the fault tolerance at the most suitable active stabilization. It has seven primary and three secondary
level, in order to minimize the cost of control surfaces that are controlled by the FCS. The present
redundancy. Furthermore, a distributed control FCS has three redundant control computers (the Flight Control
system can be designed to be less sensitive to Electronic Assembly in Figure 2).
battle damage and it is a goal to achieve a system The effect of a failure in the JAS 39 Gripen aircraft is rated
architecture not dependent on any hardcore. on a 4-degree scale. In this study, essentially critical failures
are .considered in the hardware reliability analysis. The
Communication between the distributed nodes in an FCS is following situations represent critical system failures:
dominated by continuous signals periodically sampled and
transmitted, well-suited for a time triggered protocol. The 1. Loss of pitch rate information
Flight Control System
/5

Fig. 2. Flight control sensors and surfaces of JAS 39 Gripen

2. Loss of roll or pitch stick position flight control system and, in the future, a single chip will be
3. Erroneous air data value I information (lossof air data able to do the same in a fraction of the time necessary today.
sensor is not critical, if detected) Thus, it is not mandatory to allocate the tasks according to
4. Any primary control surface uncontrollable and not processing load balance. In a future FCS system based on
streamlining ' distributed control, task allocation will be better optimized
according to a minimum bandwidth criterion, which
5. Two primary control surfaces streamlining (the left implements the cheapest and most robust communication
and right canard are treated as one surface) system and reduces the risk of transferring erroneous data
6. Loss of communication network. among the nodes.
The minimum bandwidth criterion in our study of Gripen
More or less all aircraft, combat or commercial, have represents location of all FCS computing locally at each
redundancy in the use of primary control surfaces, Le., it is actuator node. To attain inherent redundancy, we let each local
possible to maneuver and land the aircraft by commanding a control node compute not only its own local command, but also
reduced number of primary control surfaces. In the JAS 39 all control commands, giving seven replicas at nearly no
Gripen, this is used in the design of control laws such that one additional cost. The FCS software is duplicated to all seven
primary control surface can fail as long as it fails in a safe way, actuator nodes.
i.e., the FCS must know when and which control surface fails In addition, maintenance and development costs are
and it shall fail by streamlining. (Most other embedded control substantially reduced when there are seven identical actuator
systems for cars, and trains, etc. have similar fail-safe nodes instead of seven specialized ones. Furthermore, the
conditions that can be used to reduce hardware redundancy.) software developed for today's centralized FCS has been
proven usable in this future distributed FCS, as well.
SYSTEM ARCHITECHTW
Hardware Redundancy
The physical structure and the minimum number of nodes in Extra hardware (nodes) is added to the system mainly as
an FCS are given by the necessary basic elements (cockpit, hardware replication in order to meet the safety requirements
sensors, control surfaces, engine) connected for functionality. imposed by permanent faults. It is clear that a permanent fault
The functional layout of the JAS 39 Gripen includes 16 nodes. needs hardware replication in order to be both error recovered
With distributed sensor and actuator nodes, there are several and fault reconfigured, whereas the error detection can be
choices as to how to allocate the tasks of control laws and logic. handled by mechanisms implemented in software, hardware,
A single computer today can handle all data processing in the or by coding. By minimizing hardware, the life-cycle cost will
be reduced because all hardware adds to system failure rate.
The error processing caused by all types of faults should
' Each primary conlml svrfase F M OF- in one of rwo modes. me w d mode and me mainly be implemented in software [7].
s k d n i n g &. In n o d made. &e surfaa is controlled by the FCS.a
! smamlining
made. rhe surface i s free 10 follow Ihc d y n a m i c forcer affecting if. In lhir modc. the The safety requirements of loss of control are used in the
estimate considering only permanent faults (all transient faults at least 99% for both transient and permanent faults. Moreover,
are assumed to be recovered) and, second, a pessimistic the smart actuator nodes are assumed to improve the coverage
estimate considering both permanent and transient faults of detecting and locating a faulty sensor value to at least 99.9%
(assuming that as soon as only one working replica of a critical by using the information from all sensors in the system. A fault
node exists, the next fault, permanent or transient, will cause a in a sensor node can result in either: 1) that the nodes fault
catastrophic failure). The fault rate of any component is detection mechanism discovers the fault and reports it in its
assumed to be constant with respect to time. The fault rate next broadcast message; or 2 ) that the fault is not detected and a
might be considered conservative for sensors and actuators faulty value is sent at the bus.
taking into account the evolution of MEMS in the next decade.
From the optimistic and pessimistic redundancy Bus Faults
calculations of simplex, duplex, and TMWduplex systems, a No recovery action is taken in the case of bus faults. The
mix configuration is selected. The criteria for selection are to physical architecture of the buses should be such that neither a
fulfill the requirements of: I ) not including any single point of single fault, transient or permanent, nor battle damage will
failure; and 2 ) a probability of critical failure not greater than have an effect on both buses. Thus, in the case of a fault
0.5.10 per flight hours. Thus, because of criterion I), the affecting one of the buses, the duplication guarantees the
sensors and bus must be in duplex, whereas the seven actuator system to continue without functional degradation. Fault I error
nodes, one by,each primary control surface, are simplex. This detection is assured through message synchronization
is possible since the aircraft is still controllable and able to mechanisms and checksums at all messages. All nodes are
make safe landings with six of seven primary control surfaces. aware of any faults affecting the buses.
In our reliability calculations, an electrical bus is assumed. The following faults may occur:
The duplicated sensors, the duplicated bus, and the seven Transient fault causing bit errors; detected
simplex actuators form the minimum hardware configuration
that fulfills the two safety requirements. Altogether, the
distributed FCS has 20 independent computer nodes attached
- through CRC checksum
Longer intermittent fault destroying a whole
message; detected by synchronization
to a time-triggered multi-master broadcast bus with Time
Division Multiple Access (TDMA) communication. mechanisms
Interaction between tasks in different nodes is thus fixed in Permanent fault: the bus is killed, and this is
time and must be pre-scheduled. Each node broadcasts its detected by the synchronization mechanisms.
messages to other nodes in each TDMA cycle. The two buses
are synchronized and carry the same data. Considering transient faults at the bus, it might be difficult
The configuration meets the requirements with respect to to satisfy the assumption that only one bus is affected. Ideally
permanent faults, although the requirements are not fully met no redundant signals should be routed close to each other.
in the pessimistic estimate including transient faults. However, redundant signals must come together because both
Consequently, transient faults must be handled better than in buses are attached to each node and the physical distance at
the pessimistic view; there must be high likelihood that they these pans is limited. If the duplicated sensor values are sent at
will be recovered before another transient is introduced. different timeslots, or if the synchronous buses are displaced in
Furthermore, design errors (mainly software bugs) either must time, the actuator node still receives both values from a sensor
be proven not to exist or must be proven to be tolerated with in the case of a temporal fault that affects both buses.
high coverage by the software.
Structure of a n Actuator Node
FAULT HANDLING Allocation of tasks under an optimizing criterion of
minimizing the bus traffic load implies that adaptation of
The classification of a fault as permanent or transient is sensor signals, control law computation, adaptation of
dependent on the fault diagnosis techniques used. In a basically references input signals to control surfaces, and loop closure of
cyclic system such as FCS, it is convenient to use the following the servo actuator for the control surface should all be executed
definition: Thefirst time an error occurs in a redundant system in the actuator nodes. The output from the control law
it is classified as having been caused by a transient fault and computation (Figure 3) are all seven command words, one for
the only action taken is recovery. However, if the error also each primary control surface. All seven actuator nodes perform
occurs in next cycle, the diagnosis will be a permanent fault, exactly the same control law computation, but the actuator
and the defective replica will be declared faulty and a node at the left canard, for example, only gives the
reconfigurationwith recovery is performed. corresponding left canard servo command as input to the loop
closure. In addition, the handling of faults I errors from sensors,
Sensor Faults buses, and actuators will also be executed in the actuator nodes.
In order for a fault to be tolerated and handled, it must first The computer will use data from sources in other nodes of
be detected. The distributed intelligent sensor nodes of a future the system transferred via the FCS bus in order to vote on the
FCS are assumed to meet a fault detection coverage demand of position of the control object (surfaces) of the node. The
-
._ Computer node
Error processing

position
r
f
a
C

Fig. 3. Structure of an actuator node (computer, servo, and control surface)

position of the surface is fed together with the position of the (the only acceptable reason for not streamlining must be
controlled object to the (servo) loop closure, which will be mechanical locking).
implemented as a software process in the computer, see Figure With seven actuator replicas that broadcast their data,
3. Node error and health information would also be generated detection coverage of close to 100%is motivated and nearly all
and transferred on the FCS bus. transient faults are detected and located. If a transient fault
At a transient fault, the node must recover from the error and occurs in an actuator node that aborts or distorts the seven
continue the specified service within a 60 Hz cycle (it is control commands, that node will either make a notification of
assumed that transient faults only occur in the computer a fault (be silent) or will broadcast faulty command words.
because the Actuator Functions mainly consist of analog or Either way, the local voter in each computer node, including
hydro / mechanical machinery). At a permanent fault in the the faulty one, will mask the fault and deliver a correct
node, ones that occur in the computer and in Actuator command word to the loop closure. Consequently, there is no
Functions, the power from the servo shall be zero such that the performance degradation resulting from any temporary fault.
control surface streamlines in a damped mode. Depending on whether or not the actuator nodes work
Internally, the node shall detect faults and errors in the according to replica determinism, the recovely from transient
whole node. By continuously checking the control surface faults can be approached in several ways, mainly according to
position, the computer can detect all faults in the Actuator one of following three basic principles:
Functions. Via an alive signal and an alive valve the
computer can inactivate the servo such that the surface 1. Context from a non-faulty actuator node, the
streamlines. At a permanent fault in the computer, the alive actuator nodes fulfill the requirements of replica
signal shall be inhibited, Le., the computer shall stop. determinism
Consequently, the problem is to detect the faults and errors in
the computer, stop at permanent faults, and recover in the case 2. Inherent context by double execution, the
of a transient fault within a 60 Hz cycle. actuator nodes fulfill the requirements of replica
determinism
With double execution, the transient faults in the computer
will be detected with a coverage of 99%.The double execution 3. No action for recovery, the actuator node will
will also detect a good deal of the permanent faults and, continue as before. The actuator might work
combined with self-test programs, it is estimated that the according to eventual determinism. With
detection coverage of permanent faults also approaches 99%. eventual determinism, a disturbed context from a
A watchdog timer should be added to the computer. transient fault will be updated by subsequent
The control laws are computed in all other actuator nodes; sensor input and approach eventually to the
by exchanging data over the bus, the results can be voted upon. group.
A total fault detection coverage of over 0.999 is reasonable.
I ) Contextfrom a Non-Faulty Actuator Node
Transient Fault Handling in Actuator Nodes Hence, all seven units are identical and use the same input
A transient fault must not abort the function of a node. For data for task execution; at all times the seven replicas are in
instance, it must not cause any primary control surface to exactly the same state. To recover from a transient fault in an
streamline. On the other hand, a permanent fault in the actuator actuator, this alternative is to collect accurate context from one
node, computer or servo, must streamline the control surface of the other replicas. The context is rather large (in the order of

25
120 variables) and the context collection and update must be which is, of course, the case for permanent faults in all parts of
done within the TDMA round. the aircraft. Hence, the discussion of permanent fault handling
Consequently, altemative I ) has large influence of required highlights detection and reconfiguration rather than detection
bandwidth; hence, the full context from one actuator node must and recovery. Once a permanent fault has appeared, then the
be sent at the buses in each 60 Hz cycle. This implies that the infected area or node is lost, resulting in degradation of the
necessary time slots on the bus for accurate context must be system.
scheduled at a later time than the actuator node slots. However, The actuator nodes can operate in one of two modes, the
this recovery altemative covers all transient faults discussed, n o m 1 mode and the streamlining mode. As mentioned earlier,
affecting sensors, buses or actuator nodes. any control surface streamlining is not a critical situation. The
aircraft can still be controlled and achieve safe landings.
2 ) Inherent Context by Double Execution The architecture supports autonomic nodes and, thus, the
In altemative 2) two contexts are inherently created by a decision to enter the streamlining mode is made locally. The
double execution and, in the case of a transient fault in an questions of which faults must cause the control surface to
actuator, only one of the contexts is affected. Thus, recovery is streamline and how information on the streamlining mode
achieved by simply picking the non-faulty one for next comes to everyones notice must be answered. Furthermore, in
execution. The double execution should probably be done the case of a control surface streamlining, the reconfiguration
anyway in order to achieve 99% error detection coverage of must be made in a safe and correct way. The correctly working
transient faults. This alterative has no impact on bandwidth actuator nodes must agree upon which is the faulty one, and
requirements or scheduling of buses. they must change mode simultaneously within a time limit
There exists a small probability of inconsistency among the small enough not to jeopardize the stabilization of the aircraft,
actuator nodes. Suppose that a transient disturbs both buses i.e., reconfigure synchronously.
when, for example, air data sensor number I broadcasts its The actuator is mechanically connected to the control
value and only a subset of the actuator nodes successfully surface and shall operate in its normal mode as long as the
receives the value from air data sensor number 1. In this way computer provides it with a regular alive signal (in the form
the actuator nodes could go ahead with the control law of a pulse train). Should the alive signal pulse train be
computation with slightly different sensor values. Hence, in interrupted, the servo will enter streamlining mode.
order to cover all transient faults with recovery alternative 2),
either the bus must support atomic broadcast or duplicated Fault in the Servo
values ought to be sent at different timeslots. Any fault in the actuator or position sensors must lead to
streamlining. The model monitor in the computer node detects
3) No Recovery faults affecting these functions.
One option with seven replicas is to choose not to recover,
altemative 3) in which the system can lose quite a few Fault in the Computer Node
computers without any degradation of perFormance. The full
distribution of control, with seven replicas as compared to In addition, in the case of a permanent fault affecting the
three replicas in a corresponding centralized system, inherently voter, the model monitor or the loop closure must cause the
has a great deal of redundancy -enough not to have to recover. control surface to streamline. However, it is not necessary for
Note that, without recovery, there is no distinction between all permanent faults to bring the actuator node to streamlining
transient and permanent faults if the actuators fulfill replica mode. If a permanent fault only affects the control law
determinism. With this approach, a transient fault might abort a computation and manifests as a defective command word, this
node. This will not happen, however, if the nodes operate in node can get a correct command word from any of the
accordance with eventual replica determinism with less rigid remaining six actuator nodes, thus making streamlining
timing and context constraints, i.e., the states and data in all unnecessary. In order not to streamline by an affected control
seven nodes do not have to be identical at all times as in I) and law computation in the node that is caused by either a transient
2). With altemative 3) the voter may use a different algorithm or permanent fault, the output command word from the voter is
in the process of selecting correct command words. used as input to the model-monitoring block.
The seven local command words sent to the voter are most
likely not identical in this eventual deterministic approach, but CONCLUSION
a mid-value or mean value with the exclusion of two to three
extreme values will also give correct command words with one In this study, conceptual system architecture has been
or two transient faults. After a transient fault, the actuator node developed at both the functional and physical levels with
continues as before without any special recovery action and its minimum hardware by distributing control at the application
command words will eventually be determined. level and by taking advantage of inherent redundancy. From a
system point of view, the migration toward distributed control
Permanent Fault Handling in Actuator Nodes networks provides the means for reductions in the overall
The FCS cannot recover from permanent faults during development and maintenance costs and helps to manage the
runtime; these must be handled on the ground in parked mode, increasing complexity that faces control systems used for
safety-critical applications. Finally, the flexibility advantages [Z]J. Torin, l9Y9.
of distributed nodes with software implemented fault tolerance The Evolution of Microelectronics and its Impact on Avionics.
Spme Techndogy, Vol. 19, NO.3-4, pp, 199-214.
will soon emerge as a major option for the design of
131 D. Powell, 1. Arlat, L. Beus-Dskic, A. Bondavalli, P. Coppala.
cost-efficient, highly dependable systems. A. Fantechi, E. Jenn, C. Rabejlac and A. Wellings. June 1999,
GUARDS: A Generic Upgradeable Architecture for R w l - T i m
ACKNOWLEDGEMENTS Dependable Systems,
IEEE Trans on Parollel and Distributed Systems.
The national aerospace program in Sweden supported the Val. IO. No.6, pp. 580-597.
work reported in this paper, Project NFFP 349. [41 K.O. Hall and P.D. Stigall, June 1992,
Distributed Flight Control System Using Fiber Distributed Daa
We thank Lars Holmlund, Magnus Landberg, Krister Interface (FDDI).
Fersan, and Nils Ankarback at Saab AB Gripen for their IEEE A E S S y s f e m Magazine, pp 2 1-33.
contributions. [ 5 ] K. Ahlstrbm and J. Tarin, 2001,
Future Architecture far Flight Control Systems,
REFERENCES i n Pmceeding.s 4 2 b Digital Avionics Systemr
C,,"feEnCe.
[I]D. Briereand P. Traverse, June l 9 9 L [61 H. Kopetz. November 1993,
AIRBUS A320/A3301A340 El&trical Flight Controls - A Family Should Responsive Systems be Event-Triggered or Time-Triggered?,
o f Fault-Tolemt Systems, IElCE Trans. on I n f i r m t i o n and Sysrenrr.
Pmc. IEEE Irrr'l Symp. On Fault-T,dermt Computing, Vol. E76-D. No. II.pp. 1325.1332.
pp 616-623. [7l M.Hiller, June 2000.
Executable Assenions for Detecting Data Ermrs in Embedded
Control Systems,
Pmc. IEEE Inf'l Conf on Dependable Sysremr m d Nerwnrkr.
pp 24-33. P

27

Das könnte Ihnen auch gefallen