Sie sind auf Seite 1von 24

An Investigation of the

Therac-25 Accidents

Nancy G. Leveson, University of Washington

Clark S. Turner, University of California, Irvine

c; .
omputers are increasingly being introduced into safety-critical systems
. •..
and, as a consequence, have been involved in accidents. Some of the most
.

.•
.
.. :
widely cit e d software-related accidents in safety-critical systems involved
• L

a computerized radiation therapy machine called the Therac-25. Between June


1 985 and January 1987, six known accidents involved massive overdoses by the
Therac-25 - with resultant deaths and serious injuries. They have been described
as the worst series of radiation accidents in the 35-year history of medical acceler­
atorsI
With information for this article taken from publicly available documents, we
present a detailed accident investigation of the factors involved in the overdoses
and the attempts by the users, manufacturers, and the US and Canadian govern­
ments to deal with them. Our goal is to help others learn from this experience, not
to criticize the equipment's manufacturer or a ny o ne else. The mistakes that were
made are not unique to this manufacturer but are. unfortunately, fairly common in
other safety-critical systems. As Frank Houston of the US Food and Drug Admin­
istration (FDA) said, "A significant amount of software for life-critical systems
comes from small firms, especially in the medical device industry; firms that fit the
profile of those resistant to or uninformed of the principles of either system safety
or software engineering. "2
A thorough account of
Furthermore, these problems are not limited to the medical industry. It is still a
the Therac-25 medical common belief that any good engineer can build software, regardless of whether
he or she is trained in state-of-the-art software-engineering procedures. Many
electron accelerator companies building safety-critical software are not using proper procedures from
a software-engineering and safety-engineering perspective.
accidents reveals
Most accidents are system accidents; that is, they stem from complex interac­
previously unknown tions between various components and activities. To attribute a single cause to an
accident is usually a serious mistake. In this article, we hope to demonstrate the
details and suggests complex nature of accidents and the need to investigate all aspects of system
ways to reduce risk in development and operation to understand what has happened and to prevent
future accidents.
the future. Despite what can be learned from such investigations, fears of potential liability

18 0018-9162/93/0700-0018$03.00 1993 IEEE COMPUTER


or loss of business make it difficult to of software engineering, safety engineer­ computer control using a DEC PDP 1 1
find out the details behind serious engi­ ing, and government and user standards minicomputer.
neering mistakes. When the equipment and oversight. Software functionality was limited in
is regulated by government agencies, both machines: The computer merely
some information may be available. Oc­ added convenience to the existing hard­
ca,ionally. major accidents draw the at­ Genesis of the ware, which was capable of standing
tention of the US Congress or President alone. Industry-standard hardware safe­
Therac-25
and result in formal accident investiga­ ty features and interlocks in the under­
tions ( for instance, the Rogers commis­ lying machines were retained. We know
sion investigation of the Challenger ac­ Medical linear accelerators (linacs) that some old Therac-6 software rou­
cident and the Kemeny commission accelerate electrons to create high­ tines were used in the Therac-20 and
investigation of the Three Mile Island energy beams that can destroy tumors that CGR developed the initial soft­
incident). with minimal impact on the surrounding ware.
The Therac-25 accidents are the most healthy tissue. Relatively shallow tissue The business relationship between
serious computer-related accidents to is treated with the accelerated electrons; AECL and CGR faltered after the Ther­
date (at least nonmilitary and admit­ to reach deeper tissue, the electron beam ac-20 effort. Citing competitive pres­
tcd) and have even drawn the attention is converted into X-ray photons. sures, the two companies did not renew
of the popular press. (Stories about the In the early 1 970s, Atomic Energy of t h e i r cooperative agreement w h e n
Therac-25 have appeared in trade jour­ Canada Limited (AEC L) and a French scheduled i n 1 9R1 . In t h e mid-1 970s,
nals, newspapers, People Magazine, and company called CGR collaborated to AECL developed a radical new "dou­
on t e levision ' s 20120 and McNeill build linear accelerators. (AECL is an ble-pass" concept for electron accelera­
Lehrer News Hour.) Unfortunately. the arms-length entity. called a crown cor­ tion. A douhle-pass accelerator needs
previous accounts of the Therac-2S proh­ poration, of the Canadian government. much less space to develop comparable
lems havc bcen oversimplified, with Since the time of the incidents related in energy levels because it folds the long
misleading omissions. this article. AECL Medical. a division physical mechanism required to accel­
In an effort to remedy this, we have of AECL, is in the process of being erate the electrons, and it is more eco­
obtained information from a wide vari­ privatized and is now called Theratron­ nomic to produce (since it uses a mag­
ety of sources, induding lawsuits and ics International Limited. Currently, netron rather than a klystron as the
the US and Canadian government agen­ AECL's primary business is the design energy source).
cies responsible for regulating such and installation of nuclear reactors.) U sing this doub l e - p ass concept.
equipment. We have tried to be very The products of AECL and CGR's co­ AECL designed the Therac-25, a dual­
careful to present only what we could operation were ( 1 ) the Therac-6, a 6 mode linear accelerator that can deliver
document from original sources, but million electron volt (MeV) accelerator either photons at 25 MeV or electrons
there is no guarantee that the documen­ capable of producing X rays only and, at various energy levels (see Figure 1 ).
tation itself is correct. When possible, later, ( 2 ) the Therac-20. a 20-MeV dual­ Compared with the Therac-20, the Ther­
we looked for multiple confirming sourc­ mode (X rays or electrons) accelerator. ac-25 is notably more compact, more
es for the more important facts. Both were versions of older CGR ma­ versatile, and arguably easier to use.
We have tried not to bias our descrip­ chines. the Neptune and Sagittaire, re­ Thc higher energy takes advantage of
tion of the accidents, hut it is difficult spectively, which were augmented with the phenomenon of "depth dose": As
not to filter unintentionally what is de­
scribed. Also, we were unable to inves­
tigate firsthand or get information about
some aspects of the accidents that may Therac-25 unit
Treatment table
be very relevant. For example. detailed
information ahout the manufacturer's
software development, management,
and quality control was unavailable. We
had to infer most information about Room
these from statements in correspondence emergency
SWitch TV
or other sources.
camera
As a result, our analysis of the acci­ Turntable
dents may omit some factors. But the position
monitor
facts available support previous hypoth­
eses about the proper development and Control
use of software to control dangerous console ----.\\1
processes and suggest hypotheses that Printer
need further evaluation. Following our TV
account of the accidents and the re­ monitor

sponses of the manufacturer, govern­ Display Motion enable Beam onloff light emergency
ment agencies. and users, we present terminal switch (footswitch) switches
what we believe are the most compel­
ling lessons to be learned in the context Figure 1. Typical Therac-25 facility.

J uly 1()()3 19
the energy increases, the depth in the dent protective circuits for monitoring were done independently, starting from
body at which maximum dose buildup electron-beam scanning. pl us mechani­ a common base." Reuse of Therac·6
occurs also increases, sparing the tissue cal interlocks for policing thc machine design features or modules may explain
above the target area. Economic advan­ and ensuring safe operation. The Ther­ some of the problematic aspects of the
tages also come into play for the cus­ ac-25 relics more on software for these Therac-25 software (see the sidebar
tomer. since only one machine is re­ functions. AECL took advantage of the 'Therac-25 software development and
q uired for both treatment modalities computer's abilities to control and mon­ design " ) . The quality assurance manag­
(el ectrons and photons ) . itor the hardware and decided not to er was apparently unaware that some
Several fcatures of the Therac-25 are duplicate all the existing hardware safe­ Therac-20 routines werc also used in
important in understanding the acci­ ty mechanisms and interlocks. This ap­ the Therac-25; this was discovered after
dents. First, like the Therac-6 and the proach is becoming more common as a bug related to one of the Therac-25
Therac-20, the Therac-25 is controlled companies decide that hardware inter­ accidents was found in the Therac-20
by a PDP 1 1 . However, AECL designed locks and backups are not worth the software.
the Therac-25 to take advantage of com­ expense. or they put more faith (per­ AECL produced the first hardwired
puter control from the outset; AECL haps misplaced) on software than on prototype of the Therac-25 in 1 976. and
did not build on a stand-alonc machine. hardware reliability. the completely computerized commer­
The Therac-6 and Therac-20 had been Finally, some software for the ma­ cial version was available in late 1 982.
designed around machines that already chines was interrelated or reused. In a (The sidebars provide details ahout the
had histories of clinical use without com­ letter to a Therac-25 user, the AECL machine's design and controlling soft­
puter control. quality assurance manager said, "The ware. important in understanding the
In addition. the Therac-25 software same Therac-6 package was used by the accidents. )
has more responsibility for maintaining AECL software people when they start­ In March 1983, AECL performed a
safety than the software in the previous ed the Therac-25 software. The Therac- safety analysis on the Therac-25. This
machines. The Therac-20 has indepen- 20 and Therac-25 software programs analysis was in the form of a fanlt tree

Therac-25 software development and design

We know that the software for the Therac-25 was devel­ AECL claims proprietary rights to its software design.
oped by a single person. using POP 11 assembly language, However, from voluminous documentation regarding the ac­
over a period of several years. The software -evolved" from Cidents, the repairs, and the eventual design changes, we
the Therac-6 software, which was started in 1972. According can build a rough picture of it.
to a letter from AECl to the FDA, the "program structure and The software is responSible for monitoring the machine
certain subroutines were carried over to the Therac 25 status, accepting input about the treatment desired, and set­
around 1976." ting the machine up for this treatment. It turns the beam on
Apparently. very little software documentation was pro­ in response to an opera1Or command (assuming that certain
duced during development. In a 1986 internal FDA memo, a operational checks on Ihe status of the phY$ical machine are
reviewer lamented, "Unfortunately, the AECL response also satisfied) and also turns the beam off when treatment is
seems to point out an apparent lack of documentation on completed, when an operator commands It, or when a mal­
software specifications and a software test plan." function is detected. The operator can print out hard-copy
The manufacturer said that the hardware and software versions of the CRT display or machine setup parameters.
were "tested and exercised separately or together over The treatment unit has an interlock system designed to re­
many years.· In his deposition for one of t he lawsuits, the move power to the unit when there is a hardware maHune­
quality assurance manager explained that testing was done tion. The computer monitors this interlock system and pro­
in two parts. A ·small amount" of software testing was done vides diagnostic messages. Depending on the fault, the
on a Simulator, but most testing was done as a system. It computer either prevents a treatment from being started or,
appears that unit and software testing was minimal. with if the treatment is in progress, creates a pause or a suspen­
most effort directed at the Integratedsy_em test. At a Ther­ sion of the treatment.
ac-25 user group meeting, the same quality assurance man­ The manufacturer describes the TheraC-25 software as
ager said that the Therac-25 software was tested fOr 2.700 having a stand-alone, real-time treatment operating system.
hours. Under questioning by the users, he clarified this as The system is not built using a standard operating system or
meaning "2,700 hours of use: executive. Rather, the real-time executive was written espe­
The programmer left AECl in 1986. In a lawsuh connected cially for the Therac-25 and runs on a 32K PDP 11/23. A
with one of the accidents, the lawyers were unable to obtain preemptive scheduler anocates cycles to Ihe critical and
information about the programmer from AECl. In the depo­ noncritical tasks.
sitions connected with that case, none of the AECL employ­ The software, written in POP 11 assembly language, has
ees questioned could provide any information about his edu­ four major components: stored data, a scheduler, a set of
cational background or experience. Although an attempt was critical and noncritical tasks, and interrupt services. The
made to obtain a deposition from the programmer, the law­ stored data includes calibration parameters for the accelera­
suit was settled before this was accomplished. We have tor setup as well as patient-treatment data. The interrupt rou­
been unable to learn anything about his background. tines include

20 COMPUTER
and apparently excluded the software. For "Computer selects wrong mode," a the responses from the manufacturer,
According to the final report, the anal­ probability of 4 x 1 0-9 is given. The government regulatory agencies. and
ysis made several assumptions: report provides no justification of ei­ users.
ther number.
(1) Programming errors hav� been Kennestone Regional Oncology Cen­
reduced by extensive testing on a hardware ter. 1985. Details of this accident i n
simulator and under field conditions on
teletherapy units. Any residual software
Accident history Marietta, Georgia. a r e sketchy since i t
errors are not included in the analysis.
was never carefully investigated. There
(2) Program software does not degrade Eleven Therac-25s were installed: five was no admission that the injury was
duc to wear, fatigue. or reproduction in the US and six in Canada. Six acci­ caused by the Therac-25 until long after
process. the occurrence, despite claims by the
dents involving massive overdoses to
(3) Computer execution errors are
caused by faulty hardware components
patients occurred betwe e n 1 985 and patient that she had been injured during
and by "soft" (random ) errors induced by 1 987. The machine was recalled in 1 9 87 treatment. the obvious and severe radi­
alpha particl es and electromagnetic noise. for extensive design changes, including ation burns the patient suffered, and
hardware safeguards against software the suspicions of the radiation physicist
The fault tree resulting from this anal­ errors. involved.
ysis does appear to include computer Related problems were found in the After undergoing a lumpectomy to
failure, although apparently, judging Therac-20 software. These were not rec­ remove a malignant hreast tumor, a 6 1 -
from these assumptions, it considers only ognized until after the Therac-25 acci­ year-old woman was receiving follow­
hardware failures. For example, in one dents because the Therac-20 included up radiation treatment to nearby lymph
OR gate leading to the event of getting hardware safety interlocks and thus no nodes on a Therac-25 at the Kenne­
the wrong energy, a box contains "Com­ injuries resulted. stone facility in Marietta. The Therac-
puter selects wrong energy" and a prob­ In this section, we present a chro­ 25 had heen operating at Kennestone
ability of 1 0-11 is assigned to this event. nological account of the accidents and for about six months; other Therae-25s

• a clock interrupt service routine, • The housekeeper task takes care of system-status in­
• a scanning interrupt service routine, terlocks and limit checks. and puts appropriate messages
• traps (for software overflow and computer-hardware­ on the CRT display. It decodes some information and
generated interrupts). checks the setup verification.
• power up (initiated at power up to initialize the system
and pass control to the scheduler). Noncritical tasks include
• treatment console screen interrupt handler.
• treatment console keyboard interrupt handler, • Check sum processor (scheduled to run periodically).
• service printer interrupt handler. and • Treatment console keyboard processor (scheduled to
• service keyboard interrupt handler. run only if it is called by other tasks or by keyboard inter­
rupts). This task acts as the interface between the software
The scheduler controis the sequences of all noninterrupt and the operator.
events and coordinates all concurrent processes. Tasks are • Treatment console screen processor (run periodically).
initiated every 0.1 second, with the critical tasks executed This task lays out appropriate record formats for either dis­
first and the noncritical tasks executed in any remaining cy­ plays or hard copies.
cle time. Critical tasks include the following: • Service keyboard processor (run on demand). This task
arbitrates non-treatment-related communication between
• The treatment monitor (Treat) directs and monitors pa­ the therapy system and the operator.
tient setup and treatment via eight operating phases. These • Snapshot (run periodically by the scheduler). Snapshot
are called as subroutines, depending on the value of the captures preselected parameter values and is called by the
Tphase control variable. Following the execution of a partic­ treatment task at the end of a treatment.
ular subroutine, Treat reschedules itself. Treat interacts • Hand-control processor (run periodically).
with the keyboard processing task. which handles operator • Calibration processor. This task is responsible for a
console communication. The prescription data is cross­ package of tasks that let the operator examine and change
checked and verified by other tasks (for example, the key­ system setup parameters and interlock limits.
board processor and the parameter setup sensor) that in­
form the treatment task of the verification status via shared It is clear from the AECL documentation on the modifica­
variables. tions that the software allows concurrent access to shared
• The servo task controls gun emission. dose rate (pulse­ memory, that there is no real synchronization aside from
repetition frequency), symmetry (beam steering). and ma­ data stored in shared variables, and that the "test" and "set"
chine motions. The servo task also sets up the machine pa­ for such variables are not indivisible operations. Race con­
rameters and monitors the beam-tilt-error and the ditions resulting from this implementation of multitasking
flatness-error interlocks. played an important part in the accidents.

July 1993 21
Major event time line
had been operating, apparently without
incident, since 1 983.
1 __
JUN 3rd: Marietta, Georgia, overdose.
On June 3, 1985. the patient was set
Later in the month, Tim Still calls AECL and asks if overdose by
up for a IO-MeV electron treatment to
Therac-25 is possible.
the clavicle area. When the machine
JUL 26th: Hamilton, Ontario, Canada, overdose; AECL notified and de­
turned on. she felt a tremendous force
"

termines microswitch failure was the cause.


of heat . . . this red-hot sensation . " When
SEP AECL makes changes to microswitch and notifies users of increased
the technician came in, thc patient said ,

safety.
" You burned me. " The technician re­
Independent cdnsultant (for Hamilton Clinic) recommends potentiom­
plied that that was not possible. Al­
eter on turntable.
though there were no marks on the pa­
OCT Georgia patient files suit against AECL and hospital.
tient at the time. the treatment area felt
NOV 8th: Letter from CRPB to AECL asking for additional hardware inter­
"warm to the touch."
locks and software changes.
It is unclear exactly when AECL
PEC Yakima, Washington, clinic overdose.
learned about this incident. Tim Still.
the Kennestone physicist, said that he
taa. contacted AECL to ask if the Therac-25
JAN Attorney for Hamilton clinic requests that potentiometer be installed could operate in electron mode without
on turntable. scanning to spread the beam. Three days
31st: Letter to AECL from Yakima reporting overdose possibility. later. the engineers at AECL called the
FEB 24th: Letter from AECL to Yakima saying overdose was impossible physicist back to exp l ain that improper
and no other incidents had occurred. scanning was not possihle.
MAR 21st: Tyler, Texas, overdose. AECL notified; claims overdose im­ Tn an August 1 9, 1 986, letter from
possible and no other accidents had occurred previously. AECL sug­ AECL to the FDA. the AECL quality
gests hospital might have an electrical problem. assurance manager said. "In March of
APR 7th: Tyler machine put back in service after no electrical problem 19X6, AECL received a lawsuit from the
could be found. patient involved . . . This incident was
11th: Second Tyler overdose. AECL again notified. Software prob­ never reported to AECL prior to this
lem found. date, although some rather odd ques­
15th: AECL files accident report with FDA. tions had heen posed by Tim Still, the
MAY 2nd: FDA declares Therac-25 defective. Asks for CAP and proper hospital physicist." The physicist at a
renotification of Therac-25 users. hospital in Tyler, Texas. where a later
JUN 13th: First version of CAP sent to FDA. accident occurred. reported. "Accord­
JUL 23rd: FDA responds and asks for more information. ing to Tim Still , the patient filed suit in
AUG First user grou , meeting. October 1985 listing the hospital, man­
SEP 26th: AECL s4h ds FDA additional information. ufacturer, and service organization re­
OCT 30th: FDA requests more information. sponsible for the machine. AECL was
NOV 12th: AECL submits revision of CAP. notified informal l y about the suit by the
DEC Therac-20 users notified of a software bug. hospital. and AECL received offici al
11th: FDA requests further changes to CAP. notification of a lawsuit in November
22nd: AECL submits second revision of CAP. 1 985."
Because of the lawsuit ( filed on Nov­
1887 ember 13, 1985). some AECL admin­
JAN 17th: Second overdose at Yakima. istrators must have known about the
26th: AECL sends FDA its revised test plan. Marietta accident - although no inves­
FEB Hamilton clinic investigates first accident and concludes there was tigation occurred at this time. Further
an overdose. comments by FDA investigators point
3rd: AECL announces changes to Therac-25. to the lack of a mechanism in AECL to
10th: FDA sends notice of adverse findings to AECL declaring Ther­ fo l low up reports of suspected accidents.
ac-25 defective under US law and asking AECL to notify customers The lack of follow-up in this case ap­
that it should not be used for routine therapy. Health Protection p ears to be evidence of such a problem
Branch of Canada does the same thing. This lasts until August 1987. in the organizatio n .
MAR Second user group meeting. The patient went home, but shortly
5th: AECL sends third revision of CAP to FDA.
afterward she developed a reddening
APR 9th: FDA responds to CAP and asks for additional information. and swe l lin g in the center of the treat­
MAY 1st: AECL sends fourth revision of CAP to FDA.
ment area. Her pain had increased to
26th: FDA approves CAP subject to final testing and safety analYSis.
the point that her shoulder "froze" and
JUN 5th: AECL sends final test plan and draft safely analysis to FDA. she experienced spasms. She was ad­
JUL Third user group meeting. mitted to West Paces Ferry H osp i tal i n
21st: Fifth (and final) revision of CAP sent to FDA.
Atlanta. b u t her oncologists continued
to send her to Kennestone for Therac-
taea
JAN 25 treat m ents. Clinical explanation was
29th: Interim safety analysis report issued.
NOV 3rd: Final safety analYSis report issued.
CO.\1PUTER
sought for the reddening of the skin, consequences. H ealth-care profession­ pital service technician was called. The
which at first h er oncologist attributed als and in stitutions were not required to technician found nothing wrong with
to her disease or to normal treatment report incidents to manufacturers. (The the machine. This also was not an un­
reaction. law was amended i n 1 990 to require usual scenario, according to a Therac-
About two weeks later, the physicist health-care facilities to report i n cidents 25 operator.
at Kenncstone noticed that the patient to the manufacturer and the FDA.) The After the treatment, the patient com­
had a matching reddening on hcr back compt roller general of the US Govern­ plained of a burning sensation, described
as though a burn had gone through her ment Accounting Office, in testimony as an "electric tingling shock" to the
body. and the swollen area had hegun to bef ore Congress on :-Iovember 6, 1 989, treatment area in her hip. Six other
slough off layers of skin. Hcr shoulder expressed great concern about the via­ patients were treated later that day with­
was immobile, and she was apparently bility of the incident-reporting regula­ out incident. The patient came back for
in great pain. It was obvious that she tions in preventing or spotting medical­ further treatment on July 29 and com­
had a radiation burn , but the hospital device problems . According to a GAO plained of burning, hip pain, and exces­
and her doctors could provide no satis­ study, the FDA knows of less than 1 sive swelling in the region of treatment.
factory explanation. Shortly afterward, percent of deaths, serious injuries, or The machine was taken out of service,
she ini tiated a lawsuit again st the hospi­ equipment malfunctions that occur in as radiation overexposure was suspect­
tal and AECL regarding her injury. hospitals.' ed. The patient was hospitalized for the
The Kennestone p hysicist later esti­ At this point , the othcr Therac-25 condition on July 30. AECL was in­
mated that she received one or two dos­ users were unaware that anything unto­ formed of the apparent radiation injury
es of radiation in the 1 5 ,000- to 20,000- ward had occurred and did not learn and sent a service engineer to investi­
rad (radiation absorbed dose) range. about any problems with the machi n e gate. The FDA, the then-Canadian Ra­
He docs not believe her injury could until after subsequent accidents. Even diation Protection Bureau (CRPB), and
have been caused by less than 8,000 then, most of their information came the users were informed that there was
rads. I'ypical single therapeutic doses through personal communica tion among a proble m , although the uscrs claim that
are in the 200-rad range. Doses of 1 .000 themselves. they were ne ver informed that a patient
rads can be fatal if delivered to the injury had occurred. (On April 1. 1 986,
whole body; in fact. the accepted figure Ontario Cancer Foundation, 1985. The the CRPB and the Bureau of Medical
for whole-body radiation that will cause second in this series of accidents oc­ Devices were merged to form the Bu­
death in 50 percent of the cases is 500 curred at this Hamilton, Ontario, Can­ reau of Radiation and Medical Devices
rads. The consequences of an overdose ada, clinic about seven weeks after the or BRMD.) G sers were told that they
to a smaller part ot the body depend on Kennestone patient was overdosed. At should visually confirm the turntable
the tissuc's radiosen s itivity. The direc­ thai time, the Therac-25 at the Hamil­ alignment until further notice (which
tor of radiation oncology at the Kenne­ ton clinic had been in use for more than occurred three months later).
stone facility explained their confusion six months. On July 26, 1 985 , a 40-year­ The patient died on November 3. 1 985,
about the accident as due to the fact that old patient came to the clinic for her of an extremely virulent cancer. An
they had never seen an overtreatmcnt 24th Therac-25 tre a t m ent [or carcino­ autopsy reveitled the cause of death as
of that magnitude before. m a of the cervix. The operator activat­ the cancer. but it was noted that had she
Eventually. the patient ' s breast had ed the machine, but the Therac shut not died. a total hip re p l ace m e nt would
to be removed because of the radiation down after five seconds with an " H-tilt " have been necessary as a re sult of the
burns. She completely lost the use of error message. The Therac's dosimetry radiation overexposure . An AECL tech­
her shoulder and her arm , and was in system display read "no dose" and indi­ nician later estimated the patient had
constant pain. She had suffered a seri­ cated a "treatment pause . " received between 1 3,000 and 1 7,000 mds.
ous radiation burn, but the manufactur­ Since the machine did not suspend
er and operators of the machine refused and the control display indicated no Jlvlanufaclurrr response. AECL could
to believe that it could have been caused dose was delivered to the patient. the not reproduce the m alfunction that had
by the Therac-25. The treatment pre­ operator went ahead with a second at­ occurred, but suspected a transient fail­
scription printout feature was disabled tempt at treatment by pr e ssing the "P" ure in the mi croswitch used to deter­
at the time of the accident, so there was key (the proceed command), expecting mine turntable position. During the in­
no hard copy of the treatment data. The the m achine to deliver the proper dose vestigation of t h e accident, AECL
lawsuit was eventually settled out of this time. This was standard operating hardwired the error conditions they as­
court. procedure and . as described in the side­ sumed were necessary for the malfunc­
From what we can dctermine, the ac­ bar "The operator interrace" on p. 24. tion and. as a result, found some design
cident was not reported to the FDA Therac-25 operators had bccome ac­ weaknesses and potential mechanical
u ntil ajler the later Tyler accidents in customed to frequent malfunctions that problems involving the turntahle posi­
1 986 (described in later sections). The had no u ntoward consequences for the tioning.
reporting regulations for medical de­ patient. Again, the machine shut down The computer senses and controls
vice incidents at that time applied only in the same manner. The operator re­ turntable position by reading a 3-bit
to equipment manufacturers and im­ peated this process four times after the signal about the status of three mi­
porters, not users. The regulations re­ original attempt - the display showing croswitches in the turntable switch as­
quired that manufacturers and import­ "no dose" delivered to the patient each sembly (see the sidehar "Turntable po­
ers report deaths, serious injuries, or ti me . After the fifth pause, the machine sitioning" on p. 25 ). Essentially, AECL
malfunctions that could result in those went into treatmen t suspend, and a hos- determined that a I -bit error in the mi-

July 1 993 23
The operator interface

In the main text, we describe changes made as a result and some merely consisted of the word "malfunction" fol­
of an FDA recall, and here we describe the operator inter­ lowed by a number from 1 to 64 denoting an analog/digital
face of the software version used during the accidents. channel number. According to an FDA memorandum writ­
The Therac-25 operator controls the machine with a ten after one accident
DEC VT100 terminal. In the general case, the operator po­
The operator's manual supplied with the machine does
sitions the patient on the treatment table, manually sets not explain nor even address the malfunction codes. The
the treatment field sizes and gantry rotation, and attaches [Maintenance] Manual lists the various malfunction
accessories to the machine.leaving the treatment room, numbers but gives no explanation. The materials provided
the operator returns to the VT100 console to enter the pa­ give no indication that these malfunctions could place a
patient at riSk.
tient identification, treatment prescription (including mode,
The program does not advise the operator if a situation
energy level, dose, dose rate, and time), field sizing, gan­ exists wherein the ion chambers used to monitor the
try rotation, arid accessory data. The system then com­ patient are saturated, thus are beyond the measurement
pares the manually set values with those entered at the limits of the ins tru ment. This software package does not
appear to contain a safety system to prevent parameters
console. If they match, a "verified" message is displayed
being entered and intermixed that wou ld result in excessive
and treatment is permitted.If they do not matCh, treatment
radiation being delivered to the patient under treatment.
is not allowed to proceed until the mismatch is corrected.
Figure A shows the screen layout. An operator involved in an overdose accident testified
When the system was first built, operators complained that she had become insensitive to machine malfunctions.
that it took too long to enter the treatment plan. In re­ Malfunction messages were commonplace - most did not
sponse, the manufacturer modified the software before the involve patient safety. Service technicians would fix the
first unit was installed so that, instead 01 reentering the problems or the hospital physicist would realign the ma­
data at the keyboard, operators could use a carriage return chine and make it operable again. She said, "It was not
to merely copy the treatment site data.' A quick series of out of the ordinary for something to stop the machine...It
carriage returns would thus complete data entry.This inter­ would often give a low dose rate in which you would turn
face modification was to figure in several aCCidents. the machine back on...They would give messages of
The Therac-25 could shut down in two ways after it de­ low dose rate, V-tilt, H-tilt, and other things; I can't re­
tected an error condition. One was a treatment suspend, member all the reasons it would stop, but there [were) a
which required a complete machine reset to restart. The lot of them." The operator further testified that during in­
other, not so serious, was a treatment pause, which re­ struction she had been taught that there were "so many
quired only a single-key command to restart the machine. safety mechanisms" that she understood it was virtually
If a treatment pause occurred, the operator could press the impossible to overdose a patient.
UP" key to "proceed" and resume treatment quickly and A radiation therapist at another clinic reported an aver­
conveniently. The previous treatment parameters remained age of 40 dose-rate malfunctions, attributed to underdos­
in effect, and no reset was required. This convenient and es, occurred on some days.
simple feature could be invoked a maximum of five times
before the machine automatically suspended treatment Reference
and required the operator to perform a system reset. 1. E. Miller. ''The Therac-25 Experience," Proc. Conf. State Redia·
Error messages provided to the operator were cryptic, tion Control Program Directors, 1987.

PATIENT NAME: TEST A


TREATMENT MODE: FIX BEAM TYP E : X ENERGY (KeV): 25

ACTUAL PRESCRIBED
UNIT RATE/MINUTE o 200
MONITOR UNITS 50 50 200
TIM E (MIN) 0.27 1.00

GANTRY ROTATION (DEG) 0.0 o V ER I FIED


COLLI MATOR ROTATION (DEG) 359.2 359 V ER I F I ED
COLLI MATOR X (CM) 14.2 14.3 VERI F I ED
COLLI MATOR Y (CM) 27.2 27.3 VERI F I ED
WEDGE NUMBER 1 V E R I F I ED
ACCESSORY NUMBER o o VERIFIED

DATE : 84-0CT-26 SYSTEM: BEAM READY OP.MODE: TREAT AUTO


TIME: 12:55. 8 TREAT: TREAT PAUSE X-RAY 173777
OPR ID: T25V02-R03 REASON: OPERATOR COMMAND:

Figure A. Operator Interface screen layout.

24 COMPUTER
croswitch codes (which could be caused The problem was exacerbated by the The plunger could be extended when
by a single open-circuit fault on the design of the mechanism that extends a the turntable was way out of position,
switch lines) could produce an ambigu­ plunger to lock the turntable when it is thus giving a second false position indi­
ous position message for the computer. in one of the three cardinal positions: cation. AECL devised a method to indi-

Turntable positioning

The Therac-25 turntable design is important in under­ hazard of dual-mode machines: If the turntable is in the
standing the accidents. The upper turntable (see Figure wrong position, the beam flattener will not be in place.
8) is a rotating table, as the name implies. The turntable In the Therac-25, the computer is responsible for posi­
rotates accessory equipment into the beam path to pro­ tioning the turntable (and for checking turntable position)
duce two therapeutic modes: electron mode and photon so that a target, flattening filter, and X-ray ion chamber
mode. A third position (called the field-light position) in­ are directly in the beam path. With the target in the beam
volves no beam at all; it facilitates correct positioning of path, electron bombardment produces X rays. The X-ray
the patient. beam is shaped by the flattening filter and measured by
Proper operation of the Therac-25 is heavily dependent the X-ray ion chamber.
on the turntable position; the accessories appropriate to No accelerator beam is expected in the field-light posi­
each mode are physically attached to the turntable. The tion. A stainless steel mirror is placed in the beam path
turntable position is monitored by three microswitches and a light Simulates the beam. This lets the operator see
corresponding to the three cardinal turntable positions: precisely where the beam will strike the patient and make
electron beam, X ray, and field light. These microswitches necessary adjustments before treatment starts. There is
are attached to the turntable and are engaged by hard­ no ion chamber in place at this turntable position, since no
ware stops at the appropriate positions. The position of beam is expected.
the turntable, sent to the computer as a 3-bit binary sig­ Traditionally, electromechanical interlocks have been
nal, is based on which of the three microswitches are de­ used on these types of equipment to ensure safety - in
pressed by the hardware stops. this case, to ensure that the turntable and attached equip­
The raw, highly concentrated accelerator beam is dan­ ment are in the correct position when treatment is started.
gerous to living tissue. In electron therapy, the computer In the Therac-25, software checks were substituted for
controls the beam energy (from 5 to 25 MeV) and current many traditional hardware interlocks.
while scanning magnets spread the beam to a safe, thera­
peutic concentration. These scanning magnets are mount­ Reference
ed on the turntable and moved into proper position by the 1. J.A. Rawlinson, "Report on the Therac-25," OCTRFfOCI Physi­
computer. Similarly, an ion chamber to measure electrons cists Meeting, Kingston, Ont., Canada. May 7, 1987.
is mounted on the turntable and
also moved into position by the
computer. In addition, operator-
mounted electron trimmers can Turntable switch assembly
be used to shape the beam if
necessary.
For X-ray therapy, only one en­
ergy level is available: 25 MeV.
Much greater electron-beam cur­
rent is required for photon mode
(some 100 times greater than
that for electron therapy) 1 to pro­
duce comparable output. Such a .
I
high dose-rate capability is re­ I
I
I
quired because a "beam flatten­ I
I

er" is used to produce a uniform I

��

treatment field. This flattener,
X-ray de target
Which resembles an inverted ice­
Switch
cream cone, is a very efficient at­
actuators
tenuator. To get a reasonable
treatment dose rate oul, a very
high input dose rate is required. If
the machine produces a photon Electron mode base
scan magnet
beam with the beam flattener not
in position, a high output dose
rate results. This is the basic Figur e B. Upper turntable assembly.

July 1993 25
cate turntable position that tolerated a they could return to normal operating an independent upper collimator posi­
I -bit error: The code would still unam­ procedures. tioning interlock on the Therac-25. Also,
biguously reveal correct position with As a result of the Hamilton accident, in January 1 986, AECL received a let­
any one mieroswiteh failure. the head of advanced X-ray systems in ter from the attorney representing the
In addition, AECL altered the soft­ the CRPB, Gordon Symonds, wrote a Hamilton clinic. The letter said there
ware so that the computer checked for report that analyzed the design and per­ had been continuing problems with the
" in transit" status of the switches to formance characteristic� of the Therac- turntable, including four incidents at
keep further track of the switch opera­ 25 with respect to radiation safety. Be­ H amilton, and requested the installa­
tion and the turntable position, and to sides citing the flawed microswitch, the tion of an independent system (potenti­
give additional assurance that the switch­ report faulted hoth hardware and soft­ ometer) to verify turntable position.
es were working and the turntable was ware componcnts of the Therac's de­ AECL did not comply: N o independent
moving. sign. It concluded with a list of four interlock was installed on the Therac-
As a result of these improvements, modifications to the Therac-25 neces­ 25s at this time.
AECL claimed in its report and corre­ sary for minimum compliance with Can­
spondence with hospitals that "analysis a d a ' s Radiation E mi t t i n g Devices Yakima Valley Memorial Hospital,
of the hazard rate of the new solution ( RED) Act. The R E D law, enacted in 1985. As with the Kennestone over­
indicates an improvement over the old 197 1 , gives government officials power dose, machine malfunction in this acci­
system by at least five orders of magni­ to ensure the safety of radiation-emit­ dent in Yakima, Washington, was not
tude. " A claim that safety h ad heen Ii ng devices. acknowledged until after later accidents
improved by five orders of magnitude The modifications recommended in were under�tood.
seems exaggerated, especially given that the Symonds report included redesign­ The Therac-25 at Yakima had been
in its final incident report to the FDA, ing the microswitch and changing the modified in September 1 985 in response
AECL concluded that it "cannot be firm way the computer handled malfunction to the overdose at Hamilton. During
on the exact cause of the accident but conditions. In particular. treatment was December 1 985, a woman came in for
can only suspect. . . " Thi� underscore� to be terminated in the event of a dose­ treatment with the Therac-25. She de­
the company's inability to determine rate malfnnction, giving a treatment veloped erythema (excessive redden­
the cause of the accident with any cer­ "suspend." This would have removed ing of the skin) in a parallel striped
tainty. The AECL quality assurance the option to proceed simply by press­ pattern at one port site (her right hip)
manager testified that AECL could not ing the "P" key. The report also made aner one of the treatments. Despite
reproduce the switch malfunction and recommendations regarding collimator this, she continued to be treated by the
that testing of the microswitch was "in­ test procedures and message and com­ Therac-25 because the cause of her rc­
conclusive." The similarity ofthe errant mand formats. A Novemher R, 1 9R5 1et­ action was not determined to be abnor­
behavior and the injuries to patients in ter signed by Ernest Letourneau, M.D., mal until January or February of 1 9R6.
this accident and a later one in Yakima, director of the CRPB, asked that AECL On January 6, 1986, her treatments were
Washington , (attributed to software make changes to the Therac-25 based completed.
error) provide good reason to believe on the Symonds report "to be in compli­ The staff monitored the skin reaction
that the Hamilton overdose was proba­ ance with the R E D Act." closely and attempted to find possible
bly related to software error rather than Although. as noted above, AECL did causes. The open slots in the blocking
to a microswitch failure. make the microswitch changes. it did trays in the Therac-25 could have pro­
not comply with the directive to change duced such a striped pattern, but by the
Government and user response. The the malfunction pause behavior into time the skin reaction had been deter­
Hamilton accident resulted in a volun­ treatment suspends, instead reducing mined to be abnormal, the blocking trays
tary recall by AECL, and the rDA the maximum number of retries from had been discarded. The blocking ar­
termed it a Class I I recall. Class II means five to three. According to Symonds, rangement and tray striping orientation
"a situation in which the use of, or expo­ the deficiencies outlined in the CRPB could not be reproduced. A reaction to
sure to. a violative product may cause letter of November 8 were still pend­ chemotherapy was ruled out because
temporary or medically reversible ad­ ing when subsequent accidents five that should have produced reactions at
verse health consequences or where the months later changed the priorities. If the other ports and would not h ave pro­
probability of serious adverse health these later accidents had not occurred, duced stripes. When it was discovered
consequences is remote. " Four users in AECL would have been compelled to that the woman slept with a heating
the US were advised by a letter from comply with the requirements in the pad, a possihle explanation was offered
AECL on August 1 , 1985. to visually letter. on the basis of the parallel wires that
check the ionization chamber to make Immediately after the Hamilton acci­ deliver the heat in such pads. The staff
sure it was in its correct position in the dent, the Ontario Cancer Foundation x-rayed the heating pad and discovered
collimator opening before any treat­ hired an independent consultant to in­ that the wire pattern did not correspond
ment and to discontinue treatment if vestigate. H e concluded in a September to the erythema pattern on the patient's
they got an H-tilt message with an in­ 1 9R5 report that an independent system hip.
correct dose indicated. The letter did (beside the computer) was needed to The hospital staff sent a letter to
not mention that a patient injury was verify turntable position and suggested AECL on January 3 1, and they also
involved. The FDA audited AECL's the u�e of a potentiometer. The CRPB spoke on the phone with the AECL
subsequent modifications. After the wrote a letter to AECL in November technical support supervisor. On Feb­
modifications, the users were told that 1 985 requesting that AECL install such ruary 24, 1 986, the AECL technical sup-

26 COMPUTER
port supervisor sent a written response the hospital staff to suspect that the first to typing this. The mistake was easy to
to the director of radiation therapy at injury had been due to a Therac-25 fault, fix; she merely used the cursor up key to
Yakima saying, "After careful consid­ the staff investigated and found that edit the mode entry.
eration, we are of the opinion that this this patient had a chronic skin ulcer, Since the other parameters she had
damage could not have been produced tissue necrosis (death) under the skin, entered were correct. she hit the return
by any malfunction of the Therac-25 or and was in constant pain . This was sur­ key several times and left their values
by an y operator error." The letter goes gically repaired, skin grafts were made, unchanged. She reached the bottom of
on to support this opinion by listing two and the symptoms relieved. The patient the screen where a message indicated
pages of technical reasons why an over­ is alive today, with minor disability and that the parameters had been "verified"
dose by the Thcrac-25 was impossible, some scarring related to thc overdose. and the terminal displayed "beam
along with the additional argument that The hospital staff concluded that the ready," as expected. She hit the onc-key
there have "apparently been no other dose accidentally delivered to this pa­ command B (for "beam on" ) to begin
" "

instances of similar damage to this or tient must have been much lower than the treatment. After a moment, the
other patients." The letter ends, "In in the second accident, as the reaction machine shut down and the console dis­
closing. I wish t o advise that this matter was significantly less intense and necro­ played the message "Malfunction 54."
has been brought to the attention of our sis did not develop until six to eight The machine also displayed a "treat­
Hazards Committee, as is normal prac­ months after exposure. Some other fac­ ment pause," indicating a problem of
tice . " tors related to the place on the body low priority (see the operator interface
The hospital staff eventually ascribed where the overdose occurred also kept sidebar ) . The sheet on the side of the
the skin/tissue problem to "cause un­ her from having more significant prob­ machine explaincd that this malfunc­
known . " In a report written on this first lems as a result of the exposure. tion was a "dose input 2" error. The
Yakima incident after another Yakima ETCC did not have any other informa­
overdose a ye ar later (described in a East Texas Cancer Center, March tion available in its instruction manual
later section ). the medical physicist in­ 1986. More is known about the Tyler, or other Therac-25 documentation to
v olved wrote Texas, accidents than the others be­ explain the meaning of Malfunction 54.
cause of the diligence of the Tyler hos­ An AECL technician later testified that
A t that time. we dId n ot believe that pital physicist, Fritz Hager. without "dose input 2" meant that a dose had
l the patie ntJ was ov er d osed because the
whose efforts the understanding of the been delivered that was either too high
manufacturer had installed additional
hardware and software safetv . devices to
software problems might have been or too low.
the accelerator. delayed even further. The machine showed a substantial
In a lcttcr from the manufacturer dated The Therac-25 was at the East Texas underdose on its dose monitor display:
1 6-Sep-85, i t is stated that " Analvsis of the Canccr Center (ETCC) for two years 6 monitor units delivered, whereas the
hazard r at e resulting from these
modifications indicates an improvement
before the first serious accident occurred; operator had requested 202 monitor
of at least five orders of magnitude" ! Wi th during that time. more than SUU pa­ units. The operator was accustomed to
such an improvement in safety ( 1 0,000,000 tients had been treated. On March 2 1 , the quirks of the machine, which would
percent) we did not believe that there 1 986, a male patient came into ETCC frequently stop or delay treatment. In
c o u l d h a v e b e e n any a c c e le rator
malfunction. These modifications to the
for his ninth treatment on the Therac- the past, the only consequences had
a c c e lerator were completed on 5 , 6 - 25, one of a series prescribed as follow­ been inconvenience. She immediately
Sep-85. up to the removal of a tumor from his took the normal action when the ma­
back. chine merely paused, which was to hit
Even with fairly sophisticated phys­ The patient's treatment was to be a the "p" key to proceed with the treat­
ics support, the hospital staff, as users, 22-MeV electron-beam treatment of 1 80 ment. The machine promptly shut down
did not have the ability to investigate rads over a l O x 1 7-cm fieldon the upper with the same " Malfunction 54" error
the possibility of machine malfunction back and a little to the left of his spine, and the same underdose shown by the
further. They were not aware of any or a total of 6,000 rads over a period of display terminal.
other incidents, and, in fact, were told 6 1 /2 weeks. He was taken into the treat­ The operator was isolated from the
that there had been none. so there was ment room and placed face down on the patient, since the machine apparatus
no reason for them to pursue the mat­ treatment table. The operator then left was inside a shielded room of its own.
ter. However, it seems that the fact that the treatment room, closed the door, The only way the operator could be
three similar incidents had occurred with and sat at the control terminaL alerted to patient difficulty was through
this equipment should have triggered The operator had held this job for audio and video monitors. On this day,
some suspicion and investigation by the some time , and her typing efficiency the video display was unplugged and
manufacturer and the appropriate gov­ had increased with experience. She could the audio monitor was broken.
ernment agencies. This assumes, of quickly enter prescription data and After the first attempt to treat him,
course, that these incidents were all re­ change it conveniently with the Ther­ the patient said that he felt like he had
ported and known by A E CL and by the ac's editing features. She entered the received an electric shock or that some­
government regulators. If they were not, patient's prescription data quickly, then one had poured hot coffee on his back:
then it is appropriate to ask why they noticed that for mode she had typed "x" He felt a thump and heat and heard a
were not and how this could be reme­ (for X ray) when she had intended "e" buzzing sound from the eq uipment. Since
died in the future. (for electron). This was a common mis­ this was his ninth treatment, he knew
About a year later (in February 1 987). take since most treatments involved X that this was not normal. He began to
after the second Yakima overdose led rays. and she had become accustomed get up from the treatment table to go for

J uly 1 993 27
help. It was at this moment that the personnel (including the quality assur­ him that another patient appeared to
operator hit the "P" key to proceed with ance manager) told him that AECL knew have been burned. Asked by the physi­
the treatment. The patient said that he of no accidents involving radiation over­ cist to describe what he had experi­
felt like his arm was being shocked by exposure by the Therac-25. This seems enced, the patient explained that some­
electricity and that his hand was leaving odd since AECL was surely at least thing had hit him on the side of the face,
his body. He went to the treatment room aware of the Hamilton accident that he saw a flash of light, and he heard a
door and pounded on it. The operator had occurred seven months before and sizzling sound reminiscent of frying eggs.
was shockcd and immcdiately opened the Yakima accident, and, even by its He was very agitated and asked, "What
the door for him. He appeared shaken own account, AECL learned of the happened to me, what happened to me?"
and upset. Georgia lawsuit about this time (the This patient died from the overdose
The patient was immediately exam­ suit h ad heen filed four months earlier). on May 1 , 1 986, three weeks after the
ined by a physician, who observed in­ The AECL enginccrs then suggested accident. He had disorientation that
tense erythema over the treatment area, that an electrical problem might have progressed to coma, fever to 1 04 de­
but suspected nothing more serious than caused this accident. grees Fahrenheit, and neurological dam­
electric shock. The patient was dis­ The electric shock theory was checked age. Autopsy showed an acute high­
charged with instructions to return if he out thoroughly by an independent engi­ dose radi ation i nj ury to the right
suffered any further reactions. The hos­ neering firm. The final report indicated temporal lobe ofthe brain and the brain
pital physicist was called in, and he found that there was no electrical grounding stem.
the machine calibration within specifi­ problem in the machine. and it did not
cations. The meaning of the malfunc­ appear capable of giving a patient an User and manufacturer response. Af­
tion message was not understood. The electrical shock. The ETCC physicist ter this second Tyler accident, the ETCC
machine was then used to treat patients checked the calibration of the Therac- physicist immediately took the machine
for the rest of the day. 25 and found it to be satisfactory. The out of service and called AECL to alert
In actuality, but unknown to anyone center put the machine hack into ser­ the company to this second apparent
at that time, the patient had received a vice on April 7, 1 986, convinced that it overexposure. The Tyler physicist then
massive overdose, concentrated in the was performing properly. began his own careful investigation. He
center of the treatment area. After-the­ worked with the operatoL who remem­
fact simulations of the accident revealed East Texas Cancer Center, April 1986. bered exactly what she had done on this
possible doses of 1 6 .500 to 25 .000 rads Three weeks after the first ETCC acci­ occasion. After a great deal of effort,
in less than 1 second over an area of dent, on Friday, April 1 1 , 1 9X6, another they were eventually able to elicit the
about 1 cm. male patient was scheduled to receive Malfunction 54 message. They deter­
During the weeks following the acci­ an electron treatment at ETCC for a mined that data-entry speed during ed­
dent, the patient continued to have pain �kin cancer on the side of his face. The iting was the key factor in producing the
in his neck and shoulder. He lost the prescription was for l O Me V to an area error condition: If the prescription data
function of his left arm and had periodic of approximately 7 x 10 cm. The same was edited at a fast pace (as i s natural
bouts of nausea and vomiting. He was technician who had treated the first Tyler for someone who has repeated the pro­
eventually hospitalizcd for radiation­ accident victim prepared this patient cedure a l arge n umber of times), the
induced myelitis of the cervical cord for treatment. Much of what follows is overdose occurred.
causing paralysis of his left arm and from the deposition of the Tyler Ther­ It took some practice before the phys­
both legs, left vocal cord paralysis (which ae-25 operator. icist could repeat the procedure rapidly
left him unable to spea k ) , neurogenic As with her former patient, she en­ enough to elicit the Malfunction 54 mes­
bowel and bladder, and paralysis of the tered the prescription data and then sage at will. Once he could do this, he
left diaphragm. He also had a lesion on noticed an error in the mode. Again she set about measuring the actual dose
his left lung and recurrent herpes sim­ used the cursor up key to change the delivered under the error condition. He
plex skin infections. He died from com­ mode from X ray to electron. After she took a measurement of about 804 rads
plications of the overdose five months finished editing, she pressed the return but realized that the ion chamber had
after the accident. key several times to place the cursor on become saturated. After making adjust­
the bottom of the screen. She saw the ments to extend his measurement abil­
User and manufacturer response. The "beam ready" message displayed and ity, he determined that the dose was
Therac-25 was shut down for testing the turned the beam on. somewhere over 4,000 rads.
day after this accident. One local AECL Within a few seconds the machine The next day, an engineer from AECL
engineer and one from the home office shut down, making a loud noise audible called and said that he could not repro­
in Canada came to ETCC to investi­ via the ( now working) intercom. The duce the error. After the ETCC physi­
gate. They spent a day running the ma­ display showed Malfunction 54 again. cist explained that the procedure had to
chine through tests but could not repro­ The operator rushed into the treatment be performed quite rapidly, AECL could
ducc a Malfunction 54. The AECL home room. hearing her patient moaning for finally produce a similar malfunction
office engineer reportedly explained that help. The patient began to remove the on its own machine. AECL then set up
it was not possible for the Therac-25 to tape that had held his head in position its own set of measurements to test the
overdose a paticnt, The ETCC physicist and said something was wrong. She asked dosage delivered. Two days after the
claims that he asked AECL at this time him what he felt, and he replied "fire" accident, AECL said they had measured
if there were any other reports of radi­ on the side of his face. She immediately the dosage (at the center of the field) to
ation overexposure and that the AECL went to the hospital physicist and told be 25 ,000 rads. An AECL engineer ex-

28 COMPUTER
plained that the frying sound heard by unable to reproduce the error on his The software problem. A lesson to be
the patient was the ion chambers being machine. but two months later he found learned from the Therac-25 story is that
saturated. the link. focusing on particular software bugs is
In fact. it is not possihle to determine The Therac-20 at the University of not the way to make a safe system. Vir­
the exact dose each of the accident vic­ Chicago is used to teach students in a tua lly all complex software can be made
tims received; the total dose delivered radiation therapy school conducted by to behave in an unexpected fashion un­
during the malfunction conditions was the center. The center's p hysicist, Frank der certain conditions. The basic mis­
found to vary enormously when differ­ Borger, noticed that whenever a new takes here involved poor software-en­
ent clinics simulated the faults. The num­ class of students started using the Ther­ gineering practices and b ui lding a
ber of pulses delivered in the 0.3 second ac-20, fuses and breakers on the ma­ machine that relies on the software for
that elapsed before interlock shutoff chine tripped, shutting down the unit. safe operation. Furthermore, the par­
varied because the software adjusted These failures, which had been occur­ ticular coding error is not as important
the start-up pulse-repetition frequency ring ever since the center had acquired as the general unsafe design of the soft­
to very different values on different the machine, might appear three times a ware overall. Examining the part of the
machines. Therefore. there is still some week while new students operated the code hlamed for the Tyler accidents is
uncertainty as to the doses actually re­ machine and then disappear for months. instructive, however, i n showing the
ceived in the accidents. ' Borger determined that new ,tudents overall software design flaws. The fol­
In one lawsuit that resulted from the ma ke lots of different types of mistakes lowing explanation of the problem is
Tyler accidents, the AECL quality con­ and use "creative methods of editing" from the description AECL provided
trol manager testified that a " cursor parameters on the console. Through for the FDA, although we have tried to
up" problem had been found in the experimentation, he found that certain clarify it somewhat. The description
service mode at the Kennestone clinie editing sequences correlated with blown leaves some unanswered questions, but
and one otherclinic in February or March fuses and determined that the same com­ i t is the best we can do with the informa­
1 985 and also in the summer of 1 9i15. puter bug (as in the Therac-25 soft­ tion we have.
Both times, AECL thought that the soft­ ware) was responsible. The physicist As described i n the sidebar on Ther­
ware problems had been fixed. There is notified the FDA, which notified Ther­ ac-25 software development and design,
no way to determine whether there is ac-20 users.' the treatment monitor task (Treat) con­
any relationship between these prob­ The software error is j ust a nuisance trols the various phases of treatment by
lems and the Tyler accidents. on the Thcrac-20 because this machine executing its eight subroutines (see Fig­
has independent hardware protective ure 2). The treatment phase indicator
Related Therac-20 problems. After the circuits for monitoring the electron­ variable (Tphase) i s used to determine
Tyler accidents, Therac-20 users (who beam scanning. The protective circuits which subroutine should be executed.
had heard informally about the Tyler do not allow the beam to turn on, so Following the execution of a particular
accidents from Therac-25 users) con­ there is no danger of radiation exposure subroutine, Treat reschedules itself.
ducted informal investigations to deter­ to a patient. While the Therac-20 relies One of Treat's subroutines, called
mine whether the same problem could on mechanical interlocks for monitor­ Datent (data entry), communicates with
occur with their machines. As noted ing the machine, the Therac-25 relies the keyboard handler task (a task that
earlier, the software for the Therac-25 largely on software. runs concurrently with Treat) via a
and Therac-20 both " evolved" from the
Therac-6 software. Additional functions
had to be added because the Therac-20 When Tphase is " 1 " (Datent):
(and Therac-25) operates in both X-ray If data enby complete, set Tphase to "3"
and electron mode, while the Therac-6
has only X-ray mode. The CGR em­
ployees modified the software for the
Therac-20 to handle the dual modes.
When the Therac-25 development
began. AECL engineers adapted the
software from the Therac-6, but they
also borrowed software routines from
the Therac-20 to handle electron mode.
The agreements hetween AECL and
CGR gave both companies the right to
tap technology used in joint products
for their other products.
After the second Tyler accident, a Mode/energy offset variable
physicist at the University of Chicago
Joint Center for Radiation Therapy ......... - --- ... ... ",
heard about the Therac-25 software :' Calibratio�·\.. ..... "
'- tables "
problem and decided to find out wheth­ .......-..
. - - .. �.,.'
er the same thing could happen with the
Therac-20. At first, the physicist was Figure 2. Tasks and subroutines in the code blamed for the Tyler accidents.

July 1 993 29
shared variable ( Data-entry this output table are transferred
completion flag) to determine Datent : to the digital-analog converter
whether the prescription data if mode/energy specified then during the ncxt clock cycle.
has been entered. The key- begin Once the parameters are all
board handler recognizes the calculate table index set, Datent calls the subrou­
completion of data entry and repeat tine Magnet, which sets the
changes the Data-entry com- fetch parameter bending magnets, Figure 3 is a
pletion variable to denote this. output parameter simplified pseudocode descrip­
Oncc thc Data-entry comple- point to next parameter tion of relevant parts of the
tion variable is set. the Datent until all parameters set software.
subroutine detects the vari- call Magnet Setting the bending magnets
able 's change in status and if mode/energy changed then return takes about 8 seconds. Magnet
changes the value of Tphase end calls a subroutine called Ptime
from 1 (Data Entry) to 3 (Set­ if data entry is complete then set Tphase to 3 to introduce a time delay, Since
Up Test) . I n this case, the if data entry is not complet e then several magncts need to be set,
Datent subroutine exits back if reset command entered then set Tphase to 0 Ptime is entered and exited
to the Treat subroutine, which return several times. A flag to indi-
will reschedule itself and be­ cate that bending magnets are
gin execution of the Set-Up Magnet : being set is initialized upon
Test subroutine. If the Data­ Set bending magnet flag entry to the Magnet subrou­
entry completion variable has repeat tine and cleared at the end of
not been set. Datent leaves the Set next magnet Pti m e . Furthermore. Ptime
value ofTphase unchanged and Call Ptime checks a shared variable, set
exits hack to Treat's main line. if mode/energy has changed, then exit by the keyboard handler, that
Treat will then reschedule it­ until all magnets ar e set indicates the presence of any
self. essentially rescheduling return editing requests. If there are
the Datent suhroutine. edits, then Ptime clears the
The command line at the Ptime: bending magnet variable and
lower right corner of the screen repeat exits to Magnet, which then
is the cursor's normal position if bending magnet flag is set then exits to Datent. But the edit
when the op e rator has com- if editing taking place then change variable is checked by
pleted all necessary changes if mode/energy has changed then exit Ptime only if the bending mag­
to the prescription. Prescrip­ until hysteresis delay has expired net flag is set. Since Ptimeclears
tion editing is signified by cur­ Clear bending magnet flag it during its first execution. any
sor movement off the com­ return edits performed during each
mand line. As the program was succeeding pass through Ptime
originally designed, the Data­ Figure 3. Datent, Magnet, and Ptime s ubroutines. will not be recognized. Thus,
entry completion variable by an edit change of the mode or
itself is not sufficient since it cnergy, although reflected on
does not ensure that the cursor is locat­ separately. If the keyboard handler sets the operator's screen and the mode/
e d on the command line. Under the the data-entry completion variable be­ energy offset variable, will not be sensed
right circumstances, the data-entry phase fore the operator changes the data in by D atent so it can index the appropri­
can be exited before all edit changes are MEOS, Datent will not detect the chang­ ate calibration tables for the machine
made on the screen. es in ME OS since it has already exited parameters.
The keyboard handler parses the mode and will not be reentered again. The Recall that the Tyler error occurred
and energy level specified by the oper­ upper collimator, on the other hand, is when the operator made an entry indi­
ator and places an encoded result in set to the position dictated by the low­ cating the mode/energy. went to the
another shared variable, the 2-byte order byte of MEOS by another concur­ command line, then moved the c ur s or
mode/energy offset (MEOS) variable, rently running task (H and) and can up to change the mode/energy, and re­
The low-order byte of this variable is therefore be inconsistent with the pa­ turned to the command line all within 8
used by another task (Hand) to set the rameters set in accordance with the in­ seconds. Since the magnet setting takes
collimator/turntable to the proper posi­ formation in the high-order byte of about 8 seconds and .'v1agnet does not
tion for the selected mode/energy. The MEOS. The software appears to include recognize edits after the first execution
high-order byte of the MEOS variable no checks to detect such an incompati­ of Ptime, the editing had been complet­
is used by Datent to set several operat­ bility , ed by the return to Datent, which never
ing parameters, The first thing that Datent does when detected that it had occurred. Part of
Initially, the data-cntry process forc­ it is entered is to check whether the the problem was fixed after the accident
es the operator to e nter the mode and mode/energy has been set in MEOS, If by clearing the bending-magnet vari­
energy, except when the operator se­ so, it uses the high-order byte to index able at the end of Magnet (after all the
lects the photon mode, in which case the into a table of preset operating param­ magnets have heen set) instead of at the
energy defaults to 25 .'v1eV. The opera­ eters and places them in the d igital-to­ end of Ptime.
tor can later edit the mode and energy analog output table, The contents of But this was not the only problem.

30 COMPUTER
Upon exit from the Magnet subroutine, specific accidents, but werc improve­ meeting at the a nnual conference of the
the data-entry subroutine (Datent) ments to the general machine safety. American Association of Physicists in
checks thc data-entry completion vari­ The full implementation or the CAP, :'v1.edicine. A t the meeting, users dis­
able. If i t indicates that data entry is including an extensive safety analysis, cussed the Tyler accident and heard an
complete, Datent sets Tphase to 3 and was not complete until more than two AECL representative present thc com­
Datent is not entered again. If it is not years after the Tyler accidents. pany's plans for responding to it. AECL
set, Datent leaves Tphase unchanged, AECL made its accident report to the promised to send a letter to all users
which means it will eventually be re­ FDA on April 15, 1 986. On that same detailing the CAP.
schcduled. But the data-entry comple­ date, AECL sent a letter to each Therac Several users described additional
tion variable only indicates that the cur­ user recommending a temporary "fix" hardware safety features that they had
sor has been down to the command line, to the machine that would allow contin­ added to their own machines to provide
not that it is still there. A potential race ued clinical use. The letter (shown i n its additionalprotection. An interlock (that
condition is set up. To fix this, A ECL complete form) read as follows: checked gun currcnt values) , which the
introduced another shared variable con­ Vancouver clinic had previously added
trolled by the keyboard handler task SUBJ ECT: CHANGE IN OPE R A TING to its Therac-2'l. was labeled as redun­
PROCEDURES FOR THE THE RAC25
that indicatcs the cursor is not posi­ dant by AECL. The users disagreed.
LIKEAR ACCELER A T O R
tioned on the command line. If this vari­ There were further discussions of poor
able is set, then prescription entry is still Effective immediatelv. a n d until further design and other problems that caused
'
i n progress and the value of Tphase is notice, the key used for moving the cursor 1 0- to 30-percent underdosing in both
left unchanged. back through the prescription seq uence modes.
(i.e., cursor "1 iP" inscribe d with an upward
The meeting notes said
pointing arrow) must not he usedfor editing
Governm enr and user respunse. The or any other purpose.
FDA does not approve each new med­ . . there was a general complaint hy all
To avoid accidental use of this key, the
users present about the lack of information
ical device on the market: All medical key cap must be removed and the switch
propagation. The users were not happy
contacts fixed i n the open position with
devices go through a classification pro­ about receiving incomplete information.
eleetrica 1 tape or other insulating material.
cess that determines the level of F D A For assistance with the latter yon should
The AECL representative countered by
approval necessary. Medical accelera­ stating that AECL docs not wish to sp re ad
contact your local AECL service
rumors and that AECL has no policy to
tors follow a procedure called pre-mar­ representative.
"keep things quiet." The consensus among
ket notification before commercial dis­ Disabling this key means that if any
the users was that an improvement was
prescription data entered is incorrect then
tribution. In this process, the firm must necessary.
[an] "R" reset command must be used and
estahlish that the product is substantial­ the whole prescription reentered.
ly equivalent i n safety and effectiveness For those users of the Multipart option, After the first user group meeting,
to a product already on the market. If it also means that editing of dose rate. there were two user group newsletters.
that cannot be done to the FDA's satis­ dose. and time will not be possible between The first, dated fall 1 986, contained let­
ports.
faction, a pre-market approval is re­ ters from Still, the Kennestone physi­
quired. For the Therae-25, the F D A cist, who wmplained about what he
On May 2, 1 986, the F D A declared
relJuired only a pre-market notification. considered to be eight major problems
the Therac defective, demanded a CA P,
The agency is basically reactive to he had experienced with the Therac-25.
and required renotification of all the
problems and requires manufacturers These problems included poor screen­
Therae customers. In the letter from the
to report serious ones. Once a problem refresh subroutines that left trash and
FDA to AECL, the director of compli­
is identified in a radiation-emitting prod­ erroneous information on the operator
ance. Center for Devices and Radiolog­
uct, the FDA must approve the manu­ console. and some tape-loading prob­
ical Hcalth, wrote
facturer's corrective action plan (CAP). lems upon start-up, which he discov­
The first reports of the Tvler acci­ ered involved the use of "phantom ta­
We have reviewed Mr. Downs' Apri l 1 5
dents came to the FD A from t h e state of letter t o purchasers and have concluded bles" to trigger the interlock system in
Texas health department, and this trig­ that it docs not satisfy the requirements the event of a load failure instead of
gered FDA action. The FDA investiga­ fur notification to purchasers of a defect in using a check sum. He askcd the ques­
an electronic product. Specifically, it does
tion was well under way when A ECL tion , "Is programming safety relying too
not describe the defect nor the hazards
produced a medical device rcport to associated with it. The letter does not much on the software interlock rou­
discuss the details of the radiation over­ provide any reason [or disabling the cursor tines?" The second user group newslet­
exposures at Tyler. The F D A declared key and the tone is not commensurate ter. in December 1 986. further discussed
the Therac-25 defective under the Ra­ with the urgency for doing so. In fact. the
the implications of the "phantom table"
letter i m p l ies t h e i n c o n v e n i e n c e t o
diation Control for Health and Safety parameterization.
operators outweighs t h e n e e d to disable
Act and ordered the firm to notify all the k e y . We request that you immediately AECL produced the first CAP on
purchasers, investigate the problem. renotify purchasers. June 1 3. 1 986. It contained six items:
determine a solution, and submit a cor­
rective action plan for FDA approval. AECL promptly made a new notice ( 1 ) Fix the software to eliminate thc
The final CAP consisted of more than to users and also requested an exten­ specific behavior leading to t he Tyler
20 changes to the system hardware and sion to produce a CAP. The FDA grant­ problem.
software, plus modifications to the sys­ ed this request. (2) Modify the software sample-and­
tem documcntation and manuals. Some About this time, the Therac-25 users hold circuits to detect one pulse above a
of these changes were unrelated to t he created a user group and held their first nonadjustable threshold. The software

July 1 993 31
sample-and-hold circuit monitors the not been p ro v i d ed, as re quested , on the
magnitude of each pulse from the ion
a
i nt e r cti o n wi t h ot h e r po rtions of the
software t o demonstrate the corre c ted
chambers in the beam. Previously, three software docs not adv er s e ly affect other
The investigators could
consecutive high readings were required software fu ncti o n s .
to shut off the high-voltage circuits, not reproduce the fault The J u l y 23 letter fro m t h e CDRH
re q uested a documented te st plan i n cl ud ­
which resulted in a shutdown time of condition that produced ing several spe cific pieces of I nfo rma tI O n
300 ms. The software modification rc­
the 1 9 8 7 Yakima identified in t h e letter. ThIS request h as
suits i n a reading after each pulse. and a b e e n i g n ored up to this p o i n t hy the
shutdown after a single high read ing. overdose. ma n u fa c ture r . Considering the ramIfI­
(3) Make Malfunctions ! through 64 cations of the current software problem,
changes in soft w ar e QA at t i t u de s arc
result in treatment suspend rather than
needed at AECL.
pause.
(4) Add a ncw circuit, which only
On October 30, the FDA responded
administrative staff can reset, to shut F D A also made a very detailed request to AECL's additional suhmissions, com­
down the modulator if the sample-and­ for a documented test plan. p l aini ng about the lack of a detailed
hold circuits dctcct a high pulse. This is A ECL responded on September 26
d e s cri p t i o n or t he accident and of suffi­
functionally equivalent to the circuit with several documents describing the
cient detail in flow diagrams. Many ,pe­
described i n item 2. However, a new software and its modifications but no
cific questions addressed the vagueness
circuit board is added that monitors the test plan. They explained how the Ther­
of the AECL response and made it clear
five sample-and-hold circuits. The new ac-25 software evolved from the Ther­
that additional CAP work must precede
circuit detects ion-chamber signals above ac-6 software and stated that "no single
approval.
a fixed threshold and inhibits the trig­ test plan and report exists for the soft­
A ECL. in response, created CAP
ger to the modulator after detecting a ware since both hardware and software
Revision ! on November 1 2 . This CAP
high pulse. This shuts down t h e beam were tested and exercised separately
contained 12 new items under "soft­
independently of the software. and together over many years:' AECL
ware modifications." all ( except for one
(5) Modify the software to limit edit­ concluded that the curre n t CAP im­
cosmetic change) designed to eliminate
ing keys to cursor up. backspace, and proved "machine safety by many orders
potentially unsafe behavior. The sub­
return. of magn i t ude and virtually eliminates
mission also contained other relevant
(6) Modify the manuals to reflect the the possibility of lethal doses as deliv­
documents including a test p l a n .
changes. ered in the Tyler incident."
The FDA responded to CAP Revi­
An FDA internal memo dated Octo­
sion 1 o n December 1 1 . The FDA ex­
FDA internal memos dcscribe their ber 20 commented on these A E CL sub­
plained that the software modifications
immediate concerns regarding the CAP. missions, raising several concerns:
appeared to correct the specific defi­
One memo suggests adding an indepen­
ciencies discovered as a result of the
dent circuit that "detects and shuts down U nfortu n ately . the AECL response also
Tyler accidents. They agreed that the
the system when inappropriate outputs seems to po int out an apparent lack of
documentation on .,oftware specifi c at io n s major items listed in CAP Revision !
are detected." warnings about when ion
and a software test pl an . would improve t h e Therae's operation.
chambers arc saturated, and under­ . . . co nc e r ns in clud e th e que sti o n of However, the FDA required AECL to
standable system error messages. An­ previolls kn owl e dge of prohlems hy AECL. attend to several further system prob­
other memo questions "whether all pos­ the a ppa r e n t pa ucity of software QA
[quality assuranceJ at the m an ufa c t uring lems before CAP approval. AECL had
sible h a rdware options have b e e n
f a c i l i t y . a n d possible w a r n i n g s a n d proposed t o retain treatment pause for
investigated by t h e manufacturer t o pre­
information dissemination t o others o f the some dose-rate and beam-tilt malfunc­
vent any future inadvertent high expo­ generi c ty p e p robl e ms .
� tions. Since these are dosimetry system
sure. . . As mentioned in my first review,
problems, the FDA considered them
On July 23 the FDA officially re­ there is some confusion on w h e th e r the
manufacturer should have been aware o f safety interlocks and believed treatment
sponded to AECL's CAP submission.
t h e softw a r e p r o b l e m s prior t o t h e must he suspended for these malfunc­
They c o ncept ua l l y agreed with the plan's [ accidenta l radiation o verdose s ] in .re � as. tions.
direction but complained about the lack AECL had received o ffI Cia l notifIcatIOn
AECL also planned to retain the
of specific information necessary to eval­ of a lawsuit in 1\ ove mb e r 1 985 fr om a
p a t i e n t cl a i mi n g accidental over-exposure malfunction cod e s , hut the FDA re­
uate the plan, especially with regard to
fr om a Thcrae-2.'i i n M a riet ta , Georgt a . . . quired bettcr warnings for the opera­
the software. The FDA requested a de­
I f knowledge of these software deficiencies tors. Furthe rmore , AECL had n o t
tailed description of the software­ were known beforehand, what would be p l a n ned on a n y quality assurance test­
development procedures and documen­ t h e F D A ' s p ost ure in this case"
. . . The materials submitted b y th e
ing to ensure exact copying of software,
tation, along with a revised CAP to
include revised requirements docu­ manufacturer have not been in s uffi c ien t but the FDA insisted on it. The 17DA
detail a n d claritv t o ensure an ade q u a te further req uested assurances that rigor­
ments, a detailed description of correc­
software QA progl'am curr e n t ly exists. For ous testing would he come a standard
tive changes, analysis of the interac­ ex amp le . a response has not been p ro vi d e d
part of AECL's software-modification
tions of the modified software with the with respect to the software part of t h e
CAP to t he CDRH [FDA Center fo r
procedures:
svstem, and detailed descriptions of the
Devices and Ra d io lo gical Health] r eq ue s t
r� vised edit modes, the changes made
for d oc u m e n ta t i o n on the revised We also expressed our concern t h a t YOll
t o the software setup table, and the requirements a n d s pe ci fica tions for the did n o t intend to perform t h e p rotocol
software interlock i nteractions. The ne w software. In add i t io n . a n a na l ysis has to future modifications to so ft w ar e . We

32 COMPUTER
believe that t h e ri goro u s testing must be The beam came on but the console the turntable ;v as i n the field-light posi­
performed each time a modification is tion - was on the order or 4,UUO to 5,000
displayed no dose or dose rate. After 5
made in order Lo ensure the modification
or 6 seconds, the unit shut down with a rads. After two attempts. the patient
does not adversely affect the safety of
the system. pause and displayed a message, The could have received 8,000 to 10,000 in­
message "may have disappeared quick­ stead of the 86 rads prescribed. AECL
AECL was also asked to draw up an ly'": the operator was unclear on this again called users on January 26 (nine
installation test plan to ensure both hard­ point. However, since the machine mere­ days after t h e accident) and gave them
ware and software changes perform as ly paused. h e was able to push the "P" detailed instructions on how to avoid
designed when installed. ke} to proceed with treatment. this problem, I n an FDA internal report
AECL submitted CAP Revision 2 and The machine paused again, this time on the accident, an AECL quality assur­
supporting documentation on Decem­ displaying "flatness" on the reason line. ance manager investigating the prob­
ber 22, 1 986. They changed the CAP to The operator heard thc patient say some­ lem is quoted as saying t hat the soft­
have dose malfunctions suspend treat­ thing over the intercom, but couldn't ware a n d hardware changes to b e
ment and included a plan for meaning­ understand him. H e went into the room retrofitted following t h e Tyler accident
ful error messages and highlighted dose to speak with the patient. who reported nine months earlicr (but which had not
error messages. They also expanded "feeling a burning sensation'" in the chest. yet heen installed) would have prevent­
diagrams of software modifications and The console displayed only the total ed the Yakima accident.
expanded the test plan to cover hard­ dose of the two film exposures (7 rads) The patient died i n April from com­
ware and software. and nothing more, plications related t o the overdose. He
On January 26, 1 987. A ECL sent the Later in the day. the patient devel­ had becn suffering from a terminal form
FDA their " Component and I nstal­ oped a skin burn over the entire treat­ of cancer prior to the radi ation over­
lation Test Plan" and explained that ment area. Four days later, the rcdness dose, but survivors initiated lawsuits
their delays were due to the investiga­ took on the striped pattern matching alleging that he died sooner than he
t ion of a new accident on January 1 7 at the slots in the blocking tray. The striped would have and endured unnecessary
Yakima. pattern was similar to the burn a year pain and suffering due to the overdose.
earlier at this hospital that had been The suit was settled out of court.
Yakima Valley Memorial Hospital, attributed to "cause unknown."
1987. On Saturday, January 1 7, 1 987, AECL began an investigation , and The Yakima software problem. The
the second patient of the day was to be users were told to confirm the turntable software prohlem for the second Yaki­
treated at the Yakima Valley Memorial position visually before turning on the ma accident is fairly well established
Hospital for a carcinoma. This patient beam, All tests run by the AECL engi­ and different from that implicated in
was to receive two film-verification ex­ neers indicated that the machin e was the Tyler accidents. There is n o way to
posures of 4 and 3 rads. plus a 79-rad working perfectly. From the informa­ determine what particular software de­
photon treatment ( for a total exposure tion gathered to that point, it was sus­ sign errors were related to the Kenne­
of 86 rads). pected that the e1edron beam had come stone, Hamilton. and first Y akima acci­
Film was placed under the patient on when the turntable was in the field­ dents. Given the unsafe programming
and 4 rads was a dministered with the light position, But the investigators could practices exhibited in the code, i t is
collimator jaws opened to 22 x 18 cm. not reproduce the fault condition that possible that unknown race conditions
After the machine paused. the collima­ produccd the overdose. or errors could havc been responsible.
tor j aws opened to 35 x 35 cm automat­ On the following Thursday. AECL There is speculation, however, that the
ically. and the second exposure of 3 rads sent a n engineer from Ottawa to inves­ Hamilton accident was the same as this
was administered. The machine paused t igate. The hospital physicist had. in the second Yakima overdose. In a report of
agai n. meantime, run some tests with film. He a conference call on January 26. 1 987,
The operator entered the treatment placed a film i n the Therac's beam and betwcen the AECL quality assurance
room to remove t he film and verify the ran two exposures of X-ray parameters manager and Ed Millcr of the FDA
patient's precise position. He used the with the turntable in field-light posi­ discussing the Yakima accident, Miller
hand control i n the treatment room to tion. The film appeared to match the notcs
rotate the turntable t o the field-light film that was left (by mistake) under the
position, a feature that let him check patient during the accident. This situation probably occurred in the
the machine's alignment with respect to After a week of checking the hard­ H a m i l t o n , Ontario, accident a couple of

the patient's body to verify proper beam ware, AECL determined that the "in­ years RgO. I t was not discovered at that
tune a n d the cause was a t trihuted to
position. The operator t h e n e ither correct machine operation was proba­ i ntermittent i n t e rl oc k failure. The
pressed the set button on the h and con­ bly not caused by hardware alone." After subsequent recall of the mul t i p l e
trol or left the room and typed a set checking the software, AECL discov­ microswitch logic network did n o t really
.
command at the console to return the ered a flaw ( described in the next sec­ �
solve the probl m .

turntable to the proper position for treat­ tion) that could explain the erroneous
ment: there is some confusion as to ex­ behavior. The coding problems explain­ The second Yakima accident was again
actly what transpired. When he left the ing this accident differ from those asso­ attributed to a type of race condition in
room, h e forgot to remove the film from ciated with the Tyler accidents. the software - this one allowed the
underneath the patient. The console AECL's preliminary dose measure­ device t o be activated in an error setting
displayed "'beam ready, " and the oper­ ments indicated that the dose delivered (a '"failure"' o f a software interlock ) .
ator hit the "B" key to turn the beam on. under these conditions - that is. when The Tyler accidents were related t o prob-

July 1 993 33
p osition check is performed by a sub­
rout ine of H k eper calle d Lmtchk (ana­
Hkeper
log/digit al limit checkin g) . Lmtchk first
checks the Class3 variable. If Class3
contai n s a n on ze ro value, Lmtchk calls
the Check Collimator (Chkcol) subrou­
tine. If Class3 contains zero, Chkcol is
bypas se d and the upper collimator po­
sition check is not performed. The Ch­
kcol subroutine sets or rese t s bit 9 of the
F$mal shared variable, depending on
the position of the upper collimator
Chkcol
(which in turn is checked by the Set-Up
If uppe r collimator position Test subroutine of Datent so it can de­
inconsistent with treatment cide whether to reschedule itself or pro­
then set bit 9 of F$mal
ceed to Set-Up Done).
Durin g machine setup, Set-Up Test
will be executed several hundred times
since i t reschedules itself waiting for
other events t o occur. In the code, the
Class3 va riable is incremented by one in
each pass through S et-Up Test. Since
F$mal the Class3 variable is 1 byte. it can only

I I I I I 1 1 1 �/ contain a maximum value of 255 deci­


mal. Thus. on every 256th pass through
Class3 the Set-Up T est code, t h e variable over­
flows and has a zero value. That means
Figure 4. Yakima software Oaw. that on every 256th pass th r ough Set­
Up Test, the upper collimator will not
be checked and an upper collimator
lems in the data-entry routines that al­ is entered and verified by the Datent fault will not be detected.
lowed the code to proceed to S e t - Up routine. the control variable Tphase is The overexposure occurred when the
Test be fore the full p rescription had changed so that the S e t -U P Test routine operator hit the "set" butto n at the pre­
bccn entered and acted upon. The Yaki­ is entered (see Figure 4). Every pass cise moment that Class3 rolled over to
ma accident involves problems encoun­ through the Set-Up Test routine incre ­ zero. Thus Chkcol was not executed.
tered lat er in the logic after the treat­ ments the upper collimator position and F$mal was not set to indicate the
ment monitor Treat reaches Set-Up Test. check, a shared variable called Class3. upper collimator was still in field-light
The Th e rac - 25's field-light feature If Class3 is nonzero, there is an incon­ posit ion. The software turned on t he
permits very pr ecise posi tionin g of the sistency and treatment should not pro­ full 25 MeV without the target in place
patient for treatment. The opcrator can ceed. A zero value for Class3 indicates and without scanning. A highly concen­
control the Therac-25 right at the treat­ that the relevant parameters are consis­ trated electron beam resulted, which
ment site using a small hand c o ntrol tent with treatment, and the beam i s not was scattered and deflected by the
offer ing certain limited functions for inhibited. stainless steel mirror that was i n the
patient setup. including se t t i n g gantry, After setting the Class3 variable, Set­ path.
collimator. and table motiuns. Up Test next checks for any malfunc­ AECL described the technical "fix"
Normally. the operator enters all the tions in the system by checking another implemented for this software flaw as
prescription data at the console (out­ shared variable (set by a routine that simple: The program is changed so that
side the treatment room) hefore the actually handles the interlock check­ t he Class3 variable is set to some fixed
final setnp of all machine parametcrs is ing) called F$mal to see if it has a non­ nonzero value each time through Set­
completcd in the treatment room. This zero value. A nonzero value in F$mal Up Test instead of being incremented.
gives rise to an "unverified" conditiun in d icates that the machine i s not ready
at the consule. The operator then com­ for treatment, and the Set - Up Test sub­ Manufacturer. government, and user
pletes t he p atient setup in the treatment routine is rescheduled. When F$mal is response. On Feb ruary 3. 1 987, after
room. and all relevant parameters nuw zero ( i n d i cati n g that everythin g is ready interaction with the FDA and others,
"verify." The console displays the mes­ for treatment), the Set-U p Test subrou­ including the user group. AECL an­
sage "Press set button" while the turn­ tine sets the Tphase variable equal to 2, nounced to its customers
table is in the fi eld-light position. The which results in next scheduling the Set­
operator now presses the set hutton on Up Done subroutine, and the treatment • a new software release to correct
the hand control or types "set" at the is allowed to continue. both the Tyler and Yakima soft­
console. That should set the collimator rhe actual interlock checking is per­ ware p rohle ms,
to the proper position [or treatment. formed by a concurrent Housekeeper • a hardware single -pulse shutdown
In the software. after the prescription task (I lkeper). The upper collimator circuit.

34 COMPUTER
• a t urn t able potentiomet er to inde­ the safety of the entire system to prevent until "the company can comp l e t e an
or mi n imi z e exposure from oth er fanlt exhaustive analysis of the design and
pendently monitor turntable posi ­
c o n d itions .
tion, and o peration of the safety systems employed
• a hardware turntable interlock cir­ for p atie n t and operator protection. "
On February 6, 1 987, Miller of the
cuit. AECL was told that the letter t o the
FDA called Pavel Dvorak of Canada's
users should include information on how
Health and Welfare to advise him that
The second item, a hardware singlc­ the users can operate the equipment
the FDA would recommend all Therac-
pulse shutdown circuit, essentially acts safely in t he event that they must con­
25s be shut down until permanent
as a hardw a r e interlock to prevent ov e r ­ tinue with p a t ie n t treatment. If AECL
modifications could be made. Accord­
dosin g by detecting an unsafe level of could not provide information that would
ing to Miller's notes on the phone call,
radiation and halting beam output after guarantee safe operation of the equip­
Dvorak agreed and indicated that they
one pulse of high energy and current. ment, AECL was requested to inform
would coordinate their actions with the
This p rovid es an independent safety the users that they cannot operate the
FDA.
mechanism to protect agains t a wide equip me n t safely, AECL complied by
On February 1 0, 1 987, the FDA gave
range of pot e ntial hardware failures and le tte rs d at ed F e bruary 20, 1987, to Ther­
a :"!otice of Adverse Findings to AECL
software errors. The turntable potenti­ ac-25 purchasers. This recommendation
d e clar i n g the Therac-25 to be defective
ometer was the safety device recom­ to dis con t inue use of the Therac-25 was
under US law. In part, the letter to
mended by several groups, including to last until August 1 987.
AECL reads:
the C R P I3 , after the Hamilton accident. On March 5, 1987, AECL issued CAP
After the second Yakima accident. I n J a nna r y I YH7, CDRH was advised of Revision 3, which was a CAP for both
the FDA became concerned that the anoth e r accidental radiation occurrence the Tyler and Yakima accidents. It con­
in Yakima, which was attributed to a s e con d tained a few additions to the Revision 2
use of t h e Therac-25 d u r i n g the CAP
s o f t w are defect re late d to the " Se t "
process, even with AECL's interim op­ modifications, n otably
command. In addition, t h e C D R H h as
era t ing instructions, involved too much become aware of at Ie ast two other software
risk to patients. The FDA concluded fe a t u r e s t h a t p ro v ide p ote n t i al for • c h anges to the software to eliminate
that the accidents had demonstrated that u n n e c e s s a ry or inadvertent pa ti e n t the behavior leading to the latest
exposure. O n e o f these i s related to the
the software alone cannot be replied Yakima accident,
method of e d i ting the prescription a fter
upon to assure safe operation o f the • four additional software functional
the " B " comm a nd is entere d and the other
machine. In a February 18, 1 987 inter­ is the calling of phantom tables when low modifications to improve safe ty, and
nal FDA memorandum, the director of d os es are prescribed. • a turntable po s i ti on interlock in the
the Division of Radiological Product s Further review of the circumstances s oftwa re .
s u rround i ng the a c c i d e n t a l r a d i a t i o n
wrote the following: occnrrenees a nd th e pote n ti a l for other
such incidents has le d us to c oncl ude that In their response on Apr i l 9, the FDA
I t is impossible for CD RII to find all in addition to the items i n yo u r propo,ed noted that in the appcndix under "turn­
potenti a l failure mode, and c ond it ion s of corrective a c t i on pl an , h a rd w a r e table p osition interlock circuit" the de ­
the s o ftw a re . AECL has indi ca ted the inte r locking of the t urnta b le to insure its
scrip tions were wrong. AECL had indi­
"simple software fix" will correct the proper posit i on pr i or to beam activation
cated "high" signals where "low" signals
turntable p osition prob lem di s pl ayed at appears t o be necessary to enhance system
Yakima. We h av e not yet had the safety and to correct the Therac-25 defect . were called for and vice versa. The FDA
opportunity to e v a l uate that modification. co rrecti ve act ion plan as
The refore , the also questioned t he reliability of the
Even if it does, based upon past his tory, I currently proposed is insufficient and m m t turntable pot e n t iome t er d esig n and
am not convinced that there arc not other be amended to i n c l ude t u r n t abl e asked whether the backspace key could
software glitches that could result in seri o us inte rloc ki ng a n d corrections fo r the three
inj ury . software problems mentioned above. still act as a carriage return in the edit
For exam p le , we are aware that AECL Without these co rrections , CDRH has mode. They requested a detailed de­
i s s u e d a u s e r ' s b ul l e t i n J a nu a ry 2 1 concluded that the consequences of the scription of the software portion of the
r e min d ing users of the p rope r p ro c e dur e defects rcprcsents a si gnifi cant p otentia l
single -p ulse shutdown and a block dia­
to fo l l o w if e d i t in g of p r e scr i p t ion risk of serio u s injury e ve n if th e Therac-25
is o pe rated in accordance with your int e ri m
gram to demonstrate the PRF (pulse
parameter is d e sired after e n terin g the
"B" (beam on) code but be fo re the CR ope rati ng instructions. CDRH. therefore, repetition frequency) generator. modu­
[carriage return 1 is press ed . I t seems that requests that AECL i mm edi a t e ly notify lator, and associated interlocks,
the normal edit ke ys (down arrow, right all p urc ha s e r s and recommend t hat use o f AECL responded on April 13 with an
arrow, or line feed) will be i nterpret e d as the devi ce on patients for routine th e rapy
update on the Therac CAP status and a
a C R and in i tiate exposure. One must use be discontinued until such time that an
either the b a c k sp a ce or left arrow key to amended c orre ctiv e action plan approved schedule ofthe nine action items pressed
edit. by CDRH is fully completed. You may by the users at a user group meeting in
We are also aware that ifthe dose entered also advise p urc hase rs that if the need for March. This unique and highly p rod uc­
i n t o the presc ription tables is below some an individual p a tient treatment outweighs
tive meeting provided an unusual op­
pre,et value, the system will default to a the potential risk, then extreme caution
and s tr i c t adherence to op e ra tin g safe ty
p ort u n it y to involve the users i n the
ph an tom table value unbeknownst to the
operator. This proble m is supposedly being procedures mnst be exercised. CAP evaluation process. It brought to­
addressed in p ropose d interim revision gether all concerned parties in one place
7 A, althou g h we are unaware of the details. At the same time, the Heal t h Protec­ so that they could decide on and ap­
We are in the position of saying that the
tion B ranch of the Canadian govern­ prove a course of action as quickly as
p ro po s ed CAP can r e a son ah l y be ex pecte d
to corre c t the deficiencies for which thev
ment instructed AECL to recommend possible. The attendees included repre­
we r e deve lop ed (Ty ler). We cannot sa y to all users in Canada that they discon­ s e ntatives from t h e m an u facturer
that we are [reasonably] confident about tinue the operation {)f t h e Therac-25 (AECL); all users, including their tech-

July 1 993 35
Safety analysis of the Therac-25

T h e Therac-25 safety analysis included ( 1 ) failure mode program changes to correct shortcomings, i mprove reli­
and effect analysis, (2) fault-tree analysis, and (3) software ability, o r i mprove the software package i n a general
examination. sense. The final safety report gives no information about
whether any particular methodology or tools were used in
Failure mode and effect analysis. An FMEA describes the software inspection or whether someone just read the
the associated system response to all failure modes of the code looking for errors.
individual system components, considered one by one.
When software was involved, AECL made no assessment Conclusions of the safety analysis. The final report
of the "how and why" of software faults and took any com­ sum marizes the conclusions of the safety analysis:
bination of software faults as a single event. The latter
The conclusions of the analysis call for 10 changes to
means that if the software was the initiating event, then no Therac-25 hardware; the most significant of these are
credit was given for the software mitigating the effects. i nterlocks to back up software control of both electron
This seems like a reasonable and conservative approach scanning and beam energy selection.
Although it is not considered necessary or advisable to
to handling software faults.
rewrite the entire Therac-25 software package, considerable
effort is being expended to update it. The changes recom­
Fault-tree analysis. A n FMEA identifies single failures mended have several distinct objectives: improve the protec­
leading to Class I hazards. To identify multiple failures and tion it provides against hardware failures; provide additional
quantify the results, AECL used fault-tree analysis. An FTA reliability via cross-checking; and provide a more maintain­
able source package. Two or three software releases are
starts with a postulated hazard - for example, two of the
antiCipated before these changes are completed.
top events for the Therac-25 are high dose per pulse and The implementation of these improvements including
illegal gantry motion. The immediate causes for the event design and testing for both hardware and software is well
are then generated i n an AND/OR tree format, using a ba­ under way. All hardware modifications should be completed
and installed by mid 1 989, with final software updates
sic understanding of the machine operation to determine
extending into late 1 989 or early 1 990.
the causes. The tree generation continues until all branch­
es end in "basic events." Operationally, a basic event is The recommended hardware changes appear to add
sometimes defined as an event that can be quantified (for protection against software errors, to add extra protection
example, a resistor fails ope n ) . against hardware failures, or to increase safety margins.
A E C L used a "generic failure rate" o f 1 0-" p e r h o u r for The software conclusions included the following:
software events. The company j ustified this number as
The software code for Beam Shut-Off, Symmetry Control,
based on the historical performance of the Therac-25 soft­
and Dose Calibration was found to be straight-forward and
ware. The final report on the safety analysis said that many no execution path could be found which would cause them
fault trees for the Therac-25 have a computer malfunction to perform incorrectly. A few improvements are being incor­
as a causative event, and the outcome of quantification is porated , but no additional hardware interlocks are required.
therefore dependent on the failure rate chosen for soft­ Inspection of the Scanning and Energy Selection func­
tions, which are under software control, showed no improper
ware.
execution paths; however, software inspection was unable
Leaving aside the gene ral q uestion of whether such fail­ to provide a high level of confidence in their reliability. This
ure rates are meaningful or measurable for software in was due to the complex nature of the code, the extensive
general, it seems rather difficult to justify a single figure of use of variables, and the time limitations of the inspection
process. Due to these factors and the possible clinical
this sort for every type of software error o r software behav­
consequences of a malfunction, computer-independent
ior. It would be equivalent to assigning the same failure interlocks are being retrofitted for these two casas.
rate to every type of failure of a car, no matter what partic­
ular failure is considered. Given the complex nature of this software design and
The authors of the safety study did note that despite the the basic multitasking design, it is difficult to understand
uncertainty that software introduces into quantification, how any part of the code could be labeled "straightfor­
fault-tree analysis provides valuable information in showing ward" o r how confidence could be achieved that "no exe­
single and multiple fai l u re paths and the relative i m por­ cution paths" exist for particular types of software behav­
tance of different failure mechanisms. This is certainly true. ior. However, it does appear that a conservative approach
- including computer-independent interlocks - was taken
Software examination. Because of the difficulty of in most cases. Furthermore, few examples of such safety
quantifying software behavior, AECL contracted for a de­ analyses of software exist in the literature. One such soft­
tailed code inspection to "obtain more information on which ware analysis was performed in 1 989 on the shutdown
to base decisions." The software functions selected for ex­ software of a nuclear power plant, which was written by a
amination were those related to the Class I software haz­ different division of AECL. ' M uch still needs to be learned
ards identified in the FMEA: electron-beam scanning, ener­ about how to perform a software-safety analysis.
gy selection, beam shutoff, and dose calibration.
The outside consultant who performed the inspection i n ­ Reference
cluded a detailed examination o f each function's imple­
1. W.C. Bowman et aI., " An Application of Fault Tree Analysis to
mentation, a search for coding errors, and a qual itative as­
Safety-Critical Software at Ontario Hydro," Cont. Probabilistic
sessment of its reliability. The consultant recommended Safety Assessment and Management. 1 991 .

C O M PUTER
nical and legal staffs; the US FDA; the (1) e lectron-beam scanning, would be an option for other clinics.
Canadian BRMD: the Canadian Atom­ (2) electron-energy selection, Software documentation was described
ic Energy Control Board; the Province (3) beam shutoff, and as a lower priority task that needed
of Ontario; and the Radiation Regula­ (4) calibration and/or steering. definition and would not be available to
tions Committee of the Canadian Asso­ the FDA in any form for more than a
ciation of Physicists. A ECL planned a fifth revision of the year.
According to Symonds of the BRMD, CAP to include the testing and safety On July 6, 1 987, AECLsent a lettcr to
this meeting was very important to the analysis results. all users to inform them of the FDA's
resolution of the problems since the reg­ Referring to the test plan at this, the verbal approval of the CAP and delin­
ulators, users, and the man ufaeturer ar­ final stage of the CAP process, an FD A eated how AECL would procced. On
rived at a consensus in one day. reviewer said Ju ly 2 1 , 1 9R7, AECL issued the fifth and
At this second users meeting, the final CAP revision. The major features
participants carefully reviewed all the Amazingly, the test data p re s ented to of the final CAP are as follows:
six known major Therac-25 accidents show that the software changes to h andl e
the edit problems in the Therac-25 are
and discussed the elements of the CAP appropriate prove the exact opposite result.
• All interruptions related to the do­
along with possible additional modifi­ A re view of the data table in the tes t simetry system will go to a treatment
cations. They came up with a prioritized results indicates that the fin a l heam type suspend, not a treatment pause. Opera­
list of modifications that they wanted and energy (edit change) [ h av e 1 no effect tors will not be allowed to restart the
on the initial beam type and e n e rgy. I can
included in the CAP and expressed con­ machine without reentering all parame­
only assume that either the fix is not right
cerns about the lack of independent or the data was entered incorrectlv. ters.
· The
software evaluation and the lack of a man ufacturer should be admonish ed for • A software single-pulse shutdown

hard-copy audit trail to assist in diag­ this error. Where is the QC [quality control] will be added.
nosing faults. review for the test progr am? AECL must: • An independent hardware single­
( 1 ) cl ari fy this situation, (2) change the
The AECL representative, who was test pro t ocol to prevent this type of error
pulse shutdown will be added.
the quality assurance manager, respond­ from uccurring, and (3) set up appropriate • Monitoring logic for turntable posi­
ed that tests had been done on the CAP QC co ntrol on data review. tion will be improved to ensure that the
changes, but that the tests were not turntable is in one of the three legal
documented, and independent evalua­ A further FDA memo said the AECL positions.
tion of the software "might not be pos­ quality assurance manager • A potentiometer will be added to
sible." He claimed that two outside ex­ the turntable. It will provide a visible
perts had reviewed the software, but he . . . cuuld nut give an explanation and
signal of position that operators will use
w i ll check i n to the circnmstances. He
could not providc thcir namcs. In rc­ SUbsequently called back a n d verified that to monitor exact turntable location.
sponse to user requests for a hard-copy t h e technician cumpleted the form • Interlocking with the 270-degree
audit trail and access to source code, h e incorrect ly. Correct op e r ation was bending magnet wil l be added to ensure
explained that memory limitations would witnessed by h imself and o t he rs . They will that the target and beam flattener are in
repe at and send us th e correct data sheet.
not permit including an audit option, position if the X-ray mode is selected.
and source code would not be made • Beam on will be prevented if the
available to users. At the American Association ofPhys­ turntable is in the field-light or an inter­
On May 1, AECL issued CAP Revi­ icists in Medicine meeting in July 1 987, mediate position.
sion 4 as a result of the FD A comments a third user group meeting was held. • Cryptic malfunction messages will
and users meeting input. The FDA re­ The AECL representative gave the sta­ be replaced with meaningful messages
sponse on May 26 approved the CAP tus of CAP Revision 5. He explained and highlighted dose-rate messages.
subject to submission of the final test that the FDA had given verbal approval • Editing keys will be limited to cur­
plan results and an independent safety and he expected full implementation by sor up, backspace, and return. All other
analysis, distribution of the draft re­ the end of August 1 987. He reviewed keys will be inoperative.
vised manual to customers, and com­ and commented on the prioritized con­ • A motion-enable foot switch will be
pletion of the CAP by June 30, 1 987. cerns of the last meeting. AECL had added, which the operator must hold
The FDA concluded by rating this a included in the CAP three of the user­ closed during movement of certain parts
Class I recall: a recall in which there is a requested hardware changes. Changes of the machine to prevent unwanted
reasonable probability that the use of to tape-load error messages and check motions when the operator is not in
or exposure to a violative product will sums on the load data would wait until control (a type of "dead man's switch").
cause serious adverse health conse­ after the CAP was done. • Twenty-three other changes to the
quences or death.' Two user-requested hardware modi­ software to improve its operation and
AECL sent more supporting docu­ fications had not been included i n the reliability, including disabling of unused
mentation to the FDA on June 5, 1 987, CAP. One of these, a push-button ener­ keys, changing the operation of the set
including the CAP test plan, a draft gy and selection mode switch, AECL and reset commands, preventing copy­
operator's manual, and the draft of the would work on after completing the ing of the control program on site, chang­
new safety analysis (described in the CAP, the quality assurance manager ing the way various detected hardware
sidebar "Safety analysis of the Therac- said. The other, a fixed ion chamber faults are handled, eliminating errors in
25" ) . The safety analysis revealed four with dose/pulse monitoring, was being the software that were detected during
potentially hazardous subsystems that installed at Yakima, had already been the review process, adding several addi­
were not covered by CAP Revision 4: installed by Halifax on their own, and tional software interlocks, disallowing

July 1 993 37
changing to the service mode while a in a n unexpected or undesired way un­
treatment is in progress, and adding der any circumstances (which is clearly
meaningful error messages. impossible) or not to use software at all
Accidents usually involve i n these types of systems. Both conclu­
The known software problems as­

sociated with the Tyler and Yakima ac­ a complex web of sions are overly pessimistic.
cidents will be fixed. interacting events with We must approach the problem of
The manuals will be fixed to reflect
• accidents in complex systems from a
multiple contributing
the changes. system-engineering point of view and
factors. consider all possible contributing fac­
In a 1 987 paper, �Iiller, director of tors. For the Therac-25 accidents, con­
the Division of Standards Enforcement, tributing factors incl uded
CDRH. wrote about the lessons learned
from the Therac-25 experiences.' The guides _ and r e g u lations to guide them a n d • management i n adequacies and lack
first was the importance of safe versus s
we have b e e n r e a sured by the hither lo of procedures for following through on
excellent record of these machines. Except all reported incidents.
" user-friendly" operator interfaces -
for a few incidents i n the 1 960s (e .g .. at
in other words, making the machine as • ovcrconfidence in the software and
H ammersmi t h . H a m b u rg) t he use o f
easy as possible to use may conflict with medical accelerators has b e e n rem a rka bly
removal of hardware interlocks (mak­
safety goals. The second is the impor­ free of s eri ous radiation accidents until ing the software into a single point of
tance of providing fail-safe designs: now. Perhaps, though. we have been spoi ed l failure that could lead to an accident).
by this success. I • presumably less-than-acceptable
The second lesson is that fo r comple x software-engineering practices, and
interrupt-driven software, t iming is of Accidents are seldom simple - they • unrealistic risk assessments along
c r i t i c a l i m p o r t a n c e . In both of t he s e
usually involve a complex web of inter­ with overconfidence in the results of
si t uatio n s . operator action w i t h i n very
narrow time -frame windows was necessary acting events with multiple contribut­ these assessments.
for the accidents to occur. It is unlikely ing technicaL h uman, and organization­
t h a t software t e s t i n g will discover a il al factors. One of the serious mistakes The exact same accident may not hap­
possib l e er rors t h a t i n v o lve o p e r a t o r that led to the multiple Thcrac-25 acci­ pen a second time_ but if we examine
intervention at p recis e time fra m e s during
softw are operation. These mac h ines , for
dents was the tendency to believe that and try to ameliorate the contributing
example. have been e x e rc i s e d for the cause of an accident had been deter­ factors to the accidcnts we have had, we
t housands of hours in the factorv and in mined (for example, a microswiteh fail­ may be able to prevent different acci­
the hos p ita ls without acci d e n t . T herefo re . ure in the Hamilton acciden t ) without dents in the future. In the following
o n e m u s t p r o v i d e for p r e v e n t i o n of
adequatc evidence to come to this con­ sections, we present what we feel are
catastrophic results of failures when they
d o occ u r . clusion and without looking at all possi­ important lessons learned from the Ther­
I. for one. w i l l not be surprised if other ble contributing factors. Another mis­ ac-25. You may draw different or addi­
s oftwa r e errors appear with this or other take was the assumption that fixing a tional conclusions.
equipment i n the future.
particular error (eliminating the cur­
rent software bug) would prevent fu­ System engineering. A common mis­
�iller concluded the paper with ture accidents. There is always another take in engineering, in this case and
software bug. many others, is to put too much confi­
FDA has performed extensive review
of the Therac-25 softw a re and hardware Accidents are often blamed on a sin­ dence in software. r-; on software profes­
safety s y s t e m s . W e c a n n o tsay with gle cause like human error. B ut virtual­ sionals seem to feci that software will
absolute certainty that all software l y all factors involved in accidents can not or cannot fail: this attitude leads to
prob em sl that might result in imp rop e r be labeled human error. except perhaps complacency and overreliance on com­
dose h a v e been found and eliminated.
for hardware wear-out failures. Even puterized functions. Although software
Howev e r . we a r e c o n f i d e n t t h at t h e
hardware a n d software safety featu res such hardware failures could be attrib­ is not subject to random wear-out fail­
recently added will prevent future uted to human error (for example, the ures like hardware, software design er­
catastrophic consequences of failure. designer's failure to provide adequate rors are much harder to find and elimi­
redundancy or the failure of operation­ nate. Furthermore, hardware failure
al personnel to properly maintain or modes are generally much more limit­
Lessons learned replace parts) : Concluding that an acci­ ed, so building protection against them
dent was the result of human error is not is usually easier. A lesson to be learned
Often. it takes an accident to alert very helpful or meaningful. from the Theriic-25 accidents is not to
people to the dangers involved in tech­ It is nearly as useless to ascribe the remove standard hardware interlocks
nology. A medical physicist wrote about cause of an accident to a computer error when adding computer control.
the Therac-25 accidents: or a software error. Certainly software Hardware backups, interlocks, and
was involved in the Therac-25 accidents, other safety devices are currently being
In the past decade or two, the medical but it was only one contributing factor. replaced by software in many different
accelerator "industry" has become perhaps If we assign software error as the cause types of systems, including commercial
a lillie complacent about safety. We have
ofthe Therac-25 accidents. we are forced aircraft, nuclear power plants, and weap­
assumed that t h e manufacturers have all
kinds of safety design experience since
to conclude that the only way to prevent on systems. Where the hardware inter­
they ' v e been in the business a long time . such accidents in the future is to build locks are still used. they are often con­
We know that there are many safet y codes, perfect software that will never behave trolled by software . Designing any

38 COMPUTER
dangerous system in such a way that one incident-analysis procedurcs that they Software engineering. The Therac-25
failure can lead to an accident violates apply whenever they find any h i nt of a accidents were fairly unique in having
basic system-engineering principles. In prohlem that might lead to an accident. software coding errors involved - most
this respcct. software needs to be treat­ The first phone call by Still should have computer-related accidents have not
ed as a single component. Software led to an extensive investigation of the involved coding errors but rather errors
should not be assigned sole responsibil­ events at Kcnnestone. Certainly, learn­ i n the software requirements such as
ity for safety, and systems should not be ing about t he first lawsuit should h ave omissions and mishandled cnvironmen­
designed such that a single software triggered an immediate response. Al­ tal conditions and system states. Al­
error or software-engineering error can though hazard logging and tracking is though using good basic software-engi­
be catastrophic. req uired in the standards for safety­ neering practices will not prevent all
A related tendency among engineers critical military projects, it is less com­ software errors. it is certainly required
is to ignore software. The first safety mon in nonmilitary projects. Every com­ as a minimum. Some companies intro­
analysis on the Therac-25 did not in­ pany building hazardous e quipme n t ducing software into their systems for
clude software (although nearly full re­ should h a v e hazard logging a n d track­ thc first time do not take software engi­
sponsibility for safety rested on the soft­ ing as well as incident reporting and neering as seriously as they should. Ba­
ware). When prohlemsstarted occurring, analysis as parts of its quality control sic software-engineering principles that
investigators assumed that hardware was procedures. Such follow-up and track­ apparently were violated with the Ther­
the cause and focused only on the hard­ ing will not only help prevent accidents, ac-25 include :
ware. Investigation of software's possi­ but will easily pay for themselves in
ble contribution to an accident should reduced insurance rates and reasonable • Documentation should not he an
not be the last avenue explored after all settlement of laws uits when they do afterthought.
other possible explanations are elimi­ occur. • Software quality assurance practic­
nated. Finally, overreliance on the numeri­ es and standards should be estab­
I n fact. a software error can always bc cal output of safety analyses is unwise. lished.
attributed to a transient hardware fail­ The arguments over whether very low • Designs should be kept simple.
ure, since software (in these types of probabilities are meaningful with re­ • Ways to get information about er­
process-control systems) reads and is­ spect to safety are too extensive to sum­ rors - for example. software audit
,ues commands to actuators. Without a marize herc. But. at the least, a healthy trails - should be designcd into the
thorough investigation (and wi thout on ­ skepticism is in order. The claim that software from t h e beginning.
line monitoring or audit trails that save safety h ad been increased five orders of • Thc software should he subjected
internal state information), it is not pos­ magnitude as a result of the microswitch to extensive testing and forma l
sihle to determine whether the sensor fix after the Hamilton accident seems anal ysis at t h e module and software
provided the wrong information, the hard to justify . Perhaps it was b ased on level: system testing alone is not
software provided an incorrect com­ the probability of failure of the mi­ adequate.
mand, or the actuator had a transient croswitch (typically 10-') ANDed with
failure and did the wrong thing on its the other interlocks. The problem with In addition. special safety-analysis and
own. In the Hamilton accident, a tran­ all such analyses is that they exclude design procedures must he incorporat­
sient microswitch failure was assumed aspects of the problem (in this case , ed into safety-critical software projects.
to be the cause, even t hough the engi­ software) that are difficult to quantify Safety must be built into software. and.
neers were unable to reproduce the fail­ but which may have a larger impact on in addition. safety must be assured at
ure or find anything wrong with the safety than the quantifiable factors that the sy'item level despite software er­
microswitch. are included. rors.'I I" The Therac-20 contained the
Patient reactions were the only real Although management and regulato­ same software error implicated in the
indications of the seriousness of the prob­ ry agencies often press engineers to Tyler deaths, but the machine included
lems with the Therac-25. There were no obtain such numbers, engineers should hardware interlocks that mitigated its
independent checks that thc software insist that any risk assessment n umbers consequences. Protection against soft­
was operating correctly (including soft­ used are in fact meaningful and that ware error, can also be built into t h e
ware checks). Such verification cannot statistics of this sort are treated with software itself.
be assigned to operators without pro­ caution. In our enthusiasm to provide Furthermore. important lessons about
viding them with some means of detect­ measurements. we should not attempt software reuse can be found here. A
ing errors. The Therac-25 software "lied" to measure the unmeasurable. William naivc assumption is often made that
to the operators. and the machine itself Ruckelshaus, two-time head of the US reusing software or using commercial
could not detect that a massive over­ Environmental Protection Agency, cau­ off-the-shelf software increases safety
dose had occurred. The Therac-25 ion tioned t h at '"risk assessment data can be because the software has been exer­
chambers could not handle the high like the captured spy: if you torture it cised extensively. Re using software
density of ionization from the unscanned long enough. it wil l tcll you anything mod ules does not guarantee safety in
electron beam at high-beam current; you want to know. "7 E.A. Ryder of the the new system to which they are trans­
they thus became satnrated and gave an British Health and Safety Executive has ferred and sometimes leads to awkward
indication of a low dosage. Engineers writtcn that the numbers game in risk and dangerous designs. Safety is a qual­
need to design for the worst case . assessment "should only he played in ity ofthe system in which the software is
Every company building safety-criti­ private hetween consenting adults. as it used: it is not a quality of thc software
cal systems should have audit trails and is too easy to be misinterpreted."x itself. Rewriting the entire software to

July 1 993 39
get a clean and si mp l e de si g n may b e swer and involves ethical and political t e s t a t l e a s t t h e operation of s a f e t y
i n t e r l o c k s d ur in g co m m i ssi o ni ng . Few
s a f e r i n many ca se s
. issues that cannot be answered by sci­
however have t h e time or resources to
Taking a c o u p l e of prog r a m m i ng ence o r e n gi n e eri ng a lo n e How ev er at
. .
c on du ct a co mpre h e ns i ve assessment of
courses o r p r og r a mmin g a h ome com­ t h e l e ast , b e t t e r p rocedure s are certain­ safety design.
puter does no t qu a li fy anyone to pro­ ly required for report in g p ro b le m s to A more effective approach might be t o

duce safety-critical software. A l t h oug h the FDA and to u s er s .


re q u ire that prior to t h e use of a new type
of accelerator i n a pa r t icul ar jurisdict i o n ,
c e r t i fi c atio n of soft wa re eng i ne er s is not The issues involved in r egula t io n of
an i ndepe n d e n t safety a n a l ysi s i s m a d e hI'
y et r e q u i re d more events l ik e those
, ri s k y t e c h no l og y ar e complex Ove r l y .
a p anel (i n c lu d i n g but not limited to medical
associated with t h e Therac-25 will make strict standards c a n i n h i b i t pro g r e ss , ph y s i c i s t s ) . S uc h a p a n e l c o u l d be
such certification incvitable. There i s r e q u i r e t e c h n i q u e s b e h i n d the state of e s ta bl ished within or without a regulatory
fram e w or k 1
a ct i vi ty in Britain to specify re qu i r e d t h e art. and t r a n s fe r r es po nsib i l i t y from
courses for t h o s e wor k i n g on critical t h e m a n ufact urer to t h e government.
software. Any engineer is not automat­ The fix i ng o f re sp o ns i b i lity requires a It is clear that users need to be 111 -
ically qualified to b e a software e n g i­ de l i c a te balance . Someone must repre­ volved. It was us e rs who found t h e prob­
n e e r - a n ex t e n si v e p r og ram o f stu dy sent t h e pub li c s needs. which may be
' lems with the Therac-25 and forced
and e xpe r ie nce is r equ i red Safe t y c rit ­
. - subsumed by a c o mp a n y s desire fo r ' AECL to r es po nd The process of fixing
.

ical so ft wa re e n gin e e ri ng requires train­ p r o fi t s . On t h e other hand, standards the Th e r ac 2 5 was user driven - the
-

ing and experience i n addi tion t o t h a t can have t h e undesirable e ffect or l i m i t ­ man ufacturer was slow to r e sp o n d The .

r e qu i r e d for noncritical softwa re . i n g t h e safety e ffort s and investment of Therac-25 user group meetings w e re .

A lth ough t h e user interface of t h e c o m p an i e s t h at fe e l t h e i r le gal a n d mor­ a cc o r din g to p a r tici p a n t s, i mport ant to
Therac-25 h a s a tt ract ed a l ot of atten­ al resp on s i bi l it i e s are fulfil led i f t hey t h e resolution o f t he p roble ms B u t if .

tion. i t w a s really a side issue in the follow the standards. users are to be i nv o l v e d t h e n t h e y must
.

acci de nt s Cert a in l y i t could h ave bee n


. . Some o f t h e most effective standards be prov i ded w it h in form at i o n a n d t h e
i m p r o v e d l i k e Illa n y other a sp e c t s of
. ami efforts fo r sa fe t y come from users. a bil i t y to p e rform this function. Manu­
this software. Either software e n gin ee rs Manufacturers have more incentive to facturers need to understan d that the
n e ed b e tt e r trai n i ng in i nt e rface de si gn , sat i sfy customers than to s at is fy govern­ adversarial approach and the attempt
or mo r e i n p u t i s n e e d e d fr om h U Illan ment agencies. The American Associa­ to keep gov e rnm e n t age n ci e s and users
factors e n g i ne e r s . There a l s o needs to tion of P h ys ici sts in Medicine estab­ in the dark about p ro bl em s will not be
be gre a te r recogn i t ion of p ot e n tial con­ lished a task group to work o n p ro b l em s to t h e i r benefit in t h e long run .
flicts between user -fr ie n d l y interfaces associated with computers in radiation T h e US An Force has one of t h e most
and safety. One go al of i n t e r fac e d e s i gn t h erap y in 1 979, long before the Ther­ extensive programs to i nform users.
is to make the interface as easy as pos­ ac-25 p robl e m s be g a n The accidents
. Contractors who build space systems
sible for the operator to use. But in the intensified these efforts. and t h e associ­ for t h e An Forc e must provide an A cci ­
Thera c-25 some design features ( for
. at i o n is d e ve l opin g user-written stan­ dent Risk Assessment R epor t ( A FAR)
e xa m p l e . not r e q u i ri n g t h e oper ator to dards. A re po r t by I.A. R a w l i n s on of t o system users and operators that d e ­
reenter p ati e n t p re scr i ption s a ft er m i s ­ t h e Ontario Cancer I nst i tu t e a tt e m p t ed scribes t h e h a z a r d ou s s u bsy s t e m s and
t a k e s ) and later c h a n ge s ( allo wi n g a to define t h e ph y sic i st s role i n ass u r in g
' operations associated with that system
carr i age return to indicate that infor­ ad e quat e safe t y i n me di c a l acc ele r at or s : and its i n t erfac es . The AFAR also com ­

mation has been entered c o rrec t l y) en­ pre h e n sive ly id e n t i fies a n d e va l uat es t he
h a n ce d u s a b i l i t y at t h e e x pe n se of We could continue our t radit i o n al role, s yst e m s accident risks: p rovi de s a means
'

5 a fe ty . which has been t o provide input t o the


of subs t a n t i ati ng c o mp l ia n c e with safe­
ma n ufacturer o n safety issues but to leave
F i n a l l y n ot o n l y mu st s a fety be con­
. t y re q ui reme n ts : summarizes all system­
t h e major safety design decisiom to the
sidered i n t h e i n i t ia l d esig n of th e soft­ man ufacturer. We can p rov i d e this i n p u t s afe t y analyses and t es t i ng p erfo rm e d
ware and it operator interface, but the t h ro ugh a numbe r of mechanisms . . . These on each system and sub sys t em : and iden­
reasons for de si g n decisions should b e i n c l u d e p a r t i c i p a t i o n i n s t a n da r d s t i fi e s des i gn and operating l i m i ts to be
o rga ni za t i o ns s u c h as the l Ee [ I nt er n a ­
recorded so t h at decisions are not i n ad ­ i mp o s e d on system components to pre­
tional Electrotechnical Commission ] . in
vertently undone in future m od i fi c a­ p ro fe ss i onal association groups . . . and in clude or minimize accidents that could
t io n s
. accelerator user groups such as the Thcrac- cause injury or d ama ge .
25 user group. I t includes also making usc An i n te res t i n g re quire m e n t in the Air
User and government oversight and of the Problem R e porti ng Program for
force AfAR is a re cord of all s a fety ­
Radia t i on Therapy Devices and i t
standards. Once t h e fDA got involved i nc l ud e s consultation i n the d r a ft i ng of the
related failures or accidents associated
i n the Ther ac-2 5 , th e i r r e sp on se wa s government safety regulations. Each of with system acceptance. test, and ch ec k ­
im p re s s i ve . e 5p e ei all y con si d er i ng how these if purs ue d vigorously will go a l o ng ou t a lo n g with an assessment of t h e
.

l i t tle c x pe rie n ce th e y had with similar wa y t o i m pro v in g safety. It is debatable im p a ct o n fl ig h t a n d ground safet y a n d
however whether these actions would he
p r o bl e m s i n co m pu t e rize d medical de­ action taken to prevent recurren c e . T h e
s u fficient to p re v en t a fu t ure series of
v ices. S i nc e the T h e r a c -2 5 events. the AFAR also must address failures, acci­
accidents.
FDA has moved to improve th e r e por t ­ Pe r h ap s what is needed in ad d i t i on i s a dents, or inciden t s from p re v i ou s mis­
in g system and to a ugmen t their p roc e ­ mechanism by which t h e safe ty of a n y new sions of this system or ot h e r systems
dures and g ui d e li ne s to i nclude soft­ model of a c ce l e r a tor is assessed
usi ng similar hardware. All co rr e ct i ve
inde p e nden t l y o f the manufacturer. This
ware. The p rob le m of deciding when to action taken to p r eve n t recurrence must
task could he done b y the i n d ividual
forbid the use of me d ical devices that p hysici st a t t h e ti m e of acceptance of a be documented. The ac cid en t an d co r ­
are also sav i n g lives has n o simple an- new machine. I nd eed man y use rs a lre a dy recti o n h is to ry must be u pd a te d through-

40 COMPUTER
out the life of the system. I f any design nonmedical systems. We must learn 10. I'.G. Le veson. " Software Safety in Em­
or operating parameters change after from our mistakes so w e do not repeat b edd e d Computer Sys te ms . " Comm.
A C.M. Feb . 1 99 1 . pp. 34-46.
government approval, the AFAR must them . •
be updated to include all changes af­
fecting safety.
Unfortunately. the Air Force program
is not practical for commercial systems. Acknowledgments
However. go v e r nm e n t agencies m i g h t
require manufacturers to provide simi­ Ed M i l l er of th e FDA was e s p eCIall y hel p­
ful. hoth in p rovi d i n g information to be in­
lar information to users. If required for
cluded i n t h is article a n d i n reviewing and
eve ryone. compe t i t i ve pressures to with­
comme n t i ng on the final VerS1 0l1. Gordon
hold information might be lessened. S y m o n d s of the C a n a d i a n Govern m e n t
Manufacturers might find that p r ovid ­ H e a l t h Protection B r a n c h also revie w ed and

ing such information actually increases comme n te d on a draft of the art i c le . Fin a l l y .
t h e referees. several of w h o m were appar­ Nancy G. Leveson is Boeing p rofe ss or o f
customer loyalty and confidence. An
e n t l y i n timately i nvolv ed i n some of t he acci ­ C omputer Science a n d Engineeri n g a t t h e
emphasis on safety can be turned into a dent s . w ere al so v ery hel pful in provi d i ng C n i versi t y of Washi ngton . Prev iously . she
competitive ad v an t ag e . additional i n formation ahout the accidents. w as a professor in the Information and Co m­
p uter Science Departme n t at t h e University
tlf California. I rv i ne . Her research interests
are software safety and reliability. induding
OSt previous accounts of the

M
soft w a re hazard analysi s. requirements spec­
Therac-25 accidents blamed References i fic at i on and analysis. d"' i g n for s a fety . and
them on a software error and verification of safety. S h e consults w o r ld ­
The i nformati on in this article was gathered
stopped there. This is not very useful wi de for i ndust ry and government on safetv ­
from offici al FDA documents a n d in ternal c r i t ica l sy stem s .
amI. in fact. c a n be misleading and dan­
gerous: If we arc to prevent such acci­
m emos . la ws u i t depo si ti o n s. l e tt e rs. a n d va r­ �
Leves n received a RA in mathem ati c s , an
i ous oth e r sources t h at are n ot puhli cly avail­ MS i n operatIOns research. and a PhD i n
dents in the future. we must dig deeper. able. Cumpu ier doe� not pn.)vidc refere n ces computer science. all from the University o f
Mmt accidents invo l v ing complex tech­ to documents that are unavailahle to t h e C a l i fornia a t Los Angeles. S h e i s t h e editor
public. in ch i e f of / f, t- t-. irllnclllCiions all Softw<lre
nology are caused by a combination of
Engineering and a m e mh e r of the board of
organizational, managerial. technical.
I.A. Rawlinson. " Report on the Therac- dIre c tors of the Comp u t i ng Re search A sso­
and. sometimes. sociological or pol i t i­ 2 5 . " OCTRF/OCI Physicists Me eti n g . ci a t i on .
cal factors. Preventing accidents requires K i ngston . O n t . . C a n a d a . May 7 . 1 98 .7
paying attention to all the root causes.
not j ust the precipitating event in a par­ 2. L Houston . "What Do t h e Simple f'olk

t i cu l ar circumstance. Do" : So ftw are Safe ty in th e Col tage In­


dustry." JEEE COmptllerS in Medicine
Accidents are unlikely to occur in
Con! . 1 985.
exactly the same way again. If we patch
only the symptoms and i g n o re the deep­ CA. Bow sher. " \1edical Devices: Th e
er underlying causes or we fix only the Public Health at R i s k . " LJ S Gov ' t Ac­
specific cause of one accident. we are co u nt i n g Office R ep ort GAOIT-PEMD-

unlikely t o p reve n t or mitigate future 90-2. 046987/139922. 1990.


accidents. The series of accidents in­
4. M. Kivel. ed .. Radiological Health Blllle­
volving the Therac-25 is a good exam­ tin. Vol. XX. 1'0. 8. US Fede r a l Food and Clark S. Turner is seeking h i s PhD in t h e
ple of exactly this problem: Fixing each [)rug A dmi n istrati on . Dec. 19l1b. Information a n d Computer S cience D epart­
individual software flaw as it was found m e n t at the Universitv
' of C a l i fornia. Irvin e .
did not solve the device's safety prob­ 5 . Medical Device Recalls. Examination of stu d y i ng unde r " ancy Le veson. H e i s also an
Selected Cases. GAO/PEMD-90-6. 1 989. attorney admitted to pra ct ice i n California.
lems. Virtually all complex soft w a re will
New York. and Massachusetts. His interests
b e have i n an unexpected or u ndesired
fashion under some conditions - there
6 . E . Mi ller. "The Thera c- 25 E xperie n ce . "
include risk a n alysis of safe t y- c rit i cal soft­
w are system s a n d legal l i abi l i ty issues involv­
Proc. Cun! Stale Radiation Cuntrul Pro­
will always be another bug. Instead. i n g unsafe softwa r e systems .
gram Directors. 1 98 7 .
Turner received a BS in mathematics fro m
accident, must he understood with re­
Ki n g' s
College in Pennsylvania. an MA in
spect to the complex factors involved. 7 . W. D . Ruckelshaus. "Risk i n a Free Soci­
mathematics from Pe n n syl van i a State U n i ­
I n addition. changes need to be made to e t y . " Risk A nalysis. Vol. 4. No. 3. 1 984.
ve rsi ty . a J D from the U n iversi t y of :Vlai n e .
pp. 1 57 - 1 62.
eliminate or reduce the underlying caus­ and an M S in computer science from th e
es and contributing factors that increase I
U n i v e rs i ty o f Californi a . rv i n e .
8 . E.A. Ryd e r. "The Control of Major Haz­
the likelihood of accidents or loss re­ ards: The Advi sory Committee's Third
sulting from them. and Final Rep ort." Tran scrip t of Can! R e a d e r s c a n c o n t a c t L e v e s o n at t h e
Although these accidents occurred in European Major Hazards. O yez Scien­ D e p a r t m e n t of C o m p u t e r S c i e n c e a n d
software controlling medical devices. tific and T ech ni c a l Services and Autho r s . En gineeri ng . FR-35. University of Wash­
the lessons apply to all types of systems L on don. 1 984 .
ington , Seattle. WA Yll195. e - lIlai l leve so n (Q'
cs.washington.edu: or Turner at the Infor­
where computers control dangerous
9. N . G . Leveson. " Soft w a re Safety : Why . mation an d C omputer Sci en c e D e p artm ent .

devices. In our experience, the same What. a n d How." ACM Computing Sur­ Universitv of C a l i forn i a. I rv i n e . I r vi n e . CA
types of mistakes are being made i n veys. Vol. 1 8. No. 2. J u n e 1986. pp. 25-69. 927 1 7. e- ';' ail t urne r@li cs . uci . edu.

July 1 993 41