You are on page 1of 4

Speech recognition: Evaluation,

implementation, and use


Speech recognition can markedly reduce turnaround time. To encourage radiologists to
use this technology, it is critical to recognize and address some of its shortcomings.

David L. Weiss, MD

C
urrently, there are a handful decrease in overall radiologist productiv- Community hospital
of ways to create a radiology ity.3,4 However, the most serious problem The first case study involves Chestnut
report. For decades, the stan- with speech recognition is its potential to Hill Hospital, a small community hospi-
dard has simply been transcription, distract the radiologist from viewing tal in Philadelphia, PA, with a general
coupled in more recent years with digi- images. If the radiologist’s eyes are on radiology practice of 4 to 5 radiologists
tal dictation. Another option, structured the dictation screen rather than on reading 100,000 examinations annually.
reporting, is used by many radiologists images, the risk of error increases.5 In 1998, we installed speech recogni-
in mammography but is not widespread Acceptance of speech recognition by tion, shortly after hardware and soft-
in general radiology. This article will radiologists has been complicated by a ware advances made it practical for
focus on speech recognition, which is misalignment of incentives. Radiologists radiology use. The next year we in-
being adopted by more and more radi- only indirectly benefit from the advan- stalled a picture archiving and commu-
ology departments, in both academic tages of speech recognition. If cost sav- nications system (PACS) and, in early
medical centers and private practice. ings help the department to stay under 2000, we integrated the 2 systems.
One of the purported advantages of budget, radiologists might receive a All of the radiologists agreed to imple-
speech recognition is cost savings. In bonus, but it will be slow in coming and ment speech recognition. We discontin-
reality, a speech recognition system will be shared by many others. Similarly, ued using a transcriptionist after about a
may, at least in part, shift costs rather improved turnaround time will help to week. What followed was 4 weeks of dif-
than save them. This is because instead achieve departmental goals, but diagnos- ficulty. First, we ordered the wrong
of paying a transcriptionist, a radiologist tic accuracy and productivity are more microphones for data entry (ones without
spends time editing and typing. How- important to most radiologists. a bar coder). We weren’t proficient in
ever, without question, improved turn- On the other hand, the disadvantages using the product as it was designed.
around time is an advantage of speech of speech recognition fall directly on Macro techniques were still in their
recognition, as has been documented in radiologists, as they suffer the potential infancy. The navigation controllers we
a number of studies.1,2 for a time penalty, productivity decrease, use today were not available. And, ini-
One of the disadvantages of speech and distraction from image viewing. tially, there was no integration between
recognition is a time penalty. A majority Administrators and information technol- the PACS and speech recognition.
of radiologists report spending more time ogy staff must pay attention to this mis- Nonetheless, report turnaround time
creating and finalizing reports using alignment of incentives and find a so- decreased from approximately 72 hours
speech recognition as compared with lution to it if they want radiologists to to 20 to 24 hours immediately after im-
conventional dictation, with a resulting adopt speech recognition. plementation of speech recognition
(Figure 1). There was no further signifi-
Dr. Weiss is the Clinical Section Head for Case studies cant reduction in turnaround time dur-
Imaging Informatics at Geisinger Health
System, Danville, PA. He is also a mem- The following case studies illustrate ing the first 6 months after installation
ber of the editorial board of this journal. both the success and, in some ways, the of the PACS. However, after we stream-
failure of speech recognition. lined workflow to make the best use of

24 ■ SUPPLEMENT TO APPLIED RADIOLOGY ©


www.appliedradiology.com December 2008
SPEECH RECOGNITION

Table 1. Speech recognition


implementation strategies
Provide users with incentives
Set realistic expectations
Plan for a drop in productivity
Provide training and ongoing
support
Set a deadline for transition
Consider using a hybrid model
Dictate, edit, and sign
Transcriptionist correction
Maximize efficiency

7 pm. This new schedule consisted of the


same total number of work hours, but
resulted in much better turnaround time.

Large medical center


At Geisinger Medical Center, we
have digital dictation and use structured
FIGURE 1. This graph illustrates average report turnaround time at Chestnut Hill Hospital. Turn- reporting for mammography. We in-
around time decreased from approximately 72 hours to 20 to 24 hours immediately after the stalled speech recognition in the third
implementation of speech recognition [SR]. There was no significant reduction in turnaround quarter of 2004, and radiologists were
time during the first 6 months after the installation of a picture archiving and communications
encouraged, rather than given a man-
system (PACS). However, after streamlining of workflow (mature PACS), average turnaround
time dropped to <4 hours. date, to use it. Therefore, at each work-
station we still have both a speech rec-
ognition microphone and software, and
a conventional digital dictation system.
An interesting pattern developed.
After 6 months, half of our radiologists
were using speech recognition 80% to
90% of the time, and the other half were
using it ≤30% of the time (Figure 2). This
was a bit surprising. It is more common to
see a bell-shaped usage curve, with most
of the radiologists accepting speech
recognition, a handful really embracing
it, and a handful really struggling with it.
What happened at Geisinger? We did
get buy-in from the radiologists ini-
tially, but we may have had unrealistic
expectations of accuracy that led to
frustration with the product after imple-
FIGURE 2. This graph depicts radiologist use of speech recognition at Geisinger Medical Center mentation. A training session with a
6 months after installation. Approximately half of the radiologists used speech recognition 80%
to 90% of the time, and the other half used it ≤30% of the time.
trainer who knows the software inside
and out—and software that has become
the PACS, we did see a major drop in distributing hardcopy studies to be read. highly accurate in recognizing that
turnaround time to an average of <4 Several months after installation of PACS trainer’s speech—is far different from a
hours for all studies. we realized that, since PACS produced new user’s initial experience. In addi-
A major change in workflow involved images hour after hour, we could reconfig- tion to problems with the speech engine,
the radiologists’ work schedules. We were ure our work schedule. All but one radiol- we also had some disruption attribut-
accustomed to leaving the hospital at 5 pm ogist began to leave the hospital at 4 pm, able to the PACS and the radiology
each day, when the film library stopped while the on-call radiologist stayed until information system (RIS).

December 2008 www.appliedradiology.com SUPPLEMENT TO APPLIED RADIOLOGY ©


■ 25
SPEECH RECOGNITION

A B

C D

FIGURE 3. (A through D) These screens illustrate modification of a macro for a CT scan of the abdomen and pelvis to insert information on an
abdominal aortic aneurysm. The radiologist is able to do all of these text edits verbally.

These were not the real reasons Don’t allow use of the system to be of PACS and speech recognition in col-
speech recognition was not widely used voluntary. laboration with New York University.
at Geisinger, however. The main prob- Have realistic expectations and plan They were able to show that at MGH the
lem was that we did not communicate for a drop in productivity. Most studies use of PACS had a positive impact on
a consistent message that the depart- show at least a 10% reduction in radiolo- radiologist productivity, while speech
ment would be adopting this technol- gist productivity after implementation of recognition had no statistically signifi-
ogy. We also had no defined endpoint speech recognition. There are some ex- cant effect on productivity.7
for eliminating digital dictation and ceptions, however. For example, at the Plan for training and ongoing support.
have continued to support 2 separate community hospital profiled earlier, my Even experienced users will have prob-
reporting systems. colleagues and I all felt strongly that we lems now and then. When new radiolo-
were more efficient when using the inte- gists join the department or locum
Ensuring success grated PACS-speech recognition product. tenens radiologists arrive, they will need
Table 1 outlines some of the steps that The University of Pittsburgh showed support on the system.
can be taken to ensure adoption of speech at least a time-neutral effect after installa- Set a deadline for the removal of con-
recognition. First, provide meaningful tion of speech recognition using a hy- ventional transcription and transition to
incentives to users. The incentives could brid transcriptionist-radiologist editing speech recognition. Consider using a
include bonuses, extra time off, or a re- process and some modifications in work- hybrid model for report editing. One way
duction in productivity requirements for flow.6 Massachusetts General Hospital to use speech recognition is to have radiol-
radiologists who use speech recognition. (MGH) recently did a productivity study ogists dictate, edit, correct, and sign their

26 ■ SUPPLEMENT TO APPLIED RADIOLOGY ©


www.appliedradiology.com December 2008
SPEECH RECOGNITION

own reports. Another way is to have radi- recognition system, as well as simultane- report, the radiologist has only 2 phrases
ologists dictate reports, and then use ous navigation through the PACS. It is to proofread.
“back-end” transcription for editing. In important to consider how to make this
this process a transcriptionist listens to an process seamless through use of a mouse, Conclusion
audio file while viewing the text and mak- a microphone with programmable but- Effective use of speech recognition
ing corrections. Although this approach tons, or some other navigation aid. can yield major improvements in report
relieves radiologists of clerical work, it Integration is another critical factor turnaround time. This technology is not
reduces cost savings and results in vari- in efficiency. It is fairly easy to use a always well accepted by radiologists,
able turnaround times. Consider combin- stand-alone speech recognition system. however, in part because it can reduce
ing these 2 approaches, so that radiol- It is much more difficult to achieve productivity, at least initially.
ogists who are proficient with speech rec- interoperability between the speech To encourage radiologists to adopt
ognition can work independently and recognition system and the PACS or speech recognition, it is essential to offer
those who are struggling or in a time bind RIS. The interface with the RIS needs to meaningful incentives. It is also helpful
can send reports to back-end transcription. be bidirectional, enabling the accession to identify departmental champions who
Christiana Care Health System in number to pass from the RIS to the can generate excitement about the new
Wilmington, DE, provides an example speech recognition system, and in the technology, to set a firm date for discon-
of a hybrid model for report editing. All other direction, for text and any other tinuing conventional transcription, and
radiologists use speech recognition but information to pass back to the RIS. to maximize efficiency by improving
are given the choice of self-editing or Even simple integration algorithms accuracy, streamlining navigation, inte-
back-end transcription. Approximately can eliminate unnecessary tasks. With- grating speech recognition with the
half of the radiologists choose to use out them, it is necessary to first open a PACS and RIS, and taking advantage of
self-editing 80% to 90% of the time, case in the PACS, then open dictation in macro functionality.
and about half of the radiologists send speech recognition, and then add demo-
their reports to back-end transcription.8 graphic information. Integration enables REFERENCES
1. Mehta A, Dreyer KJ, Schweitzer A, et al. Voice
Take steps to maximize efficiency. distillation of the workflow into simply recognition—An emerging necessity within radiol-
With speech recognition, efficiency is viewing images, dictating, and signing ogy: Experiences of the Massachusetts General
Hospital. J Digit Imaging. 1998;11(4 Suppl 2):20-23.
determined by accuracy, navigation, the report. The case is closed automati- 2. Ramaswamy MR, Chaljub G, Esch O, et al. Con-
integration, and macro use. To maxi- cally, the next case opens automatically, tinuous speech recognition in MR imaging report-
mize accuracy, it is best to use a headset and the radiologist views the images, ing: Advantages, disadvantages, and impact. AJR
Am J Roentgenol. 2000;174:617-622.
microphone. This will improve accu- dictates, and signs the report. 3. Gale B, Safriel Y, Lukban A, et al. Radiology report
racy by standardizing the distance and Macros and templates are essentially production times: Voice recognition vs. transcription.
the position of the microphone in rela- canned reports that the radiologist can Radiol Manage. 2001;23:18-22. Comment in: Radiol
Manage. 2001;23:23-25; discussion 26-27.
tion to the mouth. With a hand-held pull up anytime and modify. With 4. Hayt DB, Alexander S. The pros and cons of
microphone, approximately half of the speech recognition, the use of macros implementing PACS and speech recognition sys-
errors can be attributed to not holding magnifies the time savings. Not only tems. J Digit Imaging. 2001;14:149-157.
5. Atkins MS, Moise A, Rohling R. An application of
the microphone in the correct position. does it save time in dictation, it saves eyegaze tracking for designing radiologists’ work-
Make corrections properly, either time in proofreading. stations: Insights for comparative visual search
using the correction mode or, in the case Figure 3 shows the modification of a tasks. ACM Trans Appl Percep. 2006;3:136-151.
6. Crane K, Branstetter BF, Chang PJ. Does radiolo-
of some speech recognition systems, the macro for a CT scan of the abdomen and gist efficiency have to suffer with speech recognition?
vocabulary editor. This is a way of train- pelvis. The macro states: “Aorta is nor- Presented at the 91st Scientific Assembly and Annual
ing the system to learn each user’s spe- mal in size.” To modify that text, the user Meeting of the Radiological Society of North America.
Chicago, IL; November 27-December 2, 2005.
cific speech patterns. verbally selects the sentence and substi- 7. Halpern EF, Sack D, Kirpekar N, et al. Impact of
Pay close attention to ambient noise. tutes: “There is an aneurysm of the lower medical imaging informatics on radiologist produc-
It may be helpful to have the walls of abdominal aorta measuring 4.8 centime- tivity. Presented at the 93rd Scientific Assembly and
Annual Meeting of the Radiological Society of North
the reading room covered in acoustic ter in diameter.” The user also changes America. Chicago,IL; November 25-30, 2007.
paneling and to have acoustic tile or car- the impression from: “No significant 8. Stillman P, Garrett RE, Cooper SA. Quality
peting installed on the floor. If ambient abnormality in CT scan of abdomen and improvement project to decrease inpatient radiology
turnaround time: Experience at Christiana Care
noise remains a problem, consider put- pelvis” to “Abdominal aortic aneurysm. Health System. Prescrip Excell Health Care. 2008;
ting in a fan or some device that gener- No other significant abnormality….” 1(2):9-11.
ates “white noise.” All of the radiologist’s edits are done
Speech recognition requires not just verbally, without the eyes ever leaving For a roundtable discussion of this article,
visit http://www.appliedradiology.com/
dictation but navigation through the the images. Not only is this method informatics.
text and various screens of the speech much faster than dictating an entire

December 2008 www.appliedradiology.com SUPPLEMENT TO APPLIED RADIOLOGY ©


■ 27