Sie sind auf Seite 1von 8

Speech Sound Disorders in Children:

Low-Cost Speech Analysis Software as a Clinical Biofeedback Tool


Tracey Lorang
Richard McGuire University of South Dakota Julie Hoffmann
Saint Louis University Scottish Rite Children's Clinic Saint Louis University

With the regular availability of computers containing integrated high-quality


digital sound capabilities, several computer programmers have participated in developing
collaborative initiatives to develop software that can exploit the digital sound capabilities
of personal computers. One area of this collaborative software development is related to
the study of linguistics, more specifically, the analysis of speech. Although some of this
development has led to commercial speech analysis products, much of it remains a free or
low-cost alternative to expensive analysis systems. In the past, acoustic analysis of
speech, involved sophisticated and expensive data acquisition hardware and software that
prohibited most clinicians from considering using it as a therapy tool. Well, as Bob
Dylan once said, Times, they are a changing.
With most clinicians having access to computers and the availability of low- or
no-cost acoustic analysis software, there is little reason why the acquisition, analysis, and
interpretation of acoustic aspects of speech are not routinely incorporated into the clinical
skills-set of practicing clinicians. With adequate education, modeling, and support the
application of speech analysis to clinical practice can lead to effective clinical
management.
There are a number of low-cost or no-cost software packages for the acoustic
analysis of speech available from a variety of commercial 1 and non-commercial sources
for both the Windows and Macintosh platforms. There is also a wide range of acquisition
and analysis options available across these software programs resulting in varied levels of
usability, utility, and effectiveness for teaching, research, and clinical purposes. The
intent of this handout is to provide an overview of five of the most useful and popular
acoustic analysis software packages that are either free or available for a cost of less than

1
Computerized Speech Lab (CSL) - http://www.kayelemetrics.com
Dr. Speech 4 - http://www.drspeech.com/List_New.html
Computerized Speech Research Environment (CSRE) - http://www.avaaz.com/researchresources/csre.htm
McGuire, Lorang, & Hoffmann 2 of 8
ASHA Convention 2006

$50. This commentary is based on the personal experiences and preferences of the
authors and is not intended to be a systematic evaluation or comparison of the computer
programs presented. Further, the focus of this review is limited to analysis programs
available for a Windows platform.

LOW-COST SPEECH ANALYSIS SOFTWARE

Speech Filing System


The Speech Filing System (SFS) is a free software suite that is available from the
University College London at http://www.phon.ucl.ac.uk/resource/software.html. In its
entirety, SFS is a powerful tool that is updated and improved on a regular basis. This
software enables users to perform a variety of involved speech signal processing,
synthesis, and recognition activities. This complex suite of software involves over a
dozen individual programs that are useful for speech analysis and clinical biofeedback.
Some of these programs can be integrated with each other to provide extremely powerful
tools. The use of the full SFS software suite can be challenging for those with little
computing experience, although use of the basis speech capture and analysis program
Waveforms Annotations Spectrograms and Pitch (WASP) of this suite is manageable for
the novice user. Based on the clinical speech analysis focus of this paper, only the WASP
program of SFS suite is presented here.
WASP is a simple and easy to use program for the recording, displaying, and
analyzing speech on personal computers. With WASP one can record and replay speech
signals, save them and reload them from disk, edit annotations, and display spectrograms
and fundamental frequency tracks. It takes up very little disk space and is ideal for
technologically challenged clinician to ease into computer-based acoustic analysis.
WaveSurfer
WaveSurfer is a free software program for sound (speech) visualization and
manipulation developed at the Centre for Speech Technology (CTT) at KTH (the Royal
Institute of Technology) in Stockholm, Sweden. WaveSurfer is available from
McGuire, Lorang, & Hoffmann 3 of 8
ASHA Convention 2006

http://www.speech.kth.se/wavesurfer/. This software can be used to visualize and analyze


sound in several ways including the display of waveform, spectrogram, and pitch tracking
displays. Additionally, several properties of these displays can be adjusted, such as, a
spectrum window and a zoom waveform window, which are useful for more detailed
inspection and adjustment of the speech signal.
WaveSurfer has a simple and logical user interface that provides analysis options
in an intuitive manner. In its basic configuration, is embraced by novice users, yet is
suitable for more advance users through the addition of program plug-ins. Like WASP, it
is an effective program for clinicians to employ computer-based speech analysis in their
clinical practice.
SIL Speech Tools
Speech Tools is a low-cost ($19.95) suite of speech analysis software that has
been developed by SIL International (formerly known as the Summer Institute of
Linguistics) and is available from http://www.sil.org/computing/speechtools/. Although
all of the individual programs in Speech Tools (Speech Analyzer, Speech Manager, and
IPA Help) may be useful in acoustic phonetic instruction, only Speech Analyzer is
presented here.
Like WASP and WaveSurfer, Speech Analyzer is relatively easy to use, yet it has
more display, playback, and analysis functions. More specifically, Speech Analyzer
enables users to vary playback speed, view speech input as a waveform, pitch plot,
spectrogram (grey scale and color), spectrum, and various F1 vs. F2 displays. The pitch
tracker can easily be restrained within a particular frequency range which is a desirable
feature due to the varying pitch ranges based on age and gender. Clinicians tend to
embrace this analysis program as it gives them a wider range of analysis options while
remaining easy to use.
PRAAT
PRAAT (Dutch for talk) is a very popular free computer-based phonetic science
environment that has been developed at the Institute of Phonetic Sciences at the
University of Amsterdam and is available from http://www.fon.hum.uva.nl/praat/. This
program is widely used and supported with PRAAT developers and enthusiasts
contributing extra program routines to this ever-growing phonetic science
McGuire, Lorang, & Hoffmann 4 of 8
ASHA Convention 2006

program/environment. Although PRAAT is used for complex kinds of analysis by


advanced level phonetic science researchers who are adept at computer programming
(scripting), its relatively basic functions, such as, waveforms, spectrograms, and pitch
tracking can be used by individuals with less technical expertise.
The main speech analysis features of PRAAT include waveform, spectral
(including FFTs and spectrograms), formant, intensity, pitch, and voice (including jitter,
shimmer, and additive noise) analyses. The mastery of these features gives clinicians
speech analysis competencies that can easily transferred from initial learning activities
and experiences related to sophisticated clinical and research applications using the same
software program. If clinicians embrace PRAAT, they can continue to grow in their
mastery of this program and always have free access to the most current full version of
this effective speech analysis tool.
Although PRAATs basic speech analysis functions are ideal for clinical
application, the complexity of this program can be daunting to many clinicians. Although
the learning curve of PRAAT is steeper than the other analysis programs previously
mentioned, the investment of time and effort to master those features of PRAAT enables
long-term use of this program.
Although there are other available free or low-cost speech analysis software
programs (e.g. PCquirer (a demo version available from http://sciconrd.com/multi.htm),
Sound Software Spectrograph (a free spectrogram display program is available from
http://www.sonicspot.com/soundsoftwarespectrograph/soundsoftwarespectrograph.html),
WinSpec32 (a feature limited shareware version available from
http://www.sonicspot.com/winspec/winspec.html), Spectrogram (a demo available from
http://www.visualizationsoftware.com/gram/gramdl.html), and Wave Tools (a shareware
version is available from http://www.sonicspot.com/wavetools/wavetools.html)), their
cost, limited features, utility, and/or the authors lack of awareness prevents them from
being included in this handout.
Although these programs have been effectively employed in university
instruction, research, and clinical management of speech, the quality of capture,
resolution of displays and analyses, and quality of measurements vary as a function of the
software programs and computers (sound cards and microphones) employed. More
McGuire, Lorang, & Hoffmann 5 of 8
ASHA Convention 2006

specifically, the utility of any of these programs for sophisticated analysis and
measurement is directly related to the quality of microphone and sound card used to
capture the speech signal.
Clinicians must also understand the need to be critical in the application of
computer-based speech analysis in their clinical practice and research. They must learn
to consider routinely the limitations of the specific hardware and software they are using
as well as the reliability and validity of their speech analyses/ measurements by always
relating computer-based feedback to the own professional observations and judgments.

MICROPHONES AND SOUND CARDS

Microphones
There are basically two kinds of microphones, dynamic and condenser. Dynamic
microphones act like speakers in reverse by generating a small amount of electricity
when the diaphragm of the microphones moves back and forth under the pressure of the
sound waves hitting it. Condenser microphones are powered by electricity and are more
sensitive than dynamic microphones. Additionally, they use a lightweight diaphragm that
is better at picking up nuances of sound.
Another consideration related to microphones is their pickup pattern. This refers
to the relative sensitivity of a microphone related to the direction of the sound it is
sensing. Two popular pickup patterns are omnidirectional and unidirectional.
Omnidirectional microphones picks up sound equally well in all directions while
unidirectional microphones mostly picks up sound from one direction. Generally,
unidirectional microphones are preferred for recording speech as the microphone can be
aimed at the sound source (client). Thus, it is sensing less noise environmental (room)
noise. You can learn more about microphones at (www.homerecording.com/mics.html).
Although you can buy an inexpensive microphone to meet your basic audio capture
needs, you are encouraged to consider a professional quality microphone if you
need/desire a high quality capture. Although professional microphones can cost several
McGuire, Lorang, & Hoffmann 6 of 8
ASHA Convention 2006

hundred dollars, a good professional quality microphone for recording speech (such as
the Shure M58 - www.shure.com) can be purchased for $70 to $100. For about the same
amount you could also purchase a USB microphone which may be the best choice for
high quality recording of speech (discussed further below). You can also purchase a
computer microphone for less than $10 2 at a host of local retailers (e.g., Target, Wal-
Mart, Best Buy, CompUSA) that may adequately meet several of your basic recording
needs.
Sound Cards
The sound card is an interface enabling the user to connect various audio devices
to the computer (e.g., speakers, microphone, tape recorder). Although almost all
computers come with a sound card, not all sound cards are created equal. On most
Windows-based computers, the sound card will have three 1/8 inch jacks; the green is for
speaker (audio) output, the red one is microphone (audio) input, and the blue one usually
is a line (audio) input (these color codes are common but are not standard on all
computers). The difference between the microphone and line input relate to how the
Soundcard and computer shape the audio signal. The sound card properties can be
adjusted in the computer control panel. A basic tutorial for adjusting your Sound card
can be found at http://k-12.pisd.edu/multimedia/audio/windows.htm.
The microphone input converts the audio input into a digital signal and also
amplifies the input to make it more usable to the computer. The line input doesnt
manipulate the input much beyond converting the input to a digital signal. Since the
specific way the signal is shaped by the soundcard is not readily known, it is generally
advisable to use the line input when possible. However, when using the line input, it is
likely that you will need to amplify the microphone signal prior to feeding it into the
computer. This will require a small pre-amplifier and add an additional $50 to $100 to
the cost of your system.
An alternative to using the microphone or the line inputs on your computer is to
use a quality USB microphone. A USB microphone plugs into one of the USB ports on
your computer and bypasses the soundcard input entirely. A good quality USB
2
Although you can buy an inexpensive microphone to meet your basic audio capture needs, you should pay
attention to its specifications. Specifically, make sure that the microphone has a Frequency of Response
from at least 80 Hz to 16,000 HZ to capture all aspects of the speech signal.
McGuire, Lorang, & Hoffmann 7 of 8
ASHA Convention 2006

microphone (which a less common and a bit harder to find) costs about the same as a
professional quality microphone ($100). A good USB microphone is the Samson
C10U. For information related to this microphone see:
http://www.samsontech.com/products/productpage.cfm?prodID=1878&brandID=2 or
http://www.zzounds.com/prodsearch?form=prodsearch&cat=3807&cat2=3582.

COMPUTER-BASED DIGITAL (TAPE) RECORDER

AUDACITY
Bundled within Windows-XP is the application Sound Recorder. This program is
a functional digital audio recorder that can be used to record, save, and edit sound. This
essentially turns any computer into a handy, high-quality tape recorder. However, the
Sound Recorder program limits your recording time to 60 seconds.
Audacity is a powerful and widely used freeware program that is designed to
capture and manipulate sound in variety ways. It does not restrict the record time and can
be cery useful in capturing a complete therapy session. Additionally Audacity enables
you to quickly and easily edit and save your recording. This program is available for free
download from http://audacity.sourceforge.net/download/. Documentation and tutorials
related to using Audacity are available from http://audacity.sourceforge.net/help/. Note:
If you download and use Audacity, realize when you save you save the recording as an
Audacity Project which is not compatible with most other programs. Instead you
should use the Export function in the File menu. The WAV file type is the most
compatible file type on Windowsbased computers and you will probably want to save
all your recordings as WAV files.
McGuire, Lorang, & Hoffmann 8 of 8
ASHA Convention 2006

RESOURCES

Cochran, P. (2005). Clinical Computing Competency for Speech-Language


Pathologists. Baltimore, MD: Paul H. Brookes Publishing Co., Inc.
Michi, K. & Yamashita, Y. (1993). Role of visual feedback treatment for defective /s/
sounds in patients with cleft palate. Journal of Speech & Hearing Research, 36 (2),
277-285.
Ohde, R.N. & Sharf, D.J. (1992). Phonetic analysis of normal and abnormal speech.
New York: Merrill.
Parker, M., Cunningham, S., Enderby, P., Hawley, M. & Green P. (2006). Automatic
speech recognition and training for severely dysarthric users of assistive technology:
The STARDUST project. Clinical Linguistics & Phonetics, 20 (2/3), 149-156.
Pickett, J.M. (1999). The acoustics of speech communication: fundamentals, speech
perception theory, and technology. Boston: Allyn and Bacon.
Ruscello, D.M. (1995). Visual feedback in treatment of residual phonological disorders.
Journal of Communication Disorders, 28 (4), 279-302.
Sassi, F.C., de Andrade, C.R. (2004). Acoustic analyses of speech naturalness: A
comparison between two therapeutic approaches. Profono, 16 (1), 31-38.

If you have any questions or need assistance downloading,


installing, or using any of the programs listed above, feel
free to contact Rick McGuire at
RichardMcGuirephd@yahoo.com.

Das könnte Ihnen auch gefallen