Sie sind auf Seite 1von 5

SPEECH TRAINER KIT USING LARYNGEAL VIBRATIONS

Gopikrishnan Potti (potti.gopikrishnan@gmail.com), Jennifer V


(jennifergafur@gmail.com), Rekha M (rekha.m.mec@gmail.com), Wazeem Basheer K
(wazeembasheerk@gmail.com)

VIII Semester, Electronics And Biomedical, Model Engineering College, Cochin

ABSTRACT

Development of hardware and software tools for speech training for hearing-impaired, on
a multilingual basis using laryngeal vibrations has been proposed. After identification of user
requirements in close co-operation with user groups for the three languages, a software package
for transforming speech signals into images has been developed and tested. These images
including display of various parameters like frequency, amplitude, pitch etc and constitute a real
time visual feedback for the hearing impaired.

1. INTRODUCTION packages are available which help in the


training of the deaf speaker in producing a
Speech is one of the primary ways we
speech vocalization. The visual feedback
communicate with those around us. It is an
systems are usually video game type
effective way to monitor normal growth and
systems, having to hit the right target and an
development as well as to identify potential
indication of the successful or correct hits,
problems. Speech disorders or speech
maintaining good motivation and interest of
impediments are a type of communication
even the children in doing vocalization
disorders where normal speech is disrupted.
exercises repeatedly.
Lack of auditory feedback in the hearing
Our objective is to focus on the
impaired results in failure to produce
effective conversion of laryngeal vibrations
intelligible speech. Speech training, in such
into speech output so as to rehabilitation of
cases, can be assisted by other forms of
people suffering from suffering from speech
feedback. At present many software
pathologies. The system involves close and consists of two main modules, signal
constant interaction with user groups made acquisition module and signal processing
up of motivated hearing-impaired persons, module.
who will intervene in the iterative process of The laryngeal vibrations from
system development and testing. The system suitable positions are picked up using three
has been developed for three languages in electret condenser microphones which are
parallel (Malayalam, Hindi, English) on a fixed to a neck band worn around the neck
comparative basis. This act as a voice of the patient. In order to obtain a composite
analysis tool for helping the speech signal from the three microphones a mixer
therapists and physicians in speech circuit is used. After suitable amplification
diagnosis. It is implemented on a low-cost, the output of the mixer circuit is fed to the
PC based system with software on CD-ROM sound card of the PC through the MIC-IN
using a generic framework and a user using a stereo pin connector. Figure 1 shows
friendly interface. the block diagram of the signal acquisition
module.

2. IMPLEMENTATION MIC
Our implementation of the traner kit is MIXER PC SOUND CARD
MIC
based on the speech processing techniques
which include feature extraction and MIC
identification using Mel Frequency Cepstral
Figure 1: Block diagram of signal acquisition
Coefficients. Clusters are created for every
module
sample and stored in database using K-mean
Signal processing module consists of
clustering. The feature extracted from the
speech feature extraction and clustering. The
real time inputs of trainee are compared with
trainee’s utterances are analyzed and some
these by calculating the Euclidean distance.
features are extracted that are compared with
The system outputs a suitable feedback
those extracted in the same way from a
depending whether the utterance of the
normal person’s utterance of the same sound.
trainee was correct or not by analyzing the
The decision of the test, whether the trainee’s
Euclidean distances. The product
utterance is correct or not is conveyed
implements hardware and software tools for
through the visual display. Thus the
speech training on a multilingual basis. It
technique may be useful for improving 3. RESULTS
effectiveness of speech-training for Hearing The package was tested with speech
Impaired children by providing visual signals for consistency and validity of the
feedback to improve their articulation. system. Figure 1 shows the signal
Mel frequency cepstral coefficients acquisition module.
algorithm is used for feature extraction and
K means algorithm is used for clustering.
Then the feature extracted from the real time
input is compared with clusters by
computing the Euclidean distance. The
testing process is iterated until the Euclidean
Figure 1. Signal acquisition module
distance becomes minimum. This module
also comprises a visual feedback for the Figure 2 shows the MIC input of vowel ‘a’
trainee with information about the goodness and its corresponding MFCC plot.
of each utterance. It also provides a graph
showing the level of improvement for each
utterance.

Training signal

Feature Extraction
Figure 2. MIC input of vowel ‘a’ and
using MFCC
corresponding MFCC plot
Real
Clustering Time i/p The GUI provides an option for the
(KMEANS)
trainer to select language, create database,
Feature retrieve data from the database, analyse the
Measure Euclidean
Extraction improvement graph of the patient, and also
Distance
using MFCC
provides a visual feedback to the trainee.
Visual Feedback figure 3 shows the GUI of the main menu
Display Result and figure 4 shows the visual feedback.

Figure 2: Block diagram of signal processing


module
5. REFERENCES
[1] Fletcher SG: Dynamic orometrics: A
computer-based means of learning about
and developing speech by deaf children.
Am Ann Deaf 128:525-534, 1983.

Figure 3. GUI of main menu [2] H. Levitt, J. M. Pickett, and R. A.


Hounde, Sensory Aids for the Hearing
Impaired. New York: IEEE Press, 1980.

[3] IBM France: IBM France Scientific


Centre, Paris, France, 1984

Figure 4. Visual feedback for the user [4] Kewley-Port D, Watson CS, Cromer
PA: The Indiana Speech Training Aid
(ISTRA) : A microcomputer-based aid
4. CONCLUSION using speaker-dependent speech
recognition. Presented at American
Our aim of implementing a speech
Speech-Hearing-Language Foundation
trainer, in the MATLAB platform turned
Computer Conference, Houston, 1987.
out successful. The system was able to
display the visual feedback in real time.
[5] Lynne E. Bernstein, Moise H.
The feasibility of the project was
Goldstein, James J. Mahshie: Journal of
successfully tested. During the training
Rehabilitation Research and
process the trainees were trained till their
Development Vol. 25 No.4, 1988
pronunciation matched that of a normal
person which required a trainer on a one- [6] Mahshie JJ: A computerized
to-one basis. Our device automated the approach to assessing and modifying the
whole process. The feedback for voice of the deaf. In Proceedings of the
improvement was provided in the form of 1985 International Congress on
visual images which were easily Education of the Deaf (in press).
perceptible.
[7] Murata N, Yamada Y, Sugimoto T,
Hirosawa K, Shibata S, Yamashita S:
Speech training aid for people with
impaired speaking ability, New York:
IEEE Press, 1986.

[8] Nickerson R, Stevens KN: Teaching


speech to the deaf: Can a computer help?
Trans of Audio and Electroacoustics
AU21(5):445-455, 1973.

[9] Pandey P C, Shah M S: Symposium


on Frontiers of Research on Speech and
Music, IIT Kanpur, Feb 2003

[10] Povel D-J, Wansik M: A computer-


controlled vowel corrector for the
hearing impaired. J Speech Hear Res
29:99-105, 1986

[11] Thomas F. Quartieri: Discrete-time


speech signal processing Principle and
Practise

Das könnte Ihnen auch gefallen