Speech Recognition Seminar

1
SPEECH RECOGNITION
07-Feb2013
Seminar By: Suraj Vitthal Gaikwad Guided By: Prof. S. R. Lahane
Outline
2
Introduction Speech Recognition Process Types Of Speech Recognition Systems Algorithms Applications Advantages & Disadvantages Future Scope Conclusion
07-Feb-13
SPEECH RECOGNITION
Introduction
3
Speech recognition is the process by which a computer (or any other type of machine) identifies spoken words. Basically, it means talking to your computer, AND having it correctly understand what you are saying. An alternative to traditional methods of interacting with a computer.
SPEECH RECOGNITION
07-Feb-13
SPEECH RECOGNITION
07-Feb-13
Speech Recognition Process

5
Signal Processing
Convert the audio wave into a sequence of feature vectors

Decode the sequence of feature vectors into a sequence of words Determine the meaning of the recognized words Correct the errors and help get the task done What words to use so as to maximize user understanding Generate synthetic speech from a marked-up word string
07-Feb-13
Speech Recognition
Semantic Interpretation
Dialog Management
Response Generation
Speech Synthesis (Text to Speech)
SPEECH RECOGNITION
Typical Speech Recognition Process
SPEECH RECOGNITION
07-Feb-13
Types of Speech Recognition

7
Isolated Words
Single
utterance at a time
Connected Words
Separate
utterances together with a minimal pause between them speech or dictation
Continuous Speech
Rehearsed
Spontaneous Speech
Natural
speech
07-Feb-13
SPEECH RECOGNITION
Algorithms
8
Dynamic Time Warping

an
algorithm for measuring similarity between two sequences which may vary in time or speed.
Hidden Markov Models Neural Networks
SPEECH RECOGNITION
07-Feb-13
Hidden Markov Model

9
In a HMM, the state is not directly visible, but output, dependent on the state, is visible. Each state has a probability distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM gives some information about the sequence of states.
x states y possible observations a state transition probabilities b output probabilities
SPEECH RECOGNITION
07-Feb-13
HMM Example
10
SPEECH RECOGNITION
07-Feb-13
Neural Network
11
A neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation. An NN is typically defined by three types of parameters:
The interconnection pattern between different layers of neurons The learning process for updating the weights of the interconnections The activation function that converts a neuron's weighted input to its output activation.
SPEECH RECOGNITION
07-Feb-13
Speech Recognition Softwares

12
Open source
Julius Dragon Dictate Google Now Siri Micromax AISHA (Artificial Intelligence Speech Handset Assistant) S Voice Iris (Intelligent Rival Imitator of Siri) Dragon NaturallySpeaking Windows Speech Recognition
07-Feb-13
Macintosh
Mobile Devices/ Smartphone

Windows

SPEECH RECOGNITION
Applications
13
Games and Edutainment Data Entry Document Editing Speaker Identification/Verification Automation at Call Centers Medical/Disabilities Fighter Aircrafts
SPEECH RECOGNITION
07-Feb-13
Advantages
14
Increases Productivity Can help with menial computer tasks Can help people with disabilities Cost Effective Diminishes Spelling Mistakes
SPEECH RECOGNITION
07-Feb-13
Disadvantages
15
Inaccuracy & Slowness Vocal Strain Adaptability Out-of-Vocabulary (OOV) Words Spontaneous Speech. Etc Accent, Dialect and Mixed Language
SPEECH RECOGNITION
07-Feb-13
Future Scope
16
Achieving efficient speaker independent word recognition SRS may have the ability to distinguish nuances of speech and meanings of words. Stand alone Speech Recognition Systems. Wearable Speech Recognition System. Talk with all the devices.
SPEECH RECOGNITION
07-Feb-13
Conclusion
17
Within five years, speech recognition technology will become so pervasive in our daily lives that service environments lacking this technology will be considered inferior. Speech recognition will revolutionize the way people interacted with Smart devices & will, ultimately, differentiate the upcoming technologies.
SPEECH RECOGNITION
07-Feb-13
References
18
JOE TEBELSKIS {1995}, SPEECH RECOGNITION USING NEURAL NETWORKS, School of Computer Science, Carnegie Mellon University KRE SJLANDER {2003}, An HMM-based system for automatic segmentation and alignment of speech, Ume University, Department of Philosophy and Linguistics KLAUS RIES {1999}, HMM AND NEURAL NETWORK BASED SPEECH ACT DETECTION, International Conference on Acoustics and Signal Processing (ICASSP99) B. PLANNERER {2005}, AN INTRODUCTION TO SPEECH RECOGNITION KIMBERLEE A. KEMBLE, AN INTRODUCTION TO SPEECH RECOGNITION, Voice Systems Middleware Education, IBM LAURA SCHINDLER {2005}, A SPEECH RECOGNITION AND SYNTHESIS TOOL, Department of Mathematics and Computer Science, College of Arts and Science, Stetson University MIKAEL NILSSON, MARCUS EGNARSSON {2002}, SPEECH RECOGNITION USING HMM, Blekinge Institute Of technology
07-Feb-13
SPEECH RECOGNITION
19
ANY QUESTIONS??
SPEECH RECOGNITION
07-Feb-13

Speech Recognition Seminar

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Speech Recognition Seminar

Hochgeladen von

Copyright:

Verfügbare Formate

1

Speech Recognition Process

Convert the audio wave into a sequence of feature vectors

Speech Synthesis (Text to Speech)

Typical Speech Recognition Process

Types of Speech Recognition

utterances together with a minimal pause between them speech or dictation

Dynamic Time Warping

Hidden Markov Models Neural Networks

Hidden Markov Model

Speech Recognition Softwares

Mobile Devices/ Smartphone

Das könnte Ihnen auch gefallen