Sie sind auf Seite 1von 24

Speech

Recognition System

By:Kshirsagar V. P. Kothimbire P. P. Kasle A. S. Kulkarni N. N. Rankhamb T. U. Guide:Kendre S. V.

INDEX
1. 2.

Introduction. Relevance 2.1 Definition. 2.2 Major steps of speech recognition sytem. 2.3 Dictation grammer Vs rule grammer Related work.

3.

4. Proposed work. 4.1 Our main goal. 4.2 Project overview schematics. 4.3Components of typical speech recognition system 5. Requirements. 5.1 Software requirements. 5.2 Hardware requirements.
6.

Applications.

INTRODUCTION
o

Our project goal is to develop a software to covert Speech into text This software is mainly usable for the handicapped people . Speech recognition software is useful to increase working speed on the computer i.e. typing speed is increased.

RELEVANCE
Speech

recognition is also known as automatic speech recognition or computer speech recognition that converts spoken words to machine readable input . term "voice recognition" may also be used to refer to speech recognition, but can more precisely refer to speaker recognition, which attempts to identify the person speaking, as opposed to what is being said.

The

MAJOR STEPS OF TYPICAL SPEECH RECOGNIZER


1.

Grammer Design:-Recognition grammers define the words that may be spoken by user and the patterns in which they may be spoken. Signal processing: analyze the spectrum (frequency) characteristics of the incoming audio

2.

3. Phoneme recognition: compare the spectrum patterns to the patterns of the phonemes of the language being recognized.

4. Word recognition: compare the sequence of likely phonemes against the words and patterns of words specified by the active grammars. 5. Result generation: provide the application with information about the words the recognizer has detected in the incoming audio

Dictation Grammar VS Rule Grammar


1. RULE GRAMMAR:
In

a rule-based speech recognition system, an application provides the recognizer with rules that define what the user is expected to say.

2. DICTATION GRAMMAR: Dictation grammars impose fewer restrictions on what can be said, making them closer to providing the ideal of freeform speech input.
The

cost of this greater freedom is that they require more substantial computing resources, require higher quality audio input and tend to make more errors. dictation grammar is typically larger and more complex than rule-based grammars.

Popular speech recognition conferences held each year like ICASSP.

Related work

Conferences in the field of Natural Language Processing, such as 1.ACL(Association for computational linguistics), 2.NAACL 3.EMNLP 4.HLT are beginning to include papers on speech processing.

Important journals include the IEEE Transactions on Speech and Audio Processing. Books like "Fundamentals of Speech Recognition" by Lawrence Rabiner can be useful to acquire basic knowledge but may not be fully up to date (1993).

Another

good source can be "Statistical Methods for Speech Recognition" by Frederick Jelinek which is a more up to date book (1998). more up to date is "Computer Speech", by Manfred R. Schroeder (2004). largest speech recognition-related project ongoing as of 2007 is the GALE project, which involves both speech recognition and translation components.

Even

the

PROPOSED WORK
Our

main goal is to create free speech recognition software that will remove the current non-free speech recognition. order to achieve this goal, we decided to use dictation grammar. use of the software's engine allowed us to create more powerful accurate free speech recognition software.

In

The

Project overview (schematics) User


Voice Input Speech API Java code Function calls

Text output

The Java Speech API defines a standard, easy-touse, cross platform software interface to state-ofthe-art speech technology.

o I. II.
o

Two core speech technologies are supported through the Java Speech API: speech recognition speech synthesis.
Speech recognition provides computers with the ability to listen to spoken language and to determine what has been said . In other words, it processes audio input containing speech by converting it to text.

The

Java Speech API was developed through an open development process. a specification for a rapidly evolving technology, Sun will support and enhance the Java Speech API.

As

Component of typical speech recognition system Training data


Acoustic models Lexical Models Language models

Recognised Input

Representation

Modelling classification

Search

word

HOW SR WORKS

IBM AUDIO VIDEO SPEECH RECOGNITION SYSTEM

Applications of speech recognition


Speech recognition applications include voice dialing (e.g., "Call home"). call routing (e.g., "I would like to make a collect call"). domestic appliance control and content-based spoken audio search (e.g., find a podcast where particular words were spoken). simple data entry (e.g., entering a credit card number). preparation of structured documents (e.g., a radiology report). speech-to-text processing (e.g., Word processorsor or email). and in aircraft cockpits (usually termed Direct Voice Vnput).

4. Requirements
4.1 Software Requirements: Front End - Java
Operating System -WINDOWS XP

4.2 Hardware Requirements: Processor - Intel Pentium IV


Clock Speed - @ 700 MHz Main Memory - 256 Mb Cache Memory - 256 Kb CD ROM - 52 X Monitor - SVGA Printer - Laser printer Microphone , Headphone.

REFERENCES
^

http://www.speech.kth.se/prod/publications/f http://www.speech.kth.se/prod/publications/ ^ Eurofighter Direct Voice Input

^ Opportunities for Advanced Speech Process ^ Speech recognition for disabled people ^ Friends international support group

Karat, Clare-Marie; Vergo, John & Nahamoo, David (2007), "Conversational Interface Technologies", in Sears, Andrew & Jacko, Julie A., The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies, and Emerging Applications (Human Factors and Ergonomics), Lawrence Erlbaum Associates Inc, ISBN 978-0805858709. Cole, Ronald; Mariani, Joseph & Uszkoreit, Hans et al., eds. (1997), Survey of the state of the art in human language technology, Cambridge Studies In Natural Language Processing, XIIXIII, Cambridge University Press, ISBN 0-521-59277-1. Junqua, J.-C. & Haton, J.-P. (1995), Robustness in Automatic Speech Recognition: Fundamentals and Applications, Kluwer Academic Publishers, ISBN 978-0792396468.

CloudGarden - implementation of Sun's Java

Speech API for Windows platforms. (http://www.cloudgarden.com)


Michael Orlov's site - information on Saya's

software and hardware. (http://www.cs.bgu.ac.il/~orlovm/teaching/saya) Dragon Naturally Speaking - free speech recognition software. (http://www.nuance.com/naturallyspeaking)

Java speech API information about the java speech interface. (http://java. sun. com/products/java-media/speech/)

Thank You.

Das könnte Ihnen auch gefallen