Sie sind auf Seite 1von 15

DON BOSCO INSTITUTE OF TECHNOLOGY

INFORMATION SCIENCE AND ENGINEERING

SPEECH RECOGNITION AND TEXT SUMMARIZATION USING


TEXTRANK ALGORITHM

Batch No
SL No USN Name Email Id Contact No
1 1DB16IS038 PUSHPAHASA N S npushpahasa@gmail.com 8861844185
2 1DB16IS042 SAHANA S MATHAD sahana1098@gmail.com 9986182362
3 1DB16IS053 VARALAKSHMI B varalakshmisuvarna1998@gmail.com 9964619586

Guide Name: Mrs. Asha K H


Designation: Assistant Professor

Dept of ISE,DBIT, 2019 1


Contents:
• Introduction
• Problem Statement
• Objective
• Methodology
• Existing System
• Proposed System
• References

Dept of ISE,DBIT, 2019 2


INTRODUCTION:
Speech Recognition:

• Speech recognition is a process to convert speech sound to corresponding text


• The working of speech recognition is by recording a voice sample of a person’s speech and digitizing it to
create a unique voice print or template.
• Each spoken word is broken up into discrete segments which comprise several tones.
• Advancement in statistical modelling of speech has gained a widespread application in the field of speech
recognition.

Dept of ISE,DBIT, 2019 3


Text Summarization:

• Text Summarization is one of those applications of NLP which helps in shortening long pieces of text the
intension is to create a coherent and fluent summary having only the main points outlined in the document.
• There are two main types of techniques used for text summarization: NLP-based techniques and deep
learning-based techniques.
• Automatic Text Summarization is one of the most challenging and interesting problems in the field of Natural
Language Processing (NLP).
• The main objective of a text summarization system is to identify the most important information from the
given text and present it to the end users.

Dept of ISE,DBIT, 2019 4


Problem Statement
• The project describes a summarization system that will be developed in order to summarize news/lecture
delivered orally.
• The system generates text summaries from input audio using three independent components: an automatic
speech recognizer, a syntactic analyzer, and a summarizer.
• An automatic speech recognizer means any API to convert speech to text the text obtained using speech
recognition API we will apply text summarization using TextRank algorithm
• Text summarization plays an important role which gives the end user a brief idea of what was told instead of
reading (speech to text converted) the complete lecture or news

Dept of ISE,DBIT, 2019 5


Objective
• The goal of getting a machine to understand fluently spoken speech and respond in a natural voice has been
driving speech research for more than 50 years.
• The main objective of a text summarization system is to identify the most important information from the
given text and present it to the end users.
• In this project, The converted text from the professors during lecture are given as input to system and
extractive text summarization is presented by identifying text features and scoring the sentences accordingly.
• The text is first pre-processed to tokenize the sentences and perform stemming operations.
• In this project , recognize audio lecture by the professors using speech to text conversion model and then use
a text summarization algorithm to summarize the key points of lecture.

Dept of ISE,DBIT, 2019 6


METHODOLOGY
• Speech must be converted from physical sound to an electrical signal with a microphone and then to digital
data with an analog-to-digital converter.
• Once digitized, several models can be used to transcribe the audio to text.
• The intention of text summarization is to create a coherent and fluent summary having only the main points
outlined in the document.
• Text summarization reduces reading time, accelerates the process of researching for information, and
increases the amount of information that can fit in an area.
• Textrank is a general purpose,graph based ranking algorithm for NLP.It is an extractive and unsupervised
text summarization technique.
Three main approaches of Textrank algorithm:
1)Acoustic-Phonetic approach
2)Pattern recognition approach
3)Artificial Intelligence approach

Dept of ISE,DBIT, 2019 7


1)Acoustic-Phonetic approach-it consists of finite phonetic units which involves in spectral analysis
of speech signal to extract features for segmentation and labeling. In this way the valid word is
recognized.
2)Pattern recognition approach-First step is to train speech recognizers and then provide direct
comparison between unknown speeches to the learned pattern .
3)Artificial Intelligence approach-It is a hybrid approach that extracts the idea of Acoustic-Phonetic
and Pattern recognition approaches .this approach is used for solve complex tasks

Dept of ISE,DBIT, 2019 8


Basic working of Speech of Recognition System

Speech Signal

Pre-Processing

Feature Extraction

Language Modeling

Decoder

Pre-Processing

Dept of ISE,DBIT, 2019 9


Text Rank Algorithm: Flow Diagram of the Algorithm

TextRank algorithm follows:


The first step would be to concatenate all the text contained in the articles
Then split the text into individual sentences
In the next step, will find vector representation (word embeddings) for each and every
sentence
Similarities between sentence vectors are then calculated and stored in a matrix.

Dept of ISE,DBIT, 2019 10


Proposed System:

Dept of ISE,DBIT, 2019 11


Existing System:

Dept of ISE,DBIT, 2019 12


References:

1. S. K. Gaikward et.al, “A review on speech recognition technique”,International journal of Computer


Applications, vol. 10, no. 3,November 2010.
2. D. Mayank, Aggarwal, R. K., “Implementing a speech recognition system interface for Indian languages”,
proc.of the JCNLP-08 workshop on NLP”, , Hyderabad, India, January 2008, pp. 105-112.
3. K. Brady, M. Brandstein et.al, “An evaluation of audio-visual person recognition on the XM2VTS corpus
using the Lausanne protocol”, MIT Lincoln Laboratory, 244 Wood St., Lexington MA.
4. Dazhi Yang_ and Allan N. Zhang Singapore Institute of Manufacturing Technology "Title of the paper
Performing literature review using text mining, Part III: Summarizing articles using Text Rank".
5. Ali Toofanzadeh Mozhdehi, Mohamad Abdolahi and Shohreh Rad Rahimi title " Overview of extractive text
summarization" .
6. Sonya Rapinta Manalu, Willy School of Computer Science title "Stop Words in Review Summarization
Using Text Rank ".
7. K. Kuldeep, , R. K Aggarwal , “A Hindi speech recognition system for connected words using HTK”, Int. J.
Computational systems Engineering, vol. 1, No. 1. , Haryana, India, 2012.

Dept of ISE,DBIT, 2019 13


ANY SUGESSTIONS OR QUESTIONS TO IMPROVE OUR
PROJECT PROPOSAL??

Dept of ISE,DBIT, 2019 14


THANK YOU

Dept of ISE,DBIT, 2019 15

Das könnte Ihnen auch gefallen