Main Project

DON BOSCO INSTITUTE OF TECHNOLOGY
INFORMATION SCIENCE AND ENGINEERING
SPEECH RECOGNITION AND TEXT SUMMARIZATION USING

TEXTRANK ALGORITHM
Batch No
SL No USN Name Email Id Contact No
1 1DB16IS038 PUSHPAHASA N S npushpahasa@gmail.com 8861844185
2 1DB16IS042 SAHANA S MATHAD sahana1098@gmail.com 9986182362
3 1DB16IS053 VARALAKSHMI B varalakshmisuvarna1998@gmail.com 9964619586
Guide Name: Mrs. Asha K H

Designation: Assistant Professor
Dept of ISE,DBIT, 2019 1

Contents:
• Introduction
• Problem Statement
• Objective
• Methodology
• Existing System
• Proposed System
• References

INTRODUCTION:
Speech Recognition:
• Speech recognition is a process to convert speech sound to corresponding text

• The working of speech recognition is by recording a voice sample of a person’s speech and digitizing it to
create a unique voice print or template.
• Each spoken word is broken up into discrete segments which comprise several tones.
• Advancement in statistical modelling of speech has gained a widespread application in the field of speech
recognition.

Text Summarization:
• Text Summarization is one of those applications of NLP which helps in shortening long pieces of text the
intension is to create a coherent and fluent summary having only the main points outlined in the document.
• There are two main types of techniques used for text summarization: NLP-based techniques and deep
learning-based techniques.
• Automatic Text Summarization is one of the most challenging and interesting problems in the field of Natural
Language Processing (NLP).
• The main objective of a text summarization system is to identify the most important information from the
given text and present it to the end users.

Problem Statement
• The project describes a summarization system that will be developed in order to summarize news/lecture
delivered orally.
• The system generates text summaries from input audio using three independent components: an automatic
speech recognizer, a syntactic analyzer, and a summarizer.
• An automatic speech recognizer means any API to convert speech to text the text obtained using speech
recognition API we will apply text summarization using TextRank algorithm
• Text summarization plays an important role which gives the end user a brief idea of what was told instead of
reading (speech to text converted) the complete lecture or news

Objective
• The goal of getting a machine to understand fluently spoken speech and respond in a natural voice has been
driving speech research for more than 50 years.
• The main objective of a text summarization system is to identify the most important information from the
given text and present it to the end users.
• In this project, The converted text from the professors during lecture are given as input to system and
extractive text summarization is presented by identifying text features and scoring the sentences accordingly.
• The text is first pre-processed to tokenize the sentences and perform stemming operations.
• In this project , recognize audio lecture by the professors using speech to text conversion model and then use
a text summarization algorithm to summarize the key points of lecture.

METHODOLOGY
• Speech must be converted from physical sound to an electrical signal with a microphone and then to digital
data with an analog-to-digital converter.
• Once digitized, several models can be used to transcribe the audio to text.
• The intention of text summarization is to create a coherent and fluent summary having only the main points
outlined in the document.
• Text summarization reduces reading time, accelerates the process of researching for information, and
increases the amount of information that can fit in an area.
• Textrank is a general purpose,graph based ranking algorithm for NLP.It is an extractive and unsupervised
text summarization technique.
Three main approaches of Textrank algorithm:
1)Acoustic-Phonetic approach
2)Pattern recognition approach
3)Artificial Intelligence approach

1)Acoustic-Phonetic approach-it consists of finite phonetic units which involves in spectral analysis
of speech signal to extract features for segmentation and labeling. In this way the valid word is
recognized.
2)Pattern recognition approach-First step is to train speech recognizers and then provide direct
comparison between unknown speeches to the learned pattern .
3)Artificial Intelligence approach-It is a hybrid approach that extracts the idea of Acoustic-Phonetic
and Pattern recognition approaches .this approach is used for solve complex tasks

Basic working of Speech of Recognition System
Speech Signal
Pre-Processing
Feature Extraction
Language Modeling
Decoder
Pre-Processing

Text Rank Algorithm: Flow Diagram of the Algorithm
TextRank algorithm follows:

The first step would be to concatenate all the text contained in the articles
Then split the text into individual sentences
In the next step, will find vector representation (word embeddings) for each and every
sentence
Similarities between sentence vectors are then calculated and stored in a matrix.

Proposed System:

Existing System:

References:
1. S. K. Gaikward et.al, “A review on speech recognition technique”,International journal of Computer

Applications, vol. 10, no. 3,November 2010.
2. D. Mayank, Aggarwal, R. K., “Implementing a speech recognition system interface for Indian languages”,
proc.of the JCNLP-08 workshop on NLP”, , Hyderabad, India, January 2008, pp. 105-112.
3. K. Brady, M. Brandstein et.al, “An evaluation of audio-visual person recognition on the XM2VTS corpus
using the Lausanne protocol”, MIT Lincoln Laboratory, 244 Wood St., Lexington MA.
4. Dazhi Yang_ and Allan N. Zhang Singapore Institute of Manufacturing Technology "Title of the paper
Performing literature review using text mining, Part III: Summarizing articles using Text Rank".
5. Ali Toofanzadeh Mozhdehi, Mohamad Abdolahi and Shohreh Rad Rahimi title " Overview of extractive text
summarization" .
6. Sonya Rapinta Manalu, Willy School of Computer Science title "Stop Words in Review Summarization
Using Text Rank ".
7. K. Kuldeep, , R. K Aggarwal , “A Hindi speech recognition system for connected words using HTK”, Int. J.
Computational systems Engineering, vol. 1, No. 1. , Haryana, India, 2012.

ANY SUGESSTIONS OR QUESTIONS TO IMPROVE OUR
PROJECT PROPOSAL??

THANK YOU

Main Project

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Main Project

Hochgeladen von

Copyright:

Verfügbare Formate

DON BOSCO INSTITUTE OF TECHNOLOGY

INFORMATION SCIENCE AND ENGINEERING

SPEECH RECOGNITION AND TEXT SUMMARIZATION USING

Guide Name: Mrs. Asha K H

Dept of ISE,DBIT, 2019 1

Dept of ISE,DBIT, 2019 2

• Speech recognition is a process to convert speech sound to corresponding text

Dept of ISE,DBIT, 2019 3

Dept of ISE,DBIT, 2019 4

Dept of ISE,DBIT, 2019 5

Dept of ISE,DBIT, 2019 6

Dept of ISE,DBIT, 2019 7

Dept of ISE,DBIT, 2019 8

Dept of ISE,DBIT, 2019 9

TextRank algorithm follows:

Dept of ISE,DBIT, 2019 10

Dept of ISE,DBIT, 2019 11

Dept of ISE,DBIT, 2019 12

1. S. K. Gaikward et.al, “A review on speech recognition technique”,International journal of Computer

Dept of ISE,DBIT, 2019 13

Dept of ISE,DBIT, 2019 14

Dept of ISE,DBIT, 2019 15

Das könnte Ihnen auch gefallen