Sie sind auf Seite 1von 40

SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 1
INTRODUCTION TO MATLAB

1.0 INTRODUCTION:
MATLAB is a high-performance language for technical computing. It integrates computation,
visualization, and programming in an easy-to-use environment where problems and solutions are
expressed in familiar mathematical notation. Typical uses include

 Math and computation


 Algorithm development
 Data acquisition
 Modeling, simulation, and prototyping
 Data analysis, exploration, and visualization
 Scientific and engineering graphics
 Application development, including graphical user interface building

MATLAB is an interactive system whose basic data element is an array that does not require
dimensioning. This allows you to solve many technical computing problems, especially those with
matrix and vector formulations, in a fraction of the time it would take to write a program in a scalar
non interactive language such as C or FORTRAN.

The name MATLAB stands for matrix laboratory. MATLAB was originally written to provide
easy access to matrix software developed by the LINPACK and EISPACK projects. Today, MATLAB
engines incorporate the LAPACK and BLAS libraries, embedding the state of the art in software for
matrix computation.

MATLAB has evolved over a period of years with input from many users. In university
environments, it is the standard instructional tool for introductory and advanced courses in
mathematics, engineering, and science. In industry, MATLAB is the tool of choice for high-
productivity research, development, and analysis.

MATLAB features a family of add-on application-specific solutions called toolboxes. Very


important to most uses of MATLAB, toolboxes allow you to learn and apply specialized technology.
Toolboxes are comprehensive collections of MATLAB functions (M – files) that extend the MATLAB
1
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

environment to solve particular classes of problems. Areas in which toolboxes are available include
signal processing, control systems, neural networks, fuzzy logic, wavelets, simulation, and many
others.

1.1 THE MATLAB SYSTEM:

The MATLAB system consists of five main parts

 Development Environment:

This is the set of tools and facilities that help you use MATLAB functions and files. Many of
these tools are graphical user interfaces. It includes the MATLAB desktop and command window, a
command history, an editor and debugger, and browsers for viewing help, the workspace, files, and the
search path.

 The MATLAB Mathematical Function Library:

This is a vast collection of computational algorithms ranging from elementary functions, like
sum, sine, cosine, and complex arithmetic, to more sophisticated functions like matrix inverse, matrix
Eigen values, Bessel functions, and fast Fourier transforms.

 The MATLAB Language:

This is a high-level matrix/array language with control flow statements, functions, data
structures, input/output, and object-oriented programming features. It allows both “programming in the
small” to rapidly create quick and dirty throw-away programs, and “programming in the large” to
create large and complex application programs.

 Graphics:

MATLAB has extensive facilities for displaying vectors and matrices as graphs, as well as
annotating and printing these graphs. It includes high-level functions for two-dimensional and three-
dimensional data visualization, image processing, animation, and presentation graphics. It also
includes low-level functions that allow you to fully customize the appearance of graphics as well as to
build complete graphical user interfaces on your MATLAB applications.

2
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

 The MATLAB Application Program Interface (API):

This is a library that allows you to write C and FORTRAN programs that interact with MATLAB. It
includes facilities for calling routines from MATLAB (dynamic linking), calling MATLAB as a
computational engine, and for reading and writing MAT-files.

Various toolboxes are there in MATLAB for computing recognition techniques, but we are
using IMAGE PROCESSING toolbox

1.2 GRAPHICAL USER INTERFACE (GUI):

MATLAB’s Graphical User Interface Development Environment (GUIDE) provides a rich set
of tools for incorporating graphical user interfaces (GUIs) in M-functions. Using GUIDE, the
processes of laying out a GUI (i.e., its buttons, pop-up menus, etc.)and programming the operation of
the GUI are divided conveniently into two easily managed and relatively independent tasks. The
resulting graphical M-function is composed of two identically named (ignoring extensions) files:

 A file with extension .fig, called a FIG-file that contains a complete graphical description
of all the function’s GUI objects or elements and their spatial arrangement. A FIG-file
contains binary data that does not need to be parsed when he associated GUI-based M-
function is executed.
 A file with extension .m, called a GUI M-file, which contains the code that controls the
GUI operation. This file includes functions that are called when the GUI is launched and
exited, and callback functions that are executed when a user interacts with GUI objects for
example, when a button is pushed.

To launch GUIDE from the MATLAB command window, type

guide filename

Where filename is the name of an existing FIG-file on the current path. If filename is omitted,

GUIDE opens a new (i.e., blank) window.

3
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

A graphical user interface (GUI) is a graphical display in one or more windows containing
controls, called components that enable a user to perform interactive tasks. The user of the GUI does
not have to create a script or type commands at the command line to accomplish the tasks. Unlike
coding programs to accomplish tasks, the user of a GUI need not understand the details of how the
tasks are performed.

GUI components can include menus, toolbars, push buttons, radio buttons, list boxes, and
sliders just to name a few. GUIs created using MATLAB tools can also perform any type of
computation, read and write data files, communicate with other GUIs, and display data as tables or as
plots.

MATLAB® is a high-performance language for technical computing. It integrates


computation, visualization, and programming in an easy-to-use environment where problems and
solutions are expressed in familiar mathematical notation. Typical uses include:

 Math and computation


 Algorithm development
 Modeling, simulation, and prototyping
 Data analysis, exploration, and visualization
 Scientific and engineering graphics
 Application development, including graphical user interface building

MATLAB is an interactive system whose basic data element is an array that does not require
dimensioning. This allows you to solve many technical computing problems, especially those with
matrix and vector formulations, in a fraction of the time it would take to write a program in a scalar
noninteractive language such as C or FORTRAN.

The name MATLAB stands for matrix laboratory. MATLAB was originally written to provide easy
access to matrix software developed by the LINPACK and EISPACK projects. Today, MATLAB uses
software developed by the LAPACK and ARPACK projects, which together represent the state-of-the-
art in software for matrix computation

MATLAB has evolved over a period of years with input from many users. In university
environments, it is the standard instructional tool for introductory and advanced courses in

4
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

mathematics, engineering, and science. In industry, MATLAB is the tool of choice for high-
productivity research, development, and analysis.

MATLAB features a family of application-specific solutions called toolboxes. Very important


to most users of MATLAB, toolboxes allow you to learn and apply specialized technology. Toolboxes
are comprehensive collections of MATLAB functions (M-files) that extend the MATLAB
environment to solve particular classes of problems. Areas in which toolboxes are available include
signal processing, control systems, neural networks, fuzzy logic, wavelets, simulation, and many
others.

1.3 SOFTWARE DESCRIPTION:

Getting Started

If you are new to MATLAB, you should start by reading Manipulating Matrices. The most
important things to learn are how to enter matrices, how to use the: (colon) operator, and how to
invoke functions. After you master the basics, you should read the rest of the sections below and run
the demos.

At the heart of MATLAB is a new language you must learn before you can fully exploit its
power. You can learn the basics of MATLAB quickly, and mastery comes shortly after. You will be
rewarded with high productivity, high-creativity computing power that will change the way you work .

Introduction - describes the components of the MATLAB system.

Development Environment - introduces the MATLAB development environment, including


information about tools and the MATLAB desktop.

Manipulating Matrices - introduces how to use MATLAB to generate matrices and perform
mathematical operations on matrices.

Graphics - introduces MATLAB graphic capabilities, including information about plotting data,
annotating graphs, and working with images.

Programming with MATLAB - describes how to use the MATLAB language to create scripts
and functions, and manipulate data structures, such as cell arrays and multidimensional arrays.

5
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 2
INTRODUCTION TO TITLE

Speech plays a vital role in our daily communication and also for human machine
interfacing. Therefore, production and perception of speech have become an interesting part
of the research since decades. But the quality and intelligibility of the speech are significantly
degraded by the presence of background noise, which affects the ability in understanding
other’s speech, causes error in Human Machine Interfacing, etc. In this digital world, it's
really hard for any signal in real-time environment to escape from noise. This hits us really
hard when it comes to deliver a message from one place to another and there is a need for
cleaning up or enhancing the message signal but at the same time, not giving up any
intelligibility of the message (content, not just clarity). Since speech messages have been the
mode of communication everywhere, need for speech enhancement is required whenever the
signal comes in contact with the real-time environment. Modeling of human speech
production process helps in enhancing the speech. But, as speech is a highly non- stationary
signal, it is difficult to model the human speech production process. Though speech is highly
non-stationary signal, it is stationary for very short period of time. Based on this fact,
Classical speech enhancement techniques are considered for speech segment models for short
time, but these short time models do not include the effects of the noise as noise has long
term characteristics. On the other hand, such long-term characteristics are naturally taken care
of in the autoregressive approach as speech signals are not modeled on a short-time basis but
as a whole. The AR model is also known to be good for representing unvoiced speech.
However, it is not quite appropriate for voiced speech since voiced speech is often quite
periodic in nature. This has motivated us to look into speech models which can satisfactorily
describe both voiced and unvoiced speech, and allow for exploitation of the long-term
characteristics of noise. Speech enhancement is an area of speech processing where the goal
is to improve the intelligibility and/or pleasantness of a speech signal. The most common
approach in speech enhancement is noise removal, where we, by estimation of noise
characteristics, can cancel noise components and retain only the clean speech signal. The
basic problem with this approach is that if we remove those parts of the signal that resemble
noise, we are also bounded to remove those parts of the speech signal that resemble noise

6
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

In other words, speech enhancement procedures, often inadvertently, also corrupt the
speech signal when attempting to remove noise. Algorithms must therefore compromise
between effectiveness of noise removal and level of distortion in the speech signal.
Current speech processing algorithms can roughly be divided into three domains: spectral
subtraction, sub-space analysis and filtering algorithms. Spectral subtraction algorithms
operate in the spectral domain by removing, from each spectral band, that amount of energy
which corresponds to the noise contribution. While spectral subtraction is effective in
estimating the spectral magnitude of the speech signal, the phase of the original signal is
not retained, which produces a clearly audible distortion known as “ringing”. Sub-space
analysis operates in the autocorrelation domain, where the speech and noise components
can be assumed to be orthogonal, whereby their contributions can be readily separated.
Unfortunately, finding the orthogonal components is computationally expensive.
Moreover, the orthognality assumption is difficult to motivate. Finally, filtering
algorithms are time-domain methods that attempt to either remove the noise component
(Wiener filtering) or estimate the noise and speech components by a filtering approach
(Kalman filtering).To fulfill the objective of objective of speech enhancement was initially
done by using Kalman Filter, but the results did not meet the requirement. So, we
segregated the entire signal into small samples called windows by adopting different
windowing techniques like rectangular windowing and Hamming windowing. We iterated
the process for few times by updating the autoregressive filter coefficients after every
repetition. Even though the process takes long time for a tiny speech signal data, the output
can be compared with input for its similarity.

The primary objective of many Speech Enhancement algorithms is to improve the


perceptual quality of extracting speech signal from noisy speech. Noise estimation is the
major component in speech enhancement techniques, because better noise estimation gives
a high quality of speech extraction. Till now, removing noise from noisy speech is
challenging issue because spectral properties of non-stationary noise is very difficult to
estimate and predict. Noise estimation is a careful issue in speech enhancement algorithms
since if the noise power is more than speech power, then that speech content may be removed
due to treating that as a noise. Due to the wide use of Speech processing in many applications
like teleconferencing systems, speech recognition based security devices, biomedical

7
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

signal processing, hearing aids, ATM machines and computers; Speech enhancement is a hot
research area in signal processing and remains a challenging issue because of most of the
cases only the noisy speech is available . Over the past years, researchers have developed
different types of efficient algorithms to improve the noisy speech even though still it poses a
challenge to the researches because of characteristics of noise signal varies in a dramatic
manner over time and application to application. There are many speech enhancement
techniques are proposed using filtering approach by researchers last ten years such as
spectral subtraction method, wiener filtering, Kalman filter method and so on. Spectral
subtraction is used for enhancing speech degraded by additive stationary background noise,
but it is affected by musical noise and also it does not remove noise during the silence period.
In Wiener filter based speech enhancement method original speech signal is recovered by
minimizing Mean Square Error (MSE) between the clean speech and the estimated signal.
Spectral and wiener filter based speech enhancement algorithms require the characteristics
of clean speech. But in real time clean speech may not available in all the cases. From the
literature study, we found that some of the techniques have been proposed to enhance the
speech. using harmonic structure of speech signal, speech is recovered form noisy speech
signal, sinusoidal model is adopted, in MMSE estimator to enhance the speech was
introduced by Ephraim. At first The advantage in the use of Kalman filter for speech
enhancement using estimation of speech signal parameters from clean speech before it
corrupted by white noise is proposed in . And further extended to the random and colored
noises. In these methods a trade-off should be maintained between SNR and intelligibility.

Later, many changes are made to the Kalman filter for better improvement, it does not
meet the expectations and also complexity is more. In this paper with less complexity and
better performance a new adaptive Kalman filter based method with the combination of
nonlinear digital filter called digital expander is proposed to recover the speech signal from
noisy speech. The additive noise is modeled as the AR process based on linear prediction
coefficient estimation (LPC) in Kalman filtering algorithm. In addition to coefficient
estimation this paper solved problem of de-noising the random and colored noises. We
considered an assumption that the colored noise is also an autoregressive process. So we
estimated its AR coefficients and variance b linear prediction estimation in the same way.

8
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

In this paper, to overcome above stated problem a new adaptive Kalman filter based
method with preprocessing of a digital audio effecting technique called digital expander is
proposed to recover the speech signal from a sequence (frame) of noisy speech signals and the
additive noise is modeled as the AR process . This estimation of time-varying auto
regressive(AR) speech model parameters are based on linear prediction coefficient estimation
(LPC). In addition to coefficient estimation, this paper solved problem of de-noising the
colored noise. We made an assumption that the noise is also an autoregressive process. So we
estimated its AR coefficients and variances by LPC in the same way. In this paper the content
is organized as follows.

9
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 3
EXISTED WORK

The wiener filter was proposed by Norbert Wiener in 1940.It was published in 1949.
Its purpose is to reduce the amount of noise in a signal. This is done by comparing the
received signal with a estimation of a desired noiseless signal. Wiener filter is not an
adaptive filter, as it assumes input to be stationary. The aim of the process is to have
minimum mean square error. In signal processing, the Wiener filter is a filter used to
produce an estimate of a desired or target random process by linear time- invariant
filtering an observed noisy process, assuming known stationary signal and noise spectra, and
additive noise. The Wiener filter is similar to Spectral Subtraction in the way it is derived
and attempts to minimize the mean-square error in the frequency domain, A noisy signal
(n)s can be expressed as

Here x(n)is the clean speech signal and y(n)is the additive noise signal. This same
equation in the frequency domain now becomes

Where X(f )is the signal spectrum,(f) is the noise spectrum. The Wiener filter is
written as

Where PXX (f) the signal is power spectrum and PYY(f)is the noise power spectrum.
Taking this equation and dividing top and bottom by PYY (f) and letting

Where PXX(f)the signal is power spectrum and P YY (f)is the noise power spectrum.
Taking this equation and dividing top and bottom by PYY(f) and letting

This Equation gives us an important insight into how noise reduction systems work
by using a function of the estimates of the SNR ratios to change the spectral amplitudes
of signals disrupted with noise. For the quality of speech SNR still needs to be improved.
10
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 4
PROPOSED WORK

4.1 SPEECH:

Speech is the process associated with the production and perception of the noises
used in the spoken language. A huge number of disciplines study the speech and the speech
sounds, including acoustic, psychology, speech pathology, linguistic, cognitive science
and computer science. Spoken language is used to communicate information from a speaker
to a listener. Speech production and perception are both important components of the
speech chain.
4.2 SPEECH PERCEPTION:
Speech perception refers to processes by which humans are able to interpret and
understand the sounds used in the language. The study of the speech perception is closely
linked to the phonetic field and phonology. Speech perception researches seek to
understand how the humans recognize the speech sounds and use this information to
understand the spoken language. The researches about the speech have applications in the
building of computer systems which can recognize the speech, as well as improve the
recognition for hearing impaired listeners. There are a lot of biological and psychological
factors which can affect the speech: disorders with the lungs, vocal cords, respiratory
affections among others.

4.3 SPEECH COMMUNICATIONS:


Speech is the most primary human communication. For that reason, it exists a big
trend to increase and improve telecommunications. Nowadays, all the people use the
communication devices almost as a primary good: telephones, mobiles, internet...and the
customers demand a high coverage and quality. However, the background noise is an
important handicap. If it is joined with other distortions, it can seriously damage the
service quality. Added to this human-human interaction, it also exists a human-machine
interaction based on a graphical user interface. However, still today the computers have
a lack of human abilities like speaking, listening, understanding and learning.
We live in a noisy world! In all applications (telecommunications, hands-free
communications, recording, human-machine interfaces, etc) that require at least one

11
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

microphone, the signal of interest is usually contaminated by background noise and


reverberation. As a result, the microphone signal has to be “cleaned” with digital signal
processing tools before it is played out, transmitted, or stored. Speech processing is the
study of speech signals and the processing methods of these signals. The signals are
usually processed in a digital representation whereby speech processing can be seen as the
intersection of digital signal processing and natural language processing. Speech
processing can be divided in the following categories: Speech recognition, which deals
with analysis of the linguistic content of a speech signal. Speaker recognition, where the aim
is to recognize the identity of the speaker. Enhancement of speech signals (this is the area of
this project) Speech coding, a specialized form of data compression, which is important in
the telecommunication area. Voice analysis for medical purposes, such as analysis of
vocal loading and dysfunction of the vocal cords. Speech synthesis: the artificial
synthesis of speech, which usually means computer generated speech. The speech
processing has a lot of applications; one of them could be a tickets sales system by phone,
where, without the necessity of an operator, a customer can buy tickets with different
characteristics and options thanks to the words recognition systems.

Figure 1: Speech processing


Figure is a representation of the speech that ensures that the information content can be
easily extracted by human customers or computers.

12
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 5
SPEECH PROCESSING

5.1 SPEECH ENHANCEMENT:

What is speech enhancement? Enhancement means the improvement in the value or


quality of something. When applied to speech, this simply means the improvement in
intelligibility and/or quality of a degraded speech signal by using signal processing tools.
By speech enhancement, it refers not only to noise reduction but also to dereverberation
and separation of independent signals. Since this field is fundamental for research in the
applications of digital signal processing, it is also of great interest to the industry which is
always looking for new solutions that are both effective and practical. This is a very
difficult problem for two reasons. First, the nature and characteristics of the noise signals
can change dramatically in time and between applications. It is also difficult to find
algorithms that really work in different practical environments. Second, the performance
measure can also be defined differently for each application. Two criteria are often used to
measure the performance: quality and intelligibility. It is very hard to satisfy both at the same
time. Speech enhancement is an area of speech processing where the goal is to improve the
intelligibility and/or pleasantness of a speech signal. The most common approach in
speech enhancement is noise removal, where we, by estimation of noise characteristics, can
cancel noise components and retain only the clean speech signal. The basic problem with
this approach is that if we remove those parts of the signal that resemble noise, we are also
bounded to remove those parts of the speech signal that resemble noise. In other words,
speech enhancement procedures, often inadvertently, also corrupt the speech signal when
attempting to remove noise. Algorithms must therefore compromise between effectiveness of
noise removal and level of distortion in the speech signal. Current speech processing
algorithms can roughly be divided into three domains, spectral subtraction, sub-space analysis
and filtering algorithms:
• Sub-space analysis operates in the autocorrelation domain, where the speech and noise
components can be assumed to be orthogonal, whereby their contributions can be readily
separated. Unfortunately, finding the orthogonal components is computationally
expensive. Moreover, the orthogonality assumption is difficult to motivate.

13
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

• Finally, filtering algorithms are time-domain methods that attempt to either remove the noise
component (Wiener filtering) or estimate the noise and speech components by a filtering
approach (Kalman filtering). There is an important algorithm for speech enhancement which
belongs to the group of parametric methods where the speech signal is modeled as an
autoregressive process embedded in Gaussian noise. Speech enhancement algorithms
belonging to this category consist of two steps:
• Estimation of the AR coefficients and noise variances.
• Application of the Kalman filtering using the estimated parameters to estimate the clean
speech from a sample of the noisy signal.

5.2 SPEECH MODELING:

The modeling of speech studies how humans produce the voice. Nowadays we have a
lot of devices which “speak” to us and this voice should be as similar as possible to a real
human voice. For that reason, a lot of researches are aimed to find a good model of speech
production Figure.

Figure 2: Production model voice


First of all, with this model we decide if the noise that we want to produce is voiced or
unvoiced. In this section we filter the signal with a filter that tries to imitate the effect of the
shape formed with the pharyngeal cavity (throat), vocal and nasal cavity. Finally the radiation
model reproduces the effect of the radiation impedance that the air put up to the exit of the
speech from the mouth.

14
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 6
FITERING METHODS

6.1 KALMAN FILTERING:

The filter has its origin where it is described as a recursive solution for the linear filtering
problem for discrete data. The research was in a wide context of state – space models,
where the point is the estimation through the recursive least squares. Since that moment, due
to the development of digital calculation, Kalman filter has been researched and applied,
particularly in self and assisted navigation, missiles search and economy. The study of
Kalman filter is based on Wiener filter. Kalman filtering is one of the effective speech
enhancement technique, in which speech signal is usually modeled as autoregressive (AR)
model and represented in the state-space domain. A Kalman filter is an estimation and
updating process. In this process both the speech signal and the additive noise signals are
treated as
( ) ( ) respectively and expressed in terms of th order autoregressive model (AR)
as follows

And Noisy speech can be expressed as

15
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Where ( ) is the th sample of the speech single,( ) is the th sample of the additive noise, ( ) is
the th sample of noisy speech. And or AR model parameters. AR modeled speech signal can be
expressed in State space form shown below.

From above equations state vector( ) and driving noise vector( ) can be written as.

From Eq.

with d+1 dimension and with q dimension. From noise suppression


can be done by calculating the variance, Kalman gain.
Estimation: state vector propagation, parameter covariance matrix propagation

16
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Updating: compute Kalman gain, state vector update, parameter covariance matrix update
The coefficients in the above equations are updated every time frame by using following
Discrete Kalman filter update equations

These parameters are updated for each iteration.

6.2 WIENER FILTER:

This filter is the precursor of Kalman filter. The goal of Wiener filter is to remove the
noise from a corrupted signal. In general there are two processes which affect the signal that
we want to measure:
• First of all, it is a fact that every device introduces an error in the output when a signal is
measured. If our original signal is xk and the response of the device is hk our signal in the
output is:

Secondly, the signal outside has noise added due to the process.

To solve this equation, if we don’t have noise and we know the response, then the solution is
easy to find:

For that, we should find the optimal Wiener filter. This kind of filter was proposed. To
reduce the amount of noise in the corrupted signal this filter is based on a statistical approach.
Normally, the filters are designed for a specific frequency, but in Wiener filters, first of all,
17
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

we have to have knowledge about the spectral properties of the original signal and noise, and
after that, we have to find a LTI filter whose output would be as close as possible to the
original signal. The Wiener filters are characterized by the following concepts:
• Assumption: signal and (additive) noise are stationary linear stochastic processes with
known spectral characteristics or known autocorrelation and cross-correlation.
• Requirement: the filter must be physically realizable, i.e. causal (this requirement can be
dropped, resulting in a non-causal solution).
• Performance criteria: minimum mean-square error. 3.2 Kalman filter The filter is a
mathematical procedure which operates through a prediction and correction mechanism.
In essence, this algorithm predicts a new state from its previous estimation by adding a
correction term proportional to the predicted error. In this way, this error is statistically
minimized.
A Kalman filter is simply an optimal recursive data processing algorithm. If we focus
on the word optimal, its definition depends on the criteria chosen to evaluate. A feature is
called optimum if the Kalman filter incorporates all the information provided. It processes all
the measurements available, regardless the precision, to estimate the current value of the
interest variables, using:
• Knowledge of the system and the measurement devices.
• Statistic description of the system noises, measurements of errors and the uncertainty of
the dynamics models.

• Any information available about the initials conditions of the variables under study. A
Kalman filter would be built to combine all these data and with the knowledge of some
dynamic systems to generate the best estimation of the interest variable. We say that this is a
data processing algorithm because it is just a computer program in a processing central.
The complete estimation procedure is as follows:

• The model is formulated on state-space and for an initial set of parameters given, the
model prediction errors are generated by the filter. These are used recursively to evaluate
the probability function until its maximization.

18
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 7
ALGORITHM

7.1 THE DISCRETE ALGORITHM OF KALMAN FILTER:

The Kalman filter consists in a set of mathematic equations which give an optimum
recursive solution through the least square method. The goal of this solution is to
calculate an unbiased minimum variance linear estimator of the state in t, based on the
information available in t-1, and update these estimations, with the additional
information available int, (Clareh al. 1998). The filter is developed assuming the system
can be described through a stochastic linear model, where the associated error to both, the
system and the additional information which is incorporated on it, have a normal
distribution with zero mean and a determinate variance. The solution is optimum when the
filter combines all the observed information and the previous knowledge about the
system behavior to produce a state estimation so the error is statistically minimized. The
recursive term means the filter recalculates the solution each time a new observation or
measure is added to the system. The Kalman filter is the main algorithm to estimate
dynamics systems represented as state-space. In this representation the system is described
by a set of variables denominated of state. The state contains all the information to do with a
certain point in time. This information must permit the deduction of the past system
behavior, with the goal of predicting its future behavior. What makes the filter so
interesting is its skill to predict the system’s state in the past, present and future, although
the nature of the system is unknown. In practice, the individual state variables of a
dynamic system can’t be determined exactly by a direct measure. Due to the foregoing, its
measure is done with stochastic processes which have some uncertainty in the measure.

7.2 THE ALGORITHM:

The Kalman filter estimates the previous process using a feedback control, that is, it estimates
the process to a moment over the time and then it gets the feedback through the observed data.
From the equation point of view that is used to derivate the Kalman filter, it is possible to separate
them into two groups.
19
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Notice how the equations predict the state and covariance estimations forward from
moment n-1 to n. These two formulas give us an estimate value for xn and its covariance,
when we don’t have the real sample yet available. The first Kalman equation estimates
the next sample from the previous state. The second Kalman equation is the covariance
matrix used to predict the estimation error. The A matrix relates the state in the previous
moment n-1 with the actual moment n, this matrix could change for the different
moments over the time. Rw represents the covariance of the process random perturbation
which tries to estimate the state. The specified equations for the state correction are detailed
as follows:

These are used when we have the real sample yn. For that reason, they are called updating
equations too. The first task during the state projection correction is the calculation of the
Kalman gain, Re,n. This gain factor is chosen in such a way it minimizes the covariance
error of the new state estimation. The next step is to measure the process to get yn and
generate a new state estimation which incorporates the new observation. The final step is
to find a new estimation of the error covariance through the last equation. After each
couple of updates, time and measure, the process is repeated taking as starting point the
new state estimations and the error covariance. This recursive nature is one of the most
famous characteristics of Kalman filter. The next figure offers us the complete operation
of the filter, combining the previous figure and the five Kalman equations.

20
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Those which update the observed data or update equations The first group of
equations has to throw the state to the n moment taking as reference the state on n- 1 moment
and the intermediate update of the covariance matrix of the state. The second group of
equations has to take care of the feedback; they add new information inside the previous
estimation to achieve an improved estimation of the state. The equations which update the
time can be seen as prediction equations, while the equations which add new information
can be seen as correction equations. Exactly, the final estimation algorithm can be defined
as a prediction-correction algorithm to solve many problems. In this way, the Kalman
filter works through a projection and correction mechanism to predict the new state and its
uncertainty and correct the projection with the new measure. This cycle is showed in the
following figure

Figure 3: The Kalman filter cycle


The first step is to generate a state prognostic forward over the time taking into
account all the information available at that moment, and the second step is to generate
an improved state prognostic, so the error is statistically minimized. The specified
equations for the state prediction are detailed as follows:

21
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Notice how the equations predict the state and covariance estimations forward from
moment n-1 to n. These two formulas give us an estimate value for xn and its covariance,
when we don’t have the real sample yet available. The first Kalman equation estimates the
next sample from the previous state. The second Kalman equation is the covariance matrix
used to predict the estimation error. The A matrix relates the state in the previous moment n-1
with the actual moment n, this matrix could change for the different moments over the
time. Rw represents the covariance of the process random perturbation which tries to
estimate the state. The specified equations for the state correction are detailed as follows:

These are used when we have the real sample yn. For that reason, they are called
updating equations too. The first task during the state projection correction is the
calculation of the Kalman gain, Re,n. This gain factor is chosen in such a way it
minimizes the covariance error of the new state estimation. The next step is to measure
the process to get yn and generate a new state estimation which incorporates the new
observation. The final step is to find a new estimation of the error covariance through the
last equation. After each couple of updates, time and measure, the process is repeated
taking as starting point the new state estimations and the error covariance. This recursive
nature is one of the most famous characteristics of Kalman filter. The next figure offers us
the complete operation of the filter, combining the previous figure and the five Kalman
equations.

The main aim of the work is speech enhancement using Kalman filter. Initially, we
have taken the audio input signal which is implementing different noisy files and
producing appropriate outputs respectively the signal that is used in this work is

taken. We have also taken a babble noise with SNR 10dB and calculated its LPC coefficients.
Then we added babble noise with SNR of 10dB to the clean speech. This is used as the noisy
speech which is given as the input to the Kalman as the data observed. As speech is not
stationary for a long time we took small frames of speech by windowing. Here in this work,
we observed the algorithm by taking different windowing techniques, Rectangular and
Hamming. We took each frame length to be 240 samples. Now the segmented noisy speech is
22
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

saved as a matrix where each row consists of the value of each window, where our each
window is of 240 samples looping and taking one window at a time. We calculated the LPC
coefficients of the original noisy speech signal and calculate the Kalman gain for each loop
for updation of the next state. Looping is done as the past samples have an influence over the
future samples. Finally after iterative process, the SNR of the output of the Kalman filter is
calculated and compared with different techniques. Fig shows the mechanism of Kalman
filter in speech enhancement.

Figure 4: Mechanism of Kalman filters in speech enhancement


.

The use of Kalman Filter for speech enhancement in the form that is presented here was first
introduced. This method however is best suitable for reduction of white noise to comply with Kalman
assumption. In deriving Kalman equations it normally assumed that the process noise (the additive
noise that is observed in the observation vector) is uncorrelated and has a normal distribution. This
assumption leads to whiteness character of this noise. There are, however, different methods
developed to fit the Kalman approach to colored noises. It is assumed that speech signal is stationary
during each frame, that is, the AR model of speech remains the same across the segment. To fit the
one-dimensional speech signal to the state space model of Kalman filter we introduce the state vector
as: equation

Where x(n) is the speech signal at time n. Speech signal is contaminated by additive white
noise N(n)

The speech signal could be modeled with an AR process of order p

23
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Where ai's are AR (LP) coefficients and u(k) is the prediction error which is assumed to have
a normal distribution ~N(0,Q). substituting equations:

G has a length of p (LP order) and the observation equation would be:

where, H=GT The Kalman filter, like other recursive methods, uses all the series history
but with one advantage: it tries to estimate a stochastic path of the coefficients instead of
a deterministic one. In this way it solves the possible estimation cut when structural
changes happen. The Kalman filter uses the least square method to recursively generate a
state estimator on k moment, which is unbiased minimum and variance linear. This filter
is in equal terms with Gauss- Markov theorem and this gives to Kalman filter its
enormous power to solve a wide range of problems on statistic inference. The filter is
distinguished by its skill to predict the state of a model in the past, present and future,
although the exact nature of the modeled system is unknown. Among the filter
disadvantages we can find that it is necessary to know the initial conditions of the mean
and variance state vector to start the recursive algorithm. There is no general consent
over the way of determinate the initial conditions. When it is developed for autoregressive
models, the results are conditioned to the past information of the variable under study. In this
sense the prognostic of the series over the time represents the inertia that the system actually
has and they are efficient just for short time term.

24
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

7.3 ADAPTIVE KALMAN FILTERING:

The proposed estimating the noise and driving process variances using the property of
the innovation sequence, obtained after a preliminary Kalman filtering with an initial
gain. The signal is modeled as an AR process and a Kalman filter based- method is
proposed by reformulating and adapting the approach proposed for control applications
by Carew and Belanger .This method avoids the explicit estimation of noise and driving
process variances by estimating the optimal Kalman gain. After a preliminary Kalman
filtering with an initial sub-optimal gain, an iterative procedure is derived to estimate the
optimal Kalman gain using the property of the innovation sequence. The performance of
this algorithm is compared to the one of alternative speech enhancement algorithms based
on the Kalman filtering. A distinct advantage of the proposed algorithm is that a VAD
(voice activity detector)is not required. Another advantage of this algorithm compared to
the one, similar in structure, presented is the superiority in terms of computational load.
A filtering step is not required in the optimal Kalman gain estimation.

7.4 NOISY SPEECH MODEL AND KALMAN FILTERING:

The speech signal (n)s is modeled as a p th-order order AR process

Where s(n) is the nth sample of the speech signal, y(n) is the nth sample of the
observation, and (n)ia is the i th AR parameter. This system can be represented by the
following state-space model:

Where:
1. The sequence u(n)and v(n)are uncorrelated Gaussian white noise with zero means and the
25
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

variables.

X(n)is the1p state vector.

2. F is the pp transition matrix

3. G and H are, respectively, the 1p input vector and the 1p observation row vector
which is defined as follow

The standard Kalman filter provides the up-dating state vector estimator equations:

where is the minimum mean-square estimate of the state vector x


(n)given the past observations

is the filtered estimate of the state vector x(n), e(n)is the


innovation sequence and K(n)is the Kalman gain. The estimated speech signal can be
retrieved from the state-vector estimator:

The noise variances σ2uand 2


vare needed to compute the Kalman gain K(n). However,
the transition matrix and the Kalman gain are unknown and hence must be estimated.

26
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

7.5 VAD ALGORITHM:

The decision about voice activity presence is the sensitive part of the whole spectral
subtraction algorithm as the noise power estimation can be significantly degraded by the
errors in voice activity detection. VAD accuracy dramatically affects the noise suppression
level and amount of speech distortion that occurs. Many different techniques have been
applied to the art of VAD. In the early VAD algorithms, short- time energy, zero-crossing
rate, and linear prediction coefficients were among the common feature used in the detection
process. Energy-based VADs are frequently used because of their low computation costs.
They work on principle, that the energy of the signal is compared with the threshold
depending on the noise level. Speech is detected when the energy estimation lies over the
Threshold. Dynamical energy- based VAD described in is used in proposed enhanced spectral
subtraction method. In classical energy-based algorithms, detector can not track the threshold
value accurately, especially when speech signal is mostly voice-active and the noise level
changes considerably before the next noise level re-calibration instant. The dynamical VAD
was proposed to provide its classification more accurately in comparison with other energy-
based techniques.

Figure 5: Vad Algorithm

The spectral subtraction algorithm is historically one of the first algorithms proposed
for noise reduction , and is perhaps one of the most popular algorithms. It is based on a
simple principle. Assuming additive noise, one can obtain an estimate of the clean signal
27
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

spectrum by subtracting an estimate of the noise spectrum from the noisy speech spectrum.
The enhanced signal is obtained by computing the inverse discrete Fourier transform of the
estimated signal spectrum using the phase of the noisy signal. The algorithm is
computationally simple as it only involves a single forward and inverse Fourier transform.

28
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 8
DIGITAL AUDIO EFFECTS

Digital Audio Effects can be classified as Basic Filtering, Time Varying Filters, Delays,
Modulators, Nonlinear Processing, Special effects. Non-linear Digital Filters are
characterized by creating harmonic and inharmonic frequency components which are not
present in the original signal intentionally or unintentionally. In Dynamic Processing
signal envelope is controlled to minimize harmonic distortion using compressors or
limiters. Digital expander is a signal limiter which minimizes the distortion in the speech.
Expander operates at low signal levels to boost the dynamics of the signal and it is useful
to create a more likely sound characteristic. The signal x(n) is determined from the input
with variable attack and release time data. The logarithm of this x(n) signal is compared
with the threshold value. If the signal is above the threshold, then the difference is
multiplied by the negative slope of the limiter LS. Then the output is applied to
antilogarithm. The control factor f (n) obtained is then smoothed with a first-order low
pass filter. If the signal ( ) lies below the threshold level, then the signal ( ) is set to f
(n)= 1. The delayed input ( −1) is multiplied by the smoothed control factor ( ) to give the
output ( ). The Figure shows a digital expander block diagram. The logarithm of the
signal ( ) is taken and multiplied by 0.5.The value obtained is compared with two thresholds
in order to determine the operating range of the static curve. If thresholds are crossed, the
resulting difference is multiplied by the corresponding slope and antilogarithm of the result is
taken. A first-order low-pass filter subsequently provide the attack and release time.

Figure 6: block diagram of Digital Expander

29
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 9
ADVANTAGES AND DISADVANTAGES

9.1 ADVANTAGES:

 It avoids the influence of possible structural changes on the result. The recursive
estimation starts from an initial sample and updates the estimations by adding a
new observation until the end of the data. This implies that the most recent
coefficients estimation is affected by the
 Distant history; in presence of structural changes the data series can be cut. This cut
can be corrected through the sequential estimations but with a biggest standard
error. Like this, the Kalman filter, like other recursive methods, uses all the series
history but with one advantage: it tries to estimate a stochastic path of the
coefficients instead of a deterministic one. In this way it solves the possible
estimation cut when structural changes happen.
 The Kalman filter uses the least square method to recursively generate a state
estimator on k moment, which is unbiased minimum and variance linear. This filter is
in equal terms with Gauss-Markov theorem and this gives to Kalman filter its
enormous power to solve a wide range of problems on statistic inference.
 The filter is distinguished by its skill to predicted state of a model in the past, present
and future, although the exact nature of the modeled system is unknown. The
dynamic modeling of a system is one of the key features which distinguish the
Kalman method.

9.2 DISADVANTAGES:

 Among the filter disadvantages we can find that it is necessary to know the initial
conditions of the mean and variance state vector to start the recursive algorithm.
There is no general consent over the way of determinate the initial conditions.

The Kalman filter development, as it is found on the original document, is


supposed a wide knowledge about probability theory, specifically with the

30
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Gaussian condition for the random variables, which can be a limit for its research
and application.
 When it is developed for autoregressive models, the results are conditioned to the past
information of the variable under study. In this sense the prognostic of the series over
the time represents the inertia that the system actually has and they are efficient just
for short time term.

31
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 10
SPECTRAL SUBSTRACTION METHOD

10.1 THE PRINCIPLE OF SPECTRAL SUBSTRACTION:

The spectral subtraction is based on the principle that the enhanced speech can be
obtained by subtracting the estimated spectral components of the noise from the
spectrum of the input noisy signal. Assuming that noise w(n)is additive to the speech
signal x(n), the noisy speech y(n)can be written as,

Where n is the time index. The objective of speech enhancement is to find the
enhanced speech )(ˆnx from given y(n),with the assumption that w(n)is uncorrelated with
x(n). The time domain signals can be transformed to the frequency domain as,

Where Yk, Xk and Wk denote the short-time DFT magnitudes taken of y(n), x(n), and
w(n), respectively, and raised to a power a(a=1corresponds to magnitude spectral subtraction,
a=2
corresponds to power spectrum subtraction). If an estimate of the noise spectrum kWˆcan
be obtained, then an approximation of speech kXˆcan be obtained from

The noise spectrum cannot be calculated precisely, but can be estimated during
period when no speech is present in the input signal. Most single channel spectral
subtraction methods use a voice activity detector (VAD) to determine when there is silence
in order to get an accurate noise estimate. The noise is assumed to be short- term stationary,
so that noise from silent frames can be used to remove noise from speech frames. Fig.
shows a block diagram of the spectral subtraction method. The harshness of the subtraction
can be varied by applying a scaling factor α. The values of scaling factor α higher than 1
result in high SNR level of denoised signal, but too high values may cause distortion in
perceived speech quality. Subtraction process with applying scaling factor αcan be
32
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

expressed as:

After subtraction, the spectral magnitude is not guaranteed to be positive and some
possibilities to remove the negative components are by half-wave rectification (setting
the negative portions to zero), or full wave rectification (absolute value).

Figure 7: Spectral subtraction algorithm block-diagram

An inverse Fourier transform, using phase components directly from Fourier transform
unit, and overlap add is then done to reconstruct the speech estimate in the time domain.

10.2 MUSICAL NOISE:

Although spectral subtraction method provide an improvement in terms of noise


attenuation, it often produce a new randomly fluctuating type of noise, referred to as
musical noise due to their narrow band spectrum and presence of tone-like
characteristics. This phenomenon can be explained by noise estimation errors leading
to false peaks in the processed spectrum. When the enhanced signal is reconstructed in
the time domain, these peaks result in short sinusoids whose frequencies vary from frame
to frame. Musical noise although very different from the original noise, can sometimes be
very disturbing. A poorly designed spectral subtraction, which caused musical noise, can

33
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

sometime results in a signal that is of a lower perceived quality and lower information
content, than the original noisy signal. Most of the research at the present time is
focused in ways to combat the problem of musical noise. It is almost impossible to
minimize musical noise without affecting the speech, and hence there is a tradeoff
between the amount of noise reduction and speech distortion. Due to this fact several
perceptual based approaches were introduced, wherein instead of completely eliminating
the musical noise (and introducing distortion), the noise is masked taking advantage of
the masking properties of the auditory system.

34
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 11
IMPLEMENTATION AND EXPERIMENTAL RESULTS

Figure shows the block diagram for combination of Digital Audio Effect with
modified adaptive Kalman filter based speech enhancement method. Based on it Mat lab
code is developed. A noisy speech is generated using a Clean Speech, which is taken from the
Noises Database and random values (random noise or color noise) are added to the clean
speech. Later it passed through a Digital Expander. And the output of Digital Expander is
shown in Figure. Digital Expander expanding factor value is set to 0.5. It is further applied
to Iterative modified Kalman filter to suppress the noise. We set the Kalman filter AR
model order to P=20. These 20 AR coefficients are updated for every time frame of 25ms
duration which is chopped by Hamming window and analyzed using the linear prediction
analysis method (LPC). The additive measurement noise is assumed to be stationary during
the each small frame. LPC coefficient estimation order is taken as 13 for both noisy speech
and noise signals. Number of iterations are set to be 7. Real time noisy signals
(NOIZEUS database) of 0dB, 5dB, 10dB and 15dB are considered for performance analysis,
with Hamming window.

We have observed and tabulated the results of basic Spectral Subtraction, Wiener
Filter, Kalman filter methods and compared with Digital Audio Effect based Kalman filtering
method. Compared to all these methods, proposed algorithm giving better improvement in
terms of SNR as well as intelligibility. The corresponding waveforms are shown below.

Experimental results show that the proposed technique is effective for speech
enhancement compare to conventional Kalman filter. Iterative Kalman filter and
proposed method results and waveforms are placed below.

35
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Figure 8: Block diagram for combination of Digital Audio

Effect with modified adaptive Kalman filter Figure is output waveform of Iterative
adaptive Kalman filter, Figure is output of proposed method(combination of digital audio
effects with Kalman filter), Figure is comparison between clean speech and recovered
speech using proposed method, Figure is related to output Spectrogram waveforms of
Kalman and Proposed method, Table represent corresponding SNR ratios.

36
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Figure 9: InputS Speech signal – 5db. Figure 10: Noise Signal.

Figure 11: Noisy speech Vs Digital Expander output. Figure 12: Noisy Speech Signal.

Figure 13: Combination of Digital Audio Effect with Kalman Filter based Speech
Enhancement recovered Output

37
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

Figure 14: Signal obtained by spectral subtraction at 5dB.

Figure 15: Output speech signal which is simultaneous equal to the input speech signal as shown. That
is the required clean speech.

38
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 12
CONCLUSION

In this paper speech enhancing method based on improved SS algorithm was


introduced. For effective noise reduction with minimal distortion proposed algorithm takes
in account perceptual aspects of human ear. It can be seen from the experimental results
that proposed method effectively reduces background noise in comparison with common
used SS algorithm. Proposed method results in greater improvement of SNR and
considerably improvement of perceptual speech quality in compassion to conventional
spectral subtraction Method.
In the present study, an improved method for speech enhancement by combining
Digital Audio Effecting techniques with improved Adaptive Kalman filter technique is
proposed. In this paper, we discussed the drawbacks of basic methods such as speech
enhancement with spectral subtraction and wiener filter methods. Even though other Kalman
filter approach based speech enhancement methods are giving better results than a
conventional Kalman filter, more complexity is involved. it leads to more time taken process.
In this paper, we proposed a method to overcome the disadvantages of earlier methods in
terms of performance and Speed. All these methods are simulated using MATLAB and input
output SNR values of respective methods are compared. Performance of Proposed method is
analyzed with different Input SNR noise level. It is observed that the proposed method gives
better output SNR values and its performance is comparatively superior for both stationary
and non-stationary signals.

39
MINI PROJECT DEPT OF ECE, AITS-HYD
SPEECH ENHANCEMENT USING COMBINATIONOF DIGITAL AUDIO EFFECTS WITH KALMAN FILTER

CHAPTER 13
BIBLOGRAPHY

13.1 REFERENCES:

[1] M. Berouti, R. Schwartz, and J. Makhoul, “Enhancement of speech corrupted by acoustic


noise” in Processing of international Conference of Acoustic, Speech and Signal
Processing,1979, pp.208-211.
[2] T. Esch, P. Vary, ”Efficient musical noise suppression for speech enhancement system”
IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, Taiwan,
2009,pp. 4409 - 4412.
[3] E. Dong, X. Pu, “Speech denoising based on perceptual weighting filter” 9th International
Conference on Signal Processing ,Leipzig, Germany, 2008, pp 705-708.
[4] V. Prasad, R. Sangwan et al., “Comparison of voice activity detection algorithms for
VoIP”, proc. of the Seventh International Symposium on Computers and Communications,
Taormina, Italy, 2002, pp. 530-532.
[5] K.Sakhnov, E.Verteletskaya, B. Šimák, “Dynamical Energy-Based Speech/Silence
Detector for Speech Enhancement Applications”. In World Congress of Engeneering 2009
Proceedings. Hong Kong:, 2009, pp.801-806.
[6] S.Ogata, T.Shimamura, ”Reinforced spectral subtraction method to enhance speech
signal”, Proceedings of IEEE International Conference on Electrical and Electronic
Technology, 2001, vol 1, pp 242 – 245.
[7] P. Pollák, ”Speech signals database creation for speech recognition and speech
enchancement applications” [associate professor innagural dissertation] CTU in Prague, FEE,
Prague, 2002.
[8] "Speech Enhancement" by J. Benesty, J. Chen, Eds., and S. Makino,Springer, Berlin, 2005.
[9] S.Boll,“Suppression of acoustic noise in speech using spectralsubtraction,” IEEE
Transaction on Speech, Signal Process., volume.ASSP-27,no.2 pp.113-120,1979
[10] T.V.Sreenivas and P.kirnapure,“Codebook constrained Wiener filteringfor speech
enhancement”, IEEE Trans. Speech and Audio processing,sep1996.

40
MINI PROJECT DEPT OF ECE, AITS-HYD

Das könnte Ihnen auch gefallen