Sie sind auf Seite 1von 18

Noise Supression Techniques for

Speech Enhancement Using Adaptive


Filtering
Derek Shiell
03/09/2006
ECE 463: Project Presentation
Professor Michael Honig

Overview

Objective/Problem Description
Applications
Overview of Noise Reduction Methods
System Description
Filter analysis

Linear methods

Wiener approximation
KLT preprocessing
Signal subspace embedding

Kalman filter based methods


Non-linear methods

Current results
Future work
Implementation/ practical considerations
Conclusions

Objective/Problem
Description
The goal of my project was to
research noise reduction techniques
specifically for automatic speech
recognition system front-end
processing on a single microphone
without an independent noise
recording or clean reference signal.

Applications

Cell phone speech enhancement

Automatic speech recognition

(1)
(2)
(3)

(1)

Speaker identification

(2)

Biomedical signal processing


http://images.businessweek.com/mz/04/45/techbuy/images/razr_phone.jpg
http://www.nanopac.com/images/smnsbox.jpg
http://ldt.stanford.edu/~sgilutz/Shulis_Portfolio/fall/hci/images/sensory.jpg

(3)

Overview of Speech
Enhancement

Microphone Array Processing

Active echo/noise cancellation (ANC)

Utilizing multiple microphones, blind source separation


(BSS) techniques such as independent component
analysis (ICA) may be used to distinguish one speaker
from other directional or diffuse noises.
In this case, the echo or noise is estimated and regenerated with opposite phase to destructively interfere
with the original echo or noise.

Blind noise suppression

In this case, there is a single speech signal corrupted by


noise, no separate noise recording with which to make
noise estimates, and no source signal to reference.

System Descriptions
BSS/ICA

ANC

Active Noise Cancellation with single microphone/speaker [4]


BSS based on frequence domain ICA [6]

Blind Noise Reduction

Blind noise reduction schematic [1]

Filter Analysis

(1)

Linear MMSE (Wiener approximation)

yk sk nk
MMSE cost
function

Min
E s k g ( y k )
g

error sk sk

sk g ( yk )

Min
E yk g ( y k )
g

2 E nk g ( y k ) 2 E yk nk E nk 2

Reduces to (frame length N):

1
Min

g
N

k 1

yk g ( y k )

2 E nk g ( y )

Filter Analysis

(2)

Linear Estimation (continued)


Signal is estimated from a linear filtering of the corrupted signal

sk w T y k
Minimizing the MMSE cost function with respect to w the result is as follows:

w R -1y (ry rn )
This is an approximation to the Wiener solution where we are estimating
the crosscorrelation vector p with (ry rn) (similar to spectral subtraction)

Filter Analysis

(3)

Linear estimation with Karhunen Leve Transform (KLT)


Preprocessing the signal using KLT (or PCA) separates the signal into its
directions of greatest variance. Using the transform the signal can be
mapped into a lower dimensional space which helps decorrelate the signal
from noise. For a changing signal this requires that U be adaptively updated.
Define U the KLT transform as the eigenvectors of Ry the autocorrelation
matrix of the noisy signal.

U eig ( R y )

Using this transformation we can define the transformed yk as:

~y Uy
k
k
The resulting closed form solution for the weight vector is:

1 T

w U R y U U ry rn
T

Filter Analysis

(4)

Signal subspace embedding


This method allows for a matrix of gain factors, W, rather than simply a
sk estimate of can be
weight vector, w (MIMO) so that a simultaneous block
made. In addition the matrix Q can be chosen as either I or to taper the tap
weights by some sfactor(s)
such that
is emphasized more in the
k
minimization phase.
MMSE cost function:


~ T
~
Min
E sk W ( y k ) Q sk W ( y k )
g

Update Equations for the filter


transform basis
can be found
1matrix Tand
T
T
Wi 1 Wi
Qe y k y U i QR n U i
iteratively:
N

WiT Qe y k yiT WiT QR n


N

U i 1 U i

Filter Analysis

(5)

Kalman Filtering Approaches


Kalman filters are widely used in speech enhancement and much theoretical
work has been done analyzing Kalman filters. The Kalman filter is the minimum
mean-square estimator of the state of a linear dynamical system and can be
used to derive many types of RLS filters. Extended Kalman filters can be
expanded to handle nonlinear models through a linearization process.
Kalman filters have the advantages that they are:

more robust (stationarity not assumed)


require only the previous estimate for the next estimation
for instance)

computationally efficient
Standard linear state-space model for Kalman filter

x (n 1) F (n 1, n) x (n) v1 (n)
y ( n) C ( n) x ( n) v 2 ( n)

(versus all passed values

Filter Analysis

(6)

Nonlinear filtering
Many nonlinear filtering methods exist to suppress noise in
noisy speech. Examples include filters based on neural networks
or phase space reconstruction. In general, they are very complex
to analyze, but do not require estimation of noise or speech
spectra and are not characterized by musical tone artifacts.

Feed forward neural network (1)

Phase space reconstruction for different speech phonemes [9]

(1) http://research.yale.edu/ysm/images/78.2/articles-neural-network.jpg

Typical Results
Segmental SNR results
(left) and SNR results
(below) for various
linear and nonlinear
noise reduction
methods [8]
Noisy Speech Signal
(white noise)
Wiener Filtered
Ephraim Filtered

Comparison of segmental SNR performance for different noise sources:


1)
2)
3)
4)

White noise (SNR 6.08 dB)


Pink noise (SNR 4.34 dB)
Factory noise (SNR 5.16 dB)
F16 noise (SNR 4.61 dB)

a) Linear estimation b) linear estimation with KLT preprocessing c) signal


subspace embedding d) weighted signal subspace embedding e) NN with
KLT f) linear with clean target g) nonlinear with clean target h) standard
spectral subtraction method (3dB segmental SNR ~ 5dB SNR) [1]

Future Work

Perform ASR after noise reduction filtering


AVICAR database

Data collected in a car environment


Time varying SNR
No independent noise recording (detecting speech is
difficult)

Experiments

KLT preprocessing + linear estimation (Wiener)


Ephraim filter (ML short time spectral amplitude estimator)
Nonlinear methods

Implementation/
Practical Considerations

Real-time processing

Applications require computationally


efficient algorithms to be feasible.

Determining noise sample

Single microphone, speech detection to


estimate noise statistics is difficult.
Use visual information to detect speech
or nonlinear noise reduction methods

Conclusions

Noise suppression methods have become


increasingly important due to the proliferation of
mobile devices, ASR systems, and biometrics/
bioinformatics
Speech enhancement is a very broad field

Interested in blind noise reduction

Array processing for source separation, noise cancellation


Linear, Linear + KLT preprocessing, Signal subspace embedding
Kalman filter based methods, Non-linear methods

Using state-of-the-art noise reduction methods,


typical SNR improvements are ~5 dB
Proposed experiments to test ASR improvement

References
1.

2.

3.

4.

5.

6.

7.

8.

9.

Eric A. Wan and Rudolph van der Merwe, Noise-Regularized Adaptive Filtering for Speech
Enhancement, Proc. Eurospeech, pp. 2643-2646, 1999.
Ki Yong Lee., Byung-Gook Lee, Iickho Song, and Souguil Ann, Robust Estimation of AR Parameters
and its Application for Speech Enhancement, Proc. IEEE ICASSP, pp. 309 - 312, 1992.
Phil S. Whitehead, David V. Anderson, and Mark A. Clements, Adaptive, Acoustic Noise Suppression
for Speech Enhancement. Proc. IEEE ICME, pp. 565 568, 2003.
A. V. Oppenheim, E. Weinstein, K. C. Zangi, M. Feder, and D. Gauger, Single Sensor Active Noise
Cancellation Based on the EM Algorithm, Proc. IEEE ICASSP, pp. 277 280, 1992.
T. Rutkowski, A. Cichocki, and A. K. Barros, Speech Enhancement Using Adaptive Filters and
Independent Component Analysis Approach, Proc. AISAT, 2000.
H. Saruwatari, K. Sawai, A. Lee, K. Shikano, A. Kaminuma, and M. Sakata, Speech Enhancement and
Recognition in Car Environment Using Blind Source Separation and Subband Elimination Processing,
Proc. ICA, pp. 367 372, 2003.
Simon Haykin, Adaptive Filter Theory, Prentice-Hall Inc., Upper Saddle River, NJ, pp 466 501,
2002.
M. T. Johnson, A. C. Lindgren, R. J. Povinelli, and X. Yuan, Performance of Nonlinear Speech
Enhancement using Phase Space Reconstruction, Proc IEEE ICASSP, pp. 872 875, 2003.
Andrew C. Lindgren, Speech Recognition Using Features Extracted from Phase Space
Reconstructions, Thesis, Marquette University, Milwaukee WI, May 2003.

END

Das könnte Ihnen auch gefallen