Beruflich Dokumente
Kultur Dokumente
I. INTRODUCTION
Speech enhancement aims at improving the
performance of speech communication systems in noisy
environment by suppressing the noise and improving the
perceptual quality and intelligibility of the speech signal.
It is an important research field within speech signal
processing, with applications in many areas such as voice
communication and automatic speech recognition.
Traditional
speech
de-noising
techniques
are
predominantly based on Wiener filtering [1], spectral
subtraction algorithm [2] and subspace filtering [3], which
have attracted significant interest and investigation due to
their easy design and implementation. Although these
linear methods can reduce the noise and improve the
signal-to-noise ratio (SNR), they are not very effective
when signals contain sharp shapes or impulses of short
duration. To overcome these limits, a nonlinear approach
using Wavelet Transform (WT) to reduce the noise has
been proposed. It is based on the assumption that signal
magnitudes dominate the magnitudes of the noise in a
wavelet representation, so that a threshold is settled to
shrink the wavelet coefficients of noise and then an
inverse wavelet transform is made on the residual
coefficients to reconstruct the original speech.
Donoho [4] introduced the signal de-noising technique
using wavelet transform in 1995. However, it has two
limitations: 1) by using a universal threshold, it is unable
to track the change of the SNR and therefore ineffective
for non-stationary noise suppression; 2) when SNR is low
and the spectrum of speech and noise are overlapped, the
unvoiced signal is often lost after de-noising because of its
similarity with the noise signal. In a word, Donohos
method cannot make a balance between protecting the
978-0-7695-4356-7/11 $26.00 2011 IEEE
DOI 10.1109/CMSP.2011.150
( )
noise d
T = 2 ln N
(2)
for discrete wavelet transform and for wavelet packet
transform case:
( ))
T = 2 ln N log 2 N
(3)
where N is the length of the noisy signal. is an
estimated value of noise standard deviation and is given
by:
X ( j ) e j x ( ) = S ( j ) e js ( ) + D( j ) e jd ( ) (7)
(4)
where
where
w j ,k
w j , k , w j ,k T
=
0, w j ,k < T
X ( j ) e j x ( ) = S ( j ) e j x ( ) + D ( j ) e jx ( ) (8)
(5)
x ( j ) = Y ( j ) D ( j )
w j ,k
(9)
sgn( w j ,k ) w j ,k T , w j , k T
=
(6)
0, w j ,k < T
s(n ) = IFFT S ( j )e j x ( )
(10)
H ( ) = H [SNRpost ( )]
1 2
1
(
)
D
, if D ( ) 1
1
+ (11)
Y ( )
Y ( )
=
2
D ( ) 1
, otherwise
(
)
Y
311
and
speech
(0
distortion.
Spectral
flooring
the
< W j ,k < ,
(12)
algorithm, where
signal and
where
W j ,k and W j ,k . When W j ,k ,
difference between
j = Median W j ,k / 0.6745
where
that when W j ,k
T j = j 2 ln N , j = 1,2, " , n
W j , k
W sgn (W )(1 ) ,
W j ,k
j ,k
j ,k
W j ,k (15)
= 0,
2
W j ,k +
,
1 + exp(2W j ,k / ) others
(14)
MSE =
j scale.
1
M
( f (i ) f (i ))
M
(16)
i =1
312
313
-5
Dynamic Threshold
Universal Threshold
SNR Input/dB
ACKNOWLEDGMENT
SNR Output/dB
MSE
SNR Output/dB
MSE
12.70
0.5301
13.51
0.5088
19.27
0.3808
20.71
0.3550
23.77
0.3046
24.70
0.2908
10
25.84
0.2747
26.32
0.2682
REFERENCES
[1]
Method SNR
Hard
Soft
Semi-soft
New
TABLE III.
-5
10
9.52
11.24
12.68
13.51
18.90
20.44
20.71
21.23
24.86
24.70
25.38
26.13
27.52
26.32
28.82
29.41
Method SNR
Hard
Soft
Semi-soft
New
-5
10
0.6211
0.5702
0.5231
0.5088
0.3886
0.3600
0.3550
0.3465
0.2886
0.2908
0.2819
0.2708
0.2526
0.2682
0.2453
0.2367
From the two tables we can see that when SNR input
is low, soft thresholding algorithm has a better
performance than that of hard thresholding. However,
with the increase of SNR input, such tendency is reversed
gradually. Semi-soft thresholding algorithm performs
better than hard and soft thresholding algorithm to some
extent. Performance of the new thresholding algorithm
proposed in this paper is the best under whatever SNR
input conditions.
V. CONCLUSION
In this paper, an adaptive speech enhancement
algorithm based on the classification of unvoiced and
voiced signal is proposed. By applying dynamic threshold
to different wavelet analytical scales and using an adaptive
thresholding function to shrink the wavelet coefficient of
noise, the proposed algorithm has achieved a much better
314