Beruflich Dokumente
Kultur Dokumente
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 2, FEBRUARY 2010
I. INTRODUCTION
HE enhancement of single-channel speech degraded by
additive noise has been extensively studied in the past
and remains a challenging problem because only the noisy
speech is available. Techniques have been proposed in the
literature to exploit the harmonic structure of voiced speech
for enhancing the speech quality [1][12]. In the work of [1]
and [2], voiced speech is modeled as harmonic components
plus noise-like components, and enhancement is performed
by estimating the harmonic components while reducing the
additive noise in the noise-like components. [3] extends the
Manuscript received June 24, 2008; revised July 02, 2009. First published
July 31, 2009; current version published November 20, 2009. The associate editor coordinating the review of this manuscript and approving it for publication
was Prof. Yariv Ephraim.
W. Jin is with Qualcomm, San Diego, CA 92121 USA (e-mail: wjin@qualcomm.com).
X. Liu and M. S. Scordilis are with the Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL 33146-0640 USA
(e-mail: x.liu6@umiami.edu; m.scordilis@miami.edu).
L. Han is with the Department of Electrical and Computer Engineering, North
Carolina State University, Raleigh, NC 27695 USA (e-mail: lhan2@ncsu.edu).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TASL.2009.2028916
JIN et al.: SPEECH ENHANCEMENT USING HARMONIC EMPHASIS AND ADAPTIVE COMB FILTERING
357
. Because
varies with frequency and
bounded by
controls the level of residual noise at each frequency band , we
refer to as the frequency-dependent noise-flooring parameter
(FDNFP) in this paper. With (3) as our target of approximation,
the estimation error is
be the energy of speech distortion, where denotes the expectation and is matrix trace. Similarly, let
(4)
and
represent the
where
speech distortion and residual noise, respectively. Let
(5)
(6)
denote the energy of residual noise in the th frequency band.
is the th spectral component selector defined as
(2)
are the Fourier transforms of the noisy
where , , and
speech, clean speech, and noise, respectively. Our enhancement
such that
task is to find a spectral domain linear estimator
produces a good approximation to the clean speech
should be
spectrum. Ideally, the enhanced signal spectrum
identical to the clean speech spectrum . Many enhancement
methods, e.g., [21], [25], have been proposed in the literature
between the estimated
to minimize some error norm
and clean speech spectra. However, in practical systems there
always exist residual distortions in the enhanced speech. Moreover, retaining a comfort level of residual noise in the enhanced
speech will actually improve the perceived quality in many situations. For example, in a telephone application, keeping a lowlevel natural sounding background noise will provide the far end
user a feeling of the near end atmosphere and avoid the impression of an interrupted transmission [24]. As stated in the previous section, complete removal of the noise is neither feasible
nor desirable. Therefore, we design our linear estimator in
such a way that the enhanced speech spectrum approaches
(3)
diagonal matrix with real-valued diagonal
where is a
is the frequency index. The
elements , and
parameters
admit certain level of noise to appear at each
are
frequency band in the enhanced speech. The values of
subject to
(8)
where
is the threshold used to suppress noise at the th spectral component. The estimator that satisfies (7) and (8) can
be found by following an optimization procedure similar to that
used in [21]. Specifically, is a stationary feasible point if it
satisfies the gradient equation of the Lagrangian
(9)
and
(10)
is the Lagrange multiplier for the th spectral compowhere
nent. Assume is real and symmetric. From
,
we obtain
(11)
where
and
is the time-domain autocorrelation matrix of
speech and noise, respectively.
is a
diagonal matrix of the Lagrange multipliers. The optimal estimator can be obtained by solving the matrix (11). Now let us
assume is also diagonal. The simplification comes after the
fact that the matrices
and
are asymptotically
and
are Toeplitz [26].
diagonal provided the matrices
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
358
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 2, FEBRUARY 2010
, we
(15)
where
is the signal-to-noise ratio (SNR)
for the th spectral component. Substituting (15) into (12), we
obtain the final solution that satisfies both of the Lagrangian
gradient (9) and (10)
(16)
We can reduce the number of variables in (16) by setting the
to be a proportion of the noise power spectrum
threshold
. Let
, where
is the proportionality
factor and specifies the amount of attenuation of noise power.
Then (16) can be rewritten as
(17)
Obviously, we now have the flexibility of balancing between
the two design parameters in (17). The term
is introduced
because our enhancement target is the noise admitting specin (3). The value of
should be small in order to
trum
maintain a low-level of residual noise in the enhanced speech.
On the other hand, the parameter
dominates the value of
suppression gain. It can be deciphered as a conventional noise
for all , then
suppression function. In fact, if we let
and the second term
on the right-hand
side of (3) becomes zero. This means the enhanced speech specwill approach the clean speech spectrum . Several
trum
choices for the design of
have been proposed in [21]. Actually, if we let
and
then (17) reduces to the classical Wiener filter.
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
JIN et al.: SPEECH ENHANCEMENT USING HARMONIC EMPHASIS AND ADAPTIVE COMB FILTERING
359
Fig. 2. Waveform of clean speech and its LP residual, (a) clean speech, (b) LP
residual of clean speech.
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
360
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 2, FEBRUARY 2010
5 dB, the noisy LP residual is shown in Fig. 3(b). The final peof (19) is plotted in Fig. 3(c)
riodicity-enhanced residual
where noise suppression is clearly evident. In Fig. 4, the magnitude spectrum of the clean, noisy, and periodicity-enhanced LP
residual are plotted in Fig. 4(a)(c) respectively, and the FDNFP
is shown in Fig. 4(d).
B. Adaptive Comb Filtering
From (17) we can see that it is also beneficial to incorporate
for voiced speech. The reason is beharmonic structures into
is the dominant term of the suppression gain while the
cause
are usually small. This way, the level of residual
values of
noise can be more effectively suppressed. Further improvement
of perceptual quality can be achieved by imposing a harmonic
envelope on for voiced speech. However, unlike the which
should
are relatively flat over the entire frequency range, the
follow closely the spectral tilt as well as the formant peaks and
as an
valleys of the speech. Therefore, we implement the
adaptive comb filter by utilizing the spectral peak-picking algorithm proposed in [29].
The peak-picking method in [29] was proposed as part of
a concatenative speech synthesis algorithm that uses the Harmonic plus Noise Model (HNM). Here it is used as a means
to determine the frequency locations of the comb peaks. Because the spectral peaks were picked from the spectrum of clean
speech in [29], some modifications and postprocessings to the
peak-picking method are introduced in this paper for a more reliable performance on the spectrum of noisy speech. Specifically,
the harmonic test is modified as
(20)
or
dB
(21)
then, if
(22)
and
(23)
frequency
is declared voiced, otherwise
is declared unvoiced. The notations in (20), (21) and (22) are the same as dedenotes the frequency location of the
fined in [29], where
peak under test within the range
, and
are the frequencies of the peaks within the same range except
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
JIN et al.: SPEECH ENHANCEMENT USING HARMONIC EMPHASIS AND ADAPTIVE COMB FILTERING
frequency .
is the initial fundamental frequency estimate
and
are the
using an enhanced SIFT method [28].
and , respectively.
and
deamplitudes at
note the cumulative amplitude at
and , respectively. The
cumulative amplitude is defined as the non-normalized sum of
the amplitudes of all of the samples from the previous valley
denotes the mean
to the following valley of the peak.
value of the cumulative amplitudes
. is the index of
as
the nearest harmonic to . Having classified frequency
voiced or as unvoiced, the next interval
is searched for its largest peak and the same harmonic test is
applied. The process is continued throughout the speech bandwidth. The measurements of (20), (21), and (22) were originally
introduced in [29].
In this paper, we have added the tonality measure in (23)
to the harmonic test. The advantage of the tonality test is to
effectively remove the spurious peaks caused by white noise.
The quantity SFM in (23) denotes the spectral flatness measure
as defined in [30]
SFM
361
(24)
and
denote the geometric mean and arithmetic
where
mean of the power spectrum in the range
,
50 dB in our implementarespectively. We used SFM
tion. In other words, an SFM of 50 dB indicates the signal is
entirely tonelike.
Even though the peak-picking method is modified as above,
some real harmonic peaks are rejected and some spurious peaks
are accepted because of the distortion effects of additive noise.
Moreover, the harmonics of clean speech in the spectral valleys
are often submerged by the noise spectrum and consequently
these harmonic peaks can not be picked by the peak tracking
method. To overcome these problems, the following postprocessing steps are performed on the peaks picked by the modified
algorithm.
1) Interpolation of a single harmonic peak. A local peak is
declared a harmonic peak if both of the two conditions
are true:
its frequency is within 15% of
, the nearest
harmonic frequency;
there are at least three peaks before and two peaks after
it.
2) Rejection of isolated peaks. A harmonic peak is rejected if
its distance to the nearest neighboring peaks is either less
or greater than
.
than
3) Recovery of multiple submerged intermediate peaks. Let
and
be some positive integers and
.
Multiple harmonic peaks are interpolated based on the
following tests:
there are no peaks picked in the frequency range
;
there are at least three good harmonic peaks in the
and at least another three harmonics in
range
.
If both of the above conditions are true, then harmonics
. Assume the
are interpolated in the range
last harmonic in
is located at
and the
has a frequency of
,
first harmonic in
then the interpolated harmonics have frequencies
.
Fig. 5 illustrates steps 1 and 2 of the postprocessing. In Fig. 5,
the spectra of the clean and noisy speech are depicted in solid
line and dotted line, respectively. The modified peak-picking
method is applied to the spectrum of noisy speech. The peaks
picked by the modified peak-picking method are marked by
. The final peaks after postprocessing are marked as
crosses
circles
. As shown in Fig. 5, the harmonic peak near 900 Hz
is interpolated by step 1. The spurious peaks above 1600 Hz are
rejected according to step 2.
Fig. 6 depicts step 3 of the postprocessing. As can be seen
from Fig. 6, the spectrum of the noisy speech is relatively flat in
the range 800 1700 Hz because of the effects of additive white
Gaussian noise. The harmonics of the clean speech are submerged by the noise spectrum. Since the conditions of step 3 are
satisfied, four peaks are interpolated in the range 800 1700 Hz.
It should be noted that the spurious peak near 800 Hz is already
eliminated by step 2 before the step 3 interpolation.
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
362
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 2, FEBRUARY 2010
After finding as many additional frequency locations of harmonic peaks as possible, we are ready to design the gain factor
in (17) as an adaptive comb filter. In the first step, an initial
comb filter is implemented in the frequency domain as
otherwise
(25)
is the peak frequency as determined by the modiwhere
is the
fied peak-picking method and post-processings.
frequency response of the initial comb filter at frequency .
controls the width of the comb filter [10] and is set to 2 in
specifies the filter gain at
our implementation. The quantity
peak frequency . Notice in (25) the comb structures are only
implemented within the vicinity of one fundamental frequency
(pitch ) range centered at the peak frequency . The value
determines the filter response outside the frequency range
of
. Since there are many design choices for
and
are also flexible.
the gain factor , the designs of
and
as Wiener-type
In this paper, we implemented the
gains
Fig. 7. Adaptive comb filter for the noisy speech spectrum in Fig. 6.
(26)
and
(27)
where
is the estimated power spectrum of clean speech,
is the spectrum of noisy speech.
can be comand
puted directly from the noisy speech. The accurate estimation of
the clean speech spectrum is very crucial to the performance of
the proposed harmonic enhancement method. We have used the
classical spectral subtraction
(28)
where
is a zero-flooring parameter and
is the
estimated spectrum of noise. is simply the index of frequency
. In the following text, and
are interchangeable. To ob, the minimum statistics
tain the estimated noise spectrum
tracking method in [31] is implemented.
in (17) is obtained by
Eventually, the gain factor
dB
(29)
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
JIN et al.: SPEECH ENHANCEMENT USING HARMONIC EMPHASIS AND ADAPTIVE COMB FILTERING
363
=5
gain
Fig. 10. Waveform and spectrogram of clean and noisy speech (female speech,
multitalker babble noise, SNR
dB), (a) clean speech, (b) spectrogram of
clean speech, (c) noisy speech, and (d) spectrogram of noisy speech.
=5
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
364
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 2, FEBRUARY 2010
Fig. 11. Waveform and spectrogram of JND [24] and Spectral Subtraction [13]
enhanced speech (female speech, multitalker babble noise, SNR
dB),
(a) JND-enhanced speech, (b) spectrogram of JND-enhanced speech, (c) spectral subtraction enhanced speech, and (d) spectrogram of spectral subtraction
enhanced speech.
Fig. 12. Waveform and spectrogram of subspace [23] and proposed harmonic
dB),
enhanced speech (female speech, multitalker babble noise, SNR
(a) subspace-enhanced speech, (b) spectrogram of subspace-enhanced speech,
(c) harmonic enhanced speech, and (d) spectrogram of harmonic enhanced
speech.
=5
=5
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
JIN et al.: SPEECH ENHANCEMENT USING HARMONIC EMPHASIS AND ADAPTIVE COMB FILTERING
365
Fig. 13. Average PESQ scores and MBSD measures of 60 sentences of JND enhanced speech (dotted line), subspace enhanced speech (solid line), spectral
subtraction enhanced speech (dash-dot line), and harmonic enhanced speech (dashed line). The noise is white Gaussian at SNR of 0, 5, 10, 15, and 20 dB.
Fig. 14. Average PESQ scores and MBSD measures of 60 sentences of JND enhanced speech (dotted line), spectral subtraction enhanced speech (dash-dotted
line), subspace enhanced speech (solid line), and harmonic enhanced speech (dashed line). The noise is multitalker babble noise at SNR of 0, 5, 10, 15, and 20 dB.
set at 0, 5, 10, 15, and 20 dB. In both Figs. 13 and 14, the objective measures (PESQ and MBSD) of JND-enhanced speech
are marked by dotted lines and diamond . The measurements
of spectral subtraction are illustrated in dash-dotted lines and
. The measurements of subspace enhancements for
circles
white and colored noise are expressed by solid line and plus
. The performances of the proposed HE method are
signs
.
plotted in dashed line and asterisks
From Figs. 13 and 14 we can see the proposed harmonic enhancement method outperforms the JND approach in [24] and
subspace approach [21], [23] at all SNR conditions for both
white Gaussian noise and babble noise cases. The performance
improvement of the proposed HE method is more profound at
low SNR. In the case of the spectral subtraction method the
PESQ scores suggest that it has the similar good performance as
the proposed harmonic enhancement method in the white noise
case for SNR greater than 5 dB and at SNR equal to 20 dB for
babble noise. However, one should be aware that PESQ is generally deaf to the musical noise introduced by spectral subtraction, an observation also confirmed by subjective listening tests.
In the case of average MBSD scores, the proposed method out-
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
366
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 2, FEBRUARY 2010
TABLE I
COMPOSITE MEASUREMENT COMPARISONS OF 60 SENTENCES OF JND ENHANCED SPEECH, SPECTRAL SUBTRACTION
ENHANCED SPEECH, SUBSPACE ENHANCED SPEECH, AND HARMONIC ENHANCED SPEECH
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
JIN et al.: SPEECH ENHANCEMENT USING HARMONIC EMPHASIS AND ADAPTIVE COMB FILTERING
367
[15] J. H. Chen and A. Gersho, Adaptive postfiltering for quality enhancement of coded speech, IEEE Trans. Speech Audio Process., vol. 3, no.
1, pp. 5971, Jan. 1995.
[16] Y. Ephraim, A minimum mean square error approach for speech enhancement, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.
(ICASSP), 1990, vol. 2, pp. 829832.
[17] Y. Ephraim and D. Malah, Speech enhancement using a minimum
mean-square error log-spectral amplitude estimator, IEEE Trans.
Acoust., Speech, Signal Process., vol. 33, no. 2, pp. 443445, Apr.
1985.
[18] K. K. Paliwal and A. Basu, A speech enhancement method based
on kalman filtering, in Proc. IEEE Int. Conf. Acoust., Speech, Signal
Process. (ICASSP), 1987, pp. 177180.
[19] V. Grancharov, J. H. Plasberg, J. Samuelsson, and W. B. Kleijn,
Speech enhancement using a masking threshold constrained kalman
filter and its heuristic implementations, IEEE Trans. Audio, Speech,
Lang. Process., vol. 14, no. 1, pp. 1932, Jan. 2006.
[20] M. Gabrea, Adaptive kalman filtering-based speech enhancement algorithm, in Proc. Canadian Conf. Elect. Comput. Eng., Fredericton,
AB, Canada, 2001, vol. 1, pp. 521526.
[21] Y. Ephraim and H. L. Van Trees, A signal subspace approach for
speech enhancement, IEEE Trans. Speech Audio Process., vol. 3, pp.
251266, Jul. 1995.
[22] U. Mittal and N. Phamdo, Signal/noise KLT based approach for enhancing speech degraded by colored noise, IEEE Trans. Speech Audio
Process., vol. 8, no. 2, pp. 159167, Mar. 2000.
[23] H. Lev-Ari and Y. Ephraim, Extension of the signal subspace speech
enhancement approach to colored noise, IEEE Signal Process. Lett.,
vol. 10, no. 4, pp. 104106, Apr. 2003.
[24] S. Gustafsson, P. Jax, and P. Vary, A novel psychoacoustically motivated audio enhancement algorithm preserving background noise characteristics, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.
(ICASSP), 1998, pp. 397400.
[25] Y. Hu and P. C. Loizou, Incorporating a psychoacoustical model in
frequency domain speech enhancement, IEEE Signal Process. Lett.,
vol. 11, no. 2, pp. 270273, Feb. 2004.
[26] R. Gray, On the asymptotic eigenvalue distribution of Toeplitz matrices, IEEE Trans. Inf. Theory, vol. IT-18, no. 6, pp. 725730, Nov.
1972.
[27] J. D. Markel, The sift algofithm for fundamental frequency estimation, IEEE Trans. Audio Electroacoust., vol. AU-20, no. 5, pp.
367377, Dec. 1972.
[28] P. Veprek and M. S. Scordilis, Analysis, enhancement and evaluation
of five pitch determination techniques, Speech Commun., vol. 37, pp.
249270, Jul. 2002.
[29] Y. Stylianou, Applying the harmonic plus noise model in concatenative speech synthesis, IEEE Trans. Speech Audio Process., vol. 9, no.
1, pp. 2129, Jan. 2001.
[30] J. D. Johnston, Transform coding of audio signals using perceptual
noise criteria, IEEE J. Sel. Areas Commun., vol. 6, no. 2, pp. 314323,
Feb. 1988.
[31] R. Martin, Noise power spectral density estimation based on optimal smoothing and minimum statistics, IEEE Trans. Speech Audio
Process., vol. 9, no. 5, pp. 504512, Jul. 2001.
[32] D. H. Johnson and P. N. Shami, The signal processing information
base, IEEE Signal Process. Mag. vol. 10, no. 4, pp. 3642, Oct. 1993
[Online]. Available: http://www.spib.rice.edu/spib/select_noise.html
[33] Information EechnologyCoding of Audio-Visual ObjectsPart 3:
Audio, ISO/IEC 14496-3:2005, 2005.
[34] Perceptual evaluation of speech quality (PESQ): An objective method
for end-to-end speech quality assessment of narrowband telephone networks and speech codecs, ITU-T Rec. P.862, Feb. 2001 [Online].
Available: http://www.itu.int/rec/T-REC-P.862-200102-I/en, accessed
on Aug. 15, 2008
[35] W. Yang, M. Benbouchta, and R. Yantorno, Performance of the modified bark spectral distortion as an objective speech quality measure,
in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP),
1998, pp. 541544.
[36] J. G. Beerends, A. P. Hekstra, A. W Rix, and M. P. Hollier, Perceptual Evaluation of Speech Quality (PESQ): The new ITU standard for
end-to-end speech quality assessment part IIPsychoacoustic model,
J. Audio Eng. Soc., vol. 50, no. 10, pp. 765778, Oct. 2002.
[37] S. Wang, A. Sekey, and A. Gersho, An objective measure for
predicting subjective quality of speech coders, IEEE J. Sel. Areas
Commun., vol. 10, no. 5, pp. 819828, Jun. 1992.
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.
368
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 2, FEBRUARY 2010
Wen Jin received the M.S. and Ph.D. degrees in electrical and computer engineering from University of
Miami, Coral Gables, FL, in 2001 and 2006, respectively.
His research interests include the general area of
audio and speech processing, especially in the area of
audio and speech coding, and single-channel speech
enhancement. He is now with Qualcomm, Inc.
Michael S. Scordilis (SM03) received the B.E. degree in communication engineering from the Royal
Melbourne Institute of Technology, Melbourne, Australia, in 1984, and the M.S. degree in electrical engineering and the Ph.D. degree in engineering from
Clemson University, Clemson, SC, in 1986 and 1990,
respectively.
From 1990 to 1995, he was University Lecturer at
the University of Melbourne, Melbourne, Australia.
He has held visiting Senior Researcher positions at
Bell Communications Research (Bellcore), Morristown, NJ, Sun Microsystems Labs, Chelmsford, MA, and the University of
Patras, Patras, Greece. He is now Research Associate Professor of Electrical
and Computer Engineering at the University of Miami, Coral Gables, FL. His
current research interests include signal processing for speech, audio, signal
recovery and enhancement, psychoacoustics, language processing, and multimedia signal processing. He is an active industry consultant in the areas of audio
and speech analysis, recognition and compression, and multimedia services, and
holds patents in those areas. He has published over 60 papers in major journals
and conferences.
Dr. Scordilis received the 2003 Eliahu I. Jury Award for Excellence in Research of the College of Engineering, University of Miami. He is a member of
the Technical Chamber of Greece.
Lu Han received the M. S. degree in electrical engineering from the Harbin Institute of Technology,
Harbin, China, in 2007.
In August 2007, she joined Digital Audio and
Speech Processing Lab at the University of Miami,
Coral Gables, FL, as Research Assistant working
on was speech enhancement. In August 2008, she
transferred to the Department if Electrical and Computer Engineering, North Carolina State University,
Raleigh. Her current research interests include image
processing and computer vision.
Authorized licensed use limited to: PONDICHERRY ENGG COLLEGE. Downloaded on July 05,2010 at 15:29:06 UTC from IEEE Xplore. Restrictions apply.