Beruflich Dokumente
Kultur Dokumente
This is a special issue published in volume 2007 of “EURASIP Journal on Audio, Speech, and Music Processing.” All articles are open
access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and repro-
duction in any medium, provided the original work is properly cited.
Editor-in-Chief
Douglas O’Shaughnessy, University of Quebec, Canada
Associate Editors
Jont B. Allen, USA Horacio Franco, USA Climent Nadeu, Spain
Xavier Amatriain, USA Qian-Jie Fu, USA Elmar Noth, Germany
Gérard Bailly, France Jim Glass, USA Hiroshi Okuno, Japan
Martin Bouchard, Canada Steven Greenberg, USA Joe Picone, USA
Douglas S. Brungart, USA R. Capobianco Guido, Brazil Gerhard Rigoll, Germany
Geoffrey Chan, Canada R. Heusdens, The Netherlands Mark Sandler, UK
Dan Chazan, Israel James Kates, USA Thippur V. Sreenivas, India
Mark Clements, USA Tatsuya Kawahara, Japan Yannis Stylianou, Greece
C. D’alessandro, France Yves Laprie, France Stephen Voran, USA
Roger Dannenberg, USA Lin-Shan Lee, Taiwan Deliang Wang, USA
Li Deng, USA Dominic Massaro, USA
Thomas Eriksson, Sweden Ben Milner, USA
Contents
Adaptive Partial-Update and Sparse System Identification, Kutluyıl Doğançay and Patrick A. Naylor
Volume 2007, Article ID 12046, 2 pages
Set-Membership Proportionate Affine Projection Algorithms, Stefan Werner, José A. Apolinário, Jr.,
and Paulo S. R. Diniz
Volume 2007, Article ID 34242, 10 pages
Wavelet-Based MPNLMS Adaptive Algorithm for Network Echo Cancellation, Hongyang Deng and
Milos̆ Doroslovac̆ki
Volume 2007, Article ID 96101, 5 pages
A Low Delay and Fast Converging Improved Proportionate Algorithm for Sparse System Identification,
Andy W. H. Khong, Patrick A. Naylor, and Jacob Benesty
Volume 2007, Article ID 84376, 8 pages
Step Size Bound of the Sequential Partial Update LMS Algorithm with Periodic Input Signals,
Pedro Ramos, Roberto Torrubia, Ana López, Ana Salinas, and Enrique Masgrau
Volume 2007, Article ID 10231, 15 pages
Detection-Guided Fast Affine Projection Channel Estimator for Speech Applications, Yan Wu Jennifer,
John Homer, Geert Rombouts, and Marc Moonen
Volume 2007, Article ID 71495, 13 pages
Efficient Multichannel NLMS Implementation for Acoustic Echo Cancellation, Fredric Lindstrom,
Christian Schüldt, and Ingvar Claesson
Volume 2007, Article ID 78439, 6 pages
Editorial
Adaptive Partial-Update and Sparse System Identification
System identification is an important task in many applica- to operate on a long filter and the coefficient noise for near-
tion areas including, for example, telecommunications, con- zero-valued coefficients in the inactive regions is relatively
trol engineering, sensing, and acoustics. It would be widely large. To address this problem, the concept of proportionate
accepted that the science for identification of stationary and updating was introduced.
dynamic systems is mature. However, several new applica- An important consideration for adaptive filters is the
tions have recently become of heightened interest for which computational complexity that increases with the number of
system identification needs to be performed on high-order coefficients to be updated per sampling period. A straight-
moving average systems that are either sparse in the time forward approach to complexity reduction is to update only
domain or need to be estimated using sparse computation a small number of filter coefficients at every iteration. This
due to complexity constraints. In this special issue, we have approach is termed partial-update adaptive filtering. Two key
brought together a collection of articles on recent work in questions arise in the context of partial updating. Firstly, con-
this field giving specific consideration to (a) algorithms for sideration must be given as to how to choose which coeffi-
identification of sparse systems and (b) algorithms that ex- cients to update. Secondly, the performance and complexity
ploit sparseness in the coefficient update domain. The dis- of the partial update approach must be compared with the
tinction between these two types of sparseness is important, standard full update algorithms in order to assess the cost-to-
as we hope will become clear to the reader in the main body benefit ratio for the partial update schemes. Usually, a com-
of the special issue. promise has to be made between affordable complexity and
A driving force behind the development of algorithms for desired convergence speed.
sparse system identification in telecommunications has been We have grouped the papers in this special issue into
echo cancellation in packet switched telephone networks. four areas. The first area is sparse system identification and
The increasing popularity of packet-switched telephony has comprises three papers. In “Set-membership proportion-
led to a need for the integration of older analog systems with, ate affine projection algorithms,” Stefan Werner et al. de-
for example, IP or ATM networks. Network gateways enable velop affine projection algorithms with proportionate update
the interconnection of such networks and provide echo can- and set membership filtering. Proportionate updates facil-
cellation. In such systems, the hybrid echo response is de- itate fast convergence for sparse systems, and set member-
layed by an unknown bulk delay due to propagation through ship filtering reduces the update complexity. The second pa-
the network. The overall effect is, therefore, that an “active” per in this area is “Wavelet-based MPNLMS adaptive algo-
region associated with the true hybrid echo response occurs rithm for network echo cancellation” by H. Deng and M.
with an unknown delay within an overall response window Doroslovački, which develops a wavelet-domain µ-law pro-
that has to be sufficiently long to accommodate the worst portionate NLMS algorithm for identification and cancelling
case bulk delay. In the context of network echo cancellation of sparse telephone network echoes. This work exploits the
the direct application of well-known algorithms, such as nor- whitening and good time-frequency localisation properties
malized least-mean-square (NLMS), to sparse system identi- of the wavelet transform to speed up the convergence for
fication gives unsatisfactory performance when the echo re- coloured input signals and to retain sparseness of echo re-
sponse is sparse. This is because the adaptive algorithm has sponse in the wavelet transform domain. In “A low delay and
2 EURASIP Journal on Audio, Speech, and Music Processing
fast converging improved proportionate algorithm for sparse has created the need for new algorithms for sparse adap-
system identification,” Andy W. H. Khong et al. propose a tive filtering—a challenge that has been well met to date for
multidelay filter (MDF) implementation for improved pro- the particular applications addressed. When sparseness ex-
portionate NLMS for sparse system identification, inheriting ists, or can be safely assumed, in input signals, this can be
the beneficial properties of both; namely, fast convergence exploited to achieve both computational savings in partial
and computational efficiency coupled with low bulk delay. update schemes and, in certain specific cases, performance
As the authors show, the MDF implementation is nontrivial improvements. There remain several open research questions
and requires time-domain coefficient updating. in this context and we look forward to an ongoing research
The second area of papers is partial-update active noise effort in the scientific community and opportunities for al-
control. In the first paper in this area “Analysis of tran- gorithm deployment in real-time applications.
sient and steady-state behavior of a multichannel filtered-
x partial-error affine projection algorithm,” A. Carini and ACKNOWLEDGMENTS
S. L. Sicuranza apply partial-error complexity reduction to
filtered-x affine projection algorithm for multichannel ac- This special issue has arisen as a result of the high levels of
tive noise control, and provide a comprehensive analysis of interest shown at a special session on this topic at EUSIPCO
the transient and steady-state behaviour of the adaptive algo- 2005 in Antalya, Turkey. It has been a great privilege to act as
rithm drawing on energy conservation. In “Step size bound guest editors for this special issue and we extend our grateful
of the sequential partial update LMS algorithm with peri- thanks to all the authors and the publisher.
odic input signals” Pedro Ramos et al. show that for pe-
riodic input signals the sequential partial update LMS and Kutluyıl Doğançay
filtered-x LMS algorithms can achieve the same convergence Patrick A. Naylor
performance as their full-update counterparts by increasing
the step-size appropriately. This essentially avoids any con-
vergence penalty associated with sequential updating.
The third area focuses on general partial update algo-
rithms. In the first paper in this area, “Detection guided
fast affine projection channel estimator for speech appli-
cations,” Yan Wu Jennifer et al. consider detection guided
identification of active taps in a long acoustic echo chan-
nel in order to shorten the actual channel and integrate it
into the fast affine projection algorithm to attain faster con-
vergence. The proposed algorithm is well suited for highly
correlated input signals such as speech signals. In “Efficient
multichannel NLMS implementation for acoustic echo can-
cellation,” Fredric Lindstrom et al. propose a multichannel
acoustic echo cancellation algorithm based on normalized
least-mean-square with partial updates favouring filters with
largest misadjustment.
The final area is devoted to blind source separation. In
“Time domain convolutive blind source separation employ-
ing selective-tap adaptive algorithms,” Q. Pan and T. Aboul-
nasr propose time-domain convolutive blind source separa-
tion algorithms employing M-max and exclusive maximum
selective-tap techniques. The resulting algorithms have re-
duced complexity and improved convergence performance
thanks to partial updating and reduced interchannel co-
herence. In the final paper “Underdetermined blind audio
source separation using modal decomposition,” Abdeljalil
Aı̈ssa-El-Bey et al. present a novel blind source separation
algorithm for audio signals using modal decomposition. In
addition to instantaneous mixing, the authors consider con-
volutive mixing and exploit the sparseness of audio signals
to identify the channel responses before applying modal de-
composition.
In summary, we can say that sparseness in the context
of adaptive filtering presents both challenges and opportu-
nities. Standard adaptive algorithms suffer a degradation in
performance when the system to be identified is sparse. This
Hindawi Publishing Corporation
EURASIP Journal on Audio, Speech, and Music Processing
Volume 2007, Article ID 34242, 10 pages
doi:10.1155/2007/34242
Research Article
Set-Membership Proportionate Affine Projection Algorithms
Proportionate adaptive filters can improve the convergence speed for the identification of sparse systems as compared to their
conventional counterparts. In this paper, the idea of proportionate adaptation is combined with the framework of set-membership
filtering (SMF) in an attempt to derive novel computationally efficient algorithms. The resulting algorithms attain an attractive
faster converge for both situations of sparse and dispersive channels while decreasing the average computational complexity due to
the data discerning feature of the SMF approach. In addition, we propose a rule that allows us to automatically adjust the number
of past data pairs employed in the update. This leads to a set-membership proportionate affine projection algorithm (SM-PAPA)
having a variable data-reuse factor allowing a significant reduction in the overall complexity when compared with a fixed data-
reuse factor. Reduced-complexity implementations of the proposed algorithms are also considered that reduce the dimensions of
the matrix inversions involved in the update. Simulations show good results in terms of reduced number of updates, speed of
convergence, and final mean-squared error.
Copyright © 2007 Stefan Werner et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
error is upper bounded by a predetermined threshold.1 Set- smaller than a deterministic threshold γ, where x(k) ∈ CN
membership adaptive filters (SMAF) feature data-selective and d(k) ∈ C denote the input vector and the desired out-
(sparse in time) updating, and a time-varying data- put signal, respectively. As a result of the bounded error con-
dependent step size that provides fast convergence as well straint, there will exist a set of filters rather than a single esti-
as low steady-state error. SMAFs with low computational mate.
complexity per update are the set-membership NLMS (SM- Let S denote the set of all possible input-desired data
NLMS) [15], the set-membership binormalized data-reusing pairs (x, d) of interest. Let Θ denote the set of all possible
(SM-BNDRLMS) [16], and the set-membership affine pro- vectors w that result in an output error bounded by γ when-
jection (SM-AP) [17] algorithms. In the following, we com- ever (x, d) ∈ S. The set Θ referred to as the feasibility set is
bine the frameworks of proportionate adaptation and SMF. given by
A set-membership proportionate NLMS (SM-PNLMS) algo-
rithm is proposed as a viable alternative to the SM-NLMS al- Θ= w ∈ CN : d − w H x ≤ γ . (1)
(x,d)∈S
gorithm [15] for operation in sparse scenarios. Following the
ideas of the IPNLMS algorithm, an efficient weight-scaling Adaptive SMF algorithms seek solutions that belong to the
assignment is proposed that utilizes the information pro- exact membership set ψ(k) constructed by input-signal and
vided by the data-dependent step size. Thereafter, we propose desired-signal pairs,
a more general algorithm, the set-membership proportionate
affine projection algorithm (SM-PAPA) that generalizes the
k
ideas of the SM-PNLMS to reuse constraint sets from a fixed ψ(k) = H (i), (2)
i=1
number of past input and desired signal pairs in the same way
as the SM-AP algorithm [17]. The resulting algorithm can where H (k) is referred to as the constraint set containing all
be seen as a set-membership version of the PAP algorithm vectors w for which the associated output error at time in-
[13, 14] with an optimized step size. As with the PAP algo- stant k is upper bounded in magnitude by γ:
rithm, a faster convergence of the SM-PAPA algorithm may
H (k) = w ∈ CN : d(k) − wH x(k) ≤ γ . (3)
come at the expense of a slight increase in the computational
complexity per update that is directly linked to the amount It can be seen that the feasibility set Θ is a subset of the exact
of reuses employed, or data-reuse factor. To lower the over- membership set ψk at any given time instant. The feasibility set
all complexity, we propose to use a time-varying data-reuse is also the limiting set of the exact membership set, that is, the
factor. The introduction of the variable data-reuse factor re- two sets will be equal if the training signal traverses all signal
sults in an algorithm that close to convergence takes the form pairs belonging to S. The idea of set-membership adaptive
of the simple SM-PNLMS algorithm. Thereafter, we consider filters (SMAF) is to find adaptively an estimate that belongs
an efficient implementation of the new SM-PAPA algorithm to the feasibility set or to one of its members. Since ψ(k) in
that reduces the dimensions of the matrices involved in the (2) is not easily computed, one approach is to apply one of
update. the many optimal bounding ellipsoid (OBE) algorithms [18,
The paper is organized as follows. Section 2 reviews the 20–24], which tries to approximate the exact membership set
concept of SMF while the SM-PNLMS algorithm is proposed ψ(k) by tightly outer bounding it with ellipsoids. Adaptive
in Section 3. Section 4 derives the general SM-PAPA algo- approaches leading to algorithms with low peak complexity,
rithm where both cases of fixed and time-varying data-reuse O(N), compute a point estimate through projections using
factor are studied. Section 5 provides the details of an SM- information provided by past constraint sets [15–17, 25–27].
PAPA implementation using reduced matrix dimensions. In In this paper, we are interested in algorithms derived from
Section 6, the performances of the proposed algorithms are the latter approach.
evaluated through simulations which are followed by con-
clusions. 3. THE SET-MEMBERSHIP PROPORTIONATE
NLMS ALGORITHM
2. SET-MEMBERSHIP FILTERING
In this section, the idea of proportionate adaptation is ap-
This section reviews the basic concepts of set-membership plied to SMF in order to derive a data-selective algorithm,
filtering (SMF). For a more detailed introduction to the con- the set-membership proportionate normalized LMS (SM-
cept of SMF, the reader is referred to [18]. Set-membership PNLMS), suitable for sparse environments.
filtering is a framework applicable to filtering problems that
are linear in parameters.2 A specification on the filter param- 3.1. Algorithm derivation
eters w ∈ CN is achieved by constraining the magnitude of
the output estimation error, e(k) = d(k) − wH x(k), to be The SM-PNLMS algorithm uses the information provided by
the constraint set H (k) and the coefficient updating to solve
the optimization problem employing the criterion
1 For other reduced-complexity solutions, see, for example, [11] where the 2
concept of partial updating is applied. w(k + 1) = arg min w − w(k)G−1 (k) subject to: w ∈ H (k),
2 This includes nonlinear problems like Volterra modeling, see, for exam- w
ple, [19]. (4)
Stefan Werner et al. 3
where the norm employed is defined as b2A = bH Ab. Ma- where κ ∈ [0, 1] and w(k)1 = i |wi (k)| denotes the l1
trix G(k) is here chosen as a diagonal weighting matrix of the norm [2, 4]. The constant κ is included to increase the ro-
form bustness for estimation errors in w(k), and from the simu-
lations provided in Section 6, κ = 0.5 shows good perfor-
G(k) = diag g1 (k), . . . , gN (k) . (5)
mance for both sparse and dispersive systems. For κ = 1,
The elements values of G(k) will be discussed in Section 3.2. the algorithm will converge faster but will be more sensitive
The optimization criterion in (4) states that if the previous to the sparse assumption. The IPNLMS algorithm uses sim-
estimate already belongs to the constraint set, w(k) ∈ H (k), ilar strategy, see [4] for details. The updating expressions in
it is a feasible solution and no update is needed. However, if (9) and (6) resemble those of the IPNLMS algorithm except
w(k) ∈ H (k), an update is required. Following the principle for the time-varying step size α(k). From (9) we can observe
of minimal disturbance, a feasible update is made such that the following: (1) during initial adaptation (i.e., during tran-
w(k + 1) lies up on the nearest boundary of H (k). In this sient) the solution is far from the steady-state solution, and
case the updating equation is given by consequently α(k) is large, and more weight will be placed
e∗ (k)G(k)x(k) at the stronger components of the adaptive filter impulse re-
w(k + 1) = w(k) + α(k) , (6) sponse; (2) as the error decreases, α(k) gets smaller, all the
xH (k)G(k)x(k)
coefficients become equally important, and the algorithm be-
where haves as the SM-NLMS algorithm.
⎧
⎪ γ
⎨1 −
e(k) if e(k) > γ
α(k) = ⎪ (7) 4. THE SET-MEMBERSHIP PROPORTIONATE
⎩0 otherwise AFFINE-PROJECTION ALGORITHM
is a time-varying data-dependent step size, and e(k) is the a In this section, we extend the results from the previous sec-
priori error given by tion to derive an algorithm that utilizes the L(k) most re-
e(k) = d(k) − wH (k)x(k). (8) cent constraint sets {H (i)}ki=k−L(k)+1 . The algorithm deriva-
tion will treat the most general case where L(k) is allowed to
For the proportionate algorithms considered in this paper, vary from one updating instant to another, that is, the case of
matrix G(k) will be diagonal. However, for other choices of a variable data reuse factor. Thereafter, we provide algorithm
G(k), it is possible to identify from (6) different types of implementations for the case of fixed number of data-reuses
SMAF available in literature. For example, choosing G(k) = I (i.e., L(k) = L), and the case of L(k) ≤ Lmax (i.e., L(k) is up-
gives the SM-NLMS algorithm [15], setting G(k) equal to a per bounded but allowed to vary). The proposed algorithm,
weighted covariance matrix will result in the BEACON re- SM-PAPA, includes the SM-AP algorithm [17, 29] as a spe-
cursions [28], and choosing G(k) such that it extracts the cial case and is particularly useful whenever the input signal
P ≤ N elements in x(k) of largest magnitude gives a partial- is highly correlated. As with the SM-PNLMS algorithm, the
updating SMF [26]. Next we consider the weighting matrix main idea is to allocate different weights to the filter coeffi-
used with the SM-PNLMS algorithm. cients using a weighting matrix G(k).
in ψ L(k) (k), and matrix X(k) ∈ CN ×L(k) contains the corre- SM-PAPA
sponding input vectors, that is,
for each k
T {
p(k) = p1 (k)p2 (k) · · · pL(k) (k) ,
T e(k) = d(k) − wH (k)x(k)
d(k) = d(k)d(k − 1) · · · d k − L(k) + 1 , (12)
if e(k) > γ
X(k) = x(k)x(k − 1) · · · x k − L(k) + 1 .
{
Applying the method of Lagrange multipliers for solving the α(k) = 1 − γ e(k)
minimization problem of (11), the update equation of the 1 − κα(k) κα(k)wi (k)
most general SM-PAPA version is obtained as gi (k) = + N
, i = 1, . . . , N
N i=1 wi (k)
G(k) = diag g1 (k) · · · gN (k)
w(k + 1)
⎧ H
−1 X(k) = x(k)U(k)
⎪
⎪
⎪w(k) + G(k)X(k) X
(k)G(k)X(k)
⎨
−1
φ(k) = x(k) − U(k) UH (k)G(k)U(k) UH (k)G(k)x(k)
= × e∗ (k) − p∗ (k) , if e(k) > γ 1
⎪
⎪ w(k + 1) = w(k) + α(k)e∗ (k) H G(k)φ(k)
⎪
⎩ φ (k)G(k)φ(k)
w(k) otherwise,
(13) }
else
where e(k) = d(k) − XT (k)w∗ (k). The recursion above re- {
quires that matrix XH (k)X(k), needed for solving the vector
of Lagrange multipliers, is nonsingular. To avoid problems, a w(k + 1) = w(k)
regularization factor can be included in the inverse (common }
in conventional AP algorithms), that is, [XH (k)X(k) + δI]−1
}
with δ 1. The choice of pi (k) can fit each problem at hand.
4.3. SM-PAPA with variable data reuse Table 1: Quantization levels for Lmax = 5.
Sparse system
1
0.8
0.6
h(k)
0.4
0.2
0
5 10 15 20 25 30 35 40 45 50
Iteration, k
(a)
Dispersive system
0.5
0.4
0.3
h(k)
0.2
0.1
0
5 10 15 20 25 30 35 40 45 50
Iteration, k
(b)
Figure 1: The amplitude of two impulse responses used in the simulations: (a) sparse microwave channel (see Footnote 4), (b) dispersive
channel.
10 10
15 15
20 20
MSE (dB)
MSE (dB)
25 25
30
30
35
35
40
40
0 0.5 1 1.5 2 0 500 1000 1500
Iteration, k 104
Iterations, k
SM-PAP IPNLMS L=5 L=3
PAP SM-NLMS Lmax = 5 Lmax = 3
SM-AP NLMS L=4 Lmax = 2
SM-PNLMS Lmax = 4 L=2
Figure 3: Learning curves in a dispersive system for the SM-PNLMS, Figure 4: Learning curves in a sparse system for the SM-PAPA (L =
the SM-PAPA (L = 2), the SM-NLMS, the NLMS, √ the IPNLMS, and 2 to 5), and the SM-REDPAPA (Lmax√= 2 to 5) based on a uniformly
the PAP (L = 2) algorithms. SNR = 40 dB, γ = 2σn , and μ = 0.4. quantized α1 (k). SNR = 40 dB, γ = 2σn .
7. CONCLUSIONS
This paper presented novel set-membership filtering (SMF) natives to the SM-NLMS and SM-AP algorithms. The algo-
algorithms suitable for applications in sparse environments. rithms benefit from the reduced average computational com-
The set-membership proportionate NLMS (SM-PNLMS) al- plexity from the SMF strategy and fast convergence for sparse
gorithm and the set-membership proportionate affine pro- scenarios resulting from proportionate updating. Simula-
jection algorithm (SM-PAPA) were proposed as viable alter- tions were presented for both sparse and dispersive impulse
Stefan Werner et al. 9
Table 2: Distribution of the variable data-reuse factor L(k) used in with Φ(k) defined as in (29). Therefore,
the SM-PAPA for the case when α1 (k) is uniformly quantized.
H
−1
Lmax L(k) = 1 L(k) = 2 L(k) = 3 L(k) = 4 L(k) = 5 X(k) X (k)G(k)X(k) λ∗ (k)
1 100% — — — — A ∗
= X(k) λ (k)
2 54.10% 45.90% — — — B
3 36.55% 45.80% 17.65% — — −1 H
= X(k) − UH (k)G(k)U(k)
U (k)G(k)X(k)
4 28.80% 36.90% 26.55% 7.75% —
−1 ∗
5 23.95% 29.95% 28.45% 13.50% 4.15% × ΦH (k)G(k)Φ(k) λ (k)
−1 ∗
= Φ(k) ΦH (k)G(k)Φ(k) λ (k).
Table 3: Distribution of the variable data-reuse factor L(k) used in (A.3)
the SM-PAPA for the case when α1 (k) is quantized according to (24),
β = 2. ACKNOWLEDGMENTS
Lmax L(k) = 1 L(k) = 2 L(k) = 3 L(k) = 4 L(k) = 5
The authors would like to thank CAPES, CNPq, FAPERJ
1 100% — — — — (Brazil), and Academy of Finland, Smart and Novel Radios
2 37.90% 62.90% — — — (SMARAD) Center of Excellence (Finland), for partially sup-
3 28.90% 35.45% 35.65% — — porting this work.
4 28.86% 21.37% 33.51% 18.26% —
5 25.71% 15.03% 23.53% 25.82% 9.91%
REFERENCES
[1] R. K. Martin, W. A. Sethares, R. C. Williamson, and C. R. John-
son Jr., “Exploiting sparsity in adaptive filters,” IEEE Transac-
responses. It was verified that not only the proposed SMF tions on Signal Processing, vol. 50, no. 8, pp. 1883–1894, 2002.
algorithms can further reduce the computational complex- [2] D. L. Duttweiler, “Proportionate normalized least-mean-
ity when compared with their conventional counterparts, the squares adaptation in echo cancelers,” IEEE Transactions on
IPNLMS and PAP algorithms, but they also present faster Speech and Audio Processing, vol. 8, no. 5, pp. 508–518, 2000.
convergence to the same level of MSE when compared with [3] S. L. Gay, “An efficient, fast converging adaptive filter for net-
the SM-NLMS and the SM-AP algorithms. The weight as- work echo cancellation,” in Proceedings of the 32nd Asilomar
signment of the proposed algorithms utilizes the informa- Conference on Signals, Systems & Computers, vol. 1, pp. 394–
tion provided by a time-varying step size typical for SMF al- 398, Pacific Grove, Calif, USA, November 1998.
gorithms and is robust to the assumption of sparse impulse [4] J. Benesty and S. L. Gay, “An improved PNLMS algorithm,”
response. In order to reduce the overall complexity of the in Proceedings of IEEE International Conference on Acoustics,
Speech, and Signal Processing (ICASSP ’02), vol. 2, pp. 1881–
SM-PAPA we proposed to employ a variable data reuse fac-
1884, Orlando, Fla, USA, May 2002.
tor. The introduction of a variable data-reuse factor allows
[5] B. D. Rao and B. Song, “Adaptive filtering algorithms for pro-
significant reduction in the overall complexity as compared moting sparsity,” in Proceedings of IEEE International Confer-
to fixed data-reuse factor. Simulations showed that the pro- ence on Acoustics, Speech, and Signal Processing (ICASSP ’03),
posed algorithm could outperform the SM-PAPA with fixed vol. 6, pp. 361–364, Hong Kong, April 2003.
number of data-reuses in terms of computational complexity [6] A. W. H. Khong, J. Benesty, and P. A. Naylor, “An improved
and final mean-squared error. proportionate multi-delay block adaptive filter for packet-
switched network echo cancellation,” in Proceedings of the 13th
APPENDIX European Signal Processing Conference (EUSIPCO ’05), An-
talya, Turkey, September 2005.
The inverse in (26) can be partitioned as [7] K. Doğançay and P. Naylor, “Recent advances in partial update
and sparse adaptive filters,” in Proceedings of the 13th European
Signal Processing Conference (EUSIPCO ’05), Antalya, Turkey,
H
−1
H
−1
X (k)G(k)X(k) =
X(k)U(k)
G(k) X(k)U(k) September 2005.
[8] A. Deshpande and S. L. Grant, “A new multi-algorithm ap-
A BH proach to sparse system adaptation,” in Proceedings of the 13th
= , European Signal Processing Conference (EUSIPCO ’05), An-
B C
talya, Turkey, September 2005.
(A.1) [9] S. Werner, J. A. Apolinário Jr., P. S. R. Diniz, and T. I. Laakso,
“A set-membership approach to normalized proportionate
where adaptation algorithms,” in Proceedings of the 13th European
Signal Processing Conference (EUSIPCO ’05), Antalya, Turkey,
−1 September 2005.
A = ΦH (k)G(k)Φ(k) , [10] H. Deng and M. Doroslovac̆ki, “Proportionate adaptive algo-
−1 (A.2) rithms for network echo cancellation,” IEEE Transactions on
B = − U(k)H G(k)U(k) UH (k)G(k)X(k)A, Signal Processing, vol. 54, no. 5, pp. 1794–1803, 2006.
10 EURASIP Journal on Audio, Speech, and Music Processing
[11] O. Tanrıkulu and K. Doğançay, “Selective-partial-update nor- [26] S. Werner, M. L. R. de Campos, and P. S. R. Diniz, “Partial-
malized least-mean-square algorithm for network echo can- update NLMS algorithms with data-selective updating,” IEEE
cellation,” in Proceedings of IEEE International Conference on Transactions on Signal Processing, vol. 52, no. 4, pp. 938–949,
Acoustics, Speech, and Signal Processing (ICASSP ’02), vol. 2, 2004.
pp. 1889–1892, Orlando, Fla, USA, May 2002. [27] S. Werner, J. A. Apolinário Jr., M. L. R. de Campos, and P. S.
[12] J. Kivinen and M. K. Warmuth, “Exponentiated gradient ver- R. Diniz, “Low-complexity constrained affine-projection algo-
sus gradient descent for linear predictors,” Information and rithms,” IEEE Transactions on Signal Processing, vol. 53, no. 12,
Computation, vol. 132, no. 1, pp. 1–63, 1997. pp. 4545–4555, 2005.
[13] J. Benesty, T. Gänsler, D. Morgan, M. Sondhi, and S. Gay, Eds., [28] S. Nagaraj, S. Gollamudi, S. Kapoor, and Y.-F. Huang, “BEA-
Advances in Network and Acoustic Echo Cancellation, Springer, CON: an adaptive set-membership filtering technique with
Boston, Mass, USA, 2001. sparse updates,” IEEE Transactions on Signal Processing, vol. 47,
[14] O. Hoshuyama, R. A. Goubran, and A. Sugiyama, “A general- no. 11, pp. 2928–2941, 1999.
ized proportionate variable step-size algorithm for fast chang- [29] S. Werner, P. S. R. Diniz, and J. E. W. Moreira, “Set-
ing acoustic environments,” in Proceedings of IEEE Interna- membership affine projection algorithm with variable data-
tional Conference on Acoustics, Speech, and Signal Processing reuse factor,” in Proceedings of IEEE International Symposium
(ICASSP ’04), vol. 4, pp. 161–164, Montreal, Quebec, Canada, on Circuits and Systems (ISCAS ’06), pp. 261–264, Island of
May 2004. Kos, Greece, May 2006.
[15] S. Gollamudi, S. Nagaraj, S. Kapoor, and Y.-F. Huang, “Set- [30] M. Rupp, “A family of adaptive filter algorithms with decor-
membership filtering and a set-membership normalized LMS relating properties,” IEEE Transactions on Signal Processing,
algorithm with an adaptive step size,” IEEE Signal Processing vol. 46, no. 3, pp. 771–775, 1998.
Letters, vol. 5, no. 5, pp. 111–114, 1998.
[16] P. S. R. Diniz and S. Werner, “Set-membership binormalized
data-reusing LMS algorithms,” IEEE Transactions on Signal
Processing, vol. 51, no. 1, pp. 124–134, 2003.
[17] S. Werner and P. S. R. Diniz, “Set-membership affine projec-
tion algorithm,” IEEE Signal Processing Letters, vol. 8, no. 8, pp.
231–235, 2001.
[18] S. Gollamudi, S. Kapoor, S. Nagaraj, and Y.-F. Huang, “Set-
membership adaptive equalization and an updator-shared im-
plementation for multiple channel communications systems,”
IEEE Transactions on Signal Processing, vol. 46, no. 9, pp. 2372–
2385, 1998.
[19] A. V. Malipatil, Y.-F. Huang, S. Andra, and K. Bennett, “Ker-
nelized set-membership approach to nonlinear adaptive filter-
ing,” in Proceedings of IEEE International Conference on Acous-
tics, Speech, and Signal Processing (ICASSP ’05), vol. 4, pp. 149–
152, Philadelphia, Pa, USA, March 2005.
[20] E. Fogel and Y.-F. Huang, “On the value of information in sys-
tem identification—bounded noise case,” Automatica, vol. 18,
no. 2, pp. 229–238, 1982.
[21] S. Dasgupta and Y.-F. Huang, “Asymptotically convergent
modified recursive least-squares with data-dependent updat-
ing and forgetting factor for systems with bounded noise,”
IEEE Transactions on Information Theory, vol. 33, no. 3, pp.
383–392, 1987.
[22] J. R. Deller Jr., M. Nayeri, and M. S. Liu, “Unifying the Land-
mark developments in optimal bounding ellipsoid identifica-
tion,” International Journal of Adaptive Control and Signal Pro-
cessing, vol. 8, no. 1, pp. 43–60, 1994.
[23] D. Joachim and J. R. Deller Jr., “Multiweight optimization in
optimal bounding ellipsoid algorithms,” IEEE Transactions on
Signal Processing, vol. 54, no. 2, pp. 679–690, 2006.
[24] S. Gollamudi, S. Nagaraj, and Y.-F. Huang, “Blind equal-
ization with a deterministic constant modulus cost-a set-
membership filtering approach,” in Proceedings of IEEE Inter-
national Conference on Acoustics, Speech, and Signal Process-
ing (ICASSP ’00), vol. 5, pp. 2765–2768, Istanbul, Turkey, June
2000.
[25] P. S. R. Diniz and S. Werner, “Set-membership binormalized
data-reusing algorithms,” in Proceedings of the IFAC Sympo-
sium on System Identification (SYSID ’00), vol. 3, pp. 869–874,
Santa Barbara, Calif, USA, June 2000.
Hindawi Publishing Corporation
EURASIP Journal on Audio, Speech, and Music Processing
Volume 2007, Article ID 96101, 5 pages
doi:10.1155/2007/96101
Research Article
Wavelet-Based MPNLMS Adaptive Algorithm for
Network Echo Cancellation
The μ-law proportionate normalized least mean square (MPNLMS) algorithm has been proposed recently to solve the slow con-
vergence problem of the proportionate normalized least mean square (PNLMS) algorithm after its initial fast converging period.
But for the color input, it may become slow in the case of the big eigenvalue spread of the input signal’s autocorrelation matrix. In
this paper, we use the wavelet transform to whiten the input signal. Due to the good time-frequency localization property of the
wavelet transform, a sparse impulse response in the time domain is also sparse in the wavelet domain. By applying the MPNLMS
technique in the wavelet domain, fast convergence for the color input is observed. Furthermore, we show that some nonsparse
impulse responses may become sparse in the wavelet domain. This motivates the usage of the wavelet-based MPNLMS algorithm.
Advantages of this approach are documented.
Copyright © 2007 H. Deng and M. Doroslovački. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
advantageous usage of the MPLMS algorithm by using the Network echo path impulse response
0.5
wavelet transform to cases when the input signal is colored
or when the impulse response to be identified is nonsparse. 0.4
Coefficient amplitude
2.1. Color input case 0.2
The optimal step-size control factors are derived under the 0.1
assumption that the input is white. If the input is a color
signal, which is often the case for network echo cancella- 0
tion, the convergence time of each coefficient also depends
on the eigenvalues of the input signal’s autocorrelation ma- −0.1
trix. Since, in general, we do not know the statistical charac-
teristics of the input signal, it is impossible to derive the opti- −0.2
0 10 20 30 40 50 60 70
mal step-size control factors without introducing more com-
Time (ms)
putational complexity in adaptive algorithm. Furthermore,
the big eigenvalue spread of the input signal’s autocorrela-
tion matrix slows down the overall convergence based on the Figure 1: Network echo path impulse response.
standard LMS performance analysis [7].
One solution of the slow convergence problem of LMS
for the color input is the so-called transform domain LMS Echo path impulse response in wavelet domain
0.3
[7]. By using a unitary transform such as discrete Fourier
transform (DFT) and discrete cosine transform (DCT), we
0.2
can make the input signal’s autocorrelation matrix nearly
diagonal. We can further normalize the transformed input
0.1
vector by the estimated power of each input tap to make
the autocorrelation matrix close to the identity matrix, thus 0
decreasing the eigenvalue spread and improving the overall
convergence.
−0.1
But, there is another effect of working in the transform
domain: the adaptive filter is now estimating the transform
−0.2
coefficients of the original impulse response [8]. The number
of active coefficients to be identified can differ from the num-
−0.3
ber of active coefficients in the original impulse response. In
some cases, it can be much smaller and in some cases, it can
−0.4
be much larger. 0 100 200 300 400 500 600
The MPNLMS algorithm works well only for sparse im- Tap index
pulse response. If the impulse response is not sparse, that is,
most coefficients are active, the MPNLMS algorithm’s perfor- Figure 2: DWT of the impulse response in Figure 1.
mance degrades greatly. It is well known that if the system is
sparse in time domain, it is nonsparse in frequency domain.
For example, if a system has only one active coefficient in the
time domain (very sparse), all of its coefficients are active in transformed input to achieve fast convergence for color in-
the frequency domain. Therefore, DFT and DCT will trans- put.
form a sparse impulse response into nonsparse, so we cannot The proposed wavelet MPNLMS (WMPNLMS) algo-
apply the MPNLMS algorithm. rithm is listed in Algorithm 1, where x(k) is the input signal
Discrete wavelet transform (DWT) has gained a lot of vector in the time domain, L is the number of adaptive fil-
attention for signal processing in recent years. Due to its ter coefficients, T represents DWT, xT (k) is the input signal
good time-frequency localization property, it can transform vector in the wavelet domain, xT,i (k) is the ith component
a time domain sparse system into a sparse wavelet domain T (k) is the adaptive filter coefficient vector in the
of xT (k), w
system [8]. Let us consider the network echo path illustrated wavelet domain, wT,l (k) is the lth component of w T (k), y(k)
in Figure 1. This is a sparse impulse response. From Figure 2, is the output of the adaptive filter, d(k) is the reference signal,
we see that it is sparse in the wavelet domain, as well. Here, e(k) is the error signal driving the adaptation, σx2T,i (k) is the
we have used the 9-level Haar wavelet transform on 512 estimated average power of the ith input tap in the wavelet
data points. Also, the DWT has the similar band-partitioning domain, α is the forgetting factor with typical value 0.95, β
property as DFT or DCT to whiten the input signal. There- is the step-size parameter, and δ p and ρ are small positive
fore, we can apply the MPNLMS algorithm directly on the numbers used to prevent the zero or extremely small adaptive
H. Deng and M. Doroslovački 3
Learning curves
T −25
x(k) = x(k)x(k − 1) · · · x(k − L + 1)
xT (k) = Tx(k)
−30
filter coefficients from stalling. The parameter ε defines the Figure 3: Learning curves for wavelet- and nonwavelet-based pro-
neighborhood boundary of the optimal adaptive filter coeffi- portionate algorithms.
cients. The instant when all adaptive filter coefficients have
crossed the boundary defines the convergence time of the
adaptive filter. Definition of the matrix T can be found in main MPNLMS algorithm. Note that SPNLMS stands for the
[9, 10]. Computationally efficient algorithms exist for calcu- segmented PNLMS [5]. This is the MPNLMS algorithm in
lation of xT (k) due to the convolution-downsampling struc- which the logarithm function is approximated by linear seg-
ture of DWT. The extreme case of computational simplicity ments.
corresponds to the usage of the Haar wavelets [11]. The aver-
age power of the ith input tap in the wavelet domain is esti- 2.2. Nonsparse impulse response case
mated recursively by using the exponentially decaying time-
window of unit area. There are alternative ways to do the esti- In some networks, nonsparse impulse responses can appear.
mation. A common theme in all of them is to find the proper Figure 4 shows an echo path impulse response of a digital
balance between the influence of the old input values and the subscriber line (DSL) system. We can see that it is not sparse
current input values. The balance depends on whether the in the time domain. It has a very short fast changing seg-
input is nonstationary or stationary. Note that the multipli- ment and a very long slow decreasing tail [11]. If we apply
cation with D−1 (k + 1) assigns a different normalization fac- the MPNLMS algorithm on this type of impulse response, we
tor to every adaptive coefficient. This is not the case in the cannot expect that we will improve the convergence speed.
ordinary NLMS algorithm where the normalization factor is But if we transform the impulse response into wavelet do-
common for all coefficients. In the WMPNLMS algorithm, main by using the 9-level Haar wavelet transform, it turns
the normalization is trying to decrease the eigenvalue spread into a sparse impulse response as shown in Figure 5. Now,
of the autocorrelation matrix of transformed input vector. the WMPNLMS can speed up the convergence.
Now, we are going to use a 512-tap wavelet-based adap- To evaluate the performance of the WMPNLMS algo-
tive filter (covering 64 ms for sampling frequency of 8 KHz) rithm identifying the DSL echo path shown in Figure 4, we
to identify the network echo path illustrated in Figure 1. The use an adaptive filter with 512 taps. The input signal is white.
input signal is generated by passing the white Gaussian noise As previously, we use δ p = 0.01, ρ = 0.01, and β that pro-
with zero-mean and unit-variance through a lowpass filter vides the same steady-state error as the NLMS, MPNLMS,
with one pole at 0.9. We also add white Gaussian noise to and SPNLMS algorithms. Figure 6 shows learning curves for
the output of the echo path to control the steady-state out- identifying the DSL echo path. We can see that the NLMS al-
put error of the adaptive filter. The WMPNLMS algorithm gorithm and the wavelet-based NLMS algorithm have nearly
use δ p = 0.01 and ρ = 0.01. β is chosen to provide the same the same performance, because the input signal is white. The
steady-state error as the MPNLMS and SPNLMS algorithms. MPNLMS algorithm has marginal improvement in this case
From Figure 3, we can see that the proposed WMPNLMS because the impulse response of the DSL echo path is not
algorithm has noticeable improvement over the time do- very sparse. But the WMPNLMS algorithm has much faster
4 EURASIP Journal on Audio, Speech, and Music Processing
−0.1 −50
−0.2 −55
−0.3 −60
−0.4 −65
0 100 200 300 400 500 600 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Iteration number ×104
Samples
Simulation parameters
Figure 4: DSL echo path impulse response. Input signal: white Gaussian noise.
Echo path impulse response: Figure 4.
Near end noise: −60 dBm white Gaussian noise.
Input signal power: −10 dBm.
Echo return loss: 14 dB.
Step-size parameter: 0.3 (NLMS, MPNLMS, SPNLMS).
Echo path impulse response in wavelet domain
1.5 Figure 6: Learning curves for identifying DSL network echo path.
3. CONCLUSION
0.5
Figure 5: Wavelet domain coefficients for DSL echo path impulse REFERENCES
response in Figure 4.
[1] D. L. Duttweiler, “Proportionate normalized least-mean-
squares adaptation in echo cancelers,” IEEE Transactions on
Speech and Audio Processing, vol. 8, no. 5, pp. 508–518, 2000.
[2] S. L. Gay, “An efficient, fast converging adaptive filter for
network echo cancellation,” in Proceedings of the 32nd Asilo-
convergence due to the sparseness of the impulse response mar Conference on Signals, Systems & Computers (ACSSC ’98),
in the wavelet domain and the algorithm’s proportionate vol. 1, pp. 394–398, Pacific Grove, Calif, USA, November 1998.
adaptation mechanism. The wavelet-based NLMS algorithm [3] H. Deng and M. Doroslovački, “Modified PNLMS adaptive
also identifies a sparse impulse response, but does not speed algorithm for sparse echo path estimation,” in Proceedings of
up the convergence by using the proportionate adaptation the Conference on Information Sciences and Systems, pp. 1072–
1077, Princeton, NJ, USA, March 2004.
mechanism. Compared to the computational and memory
[4] H. Deng and M. Doroslovački, “Improving convergence of the
requirements listed in [5, Table IV] for the MPNLMS al- PNLMS algorithm for sparse impulse response identification,”
gorithm, the WMPNLMS algorithm, in the case of Haar IEEE Signal Processing Letters, vol. 12, no. 3, pp. 181–184, 2005.
wavelets with M levels of decomposition, requires M + 2L [5] H. Deng and M. Doroslovački, “Proportionate adaptive algo-
more multiplications, L − 1 more divisions, 2M + L − 1 more rithms for network echo cancellation,” IEEE Transactions on
additions/subtractions, and 2L − 1 more memory elements. Signal Processing, vol. 54, no. 5, pp. 1794–1803, 2006.
H. Deng and M. Doroslovački 5
Research Article
A Low Delay and Fast Converging Improved Proportionate
Algorithm for Sparse System Identification
A sparse system identification algorithm for network echo cancellation is presented. This new approach exploits both the fast
convergence of the improved proportionate normalized least mean square (IPNLMS) algorithm and the efficient implementation
of the multidelay adaptive filtering (MDF) algorithm inheriting the beneficial properties of both. The proposed IPMDF algorithm
is evaluated using impulse responses with various degrees of sparseness. Simulation results are also presented for both speech
and white Gaussian noise input sequences. It has been shown that the IPMDF algorithm outperforms the MDF and IPNLMS
algorithms for both sparse and dispersive echo path impulse responses. Computational complexity of the proposed algorithm is
also discussed.
Copyright © 2007 Andy W. H. Khong et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
1. INTRODUCTION sions such as the IPNLMS [6] and IIPNLMS [7] algorithms
were proposed, which achieve improved convergence by in-
Research on network echo cancellation is increasingly im- troducing a controlled mixture of proportionate (PNLMS)
portant with the advent of voice over internet protocol and nonproportionate (NLMS) adaptation. Consequently,
(VoIP). In such systems where traditional telephony equip- these algorithms perform better than PNLMS for sparse and,
ment is connected to the packet-switched network, the echo in some cases, for dispersive impulse responses. To reduce the
path impulse response, which is typically of length 64– computational complexity of PNLMS, the sparse partial up-
128 milliseconds, exhibits an “active” region in the range of date NLMS (SPNLMS) algorithm was proposed [8] where,
8–12 milliseconds duration and consequently, the impulse similar to the selective partial update NLMS (SPUNLMS) al-
response is dominated by regions where magnitudes are close gorithm [9], only taps corresponding to the M largest ab-
to zero making the impulse response sparse. The “inactive” solute values of the product of input signal and filter co-
region is due to the presence of bulk delay caused by network efficients are selected for adaptation. An optimal step-size
propagation, encoding, and jitter buffer delays [1]. Other for PNLMS has been derived in [10] and employing an ap-
applications for sparse system identification include wavelet proximate μ-law function, the proposed segment PNLMS
identification using marine seismic signals [2] and geophysi- (SPNLMS) outperforms the PNLMS algorithm.
cal seismic applications [3, 4]. In recent years, frequency-domain adaptive algorithms
Classical adaptive algorithms with a uniform step-size have become popular due to their efficient implementa-
across all filter coefficients such as the normalized least mean tion. These algorithms incorporate block updating strategies
square (NLMS) algorithm have slow convergence in sparse whereby the fast Fourier transform (FFT) algorithm [11] is
network echo cancellation applications. One of the first algo- used together with the overlap-save method [12, 13]. One of
rithms which exploits the sparse nature of network impulse the main drawbacks of these approaches is the delay intro-
responses is the proportionate normalized least mean square duced between the input and output which can be equivalent
(PNLMS) algorithm [5] where each filter coefficient is up- to the length of the adaptive filter. Consequently, for long
dated with an independent step-size which is proportional impulse responses, this delay can be considerable since the
to the estimated filter coefficient. Subsequent improved ver- number of filter coefficients can be several thousands [14]. To
2 EURASIP Journal on Audio, Speech, and Music Processing
where v(n) and w(n) are defined as the near-end speech sig-
nal and ambient noise, respectively. For simplicity, we will
y(n) temporarily ignore the effects of double talk and ambient
− noise, that is, v(n) = w(n) = 0, in the description of algo-
e(n) + y(n) w(n) + v(n)
Σ Σ rithms.
Figure 1: Schematic diagram of an echo canceller. 2.1. The PNLMS and IPNLMS algorithms
1 An earlier version of this work was presented at the EUSIPCO 2005 special where δNLMS = σx2 is the variance of the input signal [6]. It
session on sparse and partial update adaptive filters [17]. can be seen that for ρ ≥ 1, PNLMS is equivalent to NLMS.
Andy W. H. Khong et al. 3
where k = 0, 1, . . . , K − 1 is the block index while j = We consider the computational complexity of the proposed
0, 1, . . . , N − 1 is the tap-index of each kth block. The IPMDF IPMDF algorithm. We note that although the IPMDF al-
algorithm update equation is then given by gorithm is updated in the time domain, the error e(m) is
generated using frequency-domain coefficients and hence
h k (m − 1) + LμQk (m)G
k (m) = h 10 D∗ (m − k) five FFT-blocks are required. Since a 2N point FFT re-
−1 (28) quires 2N log2 N real multiplications, the number of multi-
× SIPMDF (m) + δIPMDF e(m), plications required per output sample for each algorithm is
Andy W. H. Khong et al. 5
Amplitude
−0.02
It can be seen that the complexity of IPMDF is only mod-
estly higher than MDF. However, as we will see in Section 4,
the performance of IPMDF far exceeds that of MDF for both −0.04
0 0
Speech
NLMS (μ = 0.15) −5
−10
MDF −10
IPNLMS
η (dB)
−20 IPNLMS (μ = 0.15)
η (dB)
MDF
−15
−30 −20
−25 IPMDF
IPMDF
−40
−30
0 0.5 1 1.5 2 2.5 3 0 5 10 15 20 25 30
Time (s) Time (s)
Figure 4: Relative convergence of IPMDF, MDF, IPNLMS, and Figure 6: Relative convergence of IPMDF, MDF, and IPNLMS using
NLMS using WGN input. SNR = 30 dB. speech input with echo path change at 3 seconds.
−40
IPMDF 4.2. Synthetic impulse responses with various
IPMDF
−45 degrees of sparseness
0 1 2 3 4 5 6
Time (s) We illustrate the robustness of IPMDF to impulse response
sparseness. Impulse responses with various degrees of sparse-
Figure 5: Relative convergence of IPMDF, MDF, IPNLMS, and ness are generated synthetically using an L × 1 exponentially
NLMS using WGN input with echo path change at 3 s. SNR = 30 dB. decaying window [18] which is defined as
T
u = p 1 e−1/ψ , e−2/ψ , . . . , e−(Lu −1)/ψ , (33)
We compare the tracking performance of the algorithms where the L p × 1 vector p models the bulk delay and is a zero
as shown in Figure 5 using a WGN input sequence. In this mean WGN sequence with variance σ p2 and Lu = L − L p is
simulation, an echo path change, comprising an additional the length of the decaying window while ψ ∈ Z+ is the decay
12-sample delay, was introduced after 3 seconds. As before, constant. Defining an Lu × 1 vector b as a zero mean WGN se-
the frame size for the IPMDF and MDF algorithms is N = quence with variance σb2 , the L × 1 synthetic impulse response
64 while for IPNLMS and NLMS, μIPNLMS = μNLMS = can then be expressed as
0.15 is used. We see that IPMDF achieves the highest ini-
tial rate of convergence. When compared with MDF, the IL p ×L p 0L p ×Lu
B = diag{b}, h= u. (34)
IPMDF algorithm has a higher tracking capability follow- 0Lu ×L p B
ing the echo path change at 3 seconds. Compared with the
IPNLMS algorithm, a delay is introduced by block process- The sparseness of an impulse response can be quantified
ing the data input for both the MDF and IPMDF algo- using the sparseness measure [18, 19]
rithms. As a result, IPNLMS achieves a better tracking ca-
L h
pability than the MDF algorithm. The tracking capability ξ(h) = 1− √ 1 .
√ (35)
L− L L h 2
of NLMS is slower compared to IPNLMS and IPMDF due
to its relatively slow convergence rate. Although delay ex- It has been shown in [18] that ξ(h) reduces with ψ. Figure 7
ists for the IPMDF algorithm, the reduction in delay due shows an illustrative example set of impulse responses gen-
to the multidelay structure allows the IPMDF algorithm to erated using (34) with σ p2 = 1.055 × 10−4 , σb2 = 0.9146,
Andy W. H. Khong et al. 7
2 2
1.5 1.5
1 1
0.5 0.5
Amplitude
Amplitude
0 0
−0.5 −0.5
−1 −1
−1.5 −1.5
−2 −2
0 100 200 300 400 512 0 100 200 300 400 512
Samples Samples
(a) (b)
2 2
1.5 1.5
1 1
Amplitude
Amplitude
0.5 0.5
0 0
−0.5 −0.5
−1 −1
−1.5 −1.5
−2 −2
0 100 200 300 400 512 0 100 200 300 400 512
Samples Samples
(c) (d)
Figure 7: Impulse responses controlled using (a) ψ = 10, (b) ψ = 50, (c) ψ = 150, and (d) ψ = 300 giving sparseness measure (a) ξ = 0.8767,
(b) ξ = 0.6735, (c) ξ = 0.4216, and (d) ξ = 0.3063.
(b)
using white Gaussian noise input sequences for impulse re-
sponses generated using 0.3 ≤ ξ ≤ 0.9 as controlled by ψ. 0.4
As before w(n) is added to achieve an SNR of 30 dB. Figure 8
(c)
shows the variation in time to reach η(m) = −20 dB nor-
malized misalignment with sparseness measure ξ controlled 0.2
using exponential window ψ. Due to the proportional con-
trol of step-sizes, significant increase in the rate of conver- 0
gence for IPNLMS and IPMDF can be seen as the sparseness 0.4 0.5 0.6 0.7 0.8 0.9
of the impulse responses increases for high ξ. For all cases of Sparseness measure (ξ)
sparseness, the IPMDF algorithm exhibits the highest rate of
convergence compared to IPNLMS and MDF hence demon- Figure 8: Time to reach −20 dB (T20 ) normalized misalignment for
strating the robustness of IPMDF to the sparse nature of the (a) IPNLMS, (b) MDF and (c) IPMDF algorithms with sparseness
unknown system. measure ξ controlled using exponential decay factor ψ.
8 EURASIP Journal on Audio, Speech, and Music Processing
Research Article
Analysis of Transient and Steady-State Behavior
of a Multichannel Filtered-x Partial-Error Affine
Projection Algorithm
The paper provides an analysis of the transient and the steady-state behavior of a filtered-x partial-error affine projection algo-
rithm suitable for multichannel active noise control. The analysis relies on energy conservation arguments, it does not apply the
independence theory nor does it impose any restriction to the signal distributions. The paper shows that the partial-error filtered-x
affine projection algorithm in presence of stationary input signals converges to a cyclostationary process, that is, the mean value of
the coefficient vector, the mean-square error and the mean-square deviation tend to periodic functions of the sample time.
Copyright © 2007 A. Carini and G. L. Sicuranza. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
the partial update strategies, a simple yet effective approach trices, for example, x and X, all vectors are column vectors,
is provided by the partial error (PE) technique, which has the boldface symbol I indicates an identity matrix of appro-
been first applied in [7] for reducing the complexity of linear priate dimensions, the symbol denotes linear convolution,
multichannel controllers equipped with the filtered-x LMS diag{· · · } is a block-diagonal matrix of the entries, E[·] de-
algorithm. The PE technique consists in using sequentially at notes mathematical expectation, · 2Σ is the weighted Eu-
each iteration only one of the K error sensor signals in place clidean norm, for example, w2Σ = wT Σw with Σ a symmet-
of their combination and it is capable to reduce the adap- ric positive definite matrix, vec{·} indicates the vector oper-
tation complexity with a factor K. In [9], the PE technique ator and vec−1 {·} the inverse vector operator that returns a
was applied, together with other methods, for reducing the square matrix from an input vector of appropriate dimen-
computational load of multichannel active noise controllers sions, ⊗ denotes the Kronecker product, a%b is the remain-
equipped with filtered-x affine projection (AP) algorithms. der of the division of a by b, and |a| is the absolute value
When dealing with novel adaptive filters, it is important to of a.
assess their performance not only through extensive simu-
lations but also with theoretical analysis results. In the lit- 2. THE PARTIAL-ERROR FILTERED-x AP ALGORITHM
erature, very few results deal with the analysis of filtered-x,
affine projection or partial-update algorithms. The conver- The schematic description of a multichannel feedforward ac-
gence analysis results for these algorithms are often based on tive noise controller (ANC) is provided in Figure 1. I ref-
the independence theory (IT) and they constrain the proba- erence sensors collect the corresponding input signals from
bility distribution of the input signal to be Gaussian or spher- the noise sources and K error sensors collect the error sig-
ically invariant [10]. The IT hypothesis assumes statistical nals at the interference locations. The signals coming from
independence of time-lagged input data vectors. As it is too these sensors are used by the controller in order to adap-
strong for filtered-x LMS [11] and AP algorithms [12], dif- tively estimate J output signals which feed J actuators. The
ferent approaches have been studied in the literature in order corresponding block diagram is reported in Figure 2. The
to overcome this hypothesis. In [11], an analysis of the mean propagation of the original noise up to the region to be si-
weight behavior of the filtered-x LMS algorithm, based only lenced is described by the transfer functions pk,i (z) repre-
on neglecting the correlation between coefficient and signal senting the primary paths. The secondary noise signals prop-
vectors, is presented. Moreover, the analysis of [11] does not agate through secondary paths, which are characterized by
impose any restriction on the signal distributions. Another the transfer functions sk, j (z). We assume there is no feedback
analysis approach that avoids IT is applied in [12] for the between loudspeakers and reference sensors. The primary
mean-square performance analysis of AP algorithms. This source signals filtered by the impulse responses of the sec-
relies on energy conservation arguments, and no restriction ondary paths model, with transfer functions sk, j (z), are used
is imposed on the signal distributions. In [4], we applied and for the adaptive filter update, and for this reason the adap-
adapted the approach of [12] for analyzing the convergence tation algorithm is called filtered-x. Figure 2 illustrates also
behavior of multichannel FX-AP algorithms. In this paper, the delay-compensation scheme [13] that is used through-
we extend the analysis approach of [4] and study the tran- out the paper. To compensate for the propagation delay in-
sient and steady-state behavior of a filtered-x partial error troduced by the secondary paths, the output of the primary
affine projection (FX-PE-AP) algorithm. The paper shows paths d(n) is estimated with d(n) by subtracting the output
that the FX-PE-AP algorithm in presence of stationary input of the secondary paths model from the error sensors signals
signals converges to a cyclostationary process, that is, that the d(n), and the error signal e(n) between d(n) and the output
mean value of the coefficient vector, the mean-square-error, of the adaptive filter is used for the adaptation of the filter
and the mean-square-deviation tend to periodic functions of w(n). A copy of this filter is used for the actuators’ output
the sample time. We also show the FX-PE-AP algorithm is estimation.
capable to reduce the adaptation complexity with a factor K Preliminary and independent evaluations of the sec-
with respect to an approximate FX-AP algorithm introduced ondary paths transfer functions are needed. For generality
in [4], but it also reduces the convergence speed by the same purposes, the theoretical results we present assume imper-
factor. fect modelling of the secondary paths (we consider sk, j (z) =
The paper is organized as follows. Section 2 reviews sk, j (z) for any choice of j and k), but all the results hold also
the multichannel feedforward active noise controller struc- for perfect modelling (i.e., for sk, j (z) = sk, j (z)). Indeed, the
ture and introduces the FX-PE-AP algorithm. Section 3 experimental results of Section 5 refer to ANC systems with
discusses the asymptotic solution of the FX-PE-AP algo- perfect modelling of the secondary paths. When necessary,
rithm and compares it with that of FX-AP algorithms and we will highlight in the paper the different behavior of the
with the minimum-mean-square solution of the ANC prob- system under perfect and imperfect estimations of the sec-
lem. Section 4 presents the analysis of the transient and ondary paths.
steady-state behavior of the FX-PE-AP algorithm. Section 5 Very mild assumptions are posed in this paper on the
provides some experimental results. Conclusions follow in adaptive controller. Indeed, we assume that any input i of the
Section 6. controller is connected to any output j through a filter whose
Throughout this paper, small boldface letters are used to output depends linearly on the filter coefficients, that is, we
denote vectors and bold capital letters are used to denote ma- assume that the jth actuator output is given by the following
A. Carini and G. L. Sicuranza 3
.
. Error
.
Primary microphones
Noise paths
source .
.. e1 (n)
e2 (n)
Reference .
..
microphones Secondary eK (n)
paths
x1 (n) x2 (n) xi (n)
Adaptive
controller
I primary
signals x(n) Primary paths d(n)
pk,i (z)
+
Secondary paths +
J secondary +
sk, j (z)
signals y(n) K error
sensor
signals e(n)
vector equation: ear filters, truncated Volterra filters of any order p [14], ra-
dial basis function networks [15], filters based on functional
I
expansions [16], and other nonlinear filter structures. In
y j (n) = xiT (n)w j,i (n), (1)
i=1
Section 5 we provide experimental results for linear filters,
where the vector xi (n) reduces to
where w j,i (n) is the coefficient vector of the filter that con- T
nects the input i to the output j of the adaptive controller, xi (n) = xi (n), xi (n − 1), . . . , xi (n − N + 1) , (3)
and xi (n) is the ith primary source input signal vector. In
particular, xi (n) is here expressed as a vector function of the and for filters based on a piecewise linear functional expan-
signal samples xi (n) whose general form is given by sion with the vector xi (n) given by
T xi (n) = xi (n), xi (n − 1), . . . , xi (n − N + 1),
xi (n) = f1 xi (n) , f2 xi (n) , . . . , fN xi (n) , (2) (4)
xi (n) − a, . . . , xi (n − N + 1) − a T ,
where fi [·], for any i = 1, . . . , N, is a time-invariant func-
tional of its argument. Equations (1) and (2) include lin- where a is an appropriate constant.
4 EURASIP Journal on Audio, Speech, and Music Processing
To introduce the PE-FX-AP algorithm analyzed in subse- that is, when we work with small step-size values. On the
quent sections, we make use of quantities defined in Table 1. contrary, the expression in (11) is only an approximation
TOurT objective is to estimate the coefficient vector wo = for large step-sizes and in presence of secondary path estima-
w1 , w2 , . . . , wJT ]T that minimizes the cost function given in tion errors, but it allows an insightful analysis of the effects
of these estimation errors.
2
K
J
By introducing the result of (11) in (8), we obtain the
Jo = E dk (n) + sk, j (n) wTj x(n) . (5) following equation:
k=1 j =1
n%K (n)R
w(n + 1) = w(n) − μU n%K
−1
(n)
Several adaptive filters have been proposed in the literature
(12)
to estimate the filter wo . In [4], we have analyzed the conver- × dn%K (n) + UTn%K (n)w(n) ,
gence properties of the approximate FX-AP algorithm with
adaptation rule given by which can also be written in the compact form of
K w(n + 1) = Vn%K (n)w(n) − vn%K (n), (13)
w(n + 1) = w(n) − μ k (n)R
U k−1 (n)
ek (n), (6)
k=0 with
where k (n)R
k−1 (n)UTk (n),
Vk (n) = I − μU
(14)
R Tk (n)U
k (n) = U k (n) + δI. (7) k (n)R
vk (n) = μU k−1 (n)dk (n).
In this paper, we consider the FX-PE-AP algorithm charac- By iterating K times (13) from n = mK + i till n = mK +
terized by the adaptation rule of i+K − 1, with m ∈ N and 0 ≤ i < K, we obtain the expression
of (15), which will be used for the algorithm analysis,
n%K (n)R
w(n + 1) = w(n) − μU n%K
−1
(n)en%K (n), (8)
w(mK + i + K) = Mi (mK + i)w(mK + i) − mi (mK + i),
where n%K is the remainder of the division of n by K. The
(15)
adaptation rule in (8) has been obtained by applying the PE
methodology to the approximate FX-AP algorithm of (6). where
At each iteration, only one of the K error sensor signals is
used for the controller adaptation. The error sensor signal Mi (n) = V(i+K −1)%K (n + K − 1)V(i+K −2)%K (n + K − 2)
employed for the adaptation is chosen with a round-robin × · · · Vi%K (n),
strategy. Thus, compared with (6), the FX-PE-AP adaptation
(16)
in (8) reduces the computational load by a factor K.
The exact value of the estimated residual error ek (n) is mi (n) = V(i+K −1)%K (n + K − 1) · · · V(i+1)%K (n + 1)vi%K (n)
given by + V(i+K −1)%K (n + K − 1) · · · V(i+2)%K (n + 2)
× v(i+1)%K (n + 1)
J
ek (n) = dk (n) + sk, j (n) − sk, j (n) wTj (n)x(n) + · · · + v(i+K −1)%K (n + K − 1).
j =1 (17)
(9)
J
+ wTj (n)u
k, j (n). 3. THE ASYMPTOTIC SOLUTION
j =1
For i ranging from 0 to K − 1, (15) provides a set of K in-
In order to analyze the FX-PE-AP algorithm, we introduce in dependent equations that can be separately studied. The sys-
(9) the approximation tem matrix Mi (n) and excitation matrix mi (n) have different
statistical properties for different indexes i. For every i, the
J
sk, j (n) − sk, j (n) wTj (n)x(n) recursion in (15) converges to a different asymptotic coef-
j =1 ficient vector and it provides different values of the steady-
(10) state mean-square error and the mean-square deviation. If
J
∼
= wTj (n) · sk, j (n) − sk, j (n) x(n) , the input signals are stationary and if the recursion in (15)
j =1 is convergent for every i, it can be shown that the algorithm
converges to a cyclostationary process of periodicity K.
which allows us to simplify (9) and to obtain For every index i, the coefficient vector w(mK + i) tends
for m → +∞ to an asymptotic vector w∞,i , which depends on
J
the statistical properties of the input signals. In fact, by taking
ek (n) = dk (n) + wTj (n)uk, j (n). (11)
j =1
the expectation of (15) and considering the fixed point of this
equation, it can be easily deduced that
Note that the expression in (11) is correct when we per-
−1
fectly estimate the secondary paths or when w(n) is constant, w∞,i = E Mi (n) − I E mi (n) . (18)
A. Carini and G. L. Sicuranza 5
Since the matrices E[Mi (n)] and [mi (n)] vary with i, so do paths. As we already observed for FX-AP algorithms [4],
the asymptotic coefficient vectors w∞,i . Thus, the vector w(n) the asymptotic solution in (18) differs from the minimum-
for n → +∞ tends to the periodic sequence formed by the mean-square (MMS) solution of the active noise control
repetition of the K vectors w∞,i with i = 0, 1, . . . , K − 1. problem, which is given by (19) [17],
The asymptotic sequence varies with the step-size μ and −1
with the estimation errors sk, j (z) − sk, j (z) of the secondary wo = −Ruu Rud , (19)
6 EURASIP Journal on Audio, Speech, and Music Processing
where Ruu and Rud are defined, respectively, in 4.1. Energy conservation relation
Moreover, w∞,i for every i differs also from the asymptotic − 2wT (mK + i)qΣ,i (mK + i)
solution w∞ of the adaptation rule in (6), which is given by
[4] + mTi (mK + i)Σmi (mK + i),
(23)
−1
K
w∞ = −E k (n)R
U k−1 (n)UTk (n) where we have introduced the quantities Σ i (n) and qΣ,i (n)
k=1 which are defined, respectively, in
(21)
K
×E k (n)R
U k−1 (n)dk (n) .
Σ i (n) = MTi (n)ΣMi (n),
k=1 (24)
qΣ,i (n) = MTi (n)Σmi (n).
Nevertheless, when μ tends to 0, the vectors w∞,i tend to the
same asymptotic solution w∞ of (6). In fact, it can be verified Equation (23) provides an energy conservation relation,
that the expression in (18), when μ tends to 0, converges to which is the basis of our analysis. The relation of (23) has
the following expression: the same role of the energy conservation relation employed
in [12]. No approximation has been used for deriving the ex-
pression of (23).
K
w∞,i = −E (i+K −k)%K (n+K− k)R
U (i+K
−1
−k)%K (n+K− k)
k=1
4.2. Transient analysis
−1
× UT(i+K −k)%K (n + K − k) We are now interested in studying the time evolution of
E[w(mK + i)2Σ ] where Σ is a symmetric and positive defi-
nite square matrix. For this purpose, we follow the approach
K
×E (i+K −k)%K (n+K− k)R
U (i+K
−1
−k)%K (n+K− k) of [12, 18, 19].
k=1 In the analysis of filtered-x and AP algorithms, it is com-
mon to assume w(n) to be uncorrelated with some functions
× d(i+K −k)%K (n + K − k) , of the filtered input signal [11, 12]. This assumption provides
good results and is weaker than the hypothesis of the inde-
(22) pendence theory, which requires the statistical independence
of time-lagged input data vectors.
which in the hypothesis of stationary input signals is equal to Therefore, in what follows, we introduce the following
the expression in (21). approximation.
(A1) For every i with 0 ≤ i < K and for m ∈ N, we assume
4. TRANSIENT ANALYSIS AND STEADY- w(mK +i) to be uncorrelated with Mi (mK +i) and with
STATE ANALYSIS qΣ,i (mK + i).
The transient analysis aims to study the time evolution of In the appendix, we prove the following theorem that de-
the expectation of the weighted Euclidean norm of the co- scribes the transient behavior of the FX-PE-AP algorithm.
efficient vector E[w(n)2Σ ] = w(n)T Σw(n) for some choices
Theorem 1. Under the assumption (A1), the transient behav-
of the symmetric positive definite matrix Σ [12]. Moreover,
ior of the FX-PE-AP algorithm with updating rule given by
the limit for n → +∞ of the same quantity, again for some
(15) is described by the state recursions
appropriate choices of the matrix Σ, is needed for the steady-
state analysis. For simplicity, in the following we assume to
work with stationary input signals and, according to (15), we E w(mK + i + K) = Mi E w(mK + i) − mi ,
separately analyze the evolution of E[w(mK + i)2Σ ] for the (25)
different indexes i. Wi (mK + i + K) = Gi Wi (mK + i) + yi (mK + i),
A. Carini and G. L. Sicuranza 7
the M 2 × M 2 matrix Fi = E[MTi (n) ⊗ MTi (n)], the M × M 2 By exploiting the hypothesis in (A2), the MSE can be ex-
matrix Qi = E[mTi (n) ⊗ MTi (n)], the M 2 × 1 vector gi = pressed as
vec{E[mi (n)mTi (n)]}, the p j,i are the coefficients of the charac- T
MSEi = Sd +2Rud w∞,i + lim E wT (mK + i)Ruu w(mK + i) ,
teristic polynomial of Fi , that is, pi (x) = xM + pM 2 −1,i xM −1 +
2 2
m→+∞
· · · + p1,i x + p0,i = det(xI − Fi ), and σ = vec{Σ}. (29)
Table 2: First eight coefficients of the MMS solution (wo ) and of the asymptotic solutions of FX-PE-AP (w∞,0 , w∞,1 ) and of FX-AP algorithm
(w∞ ) with the linear controller.
To estimate the MSE, we have to choose σ such that (I − better estimate a non-Gaussian noise process [15, 21]. In the
Fi )σ = vec{Ruu }, that is, σ = (I − Fi )−1 vec{Ruu }. Therefore, case of our multichannel active noise controller, the exact so-
the MSE can be evaluated as in lution of the multichannel ANC problem requires the inver-
T
−1 sion of the 2 × 2 matrix S formed with the transfer functions
MSEi = Sd + 2Rud w∞,i + giT − 2w∞
T
,i Qi I − Fi vec Ruu .
sk, j . The inverse matrix S−1 is formed by IIR transfer func-
(33)
tions whose poles are given by the roots of the determinant
To estimate the MSD, we have to choose σ such that (I − of S. It is easy to verify that in our example, there is a root out-
Fi )σ = vec{I}, that is, σ = (I − Fi )−1 vec{I}. Thus, the MSD side the unit circle. Thus, also in our case the controller acts
can be evaluated as in as a predictor of the input signal and a nonlinear controller
−1 2 can better estimate the logistic noise. Therefore, in what fol-
MSDi = giT − 2w∞
T
Qi I − Fi vec{I} − w∞,i . (34)
i lows, we provide results for (1) the two-channel linear con-
troller with memory length N = 8 and (2) the two-channel
5. EXPERIMENTAL RESULTS nonlinear controller with memory length N = 4 whose in-
put data vector is given in (4), with the constant a set to 1.
In this section, we provide a few experimental results that
Note that despite the two controllers have different memory
compare theoretically predicted values with values obtained
lengths, they have the same total number of coefficients, that
from simulations.
is, M = 16. In all the experiments, a zero mean, white Gaus-
We first considered a multichannel active noise controller
sian noise, uncorrelated between the microphones, has been
with I = 1, J = 2, K = 2. The transfer functions of the
added to the error microphone signals dk (n) to get a 40 dB
primary paths are given by
signal-to-noise ratio and the parameter δ was set to 0.001.
p1,1 (z) = 1.0z−2 − 0.3z−3 + 0.2z−4 , Tables 2 and 3 provide with three-digits precision the first
(35) eight coefficients of the MMS solution, wo , and of the asymp-
p2,1 (z) = 1.0z−2 − 0.2z−3 + 0.1z−4 ,
totic solutions of the FX-PE-AP algorithm at even samples,
and the transfer functions of the secondary paths are w∞,0 , and odd samples, w∞,1 , and of the approximate FX-AP
algorithm of (6), w∞ , for μ = 1.0 and for the AP orders L = 1,
s1,1 (z) = 2.0z−1 − 0.5z−2 + 0.1z−3 ,
2, and 3. Table 2 refers to the linear controller and Table 3 to
s1,2 (z) = 2.0z−1 − 0.3z−2 − 0.1z−3 , the nonlinear controller, respectively. From Tables 2 and 3, it
(36)
s2,1 (z) = 1.0z−1 − 0.7z−2 − 0.2z−3 , is evident that the asymptotic vector varies with the AP or-
der and that the asymptotic solutions w∞,0 , w∞,1 , and w∞ are
s2,2 (z) = 1.0z−1 − 0.2z−2 + 0.2z−3 .
different. However, we must point out that their difference
For simplicity, we provide results only for a perfect estimate reduces with the step-size, and for smaller step-sizes it can be
of the secondary paths, that is, we consider si, j (z) = si, j (z). difficulty appreciated.
The input signal is the normalized logistic noise, which has Figure 3 diagrams the steady-state MSE, estimated with
been generated by scaling the signal ξ(n) obtained from the (33) or obtained from simulations with time averages over
logistic recursion ξ(n + 1) = λξ(n)(1 − ξ(n)), with λ = 4 ten million samples, versus step-size μ and for AP orders
and ξ(0) = 0.9, and by adding a white Gaussian noise to get L = 1, 2, and 3. Similarly, Figure 4 diagrams the steady-state
a 30 dB signal-to-noise ratio. It has been proven for single- MSD, estimated with (34) or obtained from simulations with
channel active noise controllers that in presence of a nonmin- time averages over ten million samples. From Figures 3 and 4,
imum phase secondary path, the controller acts as a predic- we see that the expressions in (33) and in (34) provide accu-
tor of the reference signal and that a nonlinear controller can rate estimates of the steady-state MSE and of the steady-state
A. Carini and G. L. Sicuranza 9
Table 3: First eight coefficients of the MMS solution (wo ) and of the asymptotic solutions of FX-PE-AP (w∞,0 , w∞,1 ) and of FX-AP algorithm
(w∞ ) with the nonlinear controller.
MSD, respectively, when L = 2 and L = 3. The estimation step-size equal to 0.032, of the mean value of the residual
errors can be both positive or negative depending on the AP power of the error computed on 100 successive samples for
order, the step-size, and the odd or even sample times. On the nonlinear controller with I = 1, J = 2, K = 3, and with
the contrary, for the AP order L = 1, the estimations are in- I = 1, J = 2, K = 4, respectively. In the case I = 1, J = 2,
accurate. The large estimation errors for L = 1 are due to K = 3, the transfer functions of the primary paths, p1,1 (z)
the bad conditioning of the matrices Mi − I that takes to a and p2,1 (z), and of the secondary paths, s1,1 (z), s1,2 (z), s2,1 (z),
poor estimate of the asymptotic solution. For larger AP or- and s2,2 (z), are given by (35)-(36), while the other primary
ders, the data reuse property of the AP algorithm takes to and secondary paths are given by
more regular matrices Mi . Indeed, Table 4 compares the con-
dition number, that is, the ratio between the magnitude of p3,1 (z) = 1.0z−2 − 0.3z−3 + 0.1z−4 ,
the largest and the smallest of the eigenvalues of the matrix
s3,1 (z) = 1.6z−1 − 0.6z−2 + 0.1z−3 , (37)
Mi − I of the nonlinear controller at even-time indexes for
−1 −2 −3
the AP orders L = 1, 2, and 3 and for different values of the s3,2 (z) = 1.6z − 0.2z − 0.1z .
step-size.
Figures 5 and 6 diagram the ensemble averages, estimated In the case I = 1, J = 2, K = 4, the transfer functions of
over 100 runs of the FX-PE-AP and the FX-AP algorithms the primary paths, p1,1 (z), p2,1 (z), and p3,1 (z), and of the sec-
with step-size equal to 0.032, of the mean value of the resid- ondary paths, s1,1 (z), s1,2 (z), s2,1 (z), s2,2 (z), s3,1 (z), and s3,2 (z),
ual power of the error computed on 100 successive samples are given by (35)–(37), and the other primary and secondary
for the nonlinear and the linear controllers, respectively. In paths are given by
the figures, the asymptotic values (dashed lines) of the resid-
ual power of the errors are also shown. From Figures 5 and p4,1 (z) = 1.0z−2 − 0.2z−3 + 0.2z−4 ,
6, it is evident that the nonlinear controller outperforms the
s4,1 (z) = 1.3z−1 − 0.5z−2 − 0.2z−3 , (38)
linear one in terms of residual error. Nevertheless, it must
be observed that the nonlinear controller reaches the steady- s4,2 (z) = 1.3z−1 − 0.4z−2 + 0.2z−3 .
state condition in a slightly longer time than the linear con-
troller. This behavior could also be predicted by the maxi- All the other experimental conditions are the same of the case
mum eigenvalues of the matrices Mi and Fi , which are re- I = 1, J = 2, K = 2. Figures 7 and 8 confirm again that for
ported in Table 5. Since the step-size μ assumes a small value μ = 0.032, the FX-PE-AP algorithm has a convergence speed
(μ = 0.032), in the table we have the same maximum eigen- that is reduced by a factor K with respect to the approximate
value for M0 and M1 and for F0 and F1 . Moreover, as already FX-AP algorithm. Nevertheless, we must point out that for
observed for the filtered-x PE LMS algorithm [2], from Fig- larger values of the step-size, the reduction of convergence
ures 5 and 6 it is apparent that for this step-size, the FX-PE- speed of the FX-PE-AP algorithm can be even larger than a
AP algorithm has a convergence speed that is half (i.e., 1/K) factor K.
of the approximate FX-AP algorithm. In fact, the diagrams We have also performed the same simulations by reduc-
on the left and the right of the figures can be overlapped but ing the SNR at the error microphones to 30, 20, and 10 dB
the time scale of the FX-PE-AP algorithm is the double of the and we have obtained similar convergence behaviors. The
FX-AP algorithm. The same observation applies also when a main difference, apart from the increase in the residual error,
larger number of microphones are considered. For example, has been that the lowest is the SNR at the error microphones,
Figures 7 and 8 plot the ensemble averages, estimated over the lowest is the improvement in the convergence speed ob-
100 runs of the FX-PE-AP and the FX-AP algorithm with tained by increasing the affine projection order.
10 EURASIP Journal on Audio, Speech, and Music Processing
10 2 10 2 10 2
10 1 100 10 1 100 10 1 100
(a)
10 1 10 1 10 1
10 2 10 2 10 2
10 1 100 10 1 100 10 1 100
(b)
1 10 1 10 1
10
10 2 10 2 10 2
10 1 100 10 1 100 10 1 100
(c)
1 1 10 1
10 10
10 2 10 2 10 2
10 1 100 10 1 100 10 1 100
(d)
Figure 3: Theoretical (- -) and simulation values (–) of steady-state MSE versus step-size of the FX-PE-AP algorithm (a) at even samples
with a nonlinear controller, (b) at odd samples with a nonlinear controller, (c) at even samples with a linear controller, (d) at odd samples
with a linear controller, for L = 1, 2, and 3.
A. Carini and G. L. Sicuranza 11
10 1 10 1 10 1
10 2 10 2 10 2
10 3 10 3 10 3
10 4 10 4 10 4
10 5 10 5 10 5
10 1 100 10 1 100 10 1 100
(a)
10 1 10 1 10 1
10 2 10 2 10 2
10 3 10 3 10 3
10 4 10 4 10 4
10 5 10 5 10 5
10 1 100 10 1 100 10 1 100
(b)
10 1 10 1 10 1
10 2 10 2 10 2
10 3 10 3 10 3
10 4 10 4 10 4
10 5 10 5 10 5
10 1 100 10 1 100 10 1 100
(c)
10 1 10 1 10 1
10 2 10 2 10 2
10 3 10 3 10 3
10 4 10 4 10 4
10 5 10 5 10 5
10 1 100 10 1 100 10 1 100
(d)
Figure 4: Theoretical (- -) and simulation values (–) of steady-state MSD versus step-size of the FX-PE-AP algorithm (a) at even samples
with a nonlinear controller, (b) at odd samples with a nonlinear controller, (c) at even samples with a linear controller, (d) at odd samples
with a linear controller, for L = 1, 2, and 3.
12 EURASIP Journal on Audio, Speech, and Music Processing
10 1 10 1
Residual power
Residual power
L=1 L=1
L=3 L=3
10 2 L=2
103 10 2 L=2
103
0 50 100 150 200 0 25 50 75 100
Time Time
(a) (b)
Figure 5: Evolution of residual power of the error of (a) the FX-PE-AP algorithm and (b) FX-AP algorithm with a nonlinear controller and
I = 1, J = 2, K = 2. The dashed lines diagram the asymptotic values of the residual power.
10 1 10 1
Residual power
Residual power
L=1 L=1
10 2 103 10 2 103
0 50 100 150 200 0 50 100 150 200
Time Time
(a) (b)
Figure 6: Evolution of residual power of the error of (a) the FX-PE-AP algorithm and (b) FX-AP algorithm with a linear controller and
I = 1, J = 2, K = 2. The dashed lines diagram the asymptotic values of the residual power.
10 1 10 1
Residual power
Residual power
L=1 L=1
Figure 7: Evolution of residual power of the error of (a) the FX-PE-AP algorithm and (b) FX-AP algorithm with a nonlinear controller and
I = 1, J = 2, K = 3. The dashed lines diagram the asymptotic values of the residual power.
A. Carini and G. L. Sicuranza 13
10 1 10 1
L=1 L=1
Residual power
Residual power
L=3 L=2 L=3 L=2
10 2 103 10 2 103
0 100 200 300 400 0 25 50 75 100
Time Time
(a) (b)
Figure 8: Evolution of residual power of the error of (a) the FX-PE-AP algorithm and (b) FX-AP algorithm with a nonlinear controller and
I = 1, J = 2, K = 4. The dashed lines diagram the asymptotic values of the residual power.
Table 4: Condition number of the matrix M0 − I for different step- pared the FX-PE-AP with the approximate FX-AP algorithm
sizes and for the AP orders L = 1, 2, and 3 with the nonlinear con- introduced in [4]. Compared with the approximate FX-AP
troller.
algorithm, the FX-PE-AP algorithm is capable of reducing
L μ = 1.0 μ = 0.25 μ = 0.0625 the adaptation complexity with a factor K. Nevertheless, also
the convergence speed of the algorithm reduces of the same
L=1 33379 36299 36965
value.
L=2 6428 9711 10575
L=3 2004 3290 3623
APPENDIX
Table 5: Maximum eigenvalues of the matrices Mi and Fi for the AP PROOF OF THEOREM 1
orders L = 1, 2, and 3 with the linear and the nonlinear controllers.
If we apply the expectation operator to both sides of (23),
Controllers L=1 L=2 L=3 and if we take into account the hypothesis in (A1), we can
derive the result of
λmax (Mi ) 0.999999 0.999996 0.999987
Nonlinear 2
λmax (Fi ) 0.999998 0.999992 0.999974 E w(mK + i + K)Σ
2
Linear
λmax (Mi ) 0.999991 0.999972 0.999957 = E w(mK + i)Σ i − 2E wT (mK + i) E qΣ,i (mK + i)
λmax (Fi ) 0.999981 0.999944 0.999914 T
+ E mi (mK + i)Σmi (mK + i) ,
(A.1)
6. CONCLUSION where
In this paper, we have provided an analysis of the transient Σ i = E MTi (n)ΣMi (n) . (A.2)
and the steady-state behavior of the FX-PE-AP algorithm. Moreover, under the same hypothesis (A1), the evolution
We have shown that the algorithm in presence of station- of the mean of the coefficient vector from (15) is described by
ary input signals converges to a cyclostationary process, that
is, the asymptotic value of the coefficient vector, the mean- E w(mK + i + K) = E Mi (mK + i) E w(mK + i)
square error and the mean-square deviation tend to peri- − E mi (mK + i) .
odic functions of the sample time. We have shown that the (A.3)
asymptotic coefficient vector of the FX-PE-AP algorithm dif-
fers from the minimum-mean-square solution of the ANC We manipulate (A.1), (A.2), and (A.3) by taking advan-
problem and from the asymptotic solution of the AP algo- tage of the properties of the vector operator vec{·} and of the
rithm from which the FX-PE-AP algorithm was derived. We Kronecker product, ⊗. We introduce the vectors σ = vec{Σ}
have proved that the transient behavior of the algorithm can and σ = vec{Σ }. Since for any matrices A, B, and C, it is
be studied by the cascade of two linear systems. By studying
where Fi is the M 2 × M 2 matrix defined by pi (Fi ) = 0. The characteristic polynomial pi (x) is an order
M 2 polynomial that can be written as in
Fi = E MTi (n) ⊗ MTi (n) . (A.6)
2 2 −1
pi (x) = xM + pM 2 −1,i xM + · · · + p0,i , (A.15)
T
The product E[qΣ,i (n)]E[w(n)] can be evaluated as in
where we indicate with { p j,i } the coefficients of the polyno-
E wT (n) E qΣ,i (n) = Tr E wT (n) E qΣ,i (n) mial. Since pi (Fi ) = 0, we deduce that [12, 18, 19]
(A.7)
= E wT (n) vec E qΣ,i (n) ,
2 2 −1
M 2
with E w(n)vec−1 {FMi 2 σ } = − p j,i E w(n)vec−1 {Fij σ } .
j =0
vec E qΣ,i (n) = vec E MTi (n)Σmi (n) (A.16)
(A.8)
= E mTi (n) ⊗ MTi (n) σ = Qi , σ,
The results of (A.3), (A.12)–(A.14), and (A.16) prove
Theorem 1 that describes the transient behavior of the FX-
and the M × M2 matrix Qi is given by
PE-AP algorithms.
Qi = E mTi (n) ⊗ MTi (n) . (A.9)
ACKNOWLEDGMENT
Moreover, the last term of (A.1) can be computed as in
This work was supported by MIUR under Grant PRIN
Tr E mTi (n)Σmi (n) = giT σ, (A.10) 2004092314.
where REFERENCES
gi = vec E mi (n)mTi (n) . (A.11) [1] P. A. Nelson and S. J. Elliott, Active Control of Sound, Academic
Press, London, UK, 1995.
Accordingly, introducing σ and σ instead of Σ and Σ [2] S. C. Douglas, “Fast implementations of the filtered-X LMS
and using the results of (A.5), (A.7), (A.8), and (A.10), the and LMS algorithms for multichannel active noise control,”
IEEE Transactions on Speech and Audio Processing, vol. 7, no. 4,
recursion in (A.1) can be rewritten as follows:
pp. 454–465, 1999.
2 [3] M. Bouchard, “Multichannel affine and fast affine projection
E w(mK + i + K)vec−1 {σ } algorithms for active noise control and acoustic equalization
2 systems,” IEEE Transactions on Speech and Audio Processing,
= E w(mK +i)vec−1 {Fi σ } − 2E wT (mK +i) Qi σ +giT σ. vol. 11, no. 1, pp. 54–60, 2003.
(A.12) [4] A. Carini and G. L. Sicuranza, “Transient and steady-state
analysis of filtered-x affine projection algorithms,” IEEE Trans-
The recursion in (A.12) shows that in order to evaluate actions on Signal Processing, vol. 54, no. 2, pp. 665–678, 2006.
E[w(mK +i+K)2vec−1 {σ } ], we need E[w(mK +i)2vec−1 {Fi σ } ]. [5] Y. Neuvo, C.-Y. Dong, and S. K. Mitra, “Interpolated finite im-
This quantity can be inferred from (A.12) by replacing σ with pulse response filters,” IEEE Transactions on Acoustics, Speech,
Fi σ, obtaining the following relation: and Signal Processing, vol. 32, no. 3, pp. 563–570, 1984.
[6] S. Werner and P. S. R. Diniz, “Set-membership affine projec-
2 tion algorithm,” IEEE Signal Processing Letters, vol. 8, no. 8, pp.
E w(mK + i + K)vec−1 {Fi σ } 231–235, 2001.
2 [7] S. C. Douglas, “Adaptive filters employing partial updates,”
= E w(mK + i)vec−1 {F2i σ } (A.13)
IEEE Transactions on Circuits and Systems II: Analog and Digi-
− 2E wT (mK + i) Qi Fi σ + giT Fi σ. tal Signal Processing, vol. 44, no. 3, pp. 209–216, 1997.
[8] K. Doğançay and O. Tanrikulu, “Adaptive filtering algorithms
This procedure is repeated until we obtain the following ex- with selective partial updates,” IEEE Transactions on Circuits
and Systems II: Analog and Digital Signal Processing, vol. 48,
pression [12, 18, 19]:
no. 8, pp. 762–769, 2001.
2 [9] G. L. Sicuranza and A. Carini, “Nonlinear multichannel active
E w(mK + i + K)vec−1 {FMi 2 −1 σ } noise control using partial updates,” in Proceedings of IEEE In-
2 ternational Conference on Acoustics, Speech, and Signal Process-
= E w(mK + i)vec−1 {FM2 σ } ing (ICASSP ’05), vol. 3, pp. 109–112, Philadelphia, Pa, USA,
i
March 2005.
− 2E wT (mK + i) Qi FiM −1 σ + giT FiM −1 σ.
2 2
Research Article
Step Size Bound of the Sequential Partial Update LMS
Algorithm with Periodic Input Signals
Pedro Ramos,1 Roberto Torrubia,2 Ana López,1 Ana Salinas,1 and Enrique Masgrau2
1 Communication Technologies Group, Aragón Institute for Engineering Research (I3A), EUPT, University of Zaragoza,
Ciudad Escolar s/n, 44003 Teruel, Spain
2 Communication Technologies Group, Aragón Institute for Engineering Research (I3A), CPS Ada Byron, University of Zaragoza,
This paper derives an upper bound for the step size of the sequential partial update (PU) LMS adaptive algorithm when the input
signal is a periodic reference consisting of several harmonics. The maximum step size is expressed in terms of the gain in step size of
the PU algorithm, defined as the ratio between the upper bounds that ensure convergence in the following two cases: firstly, when
only a subset of the weights of the filter is updated during every iteration; and secondly, when the whole filter is updated at every
cycle. Thus, this gain in step-size determines the factor by which the step size parameter can be increased in order to compensate
the inherently slower convergence rate of the sequential PU adaptive algorithm. The theoretical analysis of the strategy developed
in this paper excludes the use of certain frequencies corresponding to notches that appear in the gain in step size. This strategy
has been successfully applied in the active control of periodic disturbances consisting of several harmonics, so as to reduce the
computational complexity of the control system without either slowing down the convergence rate or increasing the residual error.
Simulated and experimental results confirm the expected behavior.
Copyright © 2007 Pedro Ramos et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
(a)
Figure 1: Single-channel active noise control system using the coefficient vector at each sample time. Partial update (PU)
filtered-x adaptive algorithm. (a) Physical arrangement of the elec- adaptive algorithms have been proposed to reduce the large
troacoustic elements. (b) Equivalent block diagram. computational complexity associated with long adaptive fil-
ters. As far as the drawbacks of PU algorithms are concerned,
it should be noted that their convergence speed is reduced
is obtained by filtering the reference signal through the esti- approximately in proportion to the filter length divided by
mate of the secondary path. the number of coefficients updated per iteration, that is, the
decimation factor N. Therefore, the tradeoff between con-
1.2. Partial update LMS algorithm vergence performance and complexity is clearly established:
the larger the saving in computational costs, the slower the
The LMS algorithm and its filtered-x version have been convergence rate.
widely used in control applications because of their sim- Two well-known adaptive algorithms carry out the par-
ple implementation and good performance. However, the tial updating process of the filter vector employing decimated
adaptive FIR filter may eventually require a large number versions of the error or the regressor signals [6]. These algo-
of coefficients to meet the requirements imposed by the ad- rithms are, respectively, the periodic LMS and the sequential
dressed problem. For instance, in the ANC system described LMS. This work focuses the attention on the later.
in Figure 1(b), the task associated with the adaptive filter— The sequential LMS algorithm with decimation factor N
in order to minimize the error signal—is to accurately model updates a subset of size L/N, out of a total of L, coefficients
the primary path and inversely model the secondary path. per iteration according to (1),
Previous research in the field has shown that if the active
canceller has to deal with an acoustic disturbance consist- wl (n + 1)
⎧
ing of closely spaced frequency harmonics, a long adaptive ⎪
⎨w (n) + μx(n
filter is necessary [5]. Thus, an improvement in performance l l + 1)e(n) if (n l + 1) mod N = 0,
=
⎪
⎩w (n)
is achieved at the expense of increasing the computational l otherwise
load of the control strategy. Because of limitations in com- (1)
putational efficiency and memory capacity of low-cost DSP
boards, a large number of coefficients may even impair the for 1 l L, where wl (n) represents the lth weight of the
practical implementation of the LMS or more complex adap- filter, μ is the step size of the adaptive algorithm, x(n) is the
tive algorithms. regressor signal, and e(n) is the error signal.
As an alternative to the reduction of the number of coef- The reduction in computational costs of the sequential
ficients, one may choose to update only a portion of the filter PU strategy depends directly on the decimation factor N.
Pedro Ramos et al. 3
Tables 1 and 2 show, respectively, the computational com- adaptive algorithm [13, Chapter 3], it is necessary to as-
plexity of the LMS and the sequential LMS algorithms in sume slow convergence—i.e., that the control filter is chang-
terms of the average number of operations required per cy- ing slowly—and to count on an exact estimate of the sec-
cle, when used in the context of a filtered-x implementation ondary path in order to commute the order of the adaptive
of a single-channel ANC system. The length of the adaptive filter and the secondary path [2]. In so doing, the output of
filter is L, the length of the offline estimate of the secondary the adaptive filter carries through directly to the error signal,
path is Ls , and the decimation factor is N. and the traditional LMS algorithm analysis can be applied by
The criterion for the selection of coefficients to be up- using as regressor signal the result of the filtering of the ref-
dated can be modified and, as a result of that, different PU erence signal through the secondary path transfer function.
adaptive algorithms have been proposed [7–10]. The varia- It could be argued that this condition compromises the de-
tions of the cited PU LMS algorithms speed up their conver- termination of an upper bound on the step size of the adap-
gence rate at the expense of increasing the number of oper- tive algorithm, but actually, slow convergence is guaranteed
ations per cycle. These extra operations include the “intelli- because the convergence factor is affected by a much more
gence” required to optimize the election of the coefficients to restrictive condition with a periodic reference than with a
be updated at every instant. white noise reference. It has been proved that with a sinu-
In this paper, we try to go a step further, showing that soidal reference, the upper bound of the step size is inversely
in applications based on the sequential LMS algorithm, proportional to the product of the length of the filter and the
where the regressor signal is periodic, the inclusion of a new delay in the secondary path; whereas with a white reference
parameter—called gain in step size—in the traditional trade- signal, the bound depends inversely on the sum of these pa-
off proves that one can achieve a significant reduction in rameters, instead of their product [12, 14]. Simulations with
the computational costs without degrading the performance a white noise reference signal suggest that a realistic upper
of the algorithm. The proposed strategy—filtered-x sequen- bound in the step size is given by [15, Chapter 3]
tial least mean-square algorithm with gain in step size (Gμ -
FxSLMS)—has been successfully applied in our laboratory 2
μmax = , (2)
in the context of active control of periodic noise [5]. Px (L + Δ)
¼
1.3. Assumptions in the convergence analysis of the adaptive filter, and Δ is the delay introduced by the
secondary path.
Before focusing on the sequential PU LMS strategy and the Bjarnason [12] analyzed FxLMS convergence with a si-
derivation of the gain in step size, it is necessary to remark nusoidal reference, but employed the habitual assumptions
on two assumptions about the upcoming analysis: the inde- made with stochastic signals, that is, the independence the-
pendence theory and the slow convergence condition. ory. The stability condition derived by Bjarnason yields
The traditional approach to convergence analyses of
LMS—and FxLMS—algorithms is based on stochastic in- 2 π
puts instead of deterministic signals such as a combination μmax = sin . (3)
Px L ¼ 2(2Δ + 1)
of multiple sinusoids. Those stochastic analyses assume inde-
pendence between the reference—or regressor—signal and In case of large delay Δ, (3) simplifies to
the coefficients of the filter vector. In spite of the fact that this
independence assumption is not satisfied or, at least, ques-
μmax =
π
, Δ
π
. (4)
tionable when the reference signal is deterministic, some re- Px L(2Δ + 1)
¼ 4
searchers have previously used the independence assumption
Vicente and Masgrau [14] obtained an upper bound for
with a deterministic reference. For instance, Kuo et al. [11]
the FxLMS step size that ensures convergence when the ref-
assumed the independence theory, the slow convergence con-
erence signal is deterministic (extended to any combination
dition, and the exact offline estimate of the secondary path
of multiple sinusoids). In the derivation of that result, there
to state that the maximum step size of the FxLMS algorithm
is no need of any of the usual approximations, such as in-
is inversely bounded by the maximum eigenvalue of the au-
dependence between reference and weights or slow conver-
tocorrelation matrix of the filtered reference, when the ref-
gence. The maximum step size for a sinusoidal reference is
erence was considered to be the sum of multiple sinusoids.
given by
Bjarnason [12] used as well the independence theory to carry
out a FxLMS analysis extended to a sinusoidal input. Accord- 2
ing to Bjarnason, this approach is justified by the fact that ex- μmax = . (5)
Px L(2Δ + 1)
¼
Updated during the 1st iteration Updated during the 1st iteration Updated during the 1st iteration with
with x¼ (n), during the (N + 1)th with x¼ (n N), during the (N + 1)th x¼ (n L + N), during the (N + 1)th
iteration with x¼ (n + N),. . . iteration with x¼ (n),. . . iteration with x¼ (n L + 2N),. . .
w1 w2 wN wN+1 wN+2 w2N wL N wL N+1 wL N+2 wL 1 wL
Updated during the 2nd iteration Updated during the 2nd iteration Updated during the 2nd iteration with
with x¼ (n), during the (N + 2)th with x¼ (n N), during the (N + 2)th x¼ (n L + N), during the (N + 2)th iteration
iteration with x¼ (n + N),. . . iteration with x¼ (n),. . . with x¼ (n L + 2N),. . .
Updated during the Nth iteration with Updated during the Nth iteration with Updated during the Nth iteration with
x¼ (n), during the 2Nth iteration with x¼ (n N), during the 2Nth iteration with x¼ (n L + N), during the 2Nth iteration
x¼ (n + N),. . . x¼ (n),. . . with x¼ (n L + 2N),. . .
Figure 2: Summary of the sequential PU algorithm, showing the coefficients to be updated at each iteration and related samples of the
regressor signal used in each update, x¼ (n) being the value of the regressor signal at the current instant.
limit. In other words, we look for a useful guide on deter- tion have been extended to sequential PU algorithms [6] to
mining the maximum step size but, as we will see in this pa- yield the following result: the bounds on the step size for the
per, derived bounds and theoretically predicted behavior are sequential LMS algorithm are the same as those for the LMS
found to correspond not only to simulation but also to ex- algorithm and, as a result of that, a larger step size cannot
perimental results carried out in the laboratory in practical be used in order to compensate its inherently slower conver-
implementations of ANC systems based on DSP boards. gence rate. However, this result is only valid for independent
To sum up, independence theory and slow convergence identically distributed (i.i.d.) zero-mean Gaussian input sig-
are assumed in order to derive a bound for a filtered-x se- nals.
quential PU LMS algorithm with deterministic periodic in- To obtain a valid analysis in the case of periodic signals as
puts. Despite the fact that such assumptions might be ini- input of the adaptive filter, we will focus on the updating pro-
tially questionable, previous research and achieved results cess of the coefficients when the L-length filter is adapted by
confirm the possibility of application of these strategies in the the sequential LMS algorithm with decimation factor N. This
attenuation of periodic disturbances in the context of ANC, algorithm updates just L/N coefficients per iteration accord-
achieving the same performance as that of the full update ing to (1). For ease in analyzing the PU strategy, it is assumed
FxLMS in terms of convergence rate and misadjustment, but throughout the paper that L/N is an integer.
with lower computational complexity. Figure 1(b) shows the block diagram of a filtered-x ANC
As far as the applicability of the proposed idea is con- system, where the secondary path S(z) is placed following
cerned, the contribution of this paper to the design of the the digital filter W(z) controlled by an adaptive algorithm.
step size parameter is applicable not only to the filtered-x As has been previously stated, under the assumption of slow
sequential LMS algorithm but also to basic sequential LMS convergence and considering an accurate offline estimate of
strategies. In other words, the derivation and analysis of the the secondary path, the order of W(z) and S(z) can be com-
gain in step size could have been done without consideration muted and the resulting equivalent diagram simplified. Thus,
of a secondary path. The reason for the study of the specific standard LMS algorithm techniques can be applied to the
case that includes the filtered-x stage is the unquestionable filtered-x version of the sequential LMS algorithm in order
existence of an extended problem: the need of attenuation to determine the convergence of the mean weights and the
of periodic disturbances by means of ANC systems imple- maximum value of the step size [13, Chapter 3]. The simpli-
menting filtered-x algorithms on low-cost DSP-based boards fied analysis is based on the consideration of the filtered ref-
where the reduction of the number of operations required erence as the regressor signal of the adaptive filter. This signal
per cycle is a factor of great importance. is denoted as x¼ (n) in Figure 1(b).
Figure 2 summarizes the sequential PU algorithm given
2. EIGENVALUE ANALYSIS OF PERIODIC NOISE: by (1), indicating the coefficients to be updated at each iter-
THE GAIN IN STEP SIZE ation and the related samples of the regressor signal. In the
scheme of Figure 2, the following update is considered to be
2.1. Overview carried out during the first iteration. The current value of
the regressor signal is x¼ (n). According to (1) and Figure 2,
Many convergence analyses of the LMS algorithm try to de- this value is used to update the first N coefficients of the filter
rive exact bounds on the step size to guarantee mean and during the following N iterations. Generally, at each iteration
mean-square convergence based on the independence as- of a full update adaptive algorithm, a new sample of the re-
sumption [16, Chapter 6]. Analyses based on such assump- gressor signal has to be taken as the latest and newest value of
Pedro Ramos et al. 5
the filtered reference signal. However, according to Figure 2, LMS algorithm when the regressor vector is a periodic signal
the sequential LMS algorithm uses only every Nth element of consisting of multiple sinusoids.
the regressor signal. Thus, it is not worth computing a new It is known that the LMS adaptive algorithm converges in
sample of the filtered reference at every algorithm iteration. mean to the solution if the step size satisfies [16, Chapter 6]
It is enough to obtain the value of a new sample at just one
out of N iterations. 2
0<μ< , (10)
The L-length filter can be considered as formed by N sub- λmax
filters of L/N coefficients each. These subfilters are obtained
by uniformly sampling by N the weights of the original vec- where λmax is the largest eigenvalue of the input autocorrela-
tor. Coefficients of the first subfilter are encircled in Figure 2. tion matrix
Hence, the whole updating process can be understood as the
R = E x¼ (n)x¼T (n) , (11)
N-cyclical updating schedule of N subfilters of length L/N.
Coefficients occupying the same relative position in every x¼ (n) being the regressor signal of the adaptive algorithm.
subfilter are updated with the same sample of the regressor As has been previously stated, under the assumptions
signal. This regressor signal is only renewed at one in every considered in Section 1.3, in the case of an ANC system based
N iterations. That is, after N iterations, the less recent value on the FxLMS, traditional LMS algorithm analysis can be
is shifted out of the valid range and a new value is acquired used considering that the regressor vector corresponds to
and subsequently used to update the first coefficient of each the reference signal filtered by an estimate of the secondary
subfilter. path. The proposed analysis is based on the ratio between the
To sum up, during N consecutive instants, N subfilters of largest eigenvalue of the autocorrelation matrix of the regres-
length L/N are updated with the same regressor signal. This sor signal for two different situations. Firstly, when the adap-
regressor signal is a N-decimated version of the filtered ref- tive algorithm is the full update LMS and, secondly, when the
erence signal. Therefore, the overall convergence can be ana- updating strategy is based on the sequential LMS algorithm
lyzed on the basis of the joint convergence of N subfilters: with a decimation factor N > 1. The sequential LMS with
(i) each of length L/N, N = 1 corresponds to the LMS algorithm.
(ii) updated by an N-decimated regressor signal. Let the regressor vector x¼ (n) be formed by a periodic sig-
nal consisting of K harmonics of the fundamental frequency
2.2. Spectral norm of autocorrelation matrices: f0 ,
the triangle inequality
K
The autocorrelation matrix R of a periodic signal consisting x¼ (n) = Ck cos 2πk f0 n + φk . (12)
k=1
of several harmonics is Hermitian and Toeplitz.
The spectral norm of a matrix A is defined as the square The autocorrelation matrix of the whole signal can be ex-
root of the largest eigenvalue of the matrix product AH A, pressed as the sum of K simpler matrices with each being the
where AH is the Hermitian transpose of A, that is, [17, Ap- autocorrelation matrix of a single tone [11]
pendix E]
1/2
K
As = λmax AH A . (6) R= Ck2 Rk , (13)
k=1
The spectral norm of a matrix satisfies, among other norm
conditions, the triangle inequality given by where
cos 2πk(L 1) f0
1
⎪
⎪
⎭
At this point, a convergence analysis is carried out in order to According to (9) the largest eigenvalue of a sum of matrices
derive a bound on the step size of the filtered-x sequential PU is bounded by the sum of the largest eigenvalues of each of
6 EURASIP Journal on Audio, Speech, and Music Processing
its components. Therefore, the largest eigenvalue of R can be Figures 3 and 4 show the gain in step size expressed
expressed as by (21) for different decimation factors (N) and different
lengths of the adaptive filter (L).
K
N =1
λtot,max N =1
Ck2 λk,max k f0 Basically, the analytical expressions and figures show that
the step size can be multiplied by N as long as certain fre-
k=1
(16) quencies, at which a notch in the gain in step size appears,
K
sin L2πk f0
L
1 are avoided. The location of these critical frequencies, as well
= Ck2 max .
4 sin 2πk f0 as the number and width of the notches, will be analyzed as
k=1
a function of the sampling frequency Fs , the length of the
At the end of Section 2.1, two key differences were de- adaptive filter L, and the decimation factor N. According to
rived in the case of the sequential LMS algorithm: the conver- (19) and (21), with increasing decimation factor N, the step
gence condition of the whole filter might be translated to the size can be multiplied by N and, as a result of that affordable
parallel convergence of N subfilters of length L/N adapted by compensation, the PU sequential algorithm convergence is as
an N-decimated regressor signal. Considering both changes, fast as the full update FxLMS algorithm as long as the unde-
the largest eigenvalue of each simple matrix Rk can be ex- sired disturbance is free of components located at the notches
pressed as of the gain in step size.
Figure 3 shows that the total number of equidistant
sin (L/N)2πkN f0
λN>1
k,max k f0 = max
1 L
4 N
sin 2πkN f0
notches appearing in the gain in step size is (N 1). In fact,
the notches appear at the frequencies given by
(17)
Fs
and considering the triangle inequality (9), we have fk notch = k , k = 1, . . . , N 1. (22)
2N
K
λN>1
tot,max Ck2 λN>1
k,max k f0
It is important to avoid the undesired sinusoidal noise be-
k=1 ing at the mentioned notches because the gain in step size is
smaller there, with the subsequent reduction in convergence
K
f
= Ck2 max
1 L
4 N
sinsin(L/N)2πkN
2πkN f
0 . rate. As far as the width of the notches is concerned, Figure 4
k=1 0 (where the decimation factor N = 2) shows that the smaller
(18) the length of the filter, the wider the main notch of the gain
in step size. In fact, if L/N is an integer, the width between
Defining the gain in step size Gμ as the ratio between the first zeros of the main notch can be expressed as
bounds on the step sizes in both cases, we obtain the factor
by which the step size parameter can be multiplied when the Fs
adaptive algorithm uses PU, width = . (23)
L
Gμ K, f0 , L, N
Simulations and practical experiments confirm that at these
μN>1 2/ max λN>1
K 2 N =1 problematic frequencies, the gain in step size cannot be ap-
k=1 Ck λk,max k f0
= max
N =1
= tot,max
N =1
= K 2 N>1
plied at its maximum value N.
μmax 2/ max λtot,max k=1 Ck λk,max k f0 If it were not possible to avoid the presence of some har-
K
=
2
k=1 Ck max (1/4) L sin L2πk f0 sin 2πk f0 monic at a frequency where there were a notch in the gain,
K
. the proposed strategy could be combined with the filtered-
2
k=1 Ck max (1/4) L/N sin(L/N)2πkN f0 sin 2πkN f0 error least mean-square (FeLMS) algorithm [13, Chapter 3].
(19) The FeLMS algorithm is based on a shaping filter C(z) placed
In order to more easily visualize the dependence of the in the error path and in the filtered reference path. The trans-
gain in step size on the length of the filter L and on the deci- fer function C(z) is the inverse of the desired shape of the
mation factor N, let a single tone of normalized frequency f0 residual noise. Therefore, C(z) must be designed as a comb
be the regressor signal filter with notches at the problematic frequencies. As a re-
sult of that, the harmonics at those frequencies would not be
x¼ (n) = cos 2π f0 n + φ . (20) canceled. Nevertheless, if a noise component were to fall in a
notch, using a smaller step size could be preferable to using
Now, the gain in step size, that is, the ratio between the
the FeLMS, considering that typically it is more important to
bounds on the step size when N > 1 and N = 1, is given by
cancel all noise disturbance frequencies rather than obtain-
Gμ 1, f0 , L, N ing the fastest possible convergence rate.
μN>1
max
= N =1 3. NOISE ON THE WEIGHT VECTOR SOLUTION
μmax
AND EXCESS MEAN-SQUARE ERROR
max (1/4) L sin L2π f0 sin 2π f0
=
.
max (1/4) L/N sin (L/N)2πN f0 sin 2πN f0 The aim of this section is to prove that the full-strength gain
(21) in step size Gμ = N can be applied in the context of ANC
Pedro Ramos et al. 7
2 3
2.5
1.5
Gain in step size
1 1.5
1
0.5
0.5
0 0
0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.4
Normalized frequency Normalized frequency
(a) L = 256, N = 1 (b) L = 256, N = 2
5
8
4
Gain in step size
4
2
1 2
0 0
0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.4
Normalized frequency Normalized frequency
(c) L = 256, N = 4 (d) L = 256, N = 8
Figure 3: Gain in step size for a single tone and different decimation factors N = 1, 2, 4, 8.
Gain in step size for different systems controlled by the filtered-x sequential LMS algo-
lengths of the adaptive filter, N = 2 rithm without an additional increase in mean-square error
2.2
caused by the noise on the weight vector solution. We begin
with an analysis of the trace of the autocorrelation matrix
2 of an N-decimated signal xN (n), which is included to pro-
vide mathematical support for subsequent parts. The second
1.8 part of the section revises the analysis performed by Widrow
Gain in step size
L = 128
T
L = 32 x(n) = x(n), x(n 1), . . . , x(n L + 1) . (24)
L=8
Figure 4: Gain in step size for a single tone and different filter The expectation of the outer product of the vector x(n) with
lengths L = 8, 32, 128 with decimation factor N = 2. itself determines the L L autocorrelation matrix R of the
8 EURASIP Journal on Audio, Speech, and Music Processing
⎜
0 ⎟
⎜ .. ⎟ The relation between the traces of R and RN is given by
⎜ . k 1 ⎟
⎜ ⎟
⎜ ⎟
⎜
⎜ 0
⎟
⎟
trace RN =
L
rxx (0)
=
trace(R)
. (32)
⎜ ⎟ N N
⎜ 1 0 ⎟ i=1
⎜ ⎟
⎜ ⎟
⎜
⎜
0 ⎟
⎟
⎜ .. ⎟ 3.2. Effects of the gradient noise on the LMS algorithm
I(N)
k =⎜
⎜ N . ⎟.
⎟
(27)
⎜ ⎟
⎜
⎜
0 ⎟
⎟
Let the vector w(n) represent the weights of the adaptive fil-
⎜ ⎟ ter, which are updated according to the LMS algorithm as
⎜ - - - - 1 ⎟
⎜ ⎟ follows:
⎜ ⎟
⎜ 0 0 ⎟
⎜ ⎟ μ'
⎜
⎜ ..
.
⎟
⎟ w(n + 1) = w(n)
2
(n) = w(n) + μe(n)x(n), (33)
⎝ ⎠
0 where μ is the step size, ' (n) is the gradient estimate at the
As a result of (26), the autocorrelation matrix RN of the nth iteration, e(n) is the error at the previous iteration, and
new signal xN (n) only presents nonnull elements on its main x(n) is the vector of input samples, also called the regressor
diagonal and on any other diagonal parallel to the main di- signal.
agonal that is separated from it by kN positions, k being any We define v(n) as the deviation of the weight vector from
integer. Thus, its optimum value
RN = E xN (n)xNT (n) v(n) = w(n) wopt (34)
⎡ ⎤
..
⎢ rxx (0) 0 0 rxx (N) rxx (2N) . ⎥ and v¼ (n) as the rotation of v(n) by means of the eigenvector
⎢ . . ⎥
⎢ .. .. .. ⎥ matrix Q,
⎢ 0 rxx (0) 0 . 0 rxx (N) ⎥
⎢ ⎥
⎢ .. .. . ⎥
⎢ . . . ⎥ v¼ (n) = Q 1 v(n) = Q 1 w(n) wopt . (35)
⎢ 0 rxx (0) 0 0 . ⎥
⎢ ⎥
1⎢ ⎥
.. ..
⎢ 0 . 0 rxx (0) 0 . rxx (N) ⎥ In order to give a measure of the difference between
= ⎢ ⎥.
N⎢
⎢rxx (N) 0
..
.
..
.
⎥
⎥ actual and optimal performance of an adaptive algorithm,
⎢ 0 rxx (0) 0 0 ⎥
⎢ . . ⎥ two parameters can be taken into account: excess mean-
⎢ .. .. .. ⎥
⎢ rxx (N) 0 . 0 rxx (0) ⎥ square error and misadjustment. The excess mean-square er-
⎢ ⎥
⎢ .. .. .. ⎥ ror ξexcess is the average mean-square error less the minimum
⎢rxx (2N) . . 0 . 0 0 ⎥
⎣ ⎦ mean-square error, that is,
..
. rxx (N) 0 rxx (0)
(28) ξexcess = E ξ(n) ξmin . (36)
Pedro Ramos et al. 9
The misadjustment M is defined as the excess mean-square The weights of the adaptive filter when the Gμ -FxSLMS
error divided by the minimum mean-square error algorithm is used are updated according to the recursion
ξ E ξ(n) ξmin w(n + 1) = w(n) + Gμ μe(n)I(N) ¼
1+n mod N x (n), (42)
M = excess = . (37)
ξmin ξmin
where I(N)
1+n mod N is obtained from the identity matrix as ex-
Random weight variations around the optimum value of pressed in (27). The gradient estimation noise of the filtered-
the filter cause an increase in mean-square error. The average x sequential LMS algorithm at the minimum point, where
of these increases is the excess mean-square error. Widrow the true gradient is zero, is given by
n(n) =
and Stearns [16, Chapters 5 and 6] analyzed the steady-state ' (n) = 2e(n)I(N) ¼
effects of gradient noise on the weight vector solution of the 1+n mod N x (n). (43)
LMS algorithm by means of the definition of a vector of noise Considering PU, only L/N terms out of the L-length noise
n(n) in the gradient estimate at the nth iteration. It is as- vector are nonzero at each iteration, giving a smaller noise
sumed that the LMS process has converged to a steady-state contribution in comparison with the LMS algorithm, which
weight vector solution near its optimum and that the true updates the whole filter.
gradient (n) is close to zero. Thus, we write The weight vector covariance in the principal axis coor-
dinate system, that is, in primed coordinates, is related to the
n(n) =
' (n) (n) = ' (n) = 2e(n)x(n). (38) covariance of the noise as follows:
1
Gμ μ Gμ μ 2
The weight vector covariance in the principal axis coordinate cov v¼ (n) = Λ Λ cov n¼ (n)
system, that is, in primed coordinates, is related to the co- 8 2
variance of the noise as follows [16, Chapter 6]: 1
Gμ μ Gμ μ 2
= Λ Λ cov Q 1 n(n)
1 8 2
μ μ 2
cov v¼ (n) = Λ Λ cov n¼ (n) 1
8 2 Gμ μ Gμ μ 2
= Λ Λ Q 1 E n(n)nT (n) Q.
1 8 2
μ μ 2
(44)
= Λ Λ cov Q 1 n(n) (39)
8 2
1 Assuming that (Gμ μ/2)Λ is considerably less than I, then (44)
μ μ 2
simplifies to
= Λ Λ Q 1 E n(n)nT (n) Q.
8 2
Gμ μ 1 1
cov v¼ (n) ≈ Λ Q E n(n)nT (n) Q. (45)
In practical situations, (μ/2)Λ tends to be negligible with re- 8
spect to I, so that (39) simplifies to The covariance of the gradient estimation error noise
μ 1 1
when the sequential PU is used can be expressed as
cov v¼ (n) ≈ Λ Q E n(n)nT (n) Q. (40)
8 cov n(n) = E n(n)nT (n)
( )
From (38), it can be shown that the covariance of the gra- (N) (N)
= 4E e2 (n)I1+n mod N x¼ (n)x¼T (n)I1+n mod N
dient estimation noise of the LMS algorithm at the minimum
(
(N) )
point is related to the autocorrelation input matrix according (N)
= 4E e2 (n) E I1+n mod N x¼ (n)x¼T (n)I1+n mod N
to (41)
1
N
cov n(n) = E n(n)nT (n) = 4E e2 (n) R. (41) = 4E e2 (n) I(N) (N)
i RIi
N i=1
In (41), the error and the input vector are considered statisti-
2
= 4E e (n) RN .
cally independent because at the minimum point of the error (46)
surface both signals are orthogonal.
To sum up, (40) and (41) indicate that the measurement In (46), statistical independence of the error and the input
of how close the LMS algorithm is to optimality in the mean- vector has been assumed at the minimum point of the error
square error sense depends on the product of the step size surface, where both signals are orthogonal.
and the autocorrelation matrix of the regressor signal x(n). According to (32), the comparison of (40) and (45)—
carried out in terms of the trace of the autocorrelation
3.3. Effects of gradient noise on the filtered-x matrices—confirms that the contribution of the gradient es-
sequential LMS algorithm timation noise is N times weaker for the sequential LMS al-
gorithm than for the LMS. This reduction compensates the
At this point, the goal is to carry out an analysis of the effect eventual increase in the covariance of the weight vector in the
of gradient noise on the weight vector solution for the case principal axis coordinate system expressed in (45) when the
of the Gμ -FxSLMS algorithm in a similar manner as in the maximum gain in step size Gμ = N is applied in the context
previous section. of the Gμ -FxSLMS algorithm.
10 EURASIP Journal on Audio, Speech, and Music Processing
1 10
0 0
10
1
P( f )
S( f )
20
2
30
3 40
4 50
0 1000 2000 3000 4000 0 1000 2000 3000 4000
Frequency (Hz) Frequency (Hz)
(a) (b)
10
20
10
0
Se ( f )
20
20
30
40
40
50
0 1000 2000 3000 4000 0 100 200 300 400
Frequency (Hz) Frequency (Hz)
(c) (d)
Figure 5: Transfer function magnitude of (a) primary path P(z), (b) secondary path S(z), and (c) offline estimate of the secondary path
used in the simulated model, (d) power spectral density of periodic disturbance consisting of two tones of 62.5 Hz and 187.5 Hz in additive
white Gaussian noise.
2 3
8
2.5
1.5
1 1.5
4
1
0.5 2
0.5
0 0 0
0 200 400 0 200 400 0 200 400
Frequency (Hz) Frequency (Hz) Frequency (Hz)
80
30 60
25 50 60
Gain in step size
30 40
15
10 20
20
5 10
0 0 0
0 200 400 0 200 400 0 200 400
Frequency (Hz) Frequency (Hz) Frequency (Hz)
(d) N = 32 (e) N = 64 (f) N = 80
Figure 6: Gain in step size over the frequency band of interest—from 0 to 400 Hz—for different values of the decimation factor N (N =
1, 2, 8, 32, 64, 80).
and the adaptive process starts. The value μ = 0.0001 is near expressed in logarithmic scale as the ratio of the mean-square
the maximum stable step size when a decimation factor N = error and a signal of unitary power. As expected, the conver-
1 is chosen. gence rate and residual error are the same in all cases except
The performance of the Gμ -FxSLMS algorithm was tested when N = 64. For this value, the active noise control system
for different values of the decimation factor N. Figure 6 diverges. In order to make the system converge when N = 64,
shows the gain in step size over the frequency band of in- it is necessary to decrease the gain in step size to a maximum
terest for different values of the parameter N. The gain in value of 32 with a subsequent reduction in convergence rate.
step size at the frequencies 62.5 Hz and 187.5 Hz are marked The second example compares the theoretical gain in step
with two circles over the curves. The exact location of the size with the increase obtained by MATLAB simulation. The
notches is given by (22). On the basis of the position of the model of this example corresponds, as in the previous exam-
notches in the gain in step size and the spectral distribution ple, to the 1 1 1 arrangement described in Figure 1. In
of the undesired noise, the decimation factor N = 64 is ex- this example, the reference is a single sinusoidal signal whose
pected to be critical because, according to Figure 6, the full- frequency varied in 20 Hz steps from 40 to 1560 Hz. The sam-
strength gain Gμ = N = 64 cannot be applied at the fre- pling frequency of the model is 3200 samples/s. Primary and
quencies 62.5 Hz and 187.5 Hz; both frequencies correspond secondary paths—P(z) and S(z)—are pure delays of 300 and
exactly to the sinusoidal components of the periodic distur- 40 samples, respectively. The output of the primary path is
bance. Apart from the case N = 64 the gain in step size is free mixed with additive white Gaussian noise providing a signal-
of notches at both of these frequencies. to-noise ratio of 27 dB. It is assumed that the secondary path
Convergence curves for different values of the decima- has been exactly estimated. In order to provide very accurate
tion factor N are shown in Figure 7. The numbers that ap- results, the increase in step size between every two consec-
pear over the figures correspond to the mean-square error utive simulations looking for the bound is less than 1/5000
computed over the last 5000 iterations. The residual error is the final value of the step size that ensures convergence. The
12 EURASIP Journal on Audio, Speech, and Music Processing
0 0 0
0.2 0.25 0.3 0.35 0.2 0.25 0.3 0.35 0.2 0.25 0.3 0.35
Time (s) Time (s) Time (s)
(a) N = 1 (b) N = 2 (c) N = 8
0 0 0
0.2 0.25 0.3 0.35 0.2 0.25 0.3 0.35 0.2 0.25 0.3 0.35
Time (s) Time (s) Time (s)
(d) N = 32 (e) N = 64 (f) N = 80
Figure 7: Evolution of the instantaneous error power in an ANC system using the Gμ -FxSLMS algorithm for different values of the decima-
tion factor N (N = 1, 2, 8, 32, 64, 80). In all cases, the gain in step size was set to the maximum value Gμ = N.
decimation factor N of this example was set to 4. Figure 8 The system effectively cancels the main harmonics of the en-
compares the predicted gain in step size with the achieved gine noise. Considering that the loudspeakers have a low cut-
results. As expected, the experimental gain in step size is 4, off frequency of 60 Hz, the controller cannot attenuate the
apart from the notches that appear at 400, 800, and 1200 Hz. components below this frequency. Besides, the ANC system
finds more difficulty in the attenuation of closely spaced fre-
4.2. Practical implementation quency harmonics (see Figure 10(a)). This problem can be
avoided by increasing the number of coefficients of the adap-
The Gμ -FxSLMS algorithm was implemented in a 1 2 2 ac- tive filter; for instance, from L = 256 to 512 coefficients (see
tive noise control system aimed at attenuating engine noise at Figure 10(b)).
the front seats of a Nissan Vanette. Figure 9 shows the phys- In order to carry out a performance comparison of the
ical arrangement of electroacoustic elements. The adaptive Gμ -FxSLMS algorithm with increasing value in the decima-
algorithm was developed on a hardware platform based on tion term N—and subsequently in gain in step size Gμ —
the DSP TMS320C6701 from Texas Instruments [18]. it is essential to repeat the experiment with the same un-
The length of the adaptive filter (L) for the Gμ -FxSLMS desired disturbance. So to avoid inconsistencies in level
algorithm was set to 256 or 512 coefficients (depending on and frequency, instead of starting the engine, we have pre-
the spectral characteristics of the undesired noise and the de- viously recorded a signal consisting of several harmon-
gree of attenuation desired), the length of the estimate of the ics (100, 150, 200, and 250 Hz). An omnidirectional source
secondary path (Ls ) was set to 200 coefficients, and the dec- (Brüel & Kjaer Omnipower 4296) placed inside the van is fed
imation factor and the gain in step size were N = Gμ = 8. with this signal. Therefore, a comparison could be made un-
The sampling frequency was Fs = 8000 samples/s. From the der the same conditions. The ratio—in logarithmic scale—
parameters selected, one can derive, according to (22), that of the mean-square error and a signal of unitary power that
the first notch in the gain in step size is located at 500 Hz. appears over the graphics was calculated averaging the last
Pedro Ramos et al. 13
3
2.5 30
2
40
1.5
1 50
10
Secondary
source
20
Error
microphone 30
Engine Reference 40
noise microphone
50
Secondary ANC off
source ANC on
(b) L = 512, N = 8
Figure 9: Arrangement of the electroacoustic elements inside the
van.
Figure 10: Power spectral density of the undesired noise (dotted)
and of the residual error (solid) for the real cancelation of engine
noise at the driver location. The decimation factor is N = 8 and the
iterations shown. In this case, the length of the adaptive filter length of the adaptive filter is (a) L = 256 and (b) L = 512.
was set to 256 coefficients, the length of the estimate of the
secondary path (Ls ) was set to 200 coefficients, and the deci-
mation factor and the gain in step size were set to N = Gμ =
1, 2, 4, and 8. The sampling frequency was Fs = 8000 sam- cross terms, the expressions given by Tables 1 and 2 show that
ples/s and the first notch in the gain in step size appeared at approximately 32%, 48%, and 56% of the high-level multi-
500 Hz, well above the spectral location of the undesired dis- plications can be saved when the decimation factor N is set
turbance. From the experimental results shown in Figure 11, to 2, 4, and 8, respectively.
the application of the full-strength gain in step size when Although reductions in the number of operations are an
the decimation factor is 2, 4, or 8 reduces the computational indication of the computational efficiency of an algorithm,
costs without degrading in any sense the performance of the such reductions may not directly translate to a more effi-
system with respect to the full update algorithm. cient real-time DSP-based implementation on a hardware
Taking into account that the 2-channel ANC system im- platform. To accurately gauge such issues, one must consider
plementing the Gμ -FxSLMS algorithm inside the van ignored the freedoms and constraints that a platform imposes in the
14 EURASIP Journal on Audio, Speech, and Music Processing
Error power
0.04
fits of this strategy when it is applied in an active noise con-
40.18 dB trol system to attenuate periodic noise.
0
0 0.5 1 1.5 2 2.5
Time (s) ACKNOWLEDGMENT
(b) N = 2 This work was partially supported by CICYT of Spanish Gov-
ernment under Grant TIN2005-08660-C04-01.
Error power
0.04
REFERENCES
41.7 dB
0 [1] P. Lueg, “Process of silencing sound oscillations,” U.S. Patent
0 0.5 1 1.5 2 2.5 no. 2.043.416, 1936.
Time (s) [2] D. R. Morgan, “Analysis of multiple correlation cancellation
loops with a filter in the auxiliary path,” IEEE Transactions on
(c) N = 4
Acoustics, Speech, and Signal Processing, vol. 28, no. 4, pp. 454–
467, 1980.
[3] B. Widrow, D. Shur, and S. Shaffer, “On adaptive inverse con-
Error power
0.04
trol,” in Proceedings of the15th Asilomar Conference on Circuits,
40.27 dB Systems, and Computers, pp. 185–195, Pacific Grove, Calif,
0 USA, November 1981.
0 0.5 1 1.5 2 2.5 [4] J. C. Burgess, “Active adaptive sound control in a duct: a com-
Time (s) puter simulation,” Journal of the Acoustical Society of America,
(d) N = 8 vol. 70, no. 3, pp. 715–726, 1981.
[5] P. Ramos, R. Torrubia, A. López, A. Salinas, and E. Masgrau,
“Computationally efficient implementation of an active noise
Figure 11: Error convergence of the real implementation of the Gμ -
control system based on partial updates,” in Proceedings of the
FxSLMS algorithm with increasing value of the decimation factor
International Symposium on Active Control of Sound and Vibra-
N. The system deals with a previously recorded signal consisting of
tion (ACTIVE ’04), Williamsburg, Va, USA, September 2004,
harmonics at 100, 150, 200, and 250 Hz.
paper 003.
[6] S. C. Douglas, “Adaptive filters employing partial updates,”
IEEE Transactions on Circuits and Systems II: Analog and Digi-
real implementation, such as parallel operations, addressing tal Signal Processing, vol. 44, no. 3, pp. 209–216, 1997.
modes, registers available, or number of arithmetic units. In [7] T. Aboulnasr and K. Mayyas, “Selective coefficient update of
our case, the control strategy and the assembler code was de- gradient-based adaptive algorithms,” in Proceedings of IEEE In-
veloped trying to take full advantage of these aspects [5]. ternational Conference on Acoustics, Speech and Signal Process-
ing (ICASSP ’97), vol. 3, pp. 1929–1932, Munich, Germany,
April 1997.
5. CONCLUSIONS
[8] K. Doǧançay and O. Tanrikulu, “Adaptive filtering algorithms
This work presents a contribution to the selection of the step with selective partial updates,” IEEE Transactions on Circuits
size used in the sequential partial update LMS and FxLMS and Systems II: Analog and Digital Signal Processing, vol. 48,
no. 8, pp. 762–769, 2001.
adaptive algorithms. The deterministic periodic input signal
case is studied and it is verified that under certain conditions [9] J. Sanubari, “Fast convergence LMS adaptive filters employ-
ing fuzzy partial updates,” in Proceedings of IEEE Conference
the stability range of the step size is increased compared to
on Convergent Technologies for Asia-Pacific Region (TENCON
the full update LMS and FxLMS. ’03), vol. 4, pp. 1334–1337, Bangalore, India, October 2003.
The algorithm proposed here—filtered-x sequential LMS [10] P. A. Naylor, J. Cui, and M. Brookes, “Adaptive algorithms for
with gain in step size (Gμ -FxSLMS)—is based on sequential sparse echo cancellation,” Signal Processing, vol. 86, no. 6, pp.
PU of the coefficients of a filter and on a controlled increase 1182–1192, 2006.
in the step size of the adaptive algorithm. It can be used in ac- [11] S. M. Kuo, M. Tahernezhadi, and W. Hao, “Convergence anal-
tive noise control systems focused on the attenuation of pe- ysis of narrow-band active noise control system,” IEEE Trans-
riodic disturbances to reduce the computational costs of the actions on Circuits and Systems II: Analog and Digital Signal
control system. It is theoretically and experimentally proved Processing, vol. 46, no. 2, pp. 220–223, 1999.
Pedro Ramos et al. 15
Research Article
Detection-Guided Fast Affine Projection Channel Estimator for
Speech Applications
In various adaptive estimation applications, such as acoustic echo cancellation within teleconferencing systems, the input signal is
a highly correlated speech. This, in general, leads to extremely slow convergence of the NLMS adaptive FIR estimator. As a result,
for such applications, the affine projection algorithm (APA) or the low-complexity version, the fast affine projection (FAP) algo-
rithm, is commonly employed instead of the NLMS algorithm. In such applications, the signal propagation channel may have a
relatively low-dimensional impulse response structure, that is, the number m of active or significant taps within the (discrete-time
modelled) channel impulse response is much less than the overall tap length n of the channel impulse response. For such cases, we
investigate the inclusion of an active-parameter detection-guided concept within the fast affine projection FIR channel estimator.
Simulation results indicate that the proposed detection-guided fast affine projection channel estimator has improved convergence
speed and has lead to better steady-state performance than the standard fast affine projection channel estimator, especially in the
important case of highly correlated speech input signals.
Copyright © 2007 Yan Wu Jennifer et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
2. SYSTEM DESCRIPTION
where y(k) = θT (k)U(k) and where δ is a small positive reg-
2.1. Adaptive estimator ularization constant.
Note: the standard initial channel estimate θ(0) is the all-
We consider the adaptive FIR channel estimation system of
Figure 1. The following assumptions are made: zero vector.
For stable 1st-order mean behavior, the step size μ should
(1) all the signals are sampled: at sample instant k, u(k) satisfy 0 < μ ≤ 2. In practice, however, to attain higher-order
is the signal input to the unknown channel and the stable behavior, the step size is chosen to satisfy 0 < μ 2.
channel estimator; additive noise v(k) occurs within For the standard discrete NLMS adaptive FIR estimator,
the unknown channel; every coefficient θi (k) [i = 0, 1, . . . , n] is adapted at each sam-
(2) the unknown channel is linear and is adequately mod- ple interval. However, this approach leads to slow conver-
elled by a discrete-time FIR filter Θ = [θ0 , θ1 , . . . , θn ]T gence rates when the required FIR filter tap length n is “large”
with a maximum delay of n sample intervals; [6]. In [6–8], it is shown that if only the active or significant
(3) the additive noise signal is zero mean and uncorrelated channel taps are NLMS estimated then the convergence rate
with the input signal; of the NLMS estimator may be greatly enhanced, particularly
(4) the FIR-modeled unknown channel, Θ[z−1 ] is sparsely when m n.
active:
2.2. Affine projection algorithm
−1
−t1 −t2 −tm
Θ z = θt1 z + θt2 z + · · · + θtm z , (1) The affine projection algorithm (APA) is considered as a gen-
eralisation of the normalized least-mean-square (NLMS) al-
gorithm [2]. Alternatively, the APA can be viewed as an in-
where m n, and 0 ≤ t1 < t2 < · · · tm ≤ n. between solution to the NLMS and RLS algorithms in terms
At sample instant k, an active tap is defined as a tap cor- of computational complexity and convergence rate [10]. The
responding to one of the m indices {ta }m
a=1 of (1). Each of the NLMS algorithm updates the estimator taps/weights on the
remaining taps is defined as an inactive tap. basis of a single-input vector, which can be viewed as a one-
The observed output from the unknown channel is dimensional affine projection [11]. In APA, the projections
are made in multiple dimensions. The convergence rate of
the estimator’s tap weight vector greatly increases with an in-
y(k) = ΘT U(k) + v(k), (2) crease in the projection dimension. This is due to the built-in
decorrelation properties of the APA.
To describe the affine projection algorithm (APA) [1], the
where U(k) = [u(k), u(k − 1), . . . , u(k − n)]T . following notations are defined:
Yan Wu Jennifer et al. 3
(a) N: affine projection order; algorithm, 2(n + 1). Motivated by this, a fast version of the
(b) n + 1: length of the adaptive channel estimator APA was derived in [2]. Here, instead of calculating the error
excitation signal matrix of size (n+1) × N; vector from the whole covariance matrix, the FAP only cal-
(c) U(k): U(k) = [U(k), U(k − 1), . . . , culates the first element of the N-element error vector, where
U(k − (N − 1))], where an approximation is made for the second to the last compo-
U(k) = [u(k), u(k − 1), . . . , u(k − n)]T ; nents of the error vector e(k) as (1 − μ) times the previously
(d) U T (k)U(k): covariance matrix; computed error [12, 13]:
(e) Θ: the channel FIR tap weight vector, where
e(k + 1)
Θ = [θ0 , θ1 , . . . , θn ]T ; e(k + 1) = , (7)
(1 − μ)e(k)
(f) θ(k): the adaptive estimator FIR tap
weight vector at sample instant k where where the N − 1 length e(k) consists of the N − 1 upper ele-
θ(k) = [θ0 (k), θ1 (k), . . . , θn (k)]T ; ments of the vector e(k).
(g) θ(0): initial channel estimate with the all-zero Note: (7) is an exact formula for the APA if and only if
vector; δ = 0.
The second complexity reduction is achieved by only
(h) e(k): the channel estimation signal error vector
adding a weighted version of the last column of U(k) to up-
of length N;
date the tap weight vector. Hence there are just (n + 1) mul-
(i) ε(k): N-length normalized residual estimation tiplications as opposed to N × (n + 1) multiplications for the
error vector;
APA update of (6). Here, an alternate tap weight vector θ1 (k)
(j) y(k): system output; is introduced.
(k) v(k): the additive system noise; Note: the subscript 1 denotes the new calculation meth-
(l) δ: regularization parameter; od.
(m) μ: step size parameter.
θ1 (k + 1) = θ1 (k) − μU(k − N + 2)EN −1 (k + 1), (8)
The affine projection algorithm can be described by the
following equations (see Figure 1). where
The system output y(k) involves the channel impulse re- −1
N
sponse to the excitation/input and the additive system noise EN −1 (k + 1) = ε j (k − N + 2 + j)
v(k) and is given by (2). j =0
The channel estimation signal error vector e(k) is calcu-
= εN −1 (k + 1) + εN −2 (k) + · · · + ε0 (k − N + 2)
lated as
(9)
− 1),
e(k) = Y (k) − U(k)T θ(k (4)
where Y (k) = [y(k), y(k − 1), . . . , y(k − N + 1)]T . is the (N − 1)th element in the vector
The normalized residual channel estimation error vector ⎡ ⎤
ε0 (k + 1)
ε(k), is calculated in the following way: ⎢ ⎥
⎢ ε1 (k + 1) + ε0 (k) ⎥
−1 E(k + 1) = ⎢ .. ⎥.
T
ε(k) = U(k) − U(k) + δI · e(k), (5) ⎢ ⎥
⎣ . ⎦
where I = N × N identity matrix. εN −1 (k + 1) + εN −2 (k) + · · · + ε0 (k − N + 2)
The APA channel estimation vector is updated in the fol- (10)
lowing way:
Alternatively, E(k + 1) can be written as
θ(k
+ 1) = θ(k) + μU(k)ε(k). (6)
A regularization term δ times the identity matrix is added 0
E(k + 1) = + ε(k + 1), (11)
to the covariance matrix within (5) to prevent the insta- E(k)
bility problem of creating a singular matrix inverse when
[U(k)T U(k)] has eigenvalues close to zero. A well behaved where E(k) is an N − 1 length vector consisting of
inverse will be provided if δ is large enough. the upper most N − 1 elements of E(k) and ε(k +
From the above equations, it is obvious that the relations 1) = [εN −1 (k + 1), εN −2 (k + 1) + · · · + ε0 (k + 1)]T as calcu-
(4), (5), (6) reduce to the standard NLMS algorithm if N = 1. lated via (5).
Hence, the affine projection algorithm (APA) is a generaliza- Hence, it can be shown that the relationship between the
tion of the NLMS algorithm. new update method and the old update method of APA can
be viewed as
2.3. Fast affine projection algorithm
θ(k) = θ1 (k) + μU(k)E(k), (12)
The complexity of the APA is about 2(n + 1) N + 7N 2 ,
which
is generally much larger than the complexity of the NLMS where U(k) consists of the N − 1 leftmost columns of U(k).
4 EURASIP Journal on Audio, Speech, and Music Processing
A new efficient method to calculate e(k) using θ1 (k) However, the original least-square-based detection criterion
rather than θ(k) is also derived: suffers from tap coupling problems when colored or corre-
lated input signals are applied. In particular, the input cor-
rxx (k + 1) = rxx (k) + u(k + 1)α(k + 1) − u(k − n)α(k − n), relation causes X j (k) to depend not only on θ j but also the
(13) neighboring taps.
where The following three modifications to the above activity
T detection criterion were proposed in [7, 8] for providing en-
α(k + 1) = u(k), u(k − 1), . . . ,u(k − N + 2) (14) hanced performance for applications involving nonwhite in-
e1 (k + 1) = y(k + 1) − U(k + 1)T θ1 (k) (15) put signals.
t
e(k + 1) = e1 (k + 1) − μrxx (k + 1)E(k). (16) Modification 1. Replace X j (k) by
(Further details can be found in [2].) k 2
The following is a summary of the FAP algorithm: i=1 y(i) − y(i) + θj (i)u(i − j) u(i − j)
X j (k) = k .
i=1 u (i −
2 j)
(1) rxx (k +1) = rxx (k)+u(k +1)α(k +1) − u(k − n)α(k − n),
(19)
(2) e1 (k + 1) = y(k + 1) − U(k + 1)T θ1 (k),
The additional term −y (i) + θ j (i)u(i − j) in the numerator of
(3) e(k + 1) = e1 (k
t
+ 1) − μrxx (k + 1)E(k), X j (k) is used to reduce the coupling between the neighboring
taps [7, 8].
e(k+1)
(4) e(k + 1) = (1−μ)e(k) ,
Modification 2. Replace T(k) by
(5) ε(k + 1) = [U(k + 1)T U(k + 1) + δI]−1 e(k + 1),
2 log(k) k
2
T(k) = y(i) − y(i) . (20)
(6) E(k + 1) = 0
+ ε(k + 1), k i=1
E(k)
where where
k 2 k 2
i=1 Wk (i) y(i) − y(i) + θj (i)u(i − j) u(i − j) i=1 Wk (i) e1 (i) + θ1 j (i)u(i − j) u(i − j)
X wj (k) = k , X W
j (k) = k ,
i=1 Wk (i)u (i − j) i=1 Wk (i)u (i −
2 2 j)
(23) (31)
2 log Lw (k)
k
2 2 log Lw (k)
k
2
T w (k) = Wk (i) y(i) − y(i) , (24) T w (k) = Wk (i) e1 (i) , (32)
Lw (k) i=1
Lw (k) i=1
k
k
Lw (k) = Wk (i), (25) Lw (k) = Wk (i), (33)
i=1 i=1
and where Wk (i) is the exponentially decay operator: and where Wk (i) is the exponentially decay operator
Wk (i) = (1 − γ) k−i
0 < γ 1; (26) Wk (i) = (1 − γ)k−i 0<γ1 (34)
(2) update the NLMS weight for each detected active tap and θ1 j (i) is the jth element of θ1 (i) as defined in (8), (11),
index ta : and e1 (i) is as defined in (15).
μ We propose to apply this active detection criterion to
θta (k + 1) = θta (k) + 2 u k − ta e(k),
ta u k − ta +ε the fast affine projection algorithm. This involves creating an
(27) (n + 1) × (n + 1) diagonal activity matrix B(k), where the jth
diagonal element B j (k) = 1 if the jth tap index is detected
where ta = summation over all detected active-parameter as being active at sample instant k, otherwise B j (k) = 0. This
indices; matrix is then applied within the FAP algorithm as follows.
(3) reset the NLMS weight to zero for each identified in- Replace (5) with
active tap index. T −1
Note that (23)–(25) can be implemented in the following εd (k) = B(k)U(k) B(k)U(K) + δI e(k). (35)
recursive form: Replace (11) with
N j (k) = (1 − γ)N j (k − 1)
0
+ y(k) − y(k) + θj (k)u(k − j) u(k − j), Ed (k) = + εd (k). (36)
Ed (k − 1)
D j (k) = (1 − γ)D j (k − 1) + u2 (k − j), Replace (8) with
2
q(k) = (1 − γ)q(k − 1) + y(k) − y(k) , (28)
θd (k) = B(k)θd (k − 1) − μB(k)U(k − N + 1)Ed,N −1 (k),
w w
L (k) = (1 − γ)L (k − 1) + 1, (37)
N j2 (k) where
X wj (k) = , −1
N
D j (k)
Ed,N −1 (k) = εd , j (k − N + 1 + j) (38)
2q(k) log Lw (k) j =0
T w (k) = . (29)
Lw (k) and Ed, j (k) is the jth element of εd (k).
Note, as suggested in [8], that a threshold scaling constant η As with the detection-guided NLMS algorithm, a thresh-
may be introduced on the right-hand side of (24) or (29). If old scaling constant η may be introduced on the right-hand
η > 1, the system may avoid the incorrect detection of “non- side of (32) based on different conditions. The effectiveness
active” taps. This, however, may come with an initial delay in of this scaling constant is considered in the simulations.
detecting the smallest of the active taps, leading to an initial
additional error increase. If η < 1, it may improve the de- 3.4. Computational complexity
tectibility of “weak” active taps. However, it has the risk of
incorrectly including inactive taps within the active tap set, The proposed system requires 4(n + 1) + 4 MPSI to per-
resulting in reduced convergence rates. form the detection tasks required in the recursive equiva-
lent of (30)–(33). By including the sparse diagonal matrix
3.3. Proposed detection-guided FAP FIR B(k) in (37), the system only needs to include m multipli-
channel estimator cations rather than (n + 1) multiplications for (15) and (8).
Thus, the proposed detection-guided FAP channel estimator
The enhanced detection-guided FAP estimation is derived as requires 2m + 7N 2 + 5N + 4(n + 1) + 4 MPSI while the com-
follows. plexity of FAP is 2(n + 1) + 7N 2 + 5N MPSI. Hence, for suf-
The tap index j is detected as being a member of the ac- ficiently long, low-dimensional active channels n m ≥ 1,
tive parameter set {ta }m
a=1 at sample instant k if n N, the computational cost of the proposed detection-
guided FAP channel estimator is essentially twice that of the
X W W
j (k) > T (k), (30) FAP and of the standard NLMS estimators.
6 EURASIP Journal on Audio, Speech, and Music Processing
0.5 0.4
0.4 0.3
0.3 0.2
0.2
0.1
0.1
Amplitude
Amplitude
0
0
−0.1
−0.1
−0.2
−0.2
−0.3 −0.3
−0.4 −0.4
−0.5 −0.5
0 50 100 150 200 250 300 0 50 100 150 200 250 300
Tap index Tap index
(a) (b)
Figure 2: channel impulse response showing sparse structure: (a) is derived from the measured impulse response shown in (b) via the
technique of the appendix.
1
Simulation 2. Highly correlated input signal u(k) described
The complexity is calculated based on the discussion in Section 3.4. The
computational complexity of the active-parameter detection-guided FAP by the model u(k) = w(k)/[1 − 0.9z−1 ], where w(k) is a
channel estimator with N = 10 is 1980 MPSI, which is slightly lower than discrete white Gaussian process with zero mean and unit
the complexity of standard FAP with N = 14 of 2044 MPSI. variance.
Yan Wu Jennifer et al. 7
Simulation 3. Tenth-order AR-modelled speech input signal. (d) The detection-guided FAP channel estimator (e) with-
out threshold scaling detects extra “nonactive” taps. In
Simulation 4. Tenth-order AR-modelled speech input signal the simulation, it detects 32 active taps, which are 21 in
under noisy conditions. That is, with higher noise variance excess of the true number. This leads to slower conver-
= 0.05. gence rate. In comparison, the detection-guided FAP
In all four simulations, two detection-guided scaling con- channel estimator (f) with threshold scaling η = 4, it
stants were employed: η = 1 (i.e., no scaling) and η = 4. shows the ability to detect the correct number of active
taps, however, this comes with a relative initial error
5. RESULT AND ANALYSIS increase.
(e) The detection-guided FAP channel estimator (e) with
Simulation 1 (lowly correlated input signal case). The results N = 10 provides noticeably better convergence rate
of the simulations for channel estimators (a) to (g) with μ = performance than the standard FAP channel estimator
0.005 are shown in Figure 3. (d) with N = 14 in terms of the convergence rate and
the steady-state error.
(a) Channel estimators (b) to (f) show faster convergence
than the standard NLMS channel estimator (a).
(b) The detection-guided NLMS estimator (b) provides Simulation 3 (highly correlated speech input signal case).
faster convergence rate than the APA channel estima- The results of the simulations for channel estimators (a) to
tor (c) with N = 10 and the FAP channel estimator (d) (g) with μ = 0.005 are shown in Figure 5. The trends shown
with N = 10. It is clear that the APA channel estimator here are similar to those of Simulations 1 and 2, although
(c) with N = 10 and FAP channel estimator (d) with here the convergence rate and steady-state benefits provided
N = 10 still have not reached steady state at the 20000 by detection guiding are further accentuated.
sample mark.
(c) The detection-guided FAP channel estimators with (a) When the speech input signal is applied, the active
N = 10 (e), (f) show a better convergence rate than parameter detection-guided NLMS channel estimator
channel estimators (b), (c), and (d). (b) suffers from very slow convergence, similar to that
(d) Detection-guided FAP estimator (e) and detection- of the standard NLMS channel estimator (a). This is
guided FAP estimator with threshold scaling constant due to the incorrect detection of many of the inactive
η = 4 (f) both can detect all the active taps and almost taps.
have the same performance. (b) The detection-guided FAP channel estimators (e) and
(e) With almost the same computational cost, detection- (f) significantly outperform channel estimators (c)
guided FAP estimator (e) significantly outperforms and (d) in terms of convergence speed. The results
standard FAP estimator with N = 14 in terms of con- also indicate that the newly proposed detection-guided
vergence rate. FAP estimators may have better steady state error per-
formance than the standard APA and FAP estimators.
(c) For detection FAP estimator (e) and detection FAP
Simulation 2 (highly correlated input signal case). The re- estimator with threshold scaling constant η = 4 (f),
sults of the simulations for channel estimators (a) to (g) with the trends are similar to those observed for Simula-
μ = 0.005 are shown in Figure 4. tion 2: detection FAP estimator (e) detects extra 23
active taps, resulting in reduced convergence rate and
(a) The active-parameter detection-guided NLMS chan-
there is an initial error increase occurring in detection
nel estimator (b) does not provide suitably enhanced
FAP estimator with threshold scaling constant η = 4
improved convergence speed over the standard NLMS
(f).
channel estimator (a). This is due to the incorrect de-
(d) Again, with the same computational cost, the detec-
tection of many of the inactive taps with the highly cor-
tion-guided FAP channel estimator (e) with N = 10
related input signals.
shows a faster convergence rate and reduced steady
(b) The APA channel estimator with N = 10 (c) and
state error relative to standard FAP channel estimator
the FAP channel estimator with N = 10 (d) show
(d) with N = 14.
significantly improved convergence over (a) and (b).
This is due to the autocorrelation matrix inverse
[U(k)T U(k)+δI]−1 in (5) essentially prewhitening the Simulation 4 (highly correlated speech input signal case with
highly colored input signal. higher noise variance). The results of the simulations for
(c) The detection-guided FAP channel estimators with channel estimators (a) to (g) with μ = 0.005 are shown in
N = 10 (e), (f) show better convergence rates than the Figure 6, which confirm the similar good performance of our
standard APA channel estimator with N = 10 (c) and newly proposed channel estimator under noisy conditions.
the standard FAP channel estimator with N = 10 (d). The detection FAP estimator with threshold scaling constant
In addition, the detection-guided FAP estimators (e), η = 4 (f) performs noticeably better than the detection esti-
(f) appear to provide better steady-state error perfor- mator FAP without threshold scaling (e) due to the ability to
mance. detect the correct number of active taps.
8 EURASIP Journal on Audio, Speech, and Music Processing
101 101
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(a) (b)
101 101
Channel estimation error
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(c) (d)
101 101
Channel estimation error
100 100
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(e) (f)
101
Channel estimation error
100
10−1
10−2
10−3
10−4
(g)
101 101
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(a) (b)
101 101
Channel estimation error
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(c) (d)
101 101
Channel estimation error
100 100
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(e) (f)
101
Channel estimation error
100
10−1
10−2
10−3
10−4
(g)
101 101
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(a) (b)
101 101
Channel estimation error
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(c) (d)
101 101
Channel estimation error
100 100
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(e) (f)
101
Channel estimation error
100
10−1
10−2
10−3
10−4
(g)
101 101
100 100
Channel estimation error
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(a) (b)
101 101
100 100
Channel estimation error
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(c) (d)
101 101
100 100
Channel estimation error
10−1 10−1
10−2 10−2
10−3 10−3
10−4 10−4
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Sample time ×104 Sample time ×104
(e) (f)
101
100
Channel estimation error
10−1
10−2
10−3
10−4
(g)
Figure 6: Comparison of convergence rates for speech input signal under noisy conditions.
12 EURASIP Journal on Audio, Speech, and Music Processing
6. CONCLUSION which taps are zero. By applying the following iterative pro-
cedure, this requirement is avoided for sparse channels.
For many adaptive estimation applications, such as acous-
tic echo cancellation within teleconferencing systems, the in- Algorithm 1. (1) Initially, include the indices of all n tap esti-
put signal is speech or highly correlated. In such applications, mates {θi } in the set S of zero taps and set M = n.
the standard NLMS channel estimator suffers from extremely (2) Determine rms value σS of the estimates of the taps in
slow convergence. To remove this weakness, the affine pro- Set S.
jection algorithm (APA) or the related computationally ef- (3) Determine the indices i of those taps for which the
ficient fast affine projection (FAP) algorithm is commonly estimates coefficients satisfy
employed instead of the NLMS algorithm. Due to the signal
propagation channels in such applications, sometimes hav- θi ≤ σS 2 log M. (B.1)
ing low dimensional or sparsely active impulse responses,
we considered the incorporation of active-parameter de- (4) Repeat steps (2) and (3) a given number of times or, alter-
tection with the FAP channel estimator. This newly pro- natively, until the difference in σS from one iteration to the
posed detection-guided FAP channel estimator is character- next has decreased to a given value.
ized with improved convergence speed and perhaps also bet-
ter steady-state error performance as compared to the stan- ACKNOWLEDGMENT
dard FAP estimator. The similar good performance is also
achieved under noisy conditions. Additionally, simulations The authors would like to acknowledge CSIRO Rdiophysics,
confirm these advantages of the proposed channel estima- Sydney for providing the measurement data of the simula-
tor under essentially the same computational cost. These fea- tion channel.
tures make this newly proposed channel estimator a good
candidate for the adaptive estimation speech applications REFERENCES
such as the acoustic echo cancellation problem.
[1] K. Ozeki and T. Umeda, “An adaptive filtering algorithm using
APPENDICES an orthogonal projection to an affine subspace and its prop-
erties,” Electronics & Communications in Japan, vol. 67, no. 5,
A. SPARSE CHANNEL IMPULSE RESPONSE pp. 19–27, 1984.
ESTIMATION: REMOVING MEASUREMENT [2] S. L. Gay and S. Tavathia, “The fast affine projection algo-
rithm,” in Proceedings of the 20th International Conference on
NOISE EFFECTS
Acoustics, Speech, and Signal Processing (ICASSP ’95), vol. 5,
In this appendix, a procedure for removing the measure- pp. 3023–3026, Detroit, Mich, USA, May 1995.
[3] J. R. Casar-Corredera and J. Alcazar-Fernandez, “An acous-
ments noise effect from the estimated time domain channel
tic echo canceller for teleconference systems,” in Proceedings of
impulse response is presented. This procedure may be viewed IEEE International Conference on Acoustics, Speech and Signal
as an offline scheme for active-tap detection of sparse chan- Processing (ICASSP ’86), vol. 11, pp. 1317–1320, Tokyo, Japan,
nels and assumes that the true impulse response has a suffi- April 1986.
ciently large number of zero taps. Its applicability is restricted [4] A. Gilloire and J. Zurcher, “Achieving the control of the acous-
to channels which have a sparse structure. tic echo in audio terminals,” in Proceedings of European Signal
In general, the presence of measurement noise or distur- Processing Conference (EUSIPCO ’88), pp. 491–494, Grenoble,
bance causes the tap coefficient estimate of each of the zero France, September 1988.
taps of the sparse channel to be nonzero. If we assume the es- [5] S. Makino and S. Shimada, “Echo control in telecommuni-
timate was obtained with a white input, then the discussion caitons,” Journal of the Acoustic Society of Japan, vol. 11, no. 6,
of Section 3 (more details can be found in [15]) suggests that pp. 309–316, 1990.
[6] J. Homer, I. Mareels, R. R. Bitmead, B. Wahlberg, and A.
asymptotically (at least for LS, LMS estimates) the zero-tap
Gustafsson, “LMS estimation via structural detection,” IEEE
estimates have a zero mean i.i.d Gaussian distribution: Transactions on Signal Processing, vol. 46, no. 10, pp. 2651–
2663, 1998.
θi ∼ N 0, σ 2 , i.i.d, where θi = 0. (A.1)
[7] J. Homer, “Detection guided NLMS estimation of sparsely
Under the validity of (A.1), we use the following results from parametrized channels,” IEEE Transactions on Circuits and Sys-
tems II, vol. 47, no. 12, pp. 1437–1442, 2000.
the work of Donoho cited in [15], to develop a procedure for
[8] J. Homer, I. Mareels, and C. Hoang, “Enhanced detection-
removing the effects of the noise, or, equivalently, for deter- guided NLMS estimation of sparse FIR-modeled signal chan-
mining which taps are zero. nels,” IEEE Transactions on Circuits and Systems I, vol. 53, no. 8,
pp. 1783–1791, 2006.
B. RESULT [9] S. Haykin, Adaptive Filter Theory, Prentice Hall Information
and System Science Series, Prentice-Hall, Upper Saddle River,
Let {θi } ∼ N(0, σ 2 ), i.i.d. Define the event AM = {supi≤M |zi | NJ, USA, 3rd edition, 1996.
≤ σ 2 log M }, Then , Prob(AM ) → 1 as M → ∞. [10] M. Bouchard, “Multichannel affine and fast affine projection
algorithms for active noise control and acoustic equalization
A priori knowledge of the indices i of the zero taps is re- systems,” IEEE Transactions on Speech and Audio Processing,
quired in order to use the threshold σ 2 log M to determine vol. 11, no. 1, pp. 54–60, 2003.
Yan Wu Jennifer et al. 13
Research Article
Efficient Multichannel NLMS Implementation for
Acoustic Echo Cancellation
An acoustic echo cancellation structure with a single loudspeaker and multiple microphones is, from a system identification per-
spective, generally modelled as a single-input multiple-output system. Such a system thus implies specific echo-path models (adap-
tive filter) for every loudspeaker to microphone path. Due to the often large dimensionality of the filters, which is required to model
rooms with standard reverberation time, the adaptation process can be computationally demanding. This paper presents a selec-
tive updating normalized least mean square (NLMS)-based method which reduces complexity to nearly half in practical situations,
while showing superior convergence speed performance as compared to conventional complexity reduction schemes. Moreover,
the method concentrates the filter adaptation to the filter which is most misadjusted, which is a typically desired feature.
Copyright © 2007 Fredric Lindstrom et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
1. INTRODUCTION lyzed, for example, [5–14]. In this paper a related low com-
plexity algorithm for use in a multimicrophone system is
Acoustic echo cancellation (AEC) [1, 2] is used in telecon- proposed.
ferencing equipment in order to provide high quality full-
duplex communication. The core of an AEC solution is an
2. COMPLEXITY REDUCTION METHODS
adaptive filter which estimates the impulse response of the
loudspeaker enclosure microphone (LEM) system. Typical The LEM system can be modelled as a time invariant lin-
adaptive algorithms for the filter update procedure in the ear system, h(k) = [h0 (k), . . . , hN −1 (k)]T , where N − 1 is
AEC are the least mean square, normalized least mean square the order of the finite impulse response (FIR) model [11]
(LMS, NLMS) [3], affine projection (AP), and recursive least and k is the sample index. Thus, the desired (acoustic echo)
squares (RLS) algorithms [4]. Of these, the NLMS-based al- signal d(k) is given by d(k) = h(k)T x(k), where x(k) =
gorithms are popular in industrial implementations, thanks [x(k), . . . , x(k − N + 1)]T and x(k) is the input (loudspeaker)
to their low complexity and finite precision robustness. signal. The measured (microphone) signal y(k) is obtained
Multimicrophone solutions are frequent in teleconfer- as y(k) = d(k) + n(k), where n(k) is near-end noise. As-
encing equipment targeted for larger conference rooms. This
paper considers a system consisting of one loudspeaker and suming an adaptive filter h(k) of length N is used, that is,
h(k) = [h 0 (k), . . . , h
N −1 (k)]T , the NLMS algorithm is given
three microphones. The base unit of the system contains the
loudspeaker and one microphone and it is connected to two by
auxiliary expansion microphones, as shown in Figure 1. Such
multimicrophone system constitutes a single-input multiple-
e(k) = y(k) − d(k)
= y(k) − x(k)T h(k), (1)
output (SIMO) multichannel system with several system im-
pulse responses to be identified, Figure 2. Thus, the signal μ
β(k) = 2 ,
processing task can be quite computational demanding. x(k) +
Several methods for computational complexity reduction (2)
of the LMS/NLMS algorithms have been proposed and ana- h(k
+ 1) = h(k) + β(k)e(k)x(k),
2 EURASIP Journal on Audio, Speech, and Music Processing
For clarity of presentation, the sample index is omitted, that in (9), the following is obtained:
is, lmax = lmax (k) and mmax = mmax (k).
vm (k + 1)2 = vm (k)T vm (k) − 2μ − μ e2 k − lm .
2
The filter corresponding to the row index mmax , that is,
m (k), is then updated with
the filter h x k − lm 2 m
max
(11)
m (k) + μe
m (k + 1) = h max (k)x k − lmax + 1 Thus, the difference in mean-square deviation from one sam-
h . (7)
max max
x k − lmax + 1 2 + ple to the next is given by
e 2 k − lm
m (k) will make the error el-
This filter update of filter h Dm (k + 1) − Dm (k) = − 2μ − μ2 E m ,
max x k − lm 2
ements E(l, mmax , k), l = 1, . . . , L obsolete, since these are er- (12)
rors generated by h m (k) prior to the update. Consequently,
max
to avoid future erroneous updates, these elements should be which corresponds to a reduction under the assumption that
set to 0, that is, set 0 < μ < 2.
Further, assuming small fluctuations in the input energy
x(k)2 from one iteration to the next, that is, assuming
E l, mmax , k = 0 for l = 1, . . . , L. (8)
x(k)2 = x(k − 1)2 = · · · = x k − Lm + 1 2 , (13)
An advantage over periodic NLMS is that the proposed struc-
ture does not limit the update to be based on the current in- gives [4],
put vector x(k), but allows updating based on previous input 2
2 E em k − l m
vectors as well, since the errors not yet used for an update are Dm (k + 1) − Dm (k) = − 2μ − μ 2 . (14)
stored in E(k). Further, largest output-error update will con- E x(k)
centrate the updates to the corresponding filter. This is nor-
The total reduction r(k) in deviation, considering all M fil-
mally a desired feature in an acoustic echo cancellation envi-
ters is thus
ronment with multiple microphones. For example, consider
the setup in Figure 1 with all adaptive filters fairly converged.
M
If then one of the microphones is dislocated, this results in an r(k) = Dm (k + 1) − Dm (k). (15)
echo-path change for the corresponding adaptive filter. Nat- m=1
urally, it is desired to concentrate all updates to this filter. Only one filter is updated each time instant. Assume error
E(l, m, k) is chosen for the update. Then r(k) is given by
4. ANALYSIS 2
2 E E (l, m, k)
r(k) = − 2μ − μ 2 . (16)
In the previously described scenario, where several input E x(k)
vectors are available but only one of them can be used for
adaptive filter updating (due to complexity issues), it might From (16), it can be seen that the reduction is maximized if
seem intuitive to update with the input vector correspond- emax (k), (see (16)), is chosen for the update, that is, as done
ing to the largest output error magnitude. In this section, it in the proposed algorithm.
is shown analytically that, under certain assumptions, choos- The proposed algorithm can be seen as a version of the
ing the largest error maximizes the reduction. periodic NLMS. Analysis of convergence, stability, and ro-
The error deviation vector for the mth filter vm (k) is de- bustness for this branch of (N)LMS algorithms are provided
fined as vm (k) = hm (k) − h m (k), and the mean-squared de- in, for example, [5, 15].
viation as D(k) = E{vm (k)2 }, where E{·} denotes ex-
pectation [4]. Assume that no near-end sound is present, 5. COMPLEXITY AND IMPLEMENTATION
n(k) = 0, and no regularization is used, = 0, and that
the errors available for updating filter m are em (k − lm ) with The algorithm proposed in this paper is aimed for imple-
lm = 0, . . . , Lm and Lm < L, that is, the available errors in ma- mentation in a general digital signal processor (DSP), typi-
trix E(k) that correspond to filter m. Updating filter m using cally allowing multiply add and accumulate arithmetic oper-
error em (k − lm ) gives ations to be performed in parallel with memory reads and/or
writes (e.g., [16]). In such a processor, the filtering operation
can be achieved in N instructions and the NLMS update will
vm (k + 1)2 = vm (k) − β(k)em k − lm x k − lm 2
require 2N instructions. Both the filtering and the update re-
(9) quire two memory reads, one addition and one multiplica-
tion per coefficient, which can be performed by the DSP in
and by using one instruction. However, the result from the filter update is
not accumulated but it needs to be written back to memory.
T Therefore, the need for two instructions per coefficient for
em k − l m = x k − l m vm (k) = vm (k)T x k − lm (10) the update operation.
4 EURASIP Journal on Audio, Speech, and Music Processing
M+2
, (19)
3M The noise sources n1 (k), n2 (k), and n3 (k) were indepen-
which for M = 3 gives a complexity reduction of nearly a half dent, but had the same characteristics (bandlimited flat spec-
(5/9). For higher values of M, the reduction is even larger. trum). Echo-to-noise ratio was approximately 40 dB for mi-
Further reduction in complexity can also be achieved if up- crophone 1 and 34 dB and 33 dB for microphones 2 and 3,
dates are performed say every other or every third sample. respectively.
In the simulations four low-complexity methods of sim-
6. SIMULATIONS ilar complexity were compared; the periodic (N)LMS [5],
random NLMS (similar to SPU-LMS [10]) selecting which
The performance of the proposed method was evaluated filter to be updated in a stochastic manner (with all filters
through simulations with speech as input signal. Three im- having equal probability of an update), M-Max NLMS [6],
pulse responses (h1 , h2 , and h3 ), shown in Figure 3, all and the proposed NLMS. The performance of the full update
of length N = 1800 were measured with three micro- NLMS is also shown for comparison. The periodic NLMS,
phones, according to the constellation in Figure 1, in a nor- random NLMS, and the proposed method limit the updates
mal office. The acoustic coupling between the loudspeaker to one whole filter at each time interval, while M-Max NLMS
and the closest microphone, AC1, was manually normal- instead updates all filters but only does this for a subset (1/3
ized to 0 dB and the coupling between the loudspeaker and in this case) of all coefficients. However, since M-Max NLMS
the second and third microphones, AC2 and AC3, were requires sorting of the input vectors, the complexity for this
then estimated to −6 dB and −7 dB, respectively. Thus, method is somewhat larger (2 log2 N + 2 comparisons and
10 log10 (h2 2 / h1 2 ) = −6 dB and 10 log10 (h3 2 / h1 2 ) (N −1)/2 memory transfers [9]). Zero initial coefficients were
= −7 dB. used for all filters and methods. The result is presented in
Output signals y1 (k), y2 (k), and y3 (k) were obtained by Figure 4, where the normalized filter mismatch, calculated as
filtering the input signal x(k) with the three obtained impulse
responses and adding noise, hm − h
m (k)2
10 log10 2 m = 1, 2, 3, (21)
hm
y1 (k) = x(k)T h1 + n1 (k),
y2 (k) = x(k)T h2 + n2 (k), (20) for the three individual filters and solutions are presented.
Of the four variants with similar complexity, the proposed
y3 (k) = x(k)T h3 + n3 (k). method is clearly superior to the conventional periodic
Fredric Lindstrom et al. 5
Filter 1 Filter 1
0 0
10 10
Mismatch (dB)
Mismatch (dB)
20 20
30 30
40 40
50 50
0 20 40 60 80 100 120 0 20 40 60 80 100 120
Seconds Seconds
NLMS updated every sample M-Max NLMS NLMS updated every sample M-Max NLMS
Periodic NLMS Random NLMS Periodic NLMS Random NLMS
Proposed NLMS updating scheme Proposed NLMS updating scheme
Filter 2 Filter 2
0
0
Mismatch (dB)
10
Mismatch (dB)
10
20
20
30 30
40 40
0 20 40 60 80 100 120 0 20 40 60 80 100 120
Seconds Seconds
NLMS updated every sample M-Max NLMS NLMS updated every sample M-Max NLMS
Periodic NLMS Random NLMS Periodic NLMS Random NLMS
Proposed NLMS updating scheme Proposed NLMS updating scheme
Filter 3
0 Filter 3
0
Mismatch (dB)
10
10
Mismatch (dB)
20
20
30
30
40
0 20 40 60 80 100 120 40
0 20 40 60 80 100 120
Seconds
Seconds
NLMS updated every sample M-Max NLMS
NLMS updated every sample M-Max NLMS
Periodic NLMS Random NLMS
Periodic NLMS Random NLMS
Proposed NLMS updating scheme
Proposed NLMS updating scheme
Figure 4: Mismatch for the the evaluated methods. Figure 5: Mismatch for the the evaluated methods, where an echo-
path change occurs for filter 2 after 55 seconds.
NLMS and also to the random NLMS. The performance of it can be seen that the proposed algorithm basically follows
the M-Max NLMS and the proposed solution is comparable, the curve of the full update NLMS immediately after the
although the proposed solution performs better or equal for echo-path changes.
all filters. If one specific microphone is subject to an extreme
The algorithm automatically concentrates computational acoustic situation, for example, it is placed in another room
resources to filters with large error signals. This is demon- or placed immediately next to a strong noise source, there is
strated in Figure 5, where filter 2 undergoes an echo-path a risk of “getting stuck,” that is, the corresponding filter has
change, that is, a dislocation of the microphone. In Figure 5, large output error for all input vectors and thus is updated all
6 EURASIP Journal on Audio, Speech, and Music Processing
the time. This problem can be reduced by setting a limit on [11] E. Hänsler and G. Schmidt, “Single-channel acoustic echo
the lowest rate of updates for a filter, that is, if filter m has not cancellation,” in Adaptive Signal Processing, J. Benesty and Y.
been updated for the last U samples it is forced to update the Huang, Eds., Springer, New York, NY, USA, 2003.
next iteration. However, this does not resolve the issue opti- [12] S. M. Kuo and J. Chen, “Multiple-microphone acoustic echo
mally. A more sophisticated method is to monitor the echo cancellation system with the partial adaptive process,” Digital
Signal Processing, vol. 3, no. 1, pp. 54–63, 1993.
reduction of the filters and bypass or reduce the resources
[13] S. Gollamudi, S. Kapoor, S. Nagaraj, and Y.-F. Huang, “Set-
allocated to filters not providing significant error reduction. membership adaptive equalization and an updator-shared im-
Implementing these extra functions will of course add com- plementation for multiple channel communications systems,”
plexity. IEEE Transactions on Signal Processing, vol. 46, no. 9, pp. 2372–
2385, 1998.
7. CONCLUSIONS [14] S. Werner, J. A. Apolinario Jr., M. L. R. de Campos, and P. S.
R. Diniz, “Low-complexity constrained affine-projection algo-
In an acoustic multichannel solution with multiple adaptive rithms,” IEEE Transactions on Signal Processing, vol. 53, no. 12,
filters, the computation power required to update all filters pp. 4545–4555, 2005.
every sample can be vast. This paper has presented a solution [15] W. A. Gardner, “Learning characteristics of stochastic-
which updates only one filter every sample and thus signifi- gradient-descent algorithms: a general study, analysis, and cri-
tique,” Signal Processing, vol. 6, no. 2, pp. 113–133, 1984.
cantly reduces the complexity, while still performing well in
[16] ADSP-BF533 Blackfin processor hardware reference, Analog De-
terms of convergence speed. The solution also handles echo- vices, Norwood, Mass, USA, 2005.
path changes well, since the most misadjusted filter gets the
most computation power, which often is a desirable feature
in practice.
ACKNOWLEDGMENT
REFERENCES
[1] E. Hänsler and G. Schmidt, Acoustic Echo and Noise Control:
A Practical Approach, John Wiley & Sons, New York, NY, USA,
2004.
[2] M. M. Sondhi, “An adaptive echo canceler,” Bell System Tech-
nical Journal, vol. 46, no. 3, pp. 497–510, 1967.
[3] B. Widrow and S. D. Stearns, Adaptive Signal Processing,
Prentice-Hall, Englewood Cliffs, NJ, USA, 1985.
[4] S. Haykin, Adaptive Filter Theory, Prentice-Hall, Englewood
Cliffs, NJ, USA, 4th edition, 2002.
[5] S. C. Douglas, “Adaptive filters employing partial updates,”
IEEE Transactions on Circuits and Systems II: Analog and Digi-
tal Signal Processing, vol. 44, no. 3, pp. 209–216, 1997.
[6] T. Aboulnasr and K. Mayyas, “Complexity reduction of the
NLMS algorithm via selective coefficient update,” IEEE Trans-
actions on Signal Processing, vol. 47, no. 5, pp. 1421–1424,
1999.
[7] P. A. Naylor and W. Sherliker, “A short-sort M-Max NLMS
partial-update adaptive filter with applications to echo can-
cellation,” in Proceedings of IEEE International Conference on
Acoustics, Speech, and Signal Processing (ICASSP ’03), vol. 5,
pp. 373–376, Hong Kong, April 2003.
[8] K. Dogançay and O. Tanrikulu, “Adaptive filtering algorithms
with selective partial updates,” IEEE Transactions on Circuits
and Systems II: Analog and Digital Signal Processing, vol. 48,
no. 8, pp. 762–769, 2001.
[9] T. Schertler, “Selective block update of NLMS type algo-
rithms,” in Proceedings of IEEE International Conference on
Acoustics, Speech, and Signal Processing (ICASSP ’98), vol. 3,
pp. 1717–1720, Seattle, Wash, USA, May 1998.
[10] M. Godavarti and A. O. Hero III, “Partial update LMS algo-
rithms,” IEEE Transactions on Signal Processing, vol. 53, no. 7,
pp. 2382–2399, 2005.
Hindawi Publishing Corporation
EURASIP Journal on Audio, Speech, and Music Processing
Volume 2007, Article ID 92528, 11 pages
doi:10.1155/2007/92528
Research Article
Time-Domain Convolutive Blind Source Separation
Employing Selective-Tap Adaptive Algorithms
School of Information Technology and Engineering, University of Ottawa, Ottawa, ON, Canada K1N 6N5
We investigate novel algorithms to improve the convergence and reduce the complexity of time-domain convolutive blind source
separation (BSS) algorithms. First, we propose MMax partial update time-domain convolutive BSS (MMax BSS) algorithm. We
demonstrate that the partial update scheme applied in the MMax LMS algorithm for single channel can be extended to multichan-
nel time-domain convolutive BSS with little deterioration in performance and possible computational complexity saving. Next,
we propose an exclusive maximum selective-tap time-domain convolutive BSS algorithm (XM BSS) that reduces the interchannel
coherence of the tap-input vectors and improves the conditioning of the autocorrelation matrix resulting in improved convergence
rate and reduced misalignment. Moreover, the computational complexity is reduced since only half of the tap inputs are selected
for updating. Simulation results have shown a significant improvement in convergence rate compared to existing techniques.
Copyright © 2007 Q. Pan and T. Aboulnasr. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
x1 y1
s1 h11 w11 2.2. Convolutive BSS algorithm
.. hM1 .. wN1 .. The convolutive BSS model is illustrated in Figure 2. N
. . .
h1N w1M source signals {si (k)}, 1 ≤ i ≤ N, pass through an unknown
sN hMN wNM N-input, M-output linear time-invariant mixing system to
xM yN yield the M mixed signals {x j (k)}. All source signals si (k) are
Mixing system Separation system assumed to be statistically independent.
Defining the vectors s(k) = [s1 (k) · · · sN (k)]T and
Figure 2: Structure of convolutive blind source separation system. x(k) = [x1 (k) · · · xM (k)]T , the mixing system can be rep-
resented as
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x1 (k) h11 (l) · · · h1N (l) s1 (k)
is ⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ · ⎦=⎣ · · · ⎦ ∗ ⎣ · ⎦, (8)
xM (k) hM1 (l) · · · hMN (l) sN (k)
p(y)
D p(y) || q(y) = p(y) log N dy, (3)
i=1 pi yi where ∗ is convolution operation.
The jth sensor signal can be obtained by
where p(y) is the probability density of output signals, pi (yi )
is the probability density of output signal yi , q(y) is the joint −1
N L
probability density of output signals: x j (k) = h ji (l)si (k − l), (9)
i=1 l=0
D p(y) || q y)
where h ji (l) is the impulse response from source i to sensor
N
j, L defines the order of the FIR filters used to model this
= p(y) log p(y) − p(y) log pi yi
i=1
impulse response.
The task of the convolutive BSS algorithm is to obtain
N
= −H(y) + Hi yi an unmixing system such that the outputs of this system
i=1 y(k) = [y1 (k) · · · yN (k)]T become mutually independent as
the estimates of the N source signals. The separation system
N
= −H(x) − log det(W) − E log pi yi , typically consists of a set of FIR filters wi j (k) of length Q each.
i=1 The unmixing system can also be represented as
(4) ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
y1 (k) w11 (l) · · · w1M (l) x1 (k)
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
where H(·) is the entropy operation. ⎣ · ⎦=⎣ · · · ⎦∗⎣ · ⎦. (10)
Using standard gradient yN (k) wN1 (l) · · · wNM (l) xM (k)
W(k + 1) = W(k) + ΔW, where W the unmixing matrix with FIR filters as its compo-
∂D (6) nents.
ΔWstandard grad = −μ = μ W−T − E ϕ(y)xT . This approach is the natural extension and achieves
∂W
good separation results once the algorithm converges. How-
However, BSS algorithms have traditionally used the natural ever, time-domain convolutive blind source separation suf-
gradient [4] which is acknowledged as having better perfor- fers from high computational complexity and low conver-
mance. In this case, ΔW is given by gence rate, especially for systems with long FIR filters.
Convolutive BSS can also be performed in frequency do-
∂D T main by using short-time Fourier transform. This method
ΔWnatural grad = −μ W W = μ I − E ϕ(y)yT W. is very popular for convolutive mixtures and is based on
∂W
transforming the convolutive blind source separation prob-
(7) lem into instantaneous BSS problem at every frequency bin.
4 EURASIP Journal on Audio, Speech, and Music Processing
The basic idea of partial update adaptive filtering is to allow 4. EXCLUSIVE MAXIMUM SELECTIVE-TAP
for the use of filters with a number of coefficients L large ADAPTIVE ALGORITHM
enough to model the unknown system while reducing the
overall complexity by updating only M coefficients at a time. Recently, an exclusive maximum (XM) partial update algo-
This results in considerable savings for M L. Invariably, rithm was proposed in [11] to deal with stereophonic echo
there are penalties for this partial update, the most obvious cancellation. The XM algorithm was motivated by MMax
of which is reduced convergence rate. The question then be- partial update scheme [10] as both select a subset of coef-
comes which coefficients should we update and how do we ficients for updating in every adaptative iteration. However,
minimize the impact of the partial update on the overall fil- in the XM partial update, the goal is not to reduce com-
ter performance. In this section, we review the MMax partial putational complexity. Rather the exclusive maximum tap-
update adaptive algorithm for linear filters [10] since it forms selection strategy was proposed to reduce interchannel co-
the basis of our proposed MMax time-domain convolutive herence in a two-channel stereo system and improve the con-
BSS algorithm. ditioning of the input vector autocorrelation matrix. We now
Consider a standard adaptive filter set-up where x(n) is review the algorithm in [11] here since it forms the basis of
the input, y(n) is the output, and d(n) is the desired output, our proposed XM time-domain convolutive BSS algorithm.
all at instant n. The output error e(n) is given by In stereophonic acoustic environment, the stereophonic
signals x1 (n) and x2 (n) are transmitted to louder speakers in
e(n) = d(n) − y(n) = d(n) − wT (n)x(n), (13) the receiving room and coupled to the microphones in this
room by the room impulse responses. In stereophonic acous-
where w(n) is the L × 1 column vector of the filter co- tic echo cancellation, these coupled acoustic echoes have to
efficients and x(n) is the L × 1 column vector x(n) = be cancelled. Let the receiving room impulse responses for
Q. Pan and T. Aboulnasr 5
From the description of MMax partial update in Section 3, 6. PROPOSED EXCLUSIVE MAXIMUM SELECTIVE-TAP
we know that the principle of MMax partial update algo- TIME-DOMAIN CONVOLUTIVE BSS ALGORITHM
rithm for single channel is to update the subset of coefficients As we already know from Section 4, exclusive maximum tap
which has the most impact on Δw. Our proposed MMax par- selection can reduce interchannel correlation and improve
tial update convolutive BSS algorithm is based on the same the conditioning of the input autocorrelation matrix. In this
principle. section, we examine the effect of tap selection on interchan-
In the MMax LMS algorithm [10], given Δw(n) = nel coherence reduction and extend this idea to our multi-
e(n)x(n), the e(n) is common to all elements of Δw(n), then channel blind source separation case.
the larger the |x(n − i)|, the larger its impact on error. Thus,
in MMax LMS algorithm, the coefficients corresponding to
M largest values in |x(n)| are updated. 6.1. Interchannel decorrelation by tap selection
However, in time-domain convolutive BSS, ΔW is as fol- The squared coherence function of x1 , x2 is defined as
lows:
2
Px ( f )
∂D T Cx1 x2 ( f ) = 1 x2
ΔW = −μ W W = μ I − E ϕ(y)yT W. (19) , (20)
∂W Px1 x1 ( f )Px2 x2 ( f )
Every element of W is an FIR filter and there is no common where Px1 x2 ( f ) is the cross-power spectrum between the two
value for all elements of ΔW. Based on MMax partial update mixtures x1 , x2 and f is the normalized frequency [11].
6 EURASIP Journal on Audio, Speech, and Music Processing
1 1
0.9
0.9
0.8
0.5
0.6
0.4
0.5 0.3
0.2
0.4
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Normalized frequency Normalized frequency
Figure 4: Squared coherence for x1 and x2 with full tap inputs se- Figure 5: Squared coherence for x1 and x2 with 50% MMax tap
lected. inputs selected.
H= ,
h21 h22 0.7
h11 = 1 0.8 −0.2 0.78 0.4 −0.2 0.1 , 0.6
(21)
h22 = 0.8 0.6 0.1 −0.1 0.3 −0.2 0.1 , 0.5
0.4
h12 = γh11 + (1 − γ)b,
h21 = γh22 + (1 − γ)b, 0.3
0.2
where b is an independent white Gaussian noise with zero
mean. 0.1
In the simulation, we set γ = 0.9 to reflect the high inter- 0
channel correlation found in practice between the observed 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
mixtures in a convolutive environment. The two-tap input Normalized frequency
signals s1 and s2 are generated as zero mean, unit variance
gamma signals. The mixtures x1 and x2 are obtained from Figure 6: Squared coherence for x1 and x2 with exclusive maximum
the following equations: tap inputs selected.
x1 = s1 ∗ h11 + s2 ∗ h12 ,
(22)
x2 = s1 ∗ h21 + s2 ∗ h22 ,
6.2. Proposed XM update algorithm for
where ∗ is convolution operation. time-domain convolutive BSS
The squared coherence for the x1 and x2 with full taps se-
lected is shown in Figure 4. In Figure 5, the squared coher- As a result of improved conditioning of input autocorrela-
ence for inputs with taps selected according to the MMax tion matrix, we expect improved convergence rate in time-
selection criterion as described in Section 4 is shown. We domain convolutive BSS when using this update algorithm
can see that the correlation is reduced, but not significantly. for a two-by-two blind source separation system.
Figure 6 shows the squared coherence for signals with exclu- Based on the exclusive maximum tap-selection scheme
sive tap selected, that is, the selection of the same tap index proposed in [11], we propose the exclusive maximum time-
in both channels is not permitted. We can see that the corre- domain convolutive BSS algorithm (XM BSS) as follows.
lation is reduced significantly. This confirms that exclusive Define p as the interchannel tap input magnitude differ-
tap-selection strategy does indeed reduce interchannel co- ence vector at time n as
herence and as such improves the conditioning of the input
autocorrelation matrix even in the mixing environment of
blind source separation case. p = x1 − x2 . (23)
Q. Pan and T. Aboulnasr 7
10
When the target signal in our simulations is a speech signal,
we will also use PESQ (perceptual evaluation of speech qual- 9
ity) as a measure confirming the quality of the separated sig-
8
nal. The PESQ standard [15] is described in the ITU-T P862
as a perceptual evaluation tool of speech quality. The key fea- 7
SIR
ture of the PESQ standard is that it uses a perceptual model
analogous to the assessment by the human auditory system. 6
The output of the PESQ is a measure of the subjective assess- 5
ment quality of the degraded signal and is rated as a value
between −0.5 and 4.5 which is known as the mean opinion 4
score (MOS). The larger the score, the better the speech qual- 3
ity.
2
0 1 2 3 4 5 6
Number of iterations ×104
8. SIMULATIONS
SIR1 reg SIR1 par48
8.1. Experiment setup SIR1 par56 SIR1 par32
In the following simulations, our source signals s1 and s2 are Figure 7: Separation performance of time-domain regular convo-
generated as gamma signals or speech signals. The gamma lutive BSS and MMax partial update BSS for gamma signal mea-
signals are generated with zero mean, unit variance. The sured by SIR for the first output.
speech signals used in our simulations include 3 female
speeches and 3 male speeches with sample rate 8000 Hz to
40
form 9 combinations. A simple mixing system is used in our
simulations to demonstrate and compare separation perfor- 35
mance.
The mixing system is given by 30
1.0 1.0 −0.75; −0.2 0.4 0.7 25
H= . (27)
SIR
8.2. MMax partial update time-domain BSS SIR2 reg SIR2 par48
SIR2 par56 SIR2 par32
algorithm for convolutive mixture
In this simulation, we test the performance of MMax par- Figure 8: Separation performance of time-domain regular convo-
tial update time-domain BSS algorithm for convolutive mix- lutive BSS and MMax partial update BSS for gamma signal mea-
tures. In the following diagram, “reg” means regular time- sured by SIR for the second output.
domain BSS algorithm; “par56” means MMax partial update
time domain BSS algorithm with M = 56; “par48” means
MMax partial update time-domain BSS algorithm with M = From these diagrams, we can see that as expected, the
48; “par32” means MMax partial update time-domain BSS MMax partial update convolutive BSS algorithm converges
algorithm with M = 32, where M is the number of coeffi- slightly slower than the regular BSS algorithm while only a
cients updated at each iteration in a given channel. subset of coefficients gets updated. However, it converges to
In the first experiment, we use generated gamma signals similar SIR values.
as the original signals and use (9) to get the mixture signals. In the second experiment, we use speech signals as the
The performance of regular time-domain convolutive BSS original signals and use the same mixing system to get the
algorithm and MMax partial update convolutive BSS algo- mixture signals. In Figures 9 and 10, we show the perfor-
rithm evaluated by the SIR measure defined in (26) is shown mance of regular time-domain convolutive BSS algorithm
in Figures 7 and 8. and MMax partial update BSS convolutive algorithm for one
Q. Pan and T. Aboulnasr 9
2
M = 32 out2) present separated signals from MMax BSS al-
1 gorithm with M = 32.
0 From Table 1, we can see that the separation performance
evaluated by PESQ is consistent with the SIR results. The sep-
−1
aration algorithms make the separated signals more biased to
−2 one source signal and away from the other source signal. The
−3
separation performance evaluated by PESQ and SIR is also
0 0.5 1 1.5 2 2.5 3 3.5 4 consistent with our informal listening tests.
Number of iterations ×104 From the above simulation results, we can see that sim-
ilar to MMax NLMS algorithm for single-channel linear fil-
SIR1 reg SIR1 par48
SIR1 par56 SIR1 par32 ters, there is a slight deterioration in performance of the pro-
posed MMax partial update time-domain convolutive BSS
algorithm as the number of updated coefficients is reduced.
Figure 9: Separation performance of time-domain regular convo-
lutive BSS and MMax partial update BSS for speech signal measured
However, the performance at 50% coefficients updated is still
by SIR for the first output. quite acceptable.
Table 1: Average PESQ scores for mixtures and separated signals from regular BSS algorithm and MMax BSS algorithm.
25 35
30
20
25
15 20
SIR
SIR
15
10
10
5
5
0
0 −5
0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
Number of iterations ×103 Number of iterations ×103
Figure 11: Separation performance of time-domain regular convo- Figure 13: Separation performance of time-domain regular convo-
lutive BSS and XM selective tap BSS for gamma signal measured by lutive BSS and XM selective tap BSS for speech signal measured by
SIR for the first output. SIR for the first output.
55 40
50
45 35
40
35 30
SIR
SIR
30
25 25
20
15 20
10
5 15
0 1 2 3 4 5 6 7 0 5 10 15 20 25 30 35 40 45 50
Number of iterations ×103 Number of iterations ×102
Figure 12: Separation performance of time-domain regular convo- Figure 14: Separation performance of time-domain regular convo-
lutive BSS and XM selective tap BSS for gamma signal measured by lutive BSS and XM selective tap BSS for speech signal measured by
SIR for the second output. SIR for the second output.
Q. Pan and T. Aboulnasr 11
Table 2: Average PESQ scores for mixtures and separated signals [5] S. C. Douglas and X. Sun, “Convolutive blind separation of
from regular BSS algorithm and XM BSS algorithm. speech mixtures using the natural gradient,” Speech Commu-
nication, vol. 39, no. 1-2, pp. 65–78, 2003.
Mixture Regular BSS Xmax BSS
PESQ [6] P. Smaragdis, “Blind separation of convolved mixtures in the
mix1 mix2 out1 out2 out1 out2 frequency domain,” Neurocomputing, vol. 22, no. 1–3, pp. 21–
S1 1.871 0.948 2.037 0.591 2.643 0.463 34, 1998.
S2 1.583 2.255 1.215 2.547 1.055 2.560 [7] L. Parra and C. Spence, “Convolutive blind separation of non-
stationary sources,” IEEE Transactions on Speech and Audio
Processing, vol. 8, no. 3, pp. 320–327, 2000.
[8] H. Sawada, R. Mukai, S. Araki, and S. Makino, “A robust
and precise method for solving the permutation problem of
signals from regular BSS algorithm; (XM BSS out1, out2) frequency-domain blind source separation,” IEEE Transactions
present separated signals from XM BSS. The performance on Speech and Audio Processing, vol. 12, no. 5, pp. 530–538,
evaluation by PESQ is consistent with that measured by SIR. 2004.
The separation performance evaluated by PESQ and SIR is [9] M. Z. Ikram and D. R. Morgan, “A beamforming approach to
also consistent with our informal listening tests. permutation alignment for multichannel frequency-domain
blind speech separation,” in Proceedings of IEEE Interna-
Based on the above simulation, we can see that XM BSS
tional Conference on Acoustics, Speech, and Signal Processing
algorithm significantly improves the convergence rate com- (ICASSP ’02), vol. 1, pp. 881–884, Orlando, Fla, USA, May
pared with regular time-domain convolutive BSS algorithm. 2002.
[10] T. Aboulnasr and K. Mayyas, “Complexity reduction of the
NLMS algorithm via selective coefficient update,” IEEE Trans-
9. CONCLUSION
actions on Signal Processing, vol. 47, no. 5, pp. 1421–1424,
1999.
In this paper, we investigate time-domain convolutive BSS [11] A. W. H. Khong and P. A. Naylor, “Stereophonic acous-
algorithm and propose two novel algorithms to address the tic echo cancellation employing selective-tap adaptive algo-
slow convergence rate and high computational complexity rithms,” IEEE Transactions on Audio, Speech and Language Pro-
problem in time-domain BSS. In the proposed MMax par- cessing, vol. 14, no. 3, pp. 785–796, 2006.
tial update time domain convolutive BSS algorithm (MMax [12] S. Werner, M. L. R. de Campos, and P. S. R. Diniz, “Partial-
BSS), only a subset of coefficients in the separation system update NLMS algorithms with data-selective updating,” IEEE
gets updated at every iteration. We show that the partial up- Transactions on Signal Processing, vol. 52, no. 4, pp. 938–949,
date scheme applied in the MMax LMS algorithm for single 2004.
channel can be extended to multichannel natural gradient- [13] I. Pitas, “Fast algorithms for running ordering and max/min
calculation,” IEEE Transactions on Circuits and Systems, vol. 36,
based time-domain convolutive BSS with little deterioration
no. 6, pp. 795–804, 1989.
in performance and possible computation complexity sav- [14] S. Makino, H. Sawada, R. Mukai, and S. Araki, “Blind source
ing. In the proposed exclusive maximum selective-tap time- separation of convolutive mixtures of speech in frequency
domain convolutive BSS algorithm (XM BSS), the exclusive domain,” IEICE Transactions on Fundamentals of Electronics,
tap-selection update procedure reduces the interchannel co- Communications and Computer Sciences, vol. E88-A, no. 7, pp.
herence of the tap-input vectors and improves the condition- 1640–1654, 2005.
ing of the autocorrelation matrix so as to accelerate conver- [15] ITU-T Recommend P.862, “Perceptual evaluation of speech
gence rate and reduce the misalignment. Moreover, the com- quality (PESQ), an objective method for end-to end speech
putational complexity is reduced as well since only half of quality assessment of narrowband telephone network and
tap inputs are selected for updating. Simulation results have speech codecs,” May 2000.
shown a significant improvement in convergence rate com-
pared with existing techniques. The extension of the pro-
posed XM BSS algorithm to more than two channels is still
an open problem.
REFERENCES
[1] S. Haykin, Ed., Unsupervised Adaptive Filtering, Volume 1:
Blind Source Separation, John Wiley & Sons, New York, NY,
USA, 2000.
[2] A. Cichocki and S. Amari, Adaptive Blind Signal and Image
Processing, John Wiley & Sons, New York, NY, USA, 2000.
[3] A. Hyvarinen, J. Karhunen, and E. Oja, Independent Compo-
nent Analysis, John Wiley & Sons, New York, NY, USA, 2001.
[4] S. Amari, S. C. Douglas, A. Cichocki, and H. H. Yang, “Mul-
tichannel blind deconvolution and equalization using the nat-
ural gradient,” in Proceedings of the 1st IEEE Signal Processing
Workshop on Signal Processing Advances in Wireless Communi-
cations (SPAWC ’97), pp. 101–104, Paris, France, April 1997.
Hindawi Publishing Corporation
EURASIP Journal on Audio, Speech, and Music Processing
Volume 2007, Article ID 85438, 15 pages
doi:10.1155/2007/85438
Research Article
Underdetermined Blind Audio Source Separation Using
Modal Decomposition
Départment TSI, École Nationale Supérieure des Télécommunications (ENST), 46 Rue Barrault,
75634 Paris Cedex 13, France
Received 1 July 2006; Revised 20 November 2006; Accepted 14 December 2006
This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals
and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal) components. Based on this
representation, we propose a two-step approach consisting of a signal analysis (extraction of the modal components) followed by
a signal synthesis (grouping of the components belonging to the same source) using vector clustering. For the signal analysis, two
existing algorithms are considered and compared: namely the EMD (empirical mode decomposition) algorithm and a parametric
estimation algorithm using ESPRIT technique. A major advantage of the proposed method resides in its validity for both instanta-
neous and convolutive mixtures and its ability to separate more sources than sensors. Simulation results are given to compare and
assess the performance of the proposed algorithms.
Copyright © 2007 Abdeldjalil Aı̈ssa-El-Bey et al. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
Note that this modal representation of the sources is a That is, for any index pair i = j ∈ N , where N =
particular case of signal sparsity often used to separate the {1, . . . , N }, vectors ai and a j are linearly independent. This
sources in the underdetermined case [23]. Indeed, a signal assumption is necessary because if otherwise, we have a2 =
given by a sum of sinusoids (or damped sinusoids) occupies αa1 for example, then the input/output relation (2) can be
only a small region in the time-frequency (TF) domain, that reduced to
is, its TF representation is sparse. This is illustrated by Fig- T
x(t) = a1 , a3 , . . . , aN s1 (t) + αs2 (t), s3 (t), . . . , sN (t) ,
ure 1 where we represent the time-frequency distribution of
(3)
a three-modal-component signal.
The paper is organized as follows. Section 2 formulates and hence the separation of s1 (t) and s2 (t) is inherently im-
the UBSS problem and introduces the assumptions necessary possible. This assumption is used later (in the clustering step)
for the separation of audio sources using modal decomposi- to separate the source modal components using their spatial
tion. Section 3 proposes two MD-UBSS algorithms for in- directions given by the column vectors of A.
stantaneous mixture case while Section 4 introduces a modi- It is known that BSS is only possible up to some scaling
fied version of MD-UBSS that relaxes the quasiorthogonality and permutation [3]. We take the advantage of these indeter-
assumption of the source modal components. In Section 5, minacies to further make the following assumption without
we extend our MD-UBSS algorithm to the convolutive mix- loss of generality.
ture case. Some discussions on the proposed methods are
given in Section 6. The performance of the above methods Assumption 2. The column vectors of A are of unit norm.
is numerically evaluated in Section 7. The last section is for
the conclusion and final remarks. That is, ai = 1 for all i ∈ N , where the norm hereafter
is given in the Frobenius sense.
As mentioned previously, solving the UBSS problem re-
2. PROBLEM FORMULATION IN THE
quires strong a priori assumptions on the source signals. In
INSTANTANEOUS MIXTURE CASE
our case, signal sparsity is considered in terms of modal rep-
The blind source separation model assumes the existence of resentation of the input signals as stated by the fundamental
N independent signals s1 (t), . . . , sN (t) and M observations assumption below.
x1 (t), . . . , xM (t) that represent the mixtures. These mixtures
are supposed to be linear and instantaneous, that is, Assumption 3. The source signals are sum of modal compo-
nents.
N
xi (t) = ai j s j (t), i = 1, . . . , M. (1) Indeed, we assume here that each source signal si (t) is a
j =1 j
sum of li modal components ci (t), j = 1, . . . , li , that is,
This can be represented compactly by the mixing equation
li
j
si (t) = ci (t), t = 0, . . . , T − 1, (4)
x(t) = As(t), (2) j =1
def j
where s(t) = [s1 (t), . . . , sN (t)]T is an N × 1 column vector where ci (t) are damped sinusoids or (quasi)harmonic sig-
collecting the real-valued source signals, vector x(t), simi- nals, and T is the sample size.
Abdeldjalil Aı̈ssa-El-Bey et al. 3
Assumption 4. The sources are quasiorthogonal, in the sense Algorithm 1: MD-UBSS algorithm in instantaneous mixture case
that using modal decomposition.
j j
c i | ci
j j ≈ 0, = (i , j ),
for (i, j) (5)
c c
i i Note that, by this method, each sensor output leads to an
estimate of the source signals. Therefore, we end up with M
where estimates for each source signal. As the quality of source sig-
−1
T
nal extraction depends strongly on the mixture coefficients,
j j def j j we propose a blind source selection procedure to choose the
c i | ci = ci (t)ci (t),
t =0 (6) “best” of the M estimates. This algorithm is summarized in
j 2 j Algorithm 1.
c = c | c j .
i i i
In the case of sinusoidal signals, the quasiorthogonality of 3.1. Modal component estimation
the modal components is nothing else than the Fourier qua-
siorthogonality of two sinusoidal components with distinct 3.1.1. Signal analysis using EMD
frequencies. This can be observed in the frequency domain
through the disjointness of their supports. This property is A new nonlinear technique, referred to as empirical mode de-
also preserved by filtering, which does not affect the fre- composition (EMD), has recently been introduced by Huang
quency support, and hence the quasiorthogonality assump- et al. for representing nonstationary signals as sum of zero-
tion of the signals (this is used later when considering the mean AM-FM components [16]. The starting point of the
convolutive case). EMD is to consider oscillations in signals at a very local level.
Given a signal z(t), the EMD algorithm can be summarized
as follows [17]:
3. MD-UBSS ALGORITHM
(1) identify all extrema of z(t). This is done by the algo-
Based on the previous model, we propose an approach in two rithm in [25];
steps consisting of the following. (2) interpolate between minima (resp., maxima), ending
up with some envelope emin (t) (resp., emax (t)). Several
(i) An analysis step interpolation techniques can be used. In our simula-
tion, we have used a spline interpolation as in [25];
In this step, one applies an algorithm of modal decompo- (3) compute the mean m(t) = (emin (t) + emax (t))/2;
sition to each sensor output in order to extract all the har- (4) extract the detail d(t) = z(t) − m(t);
monic components from them. We compare for this modal (5) iterate on the residual1 m(t) until m(t) = 0 (in prac-
components extraction two decomposition algorithms that tice, we stop the algorithm when m(t) ≤ , where
are the EMD (empirical mode decomposition) algorithm in- is a given threshold value).
troduced in [16, 17] and a parametric algorithm which esti-
mates the parameters of the modal components modeled as By applying the EMD algorithm to the ith mixture signal xi
l j
damped sinusoids. which is written as xi (t) = Nj=1 ai j s j (t) = Nj=1 k=1 ai j ckj (t),
one obtains estimates c kj (t) of components ckj (t) (up to the
(ii) A synthesis step scalar constant ai j ).
In this step, we group together the modal components corre-
3.1.2. Parametric signal analysis
sponding to the same source in order to reconstitute the orig-
inal signal. This is done by observing that all modal compo- In this section, we present an alternative solution for signal
nents of a given source signal “live” in the same spatial direc- analysis. For that, we represent the source signal as sum of
tion. Therefore, the proposed clustering method is based on
the component’s spatial direction evaluated by correlation of
the extracted (component) signal with the observed antenna 1 Indeed, the mean signal m(t) is also the residual signal after extracting the
signal. detail component d(t), that is, m(t) = z(t) − d(t).
4 EURASIP Journal on Audio, Speech, and Music Processing
damped sinusoids:
ai
li
j j t
si (t) = e αi zi , (7)
j =1
aj
corresponding to
j j j t
ci (t) = e αi zi , (8)
j j j
where αi = βi eθi represents the complex amplitude and Figure 2: Data clustering illustration, where we represent the dif-
j j j j
zi = edi +jωi is the jth pole of the source si , where di is the neg- j
ferent estimates ai and their centroids.
j
ative damping factor and ωi is the angular frequency. e(·)
represents the real part of a complex entity. We denote by Ltot
the total number of modal components, that is, Ltot = Ni=1 li . 3.2. Clustering and source estimation
For the extraction of the modal components, we pro-
pose to use the ESPRIT (estimation of signal parameters 3.2.1. Signal synthesis using vector clustering
via rotational invariance technique) algorithm that estimates
For the synthesis of the source signals, one observes that
the poles of the signals by exploiting the row-shifting in-
thanks to the quasiorthogonality assumption, one has
variance property of the D × (T − D) data Hankel matrix
def ⎡ j
⎤
[H (xk )]n1 n2 = xk (n1 +n2 ), D being a window parameter cho- x 1 | ci
j
sen in the range T/3 ≤ D ≤ 2T/3. x | ci def 1 ⎢ ⎢ ..
⎥
⎥
j 2 = j 2 ⎢ . ⎥ ≈ ai , (13)
More precisely, we use Kung’s algorithm given in [26] c c ⎣ ⎦
i i j
that can be summarized in the following steps: xM | ci
(1) form the data Hankel matrix H (xk );
(2) estimate the 2Ltot -dimensional signal subspace where ai represents the ith column vector of A. We can, then,
U(Ltot ) = [u1 , . . . , u2Ltot ] of H (xk ) by means of the SVD of associate each component c kj to a spatial direction (vector
H (xk ) (u1 , . . . , u2Ltot are the principal left singular eigenvec- column of A) that is estimated by
tors of H (xk ));
(3) solve (in the least-squares sense) the shift invariance x | c kj
akj
= 2 . (14)
equation c k
j
U(L
↓
tot )
Ψ = U(L
↑
tot ) (L )# (L )
⇐⇒ Ψ = U↓ tot U↑ tot , (9)
Vector akj would be equal approximately to ai (up to a
where Ψ = ΦΔΦ−1 , Φ being a nonsingular 2Ltot × 2Ltot ma- scalar constant) if c kj is an estimate of a modal component
trix, and Δ = diag(z11 , z11∗ , . . . , z1l1 , z1l1 ∗ , . . . , zNlN , zNlN ∗ ). (·)∗ rep- of source i. Hence, two components of a same source signal
resents the complex conjugation, (·)# denotes the pseudoin- are associated to colinear spatial direction of to the same col-
version operation, and arrows ↓ and ↑ denote, respectively, umn vector of A. Therefore, we propose to gather these com-
the last and the first row-deleting operator; ponents by clustering their directional vectors into N classes
(4) estimate the poles as the eigenvalues of matrix Ψ; (see Figure 2). For that, we compute first the normalized vec-
(5) estimate the complex amplitudes by solving the least- tors
squares fitting criterion k
akj e−jψ j
akj = k ,
(15)
min xk − Zαk 2 ⇐⇒ αk = Z# xk , (10) a j
αk
where xk = [xk (0), . . . , xk (T − 1)]T is the observation vector, where ψ kj is the phase argument of the first entry of akj (this is
Z is a Vandermonde matrix constructed from the estimated to force the first entry to be real positive). Then, these vectors
poles, that is, are clustered by k-means algorithm [24] that can be summa-
rized in the following steps.
Z = z11 , z11∗ , . . . , zl11 , zl11 ∗ , . . . , zlNN , zlNN ∗ , (11) (1) Place N points into the space represented by the vec-
tors that are being clustered. These points represent
j j j j
with zi = [1, zi , (zi )2 , . . . , (zi )T −1 ]T , and αk is the vector of initial group centroids. One popular way to start is to
complex amplitudes, that is, randomly choose N vectors among the set of vectors to
be clustered.
1 T (2) Assign each vector akj to the group (cluster) that has the
αk = ak1 α11 , ak1 α11∗ , . . . , ak1 αl11 ∗ , . . . , akN αlNN ∗ . (12)
2 closest centroid, that is, if y1 , . . . , yN are the centroids
Abdeldjalil Aı̈ssa-El-Bey et al. 5
of the N clusters, one assigns the vector akj to the cluster 3.3. Case of common modal components
i0 that satisfies
We consider here the case where a given component ckj (t) as-
i0 = arg min akj − yi . (16) sociated with the pole zkj can be shared by several sources.
i This is the case, for example, for certain musical signals such
as those treated in [27]. To simplify, we suppose that a com-
(3) When all vectors have been assigned, recalculate the
ponent belongs to at most two sources. Thus, let us sup-
positions of the N centroids in the following way: for
pose that the sinusoidal component (zkj )t is present in the
each cluster, the new centroid’s vector is calculated as
sources s j1 (t) and s j2 (t) with the amplitudes α j1 and α j2 , re-
the mean value of the cluster’s vectors.
spectively (i.e., one modal component of source s j1 (resp., s j2 )
(4) Repeat steps 2 and 3 until the centroids no longer
move. This produces a separation of the vectors into is e(α j1 (zkj )t ) (resp., e(α j2 (zkj )t ))). It follows that the spa-
N groups. In practice, in order to increase the conver- tial direction associated with this component is a linear com-
gence rate, one can also use a threshold value and stop bination of the column vectors a j1 and a j2 . More precisely, we
the algorithm when the difference between the new have
⎡ ⎤
and old centroid values is smaller than this threshold x1T zkj
1 ⎢ ⎢ ⎥
for all N clusters. ⎥
akj = k 2 ⎢ ... ⎥ ≈ α j1 a j1 + α j2 a j2 .
(19)
z ⎣ ⎦
Finally, one will be able to rebuild the initial sources up to j T k
xM zj
a constant by adding the various components within a same
class, that is, It is now a question of finding the indices j1 and j2 of the
two sources associated with this component, as well as the
j
si (t) = c i (t), (17) amplitudes α j1 and α j2 . With this intention, one proposes an
Ci approach based on subspace projection. Let us assume that
M > 2 and that matrix A is known and satisfies the condition
where Ci represents the ith cluster. that any triplet of its column vectors is linearly independent.
Consequently, we have
3.2.2. Source grouping and selection
P⊥A akj = 0, (20)
Let us notice that by applying the approach described
= [a j1 a j2 ], A
if and only if A being a matrix formed by a
previously (analysis plus synthesis) to all antenna outputs
x1 (t), . . . , xM (t), we obtain M estimates of each source sig- pair of column vectors of A and P⊥A represents the matrix of
nal. The estimation quality of a given source signal varies orthogonal projection on the orthogonal range space of A,
significantly from one sensor to another. Indeed, it depends that is,
strongly on the matrix coefficients and, in particular, on the −1
P⊥A = I − A HA
A H,
A (21)
signal-to-interference ratio (SIR) of the desired source. Con-
sequently, we propose a blind selection method to choose a where I is the identity matrix and (·)H denotes the transpose
“good” estimate among the M we have for each source signal. conjugate. In practice, by taking into account the noise, one
For that, we need first to pair the source estimates together. detects the columns j1 and j2 by minimizing
This is done by associating each source signal extracted from
the first sensor to the (M − 1) signals extracted from the j1 , j2 = arg min P⊥A akj | A
= al am . (22)
(M − 1) other sensors that are maximally correlated with it. (l,m)
The correlation factor of two signals s1 and s2 is evaluated by found, one estimates the weightings α j1 and α j2 by
|s1 | s2 |/ s1 s2 .
Once A
Once the source grouping is achieved, we propose to se- α j1
lect the source estimate of maximal energy, that is, #
=A akj . (23)
α j2
−1
T
j 2 In this paper, we treated all the components as being asso-
si (t) = arg max Ei
j
=
si (t) , j = 1, . . . , M , (18) ciated to two source signals. If ever a component is present
j
si (t) t =0 only in one source, one of the two coefficients estimated in
j
(23) should be zero or close to zero.
where Ei represents the energy of the ith source extracted In what precedes, the mixing matrix A is supposed to be
j
from the jth sensor si (t). One can consider other methods of known. This means that it has to be estimated before apply-
selection (based, e.g., on the dispersion around the centroid) ing a subspace projection. This is performed here by clus-
or instead, a diversity combining technique for the different tering all the spatial direction vectors in (14) as for the pre-
source estimates. However, the source estimates are very dis- vious MD-UBSS algorithm. Then, the ith column vector of
similarly in quality, and hence we have observed in our simu- A is estimated as the centroid of Ci assuming implicitly that
lations that the energy-based selection, even though not op- most modal components belong mainly to one source sig-
timal, provides the best results in terms of source estimation nal. This is confirmed by our simulation experiment shown
error. in Figure 11.
6 EURASIP Journal on Audio, Speech, and Music Processing
4. MODIFIED MD-UBSS ALGORITHM One will be able to rebuild the initial sources up to a constant
by adding the various modal components within a same class
We propose here to improve the previous algorithm with Ck as follows:
respect to the computational cost and the estimation accu-
racy when Assumption 4 is poorly satisfied.2 First, in order j j j t
M
H 5. GENERALIZATION TO THE CONVOLUTIVE CASE
H(x) = H xi H xi (24)
i=1 The instantaneous mixture model is, unfortunately, not valid
in real-life applications where multipath propagation with
and we apply steps 1 to 4 of Kung’s algorithm described large channel delay spread occurs, in which case convolutive
j
in Section 3.1.2 to obtain all the poles zi , i = 1, . . . , N, mixtures are considered.
j = 1, . . . , li . In this way, we reduce significantly the compu- Blind separation of convolutive mixtures and multi-
tational cost and avoid the problem of “best source estimate” channel deconvolution has received wide attention in vari-
selection of the previous algorithm. ous fields such as biomedical signal analysis and processing
Now, to relax Assumption 4, we can rewrite the data (EEG, MEG, ECG), speech enhancement, geophysical data
model as processing, and data mining [2].
In particular, acoustic applications are considered in sit-
Γz(t) = x(t), (25) uations where signals, from several microphones in a sound
def j j j j j field produced by several speakers (the so-called cocktail-
where Γ = [γ11 , γ11 , . . . , γlNN , γlNN ], γi = βi ejφi bi and γi = party problem) or from several acoustic transducers in an
j j j j
βi e−jφi bi , where bi is a unit norm vector representing the underwater sound field produced by engine noises of several
j ships (sonar problem), need to be processed.
spatial direction of the ith component (i.e., bi = ak if the
j def In this case, the signal can be modeled by the following
component (zi )t belongs to the kth source signal) and z(t) =
1 t 1∗ t lN t lN ∗ t T equation:
[(z1 ) , (z1 ) , . . . , (zN ) , (zN ) ] .
The estimation of Γ using the least-squares fitting crite-
K
rion leads to x(t) = H(k)s(t − k) + w(t), (31)
k=0
min X − ΓZ2 ⇐⇒ Γ = XZ# , (26)
Γ
where H(k) are M × N matrices for k ∈ [0, K] represent-
where X = [x(0), . . . , x(T − 1)] and Z = [z(0), . . . , z(T − 1)]. ing the impulse response coefficients of the channel. We con-
After estimating Γ, we estimate the phase of each pole as sider in this paper the underdetermined case (M < N). The
sources are assumed, as in the instantaneous mixture case,
jH j
j arg γi γi to be decomposable in a sum of damped sinusoids satisfy-
φi = . (27) ing approximately the quasiorthogonality Assumption 4. The
2
channel satisfies the following diversity assumption.
The spatial direction of each modal component is estimated
by Assumption 5. The channel is such that each column vector
j j
of
j j j j j
ai = γi e−jφi + γi ejφi = 2βi bi .
(28)
def
K
def
Finally, we group together these components by clustering H(z) = H(k)z−k = h1 (z), . . . , hN (z) (32)
j k=0
the vectors ai into N classes. After clustering, we obtain N
classes with N unit-norm centroids a1 , . . . , aN corresponding is irreducible, that is, the entries of hi (z) denoted by hi j (z),
to the estimates of the column vectors of the mixing matrix j = 1, . . . , M, have no common zero for all i. Moreover, any
j
A. If the pole zi belongs to the kth class, then according to two column vectors of H(z) form an irreducible polynomial
(28), its amplitude can be estimated by
matrix H(z),
that is, rank (H(z)) = 2 for all z.
j
j akT ai
Knowing that the convolution preserves the different
βi = . (29) modes of the signal, we can exploit this property to estimate
2
the different modal components of the source signals us-
ing the ESPRIT method considered previously in the instan-
2 This is the case when the modal components are closely spaced or for taneous mixture case. However, using the quasiorthogonal-
modal components with strong damping factors. ity assumption, the correlation of a given modal component
Abdeldjalil Aı̈ssa-El-Bey et al. 7
1
(1) Channel estimation; AIC criterion [30] to detect the
0
s1
−1
0 1 2 3 4 5 6
Algorithm 2: MD-UBSS algorithm in convolutive mixture case us-
Time (s)
ing modal decomposition.
100 k=0
One will be able to rebuild the initial sources up to a constant (v) SIMO versus MIMO channel estimation
by adding the various components within a same class using
We have opted here to estimate the channels using SIMO
(17).
techniques. However, it is also possible to estimate the chan-
Similar to the instantaneous mixture case, one modal
nels using overdetermined blind MIMO techniques by con-
component can be assigned to two or more source signals,
sidering the time slots where the number of sources is smaller
which relaxes the quasiorthogonality assumption and im-
than (M − 1) instead of using only those where the number of
proves the estimation accuracy at moderate and high SNRs
“effective” sources is one. The advantage of doing so would
(see Figure 9).
be the use of a larger number of time slots (see Figure 4).
The drawback resides in the fact that blind identification of
3 We minimize over the scalar α because of the inherent indeterminacy of
MIMO systems is more difficult compared to the SIMO case
the blind channel identification, that is, hi (z) is estimated up to a scalar and leads in particular to higher estimation error (see Fig-
constant as shown by Theorem 1. ure 12 for a comparative performance evaluation).
10 EURASIP Journal on Audio, Speech, and Music Processing
1 1 1 1
0.5 0.5 0.5 0.5
0 0 0 0
−0.5 −0.5 −0.5 −0.5
−1 −1 −1 −1
0 5 10 0 5 10 0 5 10 0 5 10
×103 ×103 ×103 ×103
1 1.5 1 1
0.5 1 0.5 0.5
0.5
0 0 0
0
−0.5 −0.5 −0.5 −0.5
−1 −1 −1 −1
0 5 10 0 5 10 0 5 10 0 5 10
×10 3 ×103 ×103 ×103
1 1 1
0.2
0.5 0.5 0.5
0.1
0 0 0 0
−0.5 −0.5 −0.5 −0.1
−0.2
−1 −1 −1
0 5 10 0 5 10 0 5 10 0 5 10
×103 ×103 ×103 ×103
Figure 5: Blind source separation example for 4 audio sources and 3 sensors in instantaneous mixture case: the upper line represents the
original source signals, the second line represents the source estimation by pseudoinversion of mixing matrix A assumed exactly known and
the bottom one represents estimates of sources by our algorithm using EMD.
−1 0
−2 −5
−3 −10
−4 −15
NMSE (dB)
NMSE (dB)
−5 −20
−6 −25
−7 −30
−8 −35
−9 −40
−10 −45
0 5 10 15 20 25 30 2 2.5 3 3.5 4 4.5 5 5.5 6
SNR (dB) Number of sources
EMD EMD
Parametric L = 30 Parametric L = 30
Pseudoinversion
Figure 6: NMSE versus SNR for 4 audio sources and 3 sensors in Figure 8: NMSE versus N for 3 sensors in instantaneous mixture
instantaneous mixture case: comparison of the performance of our case: comparison of the performance of our algorithm (EMD and
algorithm (EMD and ESPRIT) with those given by the pseudoin- ESPRIT) for N ∈ [2, . . . , 6].
version of mixing matrix A (assumed exactly known).
−6.5
−3
−7 −4
−5
−7.5
NMSE (dB)
−6
NMSE (dB)
−8 −7
−8.5 −8
−9
−9
−10
−9.5
10 15 20 25 30 35 40 −11
0 5 10 15 20 25 30 35 40
Number of components SNR (dB)
Parametric SNR = 30 dB Parametric with subspace projection
Parametric SNR = 10 dB Parametric
Figure 7: NMSE versus L for 4 audio sources and 3 sensors in in- Figure 9: NMSE versus SNR for 4 audio sources and 3 sensors in
stantaneous mixture case: comparison of the performance of our instantaneous mixture case: comparison of the performance of our
algorithm (ESPRIT) for L varying in the range [10, . . . , 40] with algorithm (EMD) and the same algorithm with subspace projec-
SNR = 10 dB and SNR = 30 dB. tion.
In other words, there exists an optimal choice of L that de- formance. However, the latter method is better in the overde-
pends on the signal type. termined case.
In Figure 8, we compare the separation performance loss In Figure 9, we compare the performance of our algo-
that we have when the number of sources increases from 2 rithm using ESPRIT with and without subspace projection.
to 6 in the noiseless case. For N = 2 and N = 3 (overde- One can observe that using the subspace projection leads
termined case), we estimate the sources by left inversion of to a performance gain at moderate and high SNRs. At low
the estimate of matrix A. In the underdetermined case, the SNRs, the performance is slightly degraded due to the noise
EMD and parametric-based algorithms present similar per- effect. Indeed, when a given component belongs “effectively”
12 EURASIP Journal on Audio, Speech, and Music Processing
−4 −10
−5
−6 −15
NMSE (dB)
NMSE (dB)
−7
−8 −20
−9
−10 −25
0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40
SNR (dB) SNR (dB)
Modified MD-UBSS
MD-UBSS with energy-based selection Figure 11: Mixing matrix estimation: NMSE versus SNR for 4
MD-UBSS with optimal selection speech sources and 3 sensors in instantaneous mixture case.
Figure 10: NMSE versus SNR for 4 audio sources and 3 sensors:
comparison of the performance of MD-UBSS algorithms with and
without quasiorthogonality assumption. −5
−10
−8 8. CONCLUSION
0 5 10 15 20 25 30 35 40
SNR (dB) This paper introduces a new blind separation method for
UBSS algorithm with CNMSE = −15 dB
audio-type sources using modal decomposition. The pro-
UBSS algorithm with CNMSE = −20 dB posed method can separate more sources than sensors and
UBSS algorithm with known H provides, in that case, a better separation quality than the
one obtained by pseudoinversion of the mixture matrix (even
Figure 13: NMSE versus SNR for 4 audio sources and 3 sensors in
if the latter is known exactly) in the instantaneous mixture
convolutive mixture case: comparison, for the MD-UBSS algorithm case. The separation method proceeds in two steps: an anal-
in convolutive mixture case, when the channel response H is known ysis step where all modal components are estimated followed
or disturbed by Gaussian noise for different values of CNMSE. by a synthesis step to group (cluster) together the modal
components and reconstruct the source signals. For the sig-
nal analysis step, two algorithms are used and compared
based, respectively, on the EMD and on the ESPRIT tech-
−2 niques. A modified MD-UBSS as well as a subspace projec-
tion approach are also proposed to relax the “quasiorthog-
−3
onality” assumption and allow the source signals to share
common modal components, respectively. This approach
−4
leads to a performance improvement of the separation qual-
ity. For the convolutive mixture case, we propose to use again
NMSE (dB)
[7] A. Belouchrani and J. F. Cardoso, “A maximum likelihood [23] P. D. O’Grady, B. A. Pearlmutter, and S. T. Rickard, “Survey
source separation for discrete sources,” in Proceedings of the 7th of sparse and non-sparse methods in source separation,” In-
European Signal Processing Conference (EUSIPCO ’94), vol. 2, ternational Journal of Imaging Systems and Technology, vol. 15,
pp. 768–771, Scotland, UK, September 1994. no. 1, pp. 18–33, 2005.
[8] J. M. Peterson and S. Kadambe, “A probabilistic approach [24] I. E. Frank and R. Todeschini, The Data Analysis Handbook,
for blind source separation of underdetermined convolutive Elsevier Science, Amsterdam, The Netherlands, 1994.
mixtures,” in Proceedings of IEEE International Conference on
[25] G. Rilling, P. Flandrin, and P. Gonçalvès, “Empirical mode de-
Acoustics, Speech and Signal Processing (ICASSP ’03), vol. 6, pp.
composition,” http://perso.ens-lyon.fr/patrick.flandrin/emd.
581–584, Hong Kong, April 2003.
html.
[9] S. Y. Low, S. Nordholm, and R. Togneri, “Convolutive blind
signal separation with post-processing,” IEEE Transactions on [26] S. Y. Kung, K. S. Arun, and D. V. Bhaskar Rao, “State space and
Speech and Audio Processing, vol. 12, no. 5, pp. 539–548, 2004. singular value decomposition based on approximation meth-
[10] L. C. Khor, W. L. Woo, and S. S. Dlay, “Non-sparse approach ods for harmonic retrieval,” Journal of the Optical Society of
to underdetermined blind signal estimation,” in Proceedings of America, vol. 73, no. 12, pp. 1799–1811, 1983.
IEEE International Conference on Acoustics, Speech and Signal [27] J. Rosier and Y. Grenier, “Unsupervised classification tech-
Processing (ICASSP ’05), vol. 5, pp. 309–312, Philadelphia, Pa, niques for multipitch estimation,” in Proceedings of the 116th
USA, March 2005. Convention of the Audio Engineering Society (AES ’04), Berlin,
[11] P. Georgiev, F. Theis, and A. Cichocki, “Sparse component Germany, May 2004.
analysis and blind source separation of underdetermined mix- [28] Y. Huang, J. Benesty, and J. Chen, “A blind channel
tures,” IEEE Transactions on Neural Networks, vol. 16, no. 4, pp. identification-based two-stage approach to separation and
992–996, 2005. dereverberation of speech signals in a reverberant environ-
[12] I. Takigawa, M. Kudo, and J. Toyama, “Performance analysis ment,” IEEE Transactions on Speech and Audio Processing,
of minimum 1 -norm solutions for underdetermined source vol. 13, no. 5, part 2, pp. 882–895, 2005.
separation,” IEEE Transactions on Signal Processing, vol. 52, [29] B. Albouy and Y. Deville, “Alternative structures and power
no. 3, pp. 582–591, 2004. spectrum criteria for blind segmentation and separation of
[13] N. Linh-Trung, A. Belouchrani, K. Abed-Meraim, and B. convolutive speech mixtures,” in Proceedings of the 4th Inter-
Boashash, “Separating more sources than sensors using time- national Symposium on Independent Component Analysis and
frequency distributions,” EURASIP Journal on Applied Signal Blind Signal Separation (ICA ’03), pp. 361–366, Nara, Japan,
Processing, vol. 2005, no. 17, pp. 2828–2847, 2005. April 2003.
[14] Ö. Yilmaz and S. Rickard, “Blind separation of speech mix-
[30] M. Wax and T. Kailath, “Detection of signals by information
tures via time-frequency masking,” IEEE Transactions on Sig-
theoretic criteria,” IEEE Transactions on Acoustics, Speech, and
nal Processing, vol. 52, no. 7, pp. 1830–1846, 2004.
Signal Processing, vol. 33, no. 2, pp. 387–392, 1985.
[15] Y. Li, S.-I. Amari, A. Cichocki, D. W. C. Ho, and S. Xie, “Un-
derdetermined blind source separation based on sparse rep- [31] G. Xu, H. Liu, L. Tong, and T. Kailath, “A least-squares ap-
resentation,” IEEE Transactions on Signal Processing, vol. 54, proach to blind channel identification,” IEEE Transactions on
no. 2, pp. 423–437, 2006. Signal Processing, vol. 43, no. 12, pp. 2982–2993, 1995.
[16] N. E. Huang, Z. Shen, S. R. Long, et al., “The empirical mode [32] A. Aı̈ssa-El-Bey, M. Grebici, K. Abed-Meraim, and A. Be-
decomposition and the Hilbert spectrum for nonlinear and louchrani, “Blind system identification using cross-relation
non-stationary time series analysis,” Proceedings of the Royal methods: further results and developments,” in Proceedings of
Society of London. Series A, vol. 454, no. 1971, pp. 903–995, the 7th International Symposium on Signal Processing and Its
1998. Applications (ISSPA ’03), vol. 1, pp. 649–652, Paris, France,
[17] P. Flandrin, G. Rilling, and P. Gonçalvès, “Empirical mode de- July 2003.
composition as a filter bank,” IEEE Signal Processing Letters, [33] R. Ahmad, A. W. H. Khong, and P. A. Naylor, “Proportionate
vol. 11, no. 2, part 1, pp. 112–114, 2004. frequency domain adaptive algorithms for blind channel iden-
[18] R. Boyer and K. Abed-Meraim, “Audio modeling based on de- tification,” in Proceedings of IEEE International Conference on
layed sinusoids,” IEEE Transactions on Speech and Audio Pro- Acoustics, Speech and Signal Processing (ICASSP ’06), vol. 5, pp.
cessing, vol. 12, no. 2, pp. 110–120, 2004. 29–32, Toulouse, France, May 2006.
[19] J. Nieuwenhuijse, R. Heusens, and Ed. F. Deprettere, “Robust [34] L. De Lathauwer, B. De Moor, and J. Vandewalle, “ICA tech-
exponential modeling of audio signals,” in Proceedings of IEEE niques for more sources than sensors,” in Proceedings of the
International Conference on Acoustics, Speech and Signal Pro- IEEE Signal Processing Workshop on Higher-Order Statistics, pp.
cessing (ICASSP ’98), vol. 6, pp. 3581–3584, Seattler, Wash, 121–124, Caesarea, Israel, June 1999.
USA, May 1998.
[35] J. Jensen and R. Heusdens, “A comparison of sinusoidal model
[20] D. Nuzillard and J.-M. Nuzillard, “Application of blind source
variants for speech and audio representation,” in Proceedings
separation to 1-D and 2-D nuclear magnetic resonance spec-
of the 11th European Signal Processing Conference (EUSIPCO
troscopy,” IEEE Signal Processing Letters, vol. 5, no. 8, pp. 209–
’02), vol. 1, pp. 479–482, Toulouse, France, September 2002.
211, 1998.
[21] H. Park, S. Van Huffel, and L. Elden, “Fast algorithms for ex- [36] M. Z. Ikram, “Blind separation of delayed instantaneous mix-
ponential data modeling,” in Proceedings of IEEE International tures: a cross-correlation based approach,” in Proceedings of the
Conference on Acoustics, Speech, and Signal Processing (ICASSP 2nd IEEE International Symposium on Signal Processing and In-
’94), vol. 4, pp. 25–28, Adelaide, SA, Australia, April 1994. formation Technology (ISSPIT ’02), Marrakesh, Morocco, De-
[22] C. Serviere, V. Capdevielle, and J.-L. Lacoume, “Separation cember 2002.
of sinusoidal sources,” in Proceedings of IEEE Signal Process- [37] W. Qiu and Y. Hua, “Performance comparison of subspace and
ing Workshop on Higher-Order Statistics, pp. 344–348, Banff, cross-relation methods for blind channel identification,” Sig-
Canada, July 1997. nal Processing, vol. 50, no. 1-2, pp. 71–81, 1996.
Abdeldjalil Aı̈ssa-El-Bey et al. 15