Beruflich Dokumente
Kultur Dokumente
Abstract: To make audio watermarking accomplish both copyright protection and content authentication with localization, a
novel multipurpose audio watermarking scheme is proposed in this paper. The zero-watermarking idea is introduced into the
design of robust watermarking algorithm to ensure the transparency and to avoid the interference between the robust watermark
and the semi-fragile watermark. The property of natural audio that the VQ indices of DWT-DCT coefficients among neighboring
frames tend to be very similar is utilized to extract essential feature from the host audio, which is then used for watermark ex-
traction. And, the chaotic mapping based semi-fragile watermark is embedded in the detail wavelet coefficients based on the
instantaneous mixing model of the independent component analysis (ICA) system. Both the robust and semi-fragile watermarks
can be extracted blindly and the semi-fragile watermarking algorithm can localize the tampering accurately. Simulation results
demonstrate the effectiveness of our algorithm in terms of transparency, security, robustness and tampering localization ability.
tion, while the semi-fragile watermark is embedded in where d(v, bj) is the distortion between the input
the high frequency range by quantizing single coef- vector v and the codeword bj, and it can be calculated
ficient. Both the robust watermark and the as follows:
semi-fragile watermark can be extracted without host k −1
audio. But, the semi-fragile watermarking scheme can d (v , b j ) = ∑ (vl − b jl ) 2 . (2)
not achieve tampering localization. A wavelet domain l =0
VECTOR QUANTIZATION AND INDEPENDENT where s is the estimation of source signals vector.
COMPONENT ANALYSIS
quence with the same size of cDmAn; finally, embed ⎛ x1 ⎞ ⎛ a11 a12 ⎞⎛ s1 ⎞
the watermark in cDmAn based on the instantaneous X = ⎜ ⎟ = AS = ⎜ ⎟⎜ ⎟ , (7)
⎝ x2 ⎠ ⎝ a21 a22 ⎠⎝ wF ⎠
mixing model of ICA scheme (Fig.1).
s
where A is the mixing matrix. To ensure the trans-
k1
Wavelet decomposition
parency, we make a11>>a12, a21>>a22. And, when the
Chaotic map
cDm cAm
values of a11 and a21 are fixed, the increasing of the
Wavelet decomposition values of a12 and a22 will lead to the increase of ro-
wF cDm−1
cDm An cDm Dn bustness and the decrease of transparency.
k2 Instantaneous mixing (4) Replace s1 with x1. Perform two-stage
cD1
cDm D1 wavelet reconstruction with x1 and the other wavelet
Wavelet reconstruction
Semi-fragile coefficients cDmDn,…,cDmD1,cAm,cDm−1,…,cD1 to
watermark get the watermarked audio signal s′. And keep x2 as
embedding Wavelet reconstruction
s′ the secret key k2 for watermark extraction.
Robust Segmentation
watermark F Robust watermark embedding process
embedding Wavelet decomposition For natural audio signal, the VQ indices of
AH DWT-DCT coefficients among neighboring frames
Discrete cosine transform
tend to be very similar, so we make use of this prop-
AHC
VQ erty to generate a polarity P. Then the robust water-
wR Y mark is embedded in the secret key by performing
Generate binary pattern
P exclusive-or (XOR) operation between the polarity P
k3 XOR and the watermark.
Let wR be the robust watermark, which is a bi-
Fig.1 Watermark embedding process
nary-valued image of size M·N. The specific embed-
ding procedures can be described as follows:
The specific procedures are as follows: (1) Segment s′ into M·N equal frames, denoted as
(1) Perform m-level wavelet decomposition on F={f(i)|i=1,…,M·N}.
the host audio s to get the coarse coefficients cAm and (2) Perform H-level wavelet decomposition on
detail coefficients cDm,…,cD1; and then perform each frame f(i) to get its coarse signal AiH and detail
n-level wavelet decomposition on cDm to get its
coarse coefficients cDmAn, denoted as s1={s1(i)| signals DiH , Di H −1 , , Di1 . To take advantage of low
i=1,…,LF}, and detail coefficients cDmDn,…,cDmD1. frequency coefficient which has a high energy value
(2) Generate a chaotic sequence c={c(i) | i= and robust against various signal processing ma-
1,…,LF} based on logistic map [Eq.(5)] with the ini- nipulations, the DCT is only performed on AiH to get
tial value k1∈(0, 1), AiHC.
Ai HC = DCT ( Ai H ). (8)
c(i + 1) = 3.6 ⋅ c(i )[1 − c(i)], (5)
(3) Perform LBG algorithms (Gersho and Gray,
and map c into a semi-fragile watermark wF={wF(i)| 1992) on AHC={AiHC|i=1,…,M·N} to get an L-level
i=1,…,LF} with codebook CB={cb1,…,cbL} and then perform VQ on
AHC to get the indices vector Y:
⎧−1, if c(i ) ≥ 0.5,
wF (i ) = ⎨ (6) M ⋅N M ⋅N
(3) Embed wF into the wavelet coefficients s1 (4) Generate a polarity P as follows. First, cal-
based on the instantaneous mixing model of ICA as culate the variance of y(i) and its surrounding indices
follows: with
520 Chen et al. / J Zhejiang Univ Sci A 2008 9(4):517-523
⎛ 1 q =i +1
2
⎞ on the secret key k1.
1 q =i +1 2
σ 2 (i ) = ∑
3 q =i −1
y (q) − ⎜ ∑ y (q ) ⎟ . (10)
⎝ 3 q =i −1 ∑
LF
⎠ wF (i ) ⋅ wˆ F (i )
λ= i =1
1/ 2
. (15)
Then, generate a polarity P based on σ2(i) with ⎡ ∑ LF wF (i ) 2 ⋅ ∑ LF wˆ F (i ) 2 ⎤
⎣ i =1 i =1 ⎦
M ⋅N
P= ∪ p(i),
i =1
(11)
k1 k2
ŝ
Two-stage wavelet
decomposition
⎧⎪1, if σ 2 (i ) ≥ median[σ 2 (i )], Chaotic map cD m ( An )′
p (i ) = ⎨ i (12)
⎪⎩0, otherwise. FastICA
wF ŵF
Cross-correlation coefficient
(5) Perform XOR operation on wR and P to get λ
the secret key k3={k3(l)|l=1,…,M·N}. And λ >τ ? No Tampering
Semi-fragile localization
watermark Yes
k3 (l ) = wR (i, j ) ⊕ p(l ), l = N (i − 1) + j. (13) extraction
Audio is not tampered
Robust ŝ
Finally, to prevent the attackers from embedding watermark Generate binary pattern k3
extraction P′
another watermark in the same host audio signal and
ŵR XOR
claiming the copyright, the host audio, the secret keys
and the corresponding digital timestamp should be Fig.2 Watermark extraction process
registered or associated with an authentication center
for copyright demonstration. Then, the subsequent work can be classified into
In the watermarking detection procedure, first two types according to the value of λ:
generate a polarity P′ from the test audio sˆ, and then (1) |λ|≥τ, where τ is a predetermined threshold
get the estimated robust watermark, denoted as wˆ R , (The decreasing of τ will lead to the increase of
by performing XOR operation between P ′ and k3. See false-alarm probability and the decrease of false
Fig.2. dismissal probability). In this case, we believe that the
test audio was secure and there is no tampering ex-
Semi-fragile watermark extraction process isting in it.
The semi-fragile watermark can be extracted as (2) |λ|<τ. In this case, we believe that the test
follows (Fig.2). audio has been tampered with and the next step is to
(1) Perform two-stage wavelet decomposition on localize the tampering. First replace all the wavelet
the test audio ŝ to get the wavelet coefficients coefficients generated in two-stage wavelet decom-
cD m (An )′, denoted as s1′. position, except for cD m (An )′, with zero vectors of
corresponding length and then perform two-stage
(2) Generate the mixture observations X̂ as wavelet reconstruction on them. Thus, the tampering
can be localized accurately in the time domain. Now,
⎛s′⎞
Xˆ = ⎜ 1 ⎟ , (14) let us see an example. Fig.3a shows a piece of speech
⎜k ⎟ signal with embedded semi-fragile watermark. To test
⎝ 2⎠
the tampering localization performance of the pro-
and perform the FastICA algorithm (Hyvarinen, 1999) posed semi-fragile watermarking algorithm, the wa-
on X̂ to get the estimated semi-fragile watermark termarked speech is tampered with as follows: (1) The
samples from 1 to 21 500 are set to zero; (2) The
wˆ F .
samples from 205 000 to 228 000 are replaced with the
(3) Calculate the cross-correlation coefficients λ samples of another speech signal. Then, the tampered
between ŵF and the original semi-fragile watermark speech signal and the tampering localization result are
wF, which is obtained by performing Eqs.(5) and (6) shown in Figs.3b and 3c, respectively.
Chen et al. / J Zhejiang Univ Sci A 2008 9(4):517-523 521
1 60
SNR (dB)
0 (a) 40
Normalized amplitude
−1 (a)
20
1
0 (b) 0
0 20 40 60 80 100 120 140 160 180 201
−1
Music signals
1
60
0 (c)
SNR (dB)
−1 40
0 1 2 3 4 5 (b)
20
Sample index (×105)
Fig.3 Tampering localization test results of the speech 0
0 20 40 60 80 100 120 140 160 180 201
signal. (a) Watermarked speech; (b) Tampered water-
Speech signals
marked speech; (c) Tampering localization result
Fig.4 Transparency test results of 201 pieces of signals.
(a) Test results of music signals; (b) Test results of
speech signals
SIMULATION RESULTS AND ANALYSIS
Security test
A piece of music and a piece of speech of To test the security of the proposed scheme, we
524 288 samples, with 16 bits signed and sampled at attempted to extract the watermarks from non-wa-
44.1 kHz were taken as the host audios. The pa- termarked audio signals with the keys needed for
rameter values used in this study were: M=N=64, m=3, extracting wR and wF from the watermarked audio
n=2, H=4, L=8, a11=0.99, a12=0.01, a21=0.98, a22=0.02, signal. In addition to the watermarked music (wa-
τ=0.9. Furthermore, the signal-to-noise ratio (SNR) termarked speech), another 200 pieces of music sig-
and the normalized cross-correlation (NC) were em- nals (speech signals) without watermarks were used
ployed to measure the transparency and robustness of in this study. The test results, including NC and
the proposed scheme, respectively: cross-correlation coefficients, are shown in Figs.5 and
6 for music signals and speech signals, respectively.
⎛ Ls Ls
⎞ The peaks in these figures correspond to the water-
SNR( s, s ′) = 10 ⋅ lg ⎜ ∑ s 2 (i ) ∑ ( s(i) − s′(i) )
2
⎟ , (16)
⎝ i =1 i =1 ⎠ marked music and watermarked speech. Namely, the
M N proposed technique can correctly extract watermarks
∑∑ w
i =1 j =1
R (i, j ) ⋅ wˆ R (i, j ) from the matched audio and keys, while avoiding
NC ( wR , wˆ R ) = , (17) false watermark estimation from the unmatched
M N M N
∑∑ w
i =1 j =1
2
R (i, j ) ⋅ ∑∑ wˆ
i =1 j =1
2
R (i, j ) audios, so it achieves great security.
1.0
where Ls is the length of s. 0.8 (a)
0.6
NC
0.4
Transparency test 0.2
0
The SNR between the original host music (resp. 1.0
Cross-correlation
0.6
was 40.4434 dB (resp. 33.8065 dB). It was difficult 0.4
for human ear to distinguish between them. To verify 0.2
0
the stability of transparency, the transparency test was 0 50 100 150 201
performed on another 200 pieces of music signals and Music signals
speech signals. The results shown in Fig.4 verify that Fig.5 Security test results of 201 music signals. (a) Ro-
the proposed scheme has good and stable transpar- bust watermarking security; (b) Semi-fragile water-
ency. marking security
522 Chen et al. / J Zhejiang Univ Sci A 2008 9(4):517-523
0.5
tampering localization results, the manipulations
0
shown in Table 1 were performed on the watermarked
1.0 audio signals and the tampered ones, and then the
Cross-correlation
robust watermarking algorithm against the attacks After tampering After tampering
1.0
provided by practical audio watermarking evaluation 0.8
tool “Stirmark for Audio v0.2” was compared with
0.6
that of the scheme proposed in (Wang and Zhao,
2006). The comparison results summarized in Table 2 0.4
indicate that the performance of our method is better 0.2
than that of the scheme proposed by Wang and Zhao 0
(2006). (a) (b) (c) (d) (e) (f) (g) (h) (a) (b) (c) (d) (e) (f) (g) (h)
Manipulations Manipulations
(A) (B)
Tampering localization test
Fig.8 Tampering localization under commonly used
The tampering localization test result shown in audio signal processing manipulations. (A) Results for
Fig.3 indicates that the adopted semi-fragile water- speech signal; (B) Results for music signal
Table 2 Comparison of robustness against attacks Hyvarinen, A., 1999. Fast and robust fixed-point algorithms
provided by “Stirmark for Audio v02” for independent component analysis. IEEE Trans. on
Music Speech Neural Networks, 10(3):626-634. [doi:10.1109/72.761722]
Attacks Ji, Z., Xiao, W.W., Wang, J.H., Zhang, J.H., 2003. A multiple
Ours Wang’s Ours Wang’s
watermarking algorithm for digital image based on cha-
addnoise_900 0.7630 0.5208 0.9301 0.8909
otic sequences. Chin. J. Computers, 26(11):1555-1561
addsinus 0.9055 0.7084 0.7316 0.8821 (in Chinese).
compressor 0.7295 0.2689 0.9940 0.5553 Li, J., Liu, F.L., 2007. Double Zero-Watermarks Scheme
fft_real_reverse 0.9797 0.9856 1.0000 0.9991 Utilizing Scale Invariant Feature Transform and Log-
zero_cross 0.9739 0.9905 0.8830 0.4200 polar Mapping. Proc. IEEE Int. Conf. on Multimedia and
smooth 0.9716 0.3475 0.9957 0.9991 Expo, p.2118-2121.
smooth2 0.8995 0.3378 0.9948 0.9389 Lu, C.S., Mark Liao, H.Y., Chen, L.H., 2000. Multipurpose
stat1 0.9492 0.3181 0.9285 0.4200 Audio Watermarking. Proc. 15th Int. Conf. on Pattern
stat2 0.9941 0.9023 0.9974 1.0000 Recognition, p.282-285.
Lu, Z.M., Xu, D.G., Sun, S.H., 2005. Multipurpose image
watermarking algorithm based on multistage vector
quantization. IEEE Trans. on Image Processing, 14(6):
CONCLUSION AND FUTURE WORK 822-831. [doi:10.1109/TIP.2005.847324]
Lu, Z.M., Zheng, W.M., Pan, J.S., Sun, Z., 2006. Multipurpose
image watermarking method based on mean-removed
A novel multipurpose audio watermarking vector quantization. J. Inf. Assur. Secur., 1(1):33-42.
scheme is proposed in this paper. The robust and Ma, X.H., Liang, Z.J., Yin, F.L., 2006. A Digital Audio Wa-
semi-fragile watermarks are embedded in the host termarking Scheme Based on Independent Component
audio simultaneously to provide copyright protection Analysis and Singular Value Decomposition. Proc. Fifth
as well as content authentication with localization for Int. Conf. on Machine Learning and Cybernetics,
it. Compared with available multipurpose audio wa- p.2434-2438. [doi:10.1109/ICMLC.2006.258775]
Mintzer, F., Braudaway, G.W., 1999. If One Watermark Is
termarking algorithms, the advantages of the pro-
Good, Are More Better? Proc. IEEE Int. Conf. on
posed algorithm include: (1) The robust watermark Acoustics, Speech, and Signal Processing, 4:2067-2069.
and the semi-fragile watermark can be embedded Sang, J., Liao, X.F., Alam, M.S., 2006. Neural-network-based
without considering embedding order and be ex- zero-watermark scheme for digital images. Opt. Eng.,
tracted independently; (2) Both the robust and 45(9):097006-1-097006-9. [doi:10.1117/1.2354076]
semi-fragile watermarking achieve satisfactory secu- Wang, R.D., Xu, D.W., Li, Q., 2005. Multiple Audio Water-
rity; (3) Both the robust and the semi-fragile water- marks Based on Lifting Wavelet Transform. Proc. IEEE
Int. Conf. on Machine Learning and Cybernetics,
marking have certain robustness against audio signal
p.1959-1964. [doi:10.1109/ICMLC.2005.1527266]
processing manipulations; (4) The semi-fragile wa- Wang, X.Y., Zhao, H., 2006. A novel synchronization in-
termarking can localize tampering accurately. How- variant audio watermarking scheme based on DWT and
ever, there are still some issues that deserve further DCT. IEEE Trans. on Signal Processing, 54(12):4835-
exploration: (1) The semi-fragile watermarking algo- 4840. [doi:10.1109/TSP.2006.881258]
rithm cannot tell what kind of tampering the water- Wang, X.Y., Cui, Y.R., Yang, H.Y., Zhao, H., 2004. A New
marked audio suffers from; (2) The psychoacoustic Content-based Digital Audio Watermarking Algorithm
for Copyright Protection. Proc. Third Int. Conf. on In-
model is not adopted in the proposed scheme.
formation Security, 85:62-68.
Xiong, S.H., Zhou, J.L., He, K., Lang, F.N., 2006. A Multi-
References purpose Image Watermarking Method Based on Adaptive
Cao, H.Q., Xiang, H., Li, X.T., Liu, M., Yi, S., Wei, F., 2006. Quantization of Wavelet Coefficients. Proc. First Int.
A Zero-Watermarking Algorithm Based on DWT and Multi-Symp. on Computer and Computational Sciences,
Chaotic Modulation. Proc. SPIE, 6247:624716-1- 1:294-297. [doi:10.1109/IMSCCS.2006.14]
624716-9. [doi:10.1117/12.663927] Yuan, J., Cui, G.H., Zhang, Y.J., 2006. A Practical Multipur-
Ding, X.Y., 2006. Study on the Digital Audio Watermarking pose Color Image Watermarking Algorithm for Copy-
Scheme Based on Blind Source Separation. MS Thesis, right Protection and Image Authentication. Proc. Int.
School of Electronic and Information Engineering, Dalian Conf. on Digital Telecommunications, p.72-75. [doi:10.
University of Technology, p.26-35 (in Chinese). 1109/ICDT.2006.10]
Gersho, A., Gray, R.M., 1992. Vector Quantization and Signal
Compression. Kluwer Academic Publishers, Boston.