(SG & WM Project 1) - Session 1&2.160-170

Spread Spectrum-Based Multi-bit Watermarking
for Free-View Video
Huawei Tian1,2, Zheng Wang1,2, Yao Zhao1,2, Rongrong Ni1,2, and Lunming Qin1,2
1
Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China
2
Beijing Key Laboratory of Advanced Information Science and Network Technology,
Beijing 100044, China
hwtian@live.cn
Abstract. In Free-View Television (FTV) system, the user can freely generate a
realistic arbitrary view of a scene from a number of original views. The copy-
right problem for free-view video content has been produced in the emerging FTV
system. In this paper, we propose a spread spectrum-based multibit watermarking
scheme for free-view video. The same watermark sequence is embedded into
every frame of multiple views. The watermarking extraction is carried out in the
DCT domain of virtual frame generated for an arbitrary view. Experimental
results show that the watermark in FTV not only can be resistant to common
signal processing but also can be detected from the virtual view generated for an
arbitrary view.
Keywords: Free-view television, light field rendering, multi-view video,

multi-bit watermarking, 3-D watermarking.
1 Introduction
Digital media widely spread along with the prosperity of information science and
Internet technology. However, convenient manipulation and unrestricted copying of
digital media bring on a considerable financial loss to the content providers and the
media creators. Digital watermarking is introduced to prevent the above infringement.
Mono-view video watermarking has been widely studied [1-3] as a popular and
powerful technique of copyright protection in the video transmission and processing.
It embeds copyright information in the mono-view video. The ownership of the video
can be verified by detecting the embedded copyright information.
Recently, generating a realistic arbitrary view of a scene from a number of original
views has become faster and cheaper with the advances in image based rendering (IBR)
[4]. One of the main applications is FTV, where viewers can select freely the viewing
position and angle via IBR on the transmitted multi-view video. As in previous
copyright problems for mono-view video, the copyright problem for multi- view video
can also be treated by the using of watermarking. However, there are more challenging
requirements, compared to well-studied mono-view video watermarking [5]. The
owner of the multi-view video should prove his/her ownership, not only on
Y.Q. Shi, H.J. Kim, and F. Perez-Gonzalez (Eds.): IWDW 2011, LNCS 7128, pp. 156–166, 2012.
© Springer-Verlag Berlin Heidelberg 2012
157 H. TianSpread
et al. Spectrum-Based Multi-bit Watermarking for Free-View Video 157
157
157
the original views of the multi-view video, but also on any virtual view, which is
generated by the user using IBR from the original views.
Apler Koz et al. propose a watermarking approach inserts the watermark into each
view frame of multi-view video in [5] and [6]. The watermark is modulated with the
resulting output image which is obtained after filtering each view frame by a high- pass
filter, and spatially added onto the view frame. The watermark is a sequence generated
from a Gaussian distribution with zero mean and unit variance. The well- known
correlation-based detection scheme is utilized during watermark extraction. If the
correlation coefficient is big enough, the watermarking scheme claims to be
success. In fact, this approach is intended to embed only one bit of information, i.e.,
presence or absence of the watermark.
In this work, we propose a multi-bit watermarking scheme for free-view video.
Spread Spectrum-based direct-sequence code division multiple access (DS-CDMA)
[7] watermarking method is used to embed multi-bit wateramrk sequence into discrete
cosine transform (DCT) domain of each view frame. The watermark sequence is
extracted bit-by-bit with a correlation detector from a watermarked view frame or a
virtual frame generated for an arbitrary view. The detection algorithm should include
a procedure to determine the position and direction of the virtual camera, because the
watermark detector does not know the information. However, the research of
determining the position and direction has been investigated by Koz et al. in [5], so
we assume that the position and the direction of the virtual camera is priori in our
proposed watermarking extraction method.
The rest of the paper is organized as follows. In Section 2, the light field rendering
(LFR) [8] approach is introduced firstly, which is one of the competing IBR technology
for FTV systems [9]. Section 3 describes the details of watermark embedding and
detection procedure. The experimental results are illustrated in Section 4, and finally
some conclusions are drawn in Section 5.
2 Light Field Rendering
In the literatures, light field approach is the most well known and preferred IBR
technique. The first reason is that it does not require any geometry information but only
relies on scene images which are easy to capture by common digital products. Second,
it avoids building complex models, such as depth values or image correspondences, to
extract the image values. Third, the new views can be constructed in real time and is
independent of the scene complexity (only related with the size of the rendered image).
The basic assumption behind this technique is that the radiance along a ray remains
constant if there are no blockers. Then a light field is built to capture all the necessary
rays within a certain sub-space so that every possible view within a region can be
synthesized [8].
158 H. TianSpread
158
158
Fig. 1. A representation of the light field
Fig. 2. A sample light field image array: Dragon [11]
In practice, a light ray is usually parameterized as lines by its intersections with two
parallel planes, namely the camera plane and the focal plane (see Fig. 1). In Fig.1, a
light ray is shown and indexed as an integer 4-tuple (u0 , v0 , s0 , t0 ) , where (u0 , v0 ) and
(s0 , t0 ) are the intersections of the light ray with camera and focal planes, respectively.
The two planes are usually discrete so that a finite number of light rays can be recorded.
159 H. TianSpread
159
159
If the light rays from all the points on the focal plane arrive at one point on the camera
plane, then an image is generated (2D array of light rays). Therefore, the two planes can
also be interpreted as a 2D array of images, as shown in Fig. 2. To generate a virtual
view of the object for a random selected viewpoint, the light ray for each pixel of the
rendered image is calculated by quadlinear interpolation existing nearby light rays in the
image array. Nearest neighborhood interpolation and bilinear interpolation are two
interpolation methods in LFR. Bilinear interpolation gives more natural and subjectively
pleasant outputs than nearest neighborhood interpolation, so we choose the bilinear
interpolation in our simulations (in Section 4 of the paper).
Fig. 3. Illustration of watermark embedding procedure. ①~⑤ indicates Step 1) ~ Step 5).
3 Proposed Watermarking Method
3.1 Watermarking Embedding

In the proposed watermarking scheme, the spread spectrum-based DS-CDMA
watermarking scheme [10] which is well known for its robustness to common signal
processing attacks is used to embed watermark into every images of the light field image
array. The watermarking embedding procedure is demonstrated in Fig.3 and
summarized as follows:
1) Generate M 1-D binary pseudo random sequence pi , i = 1, ..., M , as signature
patterns using the private key as seed. Each of these sequences has zero mean
and takes values from binary alphabet {-1, 1}. M is the number of bits in the
watermark message. The length of pi is N , N > M ;
160 H. TianSpread
160
160
2) Create a 1-D DS-CDMA watermark signature W1 by modulating the
watermark message with the patterns generated in Setp 1), i.e.,

W1 = i =1 wi pi , where wi is the ith bit (i.e., -1 or 1) in the watermark
M
message w = [w1 w2 ⋅⋅⋅ wi ⋅⋅⋅ wM ] ;

3) Convert the 1-D signature W1 into a 2-D signature W2 in a pre-selected
zigzag scan (e.g., mid-range DCT coefficients); other coefficients are set to
zero;
4) Apply the inverse discrete cosine transform (IDCT) to the 2-D signature W2
to produce W ;
5) The final watermark signature W is embedded into each original light field
image I using the formula:
I w = I + αW
where α is the watermarking strength. It produces the watermarked light field
image I w .
The whole procedure is equivalent to embedding the watermark signature W into
the DCT domain of the light field image. The advantage is that it avoids any
distortion which might have incurred to the original image [10].
Fig. 4. Illustration of watermark extraction procedure. ①~⑤ indicates Step 1) ~ Step 5).
3.2 Watermarking Extraction

Rather than dealing with the general attacks for image and video watermarking, the
major challenge of FTV is extracting the watermark message from an arbitrary view
161 H. TianSpread
161
161
generated by LFR. The strategy of estimating the position and rotation for the
imagery view has been investigated by Koz et al. in [5], so we can only focus on the
state that the position and rotation of the virtual camera is known. The following steps
are taken to decode the embedded watermark message in a rendered image I wr :
1) Regenerate 1-D binary pseudo random sequence pi , i = 1, ..., M , using the
same key as in Step 1) of watermarking embedding. M is the number of bits

in the watermark message. Each of these sequences has zero mean and take
values from binary alphabet {-1,1};
2) Convert the 1-D pseudo random sequence pi into a 2-D pi′ in a pre-selected
zigzag scan (e.g., mid-range DCT coefficients), other coefficients are set to
zero;
3) Apply the IDCT to the 2-D pseudo random sequence pi′ to produce Pi ;
4) Apply the same rendering operations during the generation of an arbitrary

view to Pi , in order to generate a rendered watermark Pi r (assuming the
position and rotation of the virtual camera is known);

5) Decode the watermark message bit-by-bit using a correlation detector. That is,
the ith bit of the watermark message is decoded as
ŵi corr ( I wr , Pi r ) ≥ 0
1,
= corr ( I wr , Pi r ) < 0
−1
,
where corr ( ) is the correlation of two vectors. The extracted watermark message is
ŵ = [ŵ1 ŵ2 ⋅⋅⋅ ŵi ⋅⋅⋅ ŵ M ] , ŵi ∈{−1,1} .

162 H. TianSpread
162
162
Fig. 5. Location of camera & focal plane for Dragon light field
163 H. TianSpread
163
163
4 Experimental Results
A common light field, Dragon [11], is used in the simulations. The parameterization
of the focal and camera plane for Dragon light field is shown in Fig. 5. The size of
Dragon light field image is 256 × 256 pixels. The watermarking strength α is set to
1.2. The length of the pseudo random sequence pi is set as N = 30000 . The length
of the watermark message M is 50 bits. The watermark message is only embedded

into the brightness component of the color image in the simulation. The capacity of
the watermarking scheme should triple, if watermark message is embedded into three
components (i.e. RGB channels). The decoding bit-error rate (BER), defined as the
ratio between the number of incorrectly decoded bits and the total number of embedded
bits, is used to evaluate the robustness of the watermarking scheme. 20 different
randomly generated watermark sequences are tried and the BER is taken as the average
of the 20 cases.
4.1 Imperceptibility Test

Typical rendered views for the original and watermarked Dragon light field are
presented in Fig.6. Virtual camera is located at [0 0 2] with the normal direction of [0
0 -1] (Position-A in Fig. 5). The peak signal to noise ratio (PSNR) value between Fig.
6 (a) and Fig. 6 (b) is 32.6. Fig. 6 (c) shows the difference between Fig. 6 (a) and Fig.
6 (b) which has been multiplied by 2 for the purpose of better display. Another example
is given in Fig.7, the virtual camera is located at [1.5 0 2] with the direction of [-1 0 -
1] (Position-E in Fig. 5). The PSNR value between Fig. 7 (a) and Fig. 7 (b) is
34.2. From Fig.6 and Fig.7, we can see that the fidelity of the watermark is very high.
(a) (b) (c)

Fig. 6. (a) Rendered view of virtual camera at Position-A in Dragon light field, (b)
watermarked view at Position-A in Dragon light field, (c) The difference between (a) and(b),
multiplied by 2 for the purpose of better display.
164 H. TianSpread
164
164
(a) (b) (c)

Fig. 7. (a) Rendered view of virtual camera at Position-E in Dragon light field, (b)
watermarked view at Position-E in Dragon light field, (c) The difference between (a) and (b),
multiplied by 2 for the purpose of better display
4.2 Robustness Test for Rendering

In the robustness tests, the extraction scheme is applied for different imagery views
based upon the virtual camera position and orientation. In Dragon light field,
Position-A in Fig. 5 is taken as a reference, in order to describe translation and
rotation in the results. The camera position of Position-A is [0 0 2] and normal direction
is [0 0 -1]. Six cases are considered in the simulations as shown in Table 1. These cases
cover the translation, rotation and scaling type of processing for the rendered views.
The robustness tests of the six cases evaluated with average BER of
20 random watermark sequences are shown in Table 2.
From Table 2 we can see that the proposed watermarking scheme performs very
well on different imagery views. For Case I, II, IV and V, BER values are all lower
than 1%. Especially, the BER value is zero in Case I and II. For Case III and VI, the
energy of watermark reduces seriously due to shrinking of the rendered view. However,
BER=3.3% is a satisfactory value for Case III where the shrink is very severe. So the
proposed watermarking scheme is successful.
Table 1. Six cases for the creation of rendered views in the Dragon light field
Translation Translation Rotation? Position Direction Label

on uv-plane? on z-axis?
Case I No No No [0 0 2] [0 0 -1] A
Case II Yes No No [0.5 0 2] [0 0 -1] B
Case III Yes Yes No [0.5 0 3] [0 0 -1] C
Case IV Yes No Yes [0 2 2] [0 -1 -1] D
Case V Yes No Yes [1.5 0 2] [-1 0 -1] E
Case VI Yes Yes Yes [1.5 0 2.5] [-1 0 -1] F
Table 2. Robustness test for six cases of light field rendering
Case I Case II Case III Case IV Case V Case VI

BER 0 0 0.033 0.005 0.002 0.015
165 H. TianSpread
165
165
Table 3. Robustness against various attacks of the proposed watermarking scheme
Case I Case II Case III Case IV Case V Case VI

No attacks 0 0 0.033 0.005 0.002 0.015
0.015 0.014 0.170 0.14 0.094 0.137
Medina filter 2×2
0.007 0.029 0.192 0.117 0.121 0.155
Medina filter 3×3
0.013 0.013 0.165 0.142 0.090 0.125
Mean filter 2×2
0.034 0.061 0.229 0.180 0.143 0.208
Mean filter 3×3
0 0 0.056 0.023 0.014 0.033
Gaussian filter 3×3
Uniform noise 0 0 0.032 0.006 0.002 0.015
( β = 0.01)
Uniform noise 0 0 0.033 0.005 0.002 0.015
( β = 0.02)
Uniform noise 0 0 0.033 0.005 0.002 0.015
( β = 0.03 )
Salt & peppers noise 0.056 0.068 0.211 0.125 0.134 0.162
(scale = 0.05)
Salt & peppers noise 0.113 0.133 0.271 0.189 0.172 0.204
(scale = 0.08)
Gaussian noise 0.011 0.009 0.138 0.076 0.052 0.089
(var. = 0.01)
Gaussian noise 0.026 0.040 0.188 0.136 0.101 0.152
(var. = 0.02)
Gaussian noise 0.083 0.099 0.278 0.213 0.183 0.227
(var. = 0.04)
JPEG 80 0 0 0.041 0.007 0.003 0.019
JPEG 60 0 0 0.060 0.029 0.016 0.043
JPEG 40 0.001 0.007 0.107 0.052 0.033 0.067
JPEG 30 0.006 0.012 0.125 0.070 0.057 0.087
Cropping 5 0 0 0.033 0.005 0.003 0.026
Cropping 20 0 0 0.035 0.005 0.007 0.054
Cropping 30 0 0 0.044 0.005 0.019 0.076
4.3 Robustness Test against Other Attack

We also evaluate the robustness of the watermarking method against common signal
processing attacks, because they could occur in the transmission chain of FTV. The
performance of the proposed scheme under various common signal processing attacks
is shown in Table 3. These attacks might include Median filtering with size 2 × 2 and
3× 3 , Mean filtering, Gaussian filtering with size 3× 3 , adding uniform noise, adding
salt & peppers noise, JPEG compression and center cropping. The Gaussian filter
matrix is
0.0113 0.0838
0.0113
0.0838 0.6193
0.0838
166 H. TianSpread
166
166
0.0113 0.0838
0.0113
The attacked image with adding uniform noise is
167 H. TianSpread
167
167
I ′( x, y) = I ( x, y) ⋅ (1 + β ⋅ n( x, y))
where I ( x, y) is the pixel grayscale value of an input image at ( x, y) , β is a
parameter that controls the strength of the additive noise, n( x, y) is noise with
uniform distribution, zero mean and unit variance, and I ′(x, y) is the pixel grayscale
value of the attacked image.
From Table 3 we can see that the proposed watermarking scheme is not only resistant
to common signal processing but also robust against combined signal processing attacks
and light field rendering in six cases. Especially, the robustness against Gaussian filter,
adding uniform noise, JPEG compression and cropping of the watermarking scheme
performs very well.
5 Conclusion
In the emerging FTV system, there are more challenging requirements, compared to
well-studied mono-view video watermarking. The ownership of the multi-view video
should be proved not only on the original views of the multi-view video, but also on
any virtual view generated for an arbitrary view. Apler Koz et al. propose a
watermarking approach for the free-view video. However, it is only a one-bit
watermarking scheme. In this paper, a multi-bit watermarking scheme for free-view
video is proposed. The watermark message is embedded into every frames of multiple
views using DS-CDMA embedding method. The watermarking extraction is carried out
in the DCT domain of virtual frame generated for an arbitrary view with a
correclation detector. Experimental results show that the watermark for FTV can be
detected from virtual views generated for an arbitrary view. Moreover, the proposed
scheme is resistant to common signal processing including lowpass filtering, adding
noise, JPEG compression and cropping. More exhilaratingly, the watermarking
scheme is robust against combined signal processing attacks and light field rendering
operation.
Acknowledgments. This work was supported in part by 973 Program

(2011CB302204), the National Science Foundation of China for Distinguished Young
Scholars (61025013), Sino-Singapore JRP (2010DFA11010), National NSF of China
(61073159) and Fundamental Research Funds for the Central Universities
(2009JBZ006, 2011YJS292).
References
1. Barni, M., Bartolini, F., Checcacci, N.: Watermarking of MPEG-4 Video Objects. J. IEEE
Trans. Multimedia 7(1), 23–32 (2005)
2. Hsu, C., Wu, J.: Digital Watermarking for Video. In: Proc. IEEE Int. Conf. Digital Signal
Processing, vol. 1, pp. 217–220. IEEE Press, Santorini (1997)
168 H. TianSpread
168
168
3. Tian, H., Zhao, Y., Ni, R., Cao, G.: Geometrically robust image watermarking by sector-
shaped partitioning of geometric-invariant regions. J. Optics Express 17(24), 21819–21836
(2009)
4. Zhang, C., Chen, T.: A Survey on Image Based Rendering-representation, Sampling and
Compression. J. EURASIP Signal Process.: Image Commun. 19(1), 1–28 (2004)
5. Koz, A., Cigla, C., Alatan, A.A.: Watermarking of Free-view Video. J. IEEE Tran. Image
Proc. 19(7), 1785–1797 (2010)
6. Koz, A., Cigla, C., Alatan, A.A.: Free-View Watermarking for Free-View Television. In:
2006 IEEE International Conference on Image Processing, Atlanta, pp. 1405–1408 (2006)
7. Cox, I.J., Kilian, J., Leighton, F.T., Yang, Y., Shamoon, T.: Secure spread spectrum
watermarking for multimedia. J. IEEE Trans. Image Process. 6(12), 1673–1687 (1997)
8. Levoy, M., Hanrahan, P.: Light field rendering. In: Proc. ACM Siggraph 1996, New
Orleans, pp. 31–42 (1996)
9. Tanimoto, M.: FTV (Free-viewpoint Television) creating ray-based image engineering. In:
Proc. IEEE Int. Conf. Image Proce., Genova, vol. 2, pp. 25–28 (2005)
10. Dong, P., Brankov, J.G., Galatsanos, N.P., Yang, Y., Davoine, F.: Digital Watermarking
Robust to Geometric Distortions. J. IEEE Trans. Image Process. 14(12), 2140–2150 (2005)
11. The Stanford Light Field Archive,
http://graphics.stanford.edu/software/lightpack/lifs.html

(SG & WM Project 1) - Session 1&2.160-170

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

(SG & WM Project 1) - Session 1&2.160-170

Hochgeladen von

Copyright:

Verfügbare Formate

Spread Spectrum-Based Multi-bit Watermarking

for Free-View Video

Keywords: Free-view television, light field rendering, multi-view video,

2 Light Field Rendering

Fig. 1. A representation of the light field

Fig. 2. A sample light field image array: Dragon [11]

3 Proposed Watermarking Method

3.1 Watermarking Embedding

2) Create a 1-D DS-CDMA watermark signature W1 by modulating the

watermark message with the patterns generated in Setp 1), i.e.,

message w = [w1 w2 ⋅⋅⋅ wi ⋅⋅⋅ wM ] ;

3.2 Watermarking Extraction

same key as in Step 1) of watermarking embedding. M is the number of bits

4) Apply the same rendering operations during the generation of an arbitrary

position and rotation of the virtual camera is known);

ŵ = [ŵ1 ŵ2 ⋅⋅⋅ ŵi ⋅⋅⋅ ŵ M ] , ŵi ∈{−1,1} .

of the watermark message M is 50 bits. The watermark message is only embedded

4.1 Imperceptibility Test

(a) (b) (c)

(a) (b) (c)

4.2 Robustness Test for Rendering

Translation Translation Rotation? Position Direction Label

Table 2. Robustness test for six cases of light field rendering

Case I Case II Case III Case IV Case V Case VI

Table 3. Robustness against various attacks of the proposed watermarking scheme

Case I Case II Case III Case IV Case V Case VI

4.3 Robustness Test against Other Attack

Acknowledgments. This work was supported in part by 973 Program

Das könnte Ihnen auch gefallen