Sie sind auf Seite 1von 4

AN EFFICIENT COMPRESSION ALGORITHM FOR HYPERSPECTRAL IMAGES

BASED ON A MODIFIED CODING FRAMEWORK


OF H.264/AVC
*


Guizhong Liu
1
, Fan Zhao
1
, Guofu Qu
2

1
School of Electronic and Information Engineering, Xian Jiaotong University, 710049, Xian ,China
2
School of Machinery and Precision Instrument Engineering, Xian University of Technology;
710048, Xian, China
Email: liugz@xjtu.edu.cn, zhaofan@mail.xjtu.edu.cn, quguofu@gmail.com


*
This work is supported in part by National Natural Science Foundation of China (NSFC) under Project No.60572045, the Ministry of Education
of China Ph. D. Program Foundation under Project No.20050698033.

ABSTRACT

In this paper, an efficient compression algorithm for
hyperspectral images is proposed, which is based on a
modified coding framework of H.264/AVC. In virtue of
the flexible and diverse prediction modes of H264/AVC,
the most suitable ones are assigned for the macroblocks
( 16 16 pixel regions of a band) of the hyperspectral
images other than for the whole band images. Only the
4 4 mode is employed for the intra-band prediction in
view of the fact that correlation coefficients of pixels
separated by not more than four pixels in the spatial
domain are greater than 0.65 at most cases. After the
optimal reference band is determined by the fast reference
band selecting algorithm, the inter-band prediction mode
is determined then. Thus, a modified coding scheme is
proposed to speed up the implemental process with the
fast reference band selection algorithm, the integer DCT
and the quantification which just needs multiplication and
bit-shifts operations. Several AVIRIS images are used to
evaluate the proposed algorithm. Compared with the state-
of-the-art 3D-based compression algorithms, the proposed
algorithm achieves the best compression performance at
different rates.

Index TermsHyperspectral images, compression coding,
H.264/AVC, prediction mode, correlation coefficients

1. INTRODUCTION

Due to the huge sizes of the three-dimensional (3D)
hyperspectral data sets, efficient compression is required
for their transmission to the ground station for the limited
available downlink bandwidth.
Recently, the 3DWT (three-dimensional wavelet
transform) based lossy compression algorithms for
hyperspectral images have been proved promising. Both
the 3DSPIHT [1] and the 3DSPECK [2] algorithms use a
conventional symmetric 3DWT and they are very close in
the rate-distortion performance. For more efficiently
employing the higher correlation in the spectral direction
than in the spatial domain for the hyperspectral images,
some asymmetric 3DWT based methods have been
proposed recently. An algorithm AT-3DSPIHT
(asymmetric tree based 3DSPIHT) was presented in [3]
and [4], and another algorithm AT-3DSPECK was
proposed in [5] for hyperspectral images compression.
JPEG2000-MC, supporting the multi-component images,
proposed in [6] [7], has also been proved to be a good
choice for hyperspectral images compression. However,
the number of the bands demanded for 3DWT is relatively
large, that is, the coding efficiency is improved as more
band images are involved in 3DWT, and vise verse. The
huge data block which consists of at least 16 band images
will undoubtedly constitute heavy burden of computation
and storage; such a data block being processed is not
suited for real-time implementations. Rao and Bhargava
[8] proposed a scheme with a simple block-based inter-
band linear prediction followed by a block-based DCT
(discrete cosine transform), which resembles the one used
in the MPEG standard video coding. In their scheme, two
fixed bands with a lower wavelength and a higher
wavelength respectively are designated as reference bands.
After each of the reference bands is coded with high
fidelity first, the previous adjacent band and the nearest
one of the two selected reference bands are used for a bi-
directional prediction to the current band. But in case of a
large number of bands, the long distance between the
current band and the reference band would result in the
distortion increase in coding the prediction error,
especially when the correlation between the current band
and its previous adjacent band is also comparable low.
Now, we advocate a macroblock-based multi-band
prediction without motion compensation followed by a
macroblock-based integer DCT, as is done in the
H.264/AVC standard scheme [9] [10]. Each macroblock
of the current band is predicted using the surrounding
macroblocks in the intra-band or using the inter-band
reference macroblocks at the identical position in the
reference bands. Due to the full utilization of the flexible
macroblock-based intra-band or inter-band prediction and
the simplification of the on-board compression
implementation caused by applying the fast reference
band selection algorithm, the integer DCT and the
quantification which just needs multiplication and bit-
shifts operations, our lossy compression algorithm for
hyperspectral images is expected to achieve a better
coding performance.
The paper is organized as follows: The framework of
our codec is proposed in Section 2. Section 3 presents the
experimental results. Finally, Section 4 concludes this
paper.

2. AN EFFICIENT COMPRESSION ALGORITHM
FOR HYPERSPECTRAL IMAGES BASED ON A
MODIFIED CODING FRAMEWORK OF
H.264/AVC



Fig.1. Macroblock-based codec structure.

The modified macroblock-based codec we propose is
illustrated in Fig.1, in which the basic functional modules
such as the transformation, the quantization, the coder
control and the entropy coding are the same as in
H.264/AVC, without motion compensation. All the input
band images are processed in GOB (group of bands) units,
which in turn are in macroblock units with 16 pixels wide
and 16 pixels high ( 16 16 ). In the encoding process, the
optimal prediction mode of the current macroblock is
determined using the fast reference band selection
algorithm and subtracted from the reference macroblock
and the prediction error is transformed, quantized and
entropy encoded. On the other hand, the decoding process
(shaded in Fig.1) essentially works in the reverse order,
the sample values of the decoded residual macroblocks
and the predicted macroblocks are summed to form the
reconstructed macroblocks, which finally constitute the
decoded band images. Using the same classification
scheme as H.264/AVC, all the input bands are designated
as I-band, P-band and B-band respectively. The 4 4 intra-
band prediction is used for I-bands and the fast reference
band selection algorithm is implemented for P-bands and
B-bands.

2.1. The Formal Steps

Our compression scheme consists of the following steps:
Step1: All the input band images are divided into GOBs as
in H264/AVC, each of which consists of 1 I-band, 3 P-
bands and 12 B-bands.
Step2: A fast prediction is done for each of the
macroblock units.
If the current band is an I-band, the 4 4 intra-band
prediction mode is utilized.
If the current band is a P-band, the optimal anchor
band
1
ref
B is chosen using the fast reference band
selection algorithm described in Section 2.3 from the
forward reference list { } M m F
m
L , 2 , 1 , = in DPB
(decoded picture buffer). In addition to the coding
types of the I-band, some macroblocks of the P-band
can also be predicted by
1 OPT
B in
1
ref
B , the optimal
prediction mode of the given macroblock is thereby
determined, that is, the mode with the minimum SAE
(sum of absolute errors) value is assumed.
If the current band is a B-band, the forward optimal
anchor band
1
ref
B and the backward one
2
ref
B are chosen
using the fast reference band selection algorithm from
the forward reference list { } M m F
m
, , 2 , 1 , L = and the
backward reference list { } N n B
n
, , 2 , 1 , L = respectively,
In addition to the coding types available in a P-band,
some macroblocks of the B-band can also be predicted
by a combination of
1 OPT
B and
2 OPT
B ,which are
in
1
ref
B and
2
ref
B respectively, the optimal prediction
mode of the given macroblock is determined as above.
Step3: The difference between the actual macroblock and
its prediction is then calculated, followed by integer
DCT and the divide-free quantification.
Step4: The prediction mode and the prediction errors of
the macroblock are encoded using context-based
adaptive binary arithmetic coding (CABAC) [10], and
the final bitstream of a band is generated by collecting
the sub-bitstreams of the band type and all the
macroblocks in it.

2.2. The Intra-Band Prediction

For a given macroblock, the intra-band prediction is
derived from the surrounding pixel values.
The 16 16 intra-frame prediction mode could be selected
at the flat regions of video images by H.264/AVC. Since
more details exist in the hyperspectral images, using the
16 16 intra-band prediction mode will undoubtedly result
in greater prediction error energy. Fortunately, the
correlation between a pair of pixels separated by a four
pixels distance is rather strong, e.g., % 78 of the correlation
coefficients about !Cuprite" scene 1 images are greater
than 0.65 even at a 4 pixels distance in the row direction;
a similar situation appears in the column direction. But at
16 pixels distance, almost all the correlation coefficients
are less than 0.5. This can be shown by Fig.2. In which
( ) j i P D D , is the correlation coefficients separated by
j and i D D pixel distance in the row and column spatial
direction respectively. In order to reduce the error energy
of the macroblock caused by the prediction using the low
correlation ones, only the 4 4 intra-band prediction mode
is employed for intra-band prediction.

Fig.2.Correlation coefficients versus band number at different
spatial pixel distances.

Fig.3. Macroblock-based correlation coefficients between
different bands for !Cuprite" scene 1 images.






































Fig.4. The fast reference band selection algorithm.
2.3. The Inter-Band Prediction

One of the most important advances in H.264/AVC is the
use of the multi-frame motion-compensated prediction
techniques. Reference pictures are stored in DPB using a
forward reference list and a backward reference list. In
most cases, the use of the multiple reference frames
provides significantly improved coding gain. Therefore,
we can employ the macroblock-based multi-band rather
than only one band for inter-band prediction, in which no
motion estimation is performed. Fig.3. shows the
macroblock-based correlation coefficients curves of the
two pairs of bands, i.e., !band1-band3" and !band2-
band3" of !Cuprite" scene 1 image respectively. As can
be seen, in most cases, the correlations of macroblocks at
the same spatial position between band3 and band2 is
higher than those between band3 and band1; however, for
some macroblocks, the case is opposite. For the prediction
of a given macroblock of !band3", it is, therefore,
reasonable to assume that the reference macroblock
should be taken from !band2" in the former case and from
!band1" in the latter case. A modified reference
macroblock selection process is implemented by a fast
algorithm, as shown in Fig.4. For a P-band, after
calculating all the correlation coefficients between the
current macroblock and the corresponding reference
macroblocks at the identical position in the forward
list { } M m F
m
L , 2 , 1 , = , the optimal candidate
1 OPT
B with
the maximum correlation value is extracted for prediction
the given macroblock. And for B-band, another candidate
2 OPT
B is selected from the backward list { } N n B
n
L , 2 , 1 , = .
Rather than being selected from all the available modes of
all the reference bands by H.264/AVC, the prediction
mode is determined by the fast reference band selection
algorithm. The effect is obvious; compared with
H.264/AVC scheme, 80% computation can be saved when
the size of the DPB is 5.

3. EXPERIMENTAL RESULTS

To evaluate its performance, the proposed compression
scheme is compared with the state-of-the-art algorithms.
Coding experiments are carried out on the three signed
16-bit radiance AVIRIS (http://aviris.jpl.nasa.gov) images
Jasper Ridge, Lunarlake and Cuprite, all of scene 1. We
crop the scene to 224 512 512 pixels with 8 bits
resolution. PSNR is used as the quality metric. For the
fairness in evaluating the performance, all the 3D-based
algorithms employ the 3-level 9/7 biorthogonal lifting
DWT or WPT (wavelet packet transform) in a GOB unit
with the size of 16, and the DPB is set 5 in our method.
(1) PSNR performance comparison
As can be seen from the PSNR results shown in
Table 1, at the lower bit rates, the coding performance of
JPEG2000-MC is better than the best 3DWT-based
algorithm, i.e., AT3DSPECK, but as the bit rate increases,
the case is reversed. As expected, our method
substantially outperforms all the other ones, which is
especially evident at the lower bit rates. The gain is
mainly attributed to the suitable macroblock-based
prediction mode selection.
Given a forward reference list { } M m F
m
L , 2 , 1 , = ,
a backward reference list { } N n B
n
L , 2 , 1 , = in DPB,
and the current macroblock
C
B in the current band
B
C .
' ' )! ( I C type
B
=
Calculate all the correlation coefficients
( )
F C
B B , r between
C
B and the identical location
macroblocks
F
B in the { } M m F
m
L , 2 , 1 , = , and
find the candidate
macroblock : ( ) ( )

=
F C
m
OPT
B B B , max arg
1
r
' ' ) ( B C type
B
=
Calculate all the correlation coefficients
( )
B C
B B , r between
C
B and the identical location
macroblock
B
B in the { } N n B
n
L , 2 , 1 , = , and find
the candidate
macroblock : ( ) ( )

=
B C
n
OPT
B B B , max arg
2
r
' ' ) ( p C type
B
=
Reference band selection:
( )
1
1
OPT ref
B where B = , ( )
1
2
OPT ref
B where B =
NO
YES
YES
YES
Table 1.PSNR performance comparison
Bit Rate Scheme
0.05 0.1 0.5 1.0
3DSPIHT 37.81 40.13 46.73 51.15
3DSPECK 38.82 40.73 46.72 50.17
AT3DSPIHT 38.56 41.60 50.69 55.69
AT3DSPECK 39.81 42.82 51.57 55.24
JPEG2000_MC 40.69 43.15 49.59 52.67
J
a
s
p
e
r
_
s
1


Our Method 42.10 45.02 52.63 56.34
3DSPIHT 35.51 38.27 45.73 50.02
3DSPECK 37.38 39.74 46.77 50.69
AT3DSPIHT 36.34 40.42 50.52 54.24
AT3DSPECK 39.85 43.22 52.09 54.96
JPEG2000_MC 42.68 45.13 50.21 52.52
C
u
p
r
i
t
e
_
s
1


Our Method 44.83 47.67 52.67 55.72
3DSPIHT 35.58 38.34 45.97 50.59
3DSPECK 37.45 39.77 47.23 51.29
AT3DSPIHT 36.26 40.60 51.39 54.55
AT3DSPECK 39.95 43.69 52.24 55.24
JPEG2000_MC 42.94 45.28 50.76 52.47
L
u
n
a
r
l
a
k
e
_
s
1


Our Method 45.07 47.71 52.44 55.30



Fig.5. Performance comparison for all the GOB groups of
!Cuprite s1" images at 0.1bpp.



Fig.6. CPU complexity comparison for the !Cuprite s1" images.
(2) RD comparison for all the GOB groups
Several excellent algorithms were tested and the RD
results for all the GOB groups are shown in Fig.5.
Obviously, our method achieves competitive performance
at all GOB groups at different rates, especially for those
lower correlation GOB groups (i.e., GOB number 7 and
10). This absolute superiority attributes to the multi-band
prediction we used.
(3) Complexity comparison
To compare the computational complexity of the
prediction applying the fast reference band selection
algorithm with that of the prediction scheme of
H.264/AVC, the CPU times consumed for the "Cuprite
s1" scene images are recorded in Fig.6, which are
displayed in Sec/band. From these results we can see that
our scheme has a significant performance benefit from the
view point of computational complexity of prediction
while maintains the image quality unchanged when
compared with the standard H.264/AVC prediction
scheme. As shown in the figures, the average time
consumed by our scheme is 21% of the H.264/AVC
prediction complexity.

4. CONCLUSION

In this paper, we propose a modified H.264/AVC
prediction scheme for the lossy compression of the
hyperspectral images. Thanks to the fast reference band
selection and the careful prediction mode choice for the
macroblock units in the hyperspectral images, this scheme
provides very satisfactory performance in the rate-
distortion sense. And the experimental results show that
our scheme can significantly reduce the prediction time
compared with the original H.264/AVC prediction
schemes without motion estimation. Thus, it could be a
promising efficient lossy compression algorithm for the
hyperspectral images.

REFERENCES

[1] B. J. Kim and W.A. Pearlman, !An Embedded Wavelet
Video Coder Using Three-Dimensional Set Partitioning in
Hierarchical Trees (SPIHT)," Proceedings of DCC97, pp.251-
260, Mar. 1997.
[2] X. Tang, W. A. Pearlman, and J. W. Modestino,
!Hyperspectral Image Compression Using Three-Dimensional
Wavelet Coding," Proceedings of SPIE/IS&T Electronic
Imaging, pp.1037-1047, 2003.
[3] X. Tang, W. A. Pearlman, and J. W. Modestino, !3d set
partitioning coding methods in hyperspectral image
compression," Proceedings of IEEE ICIP, pp.,239-242,2003.
[4] X. Tang, W. A. Pearlman, and J. W. Modestino, !Lossless
compression for three dimensional images," Proceeding of SPIE
Visual Comm. and Image Processing, vol. 5308, pp. 294-305,
2004.
[5] Jiaji Wu, Zhensen Wu, and Chengke Wu, !Lossy to lossless
compressions of hyperspectral images using three-dimensional
set partitioning algorithm," Optical Engineering 45, 027005,
2006.
[6] Justin T. Rucker, James E. Fowler, and Nicholas H. Younan,
!JPEG2000 Coding Strategies for Hyperspectral Data", in
Proceedings of the IEEE International Geoscience and Remote
Sensing Symposium, pp.128-131, July 2005.
[7] Barbara Penna, Tammam Tillo, Enrico Magli, and Gabriella
Olmo, !Progressive 3-D Coding of Hyperspectral Images Based
on JPEG2000", IEEE Geoscience and Remote Sensing Letters,
Vol.3, No.1, pp.125-129, January 2006.
[8] Rao, A.K., Bhargava, S., !Multispectral data compression
using bidirectional interband prediction," IEEE Transactions on
Geoscience and Remote Sensing, vol. 34, No. 2, Mar. 1996.
[9] Joint Video Team of ITU-T and ISO/IEC JTC 1, !Draft ITU-
T Recommendation and Final Draft International Standard of
Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-
10 AVC) ," JVT-G050r1, May 2003.
[10] Iain EG Richardson, !H.264/MPEG-4 Part 10 White
Paper," http://www.vcode.com, Mar.2003.

Das könnte Ihnen auch gefallen