Sie sind auf Seite 1von 17

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 26, 1011-1027 (2010)

Hybrid Image Compression Based on Set-Partitioning Embedded Block Coder and Residual Vector Quantization
SHENG-FUU LIN, HSI-CHIN HSIN* AND CHIEN-KUN SU+
Department of Electrical and Control Engineering National Chiao Tung University Hsinchu, 300 Taiwan * Department of Computer Science and Information Engineering National United University Miaoli, 360 Taiwan + Department of Electrical Engineering Chung Hua University Hsinchu, 300 Taiwan
A hybrid image coding scheme based on the set-partitioning embedded block coder (SPECK) and residual vector quantization (RVQ) is proposed for image compression. In which, the scaling and wavelet coefficients of an image are coded by using the original SPECK algorithm and the SPECK with RVQ, respectively. The use of hybrid coding strategy by combining SPECK with RVQ for high frequency wavelet coefficients is to take account of the energy clustering property of wavelet transform. Experimental results show that, for gray-level still images, the proposed hybrid RVQ-SPECK coder outperforms SPECK, e.g. the peak-signal-to-noise-ratio (PSNR) values can be improved by 1.67 dB and 0.69 dB at compression rate of 1 bit per pixel for the 256 256 gray-level Lena and Barbra images, respectively. The application for chroma subsampling images is also presented in this paper, and the proposed method usually outperforms color SPECK method. The PSNR values can be improved by 1.11 dB for the Y plane, 0.99 dB for the U plane, and 2.31 dB for the V plane at the bit budget of 81,920 bits for the test image Goldhill. In addition to high coding efficiency, the proposed method also preserves the features of embeddedness, low decoding complexity, and exact bit-rate control. Keywords: image compression, residual vector quantization (RVQ), set-partitioning embedded block coder (SPECK), chroma subsampling images, embeddedness

1. INTRODUCTION
For the needs of high quality images, fast transmission, and less storage space, image compression is demanding increasingly. Differential pulse code modulation, transform coding, subband coding, and many other image compression techniques have been developed [1-3]. State-of-the-art techniques can compress typical images by a factor ranging from 10 to 50 with acceptable quality [4]. The Joint Photographic Experts Group (JPEG) image standard [5] known as the most widely used transform-coding based algorithm shows good performances at moderate compression ratios. Recently, the wavelet based multiresolution representation has received a lot of attention to the compression applications, as manifested in the JPEG2000 standard [6, 7]. Many wavelet based image coding algorithms such as the embedded zero-tree wavelets (EZW) [8], set partitioning in
Received July 8, 2008; revised November 20, 2008; accepted February 12, 2009. Communicated by Liang-Gee Chen.

1011

1012

SHENG-FUU LIN, HSI-CHIN HSIN AND CHIEN-KUN SU

hierarchical trees (SPIHT) [9], morphological representation of wavelet data (MRWD) [10], group testing for wavelets (GTW) [11], and the set-partitioning embedded block coder (SPECK) [12, 13] have been proposed with a great success. In wavelet domain, the higher detailed components of an image are projected onto the shorter basis functions with higher spatial resolutions, and the lower detailed components are projected onto the larger basis functions with narrower bandwidths; this matches the characteristics of a human visual system [14]. In SPECK, the well-defined hierarchical structure with energy clustering within high frequency subbands has been taken into account such that the significant wavelet transform coefficients of an image can be efficiently coded as early as possible. SPECK has been incorporated into the verification model of JPEG 2000, which is known as subband hierarchical block partitioning (SBHP) [15]. Another variant of SPECK called the embedded zero block coding (EZBC) [16] is much more complicated, which combines SPECK with a context-based adaptive arithmetic coder to improve the compression performance. According to Shannons theory [17, 18], vector quantization (VQ) can significantly reduce the coding bits of signals, comparing to scalar quantization. Hence, VQ plays an important role in many applications, e.g. speech recognition, volume rendering, and image compression. Gupta et al. utilized VQ to compress multispectral satellite images [19]. Su et al. developed a hybrid coding system by using SPIHT and VQ for image compression in [20]. Abdel-Galil et al. applied VQ to power systems for classifying power quality disturbances [21]. When the code vector and code book sizes become large enough, the distortion of the vector quantizer approaches the lower bound of the distortion-rate relation. However, both the computation complexity and memory requirement, associated with the vector quantizer, increase exponentially. Hence, an unconstrained full search vector quantizer usually uses small vectors. For reducing the computation complexity and memory requirements of VQ, several variants of the original VQ had been proposed in literature, such as residual vector quantization (RVQ) [22, 23, 25], hierarchical VQ [24], and tree-structured VQ (TSVQ) [18]. Each VQ variant makes a compromise between the computation complexity and performance. RVQ or multistage VQ [25] is a VQ variant with less computation complexity. Because the decoder of a RVQ is constrained by a direct-sum codebook structure and the encoder typically uses a suboptimal stage-sequential search procedure, the RVQ results in performance degradation. For efficiently coding high-frequency wavelet coefficients with energy clustering and compromising the complexity and performance of an image coder, a hybrid coder using SPECK and residual VQ (RVQ) is thus proposed for image compression. Specifically, the significant high-frequency wavelet coefficients of an image are to be coded on the basis of coefficient vectors, which can be efficiently located by using the significance coding procedure of SPECK. Recently, Chao et al. proposed a vector SPECK algorithm for gray-level still image compression [27] which is a variation on SPECK using VQ to code the significant coefficients. They used very sophisticated VQ method to improve compression efficiency at the cost of added complexity. The proposed hybrid method in this paper and the vector SPECK method were developed independently and with many differences in implementation, although both methods involve SPECK and VQ and have good performance.

HYBRID IMAGE COMPRESSION BASED ON SPECK AND RVQ

1013

The remainder of this paper proceeds as follows. Section 2 describes the proposed hybrid image coder which combines SPECK and RVQ. Experimental results are given in section 3, and conclusions are given in section 4.

2. THE PROPOSED HYBRID IMAGE COMPRESSION METHOD


The SPECK algorithm, which was proposed by Pearlman et al., is a simple, efficient image coder with coding scalability. By recursively partitioning a significant block of a transformed image, SPECK locates the significant coefficients in the block and performs scalar quantization on these significant coefficients to generate a coded bit-stream. Since vector quantization is more efficient than scalar quantization according to Shannons rate-distortion theory, developing a hybrid image coder combining SPECK with VQ was motivated. For reducing the computational complexity, RVQ was selected to be combined with SPECK to constitute the proposed hybrid codec, and experimental results showed that the proposed hybrid method is efficient in image compression. In subsection 2.1, the application for still gray-level images will be discussed, and a chroma-subsamplingimage application of the proposed hybrid method will be presented in subsection 2.2. In subsection 2.3, we will discuss the computational complexity and required memory of the proposed hybrid method in gray-level image compression. 2.1 Application for Gray-Level Still Images A hybrid image coding system by combining SPECK with RVQ is therefore proposed to improve the compression performance, and Fig. 1 shows the block diagram which can be directly used for still gray-level image compression. In the first block, an input gray-level image is transformed by 2D discrete wavelet transform (DWT) to generate its transformed image for further processing. For example, Fig. 2 shows the result of a 4-decomposition-level 2D DWT. The coefficients of the transformed image are classified into two parts. One is the LL subband which contains the scaling coefficients, and the other is the high-frequency subbands which include all the coefficients of the transformed coefficients excluding those in LL subband. The scaling coefficients represent the lowest frequency component of an image, and they can be coded efficiently by using the original (scalar) SPECK algorithm. On the other hand, the wavelet coefficients in highfrequency subbands are coded by using the SPECK with RVQ. Finally, the coded bitstream can be obtained by the use of multiplex operation.
LL subband Scalar SPECK Input image DWT SPECK with RVQ H, D, V subbands Mux Coded bit-stream

Fig. 1. The proposed hybrid image coder.

1014

SHENG-FUU LIN, HSI-CHIN HSIN AND CHIEN-KUN SU

Fig. 2. The partition and assignment of a 4-decomposition-level transformed image.

For the quantization of the coefficients in LL subband, original SPECK starts from the most significant nmax bit plane, where
max nmax = c LL log 2 (|cij |) , ij

(1)

and cij represents a coefficient in LL subband. Those scaling coefficients, whose magnitudes are greater than or equal to 2nmax are located in the first pass. Then, the coefficients, n n whose magnitudes are in interval [2 max-1, 2 max), are located in the second pass, and the procedure goes on until all the coefficients are located or the bit-budget is exhausted. In the proposed hybrid method, all the coefficients in LL subband are normalized, by the absolute value of the coefficient with the largest magnitude, before sorting. Hence, the normalized scaling coefficients with magnitudes in [2-1, 20) are located in the first pass. The coefficients with magnitudes in interval [2-2, 2-1) are located in the second pass, and the procedure goes on till all the coefficients are located or the bit-budget is exhausted. For the coefficients in high-frequency subbands, they are classified into three categories which are H (horizontal), D (diagonal), and V (vertical) types as shown in Fig. 2. All the coefficients in high-frequency subbands are partitioned into 2 2 blocks, and each 2 2 block forms a corresponding 4D vector (Fig. 3). The three types of vectors in H, D, and V are normalized by the maximum L2 norms of three categories, respectively, such that the L2 norm of each vector is less than or equal to one. If the L2 norm of a vector is greater than or equal to the threshold, then the vector is significant and the block or subband containing this vector is also significant. Because the vectors are normalized, the thresholds for a 7-stage RVQ are: 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, and 0. For a 4-decomposition-level transformed image, in the initialization step, the H, D and V blocks in the 4th decomposition level (i.e. on the left-top corner in Fig. 2) form the S set (significant set), and the other H, D and V subbands form the I set (insignificant set) in the RVQ-SPECK

HYBRID IMAGE COMPRESSION BASED ON SPECK AND RVQ

1015

ci,j ci+1,j

ci,j+1 [ci,j, ci,j+1, ci+1,j, ci+1,j+1] ci+1,j+1


A 4D vector

A 22 coefficient block

Fig. 3. A 2 2 coefficient block and its corresponding 4D vector.

Fig. 4. The signal flow diagram of a p-stage RVQ.

block of the proposed hybrid method (Fig. 1). The sorting pass of the significant vectors of the high-frequency subbands and the definitions of S and I in the proposed hybrid method are identical to the scalar SPECK except that the significance path of the SPECK with RVQ ends when the block is 2 2. Generally speaking, full-search VQ has better performance than RVQ, but RVQ was selected to be used in the proposed hybrid method because of its low complexity and acceptable performance. The signal flow diagram of a p-stage RVQ is shown in Fig. 4, where xi (1 i p) is the input vector of the ith VQ stage in the p-stage RVQ, and xi is i is xi+1 that is the code vector which has the smallest distance to xi. The residual of xi x the input vector for the (i + 1)th VQ stage in the RVQ system. Because the characteristics of H, D, and V subbands are different, 3 RVQs are used for the H, D, and V subbands, respectively, of the RVQ-SPECK part in the proposed hybrid method. Since the information of the lowest frequency subband LL of an image is usually more important than that of the high-frequency subbands, the bit-plane resolution in the scalar SPECK is set to be higher than that of the RVQ-SPECK. Thus, the transmission rate of the scalar SPECK is usually faster than that of the RVQ-SPECK. Based on the simulation results, the transmission rate of the scalar SPECK is set empirically twice of the RVQ-SPECK transmission rate, i.e. one pass of the proposed hybrid method including two SPECK passes and one RVQ-SPECK pass. At last, the output coded bit-stream contains the overhead, binary output of SPECK, and binary output of RVQ-SPECK, and the relation is shown in Fig. 5.

Fig. 5. The coded bit-stream of the proposed hybrid method.

1016

SHENG-FUU LIN, HSI-CHIN HSIN AND CHIEN-KUN SU

The decoder of the proposed method can be implemented by simply reversing the processing steps of the encoder. Besides the overhead of the compression file, the bits in the compression file are ordered in importance, so the proposed method is embedded. The proposed encoder (decoder) can terminate the coding (decoding) process at any point, so it can achieve the exact bit-rate control which is an important requirement of modern codecs. The compression performance can be improved by the use of arithmetic coding after SPECK, however, at the cost of increasing computational complexity. As an example, the PSNR value of the decoded 512 512 Lena image can be improved by 0.22 dB at compression rate of 1 bpp by using SPECK with arithmetic coding [26]. For system simplicity, the operation of arithmetic coding is not performed in our experiments. 2.2 Application for Chroma Subsampling Images The chroma subsampling format is used for balancing efficiency and quality in sampling, and the similar method is used in picture format to save bandwith (memory) and maintain good quality. CIF (Common Intermediate Format) and QCIF (Quarter CIF) are two such formats in H.261. For CIF, the size of the luminance plane is 352 288, and the sizes of the two chrominance planes are 176 144. Each of the two chrominance planes only contains one quarter data (pixels) of the luminance planes, since the human eye is less sensitive to the chrominance information than to the luminance information. The image sequence format of MPEG-4 is CIF or 4:2:0, and we will discuss how to use the proposed hybrid method for the compression of the popular YUV 4:2:0 images. Fig. 6 shows the block diagram of the application of the proposed hybrid method for chroma subsampling images. First, Y, U, and V planes are transformed by using 2D discrete wavelet transform, respectively. Then, each transformed YUV plane is processed like the transformed image in the still gray-level image case. The transformed coefficients of each plane are partitioned into LL, H, D, and V subbands. The scaling coefficients of each LL subband, which will be processed by color SPECK (CSPECK) [12, 13], are normalized by the maximum-amplitude of the coefficients in this LL subband. Another three positive base values which are determined from the L2 norms of the 4D vectors in H, V, and D subbands, respectively, are used to normalize the vectors in their corresponding subbands such that the L2 norm of each normalized 4D vector is not greater than 1. Hence, 12 base values used in normalization have to be stored and transmitted for the decoder of the proposed method. After normalization and coefficient classification, the three LL subbands of the transformed Y, U, and V planes are processed by CSPECK, and the other coefficients (i.e. H, D, and V subbands) of the transformed Y, U, and V planes are coded by CSPECK with RVQ. Similar to the quantization of the monochrome application, since the scaling coefficients in LL subbands contain more important information than the coefficients in H, D, or V subbands do, one quantization cycle of the proposed hybrid method includes two CSPECK quantization passes of the LL subbands and one RVQ-CSPECK quantization pass of H, D, and V subbands (Fig. 7).

HYBRID IMAGE COMPRESSION BASED ON SPECK AND RVQ

1017

Fig. 6. The block diagram of the application of the proposed method for chroma subsampling images.

Fig 7. One quantization cycle of the proposed hybrid method which includes two CSPECK quantization and one RVQ-CSPECK quantization.

1018

SHENG-FUU LIN, HSI-CHIN HSIN AND CHIEN-KUN SU

2.3 Memory Requirement and Computational Complexity of the Proposed Hybrid Method In this section, the memory requirement and computational complexity of the proposed hybrid codec will be discussed, assuming that the coefficients in H, D, and V subbands are much more than the scaling coefficients in LL subband. Therefore, the computation complexity of the proposed hybrid method can be approximated by that of the RVQ-SPECK part, or the RVQ-SPECK is the dominant part of the proposed hybrid method. For the memory issue, the proposed hybrid method needs extra memory for storing codebooks and parameters. Assume that three p-stage RVQs are used and the codebook sizes of H, D, and V subbands for the ith VQ stage are the same and equal to mi words. Then, all the codebooks need 3 mi words. The proposed hybrid method also needs
i =1 p

memory to store the 12 base values for normalization and threshold information. The proposed hybrid method outperforms the scalar SPECK, for the memory issue, on the lengths of the list of significant vectors and the list of in-significant sets. Since each vector in RVQ-SPECK contains 4 coefficients, the length of a RVQ-SPECK list is about one fourth of the length of the corresponding list in SPECK (e.g. LSP and LIP). For the computational complexity in encoding, the proposed hybrid method use L2 norms for significance test of the vectors for each stage, and SPECK uses 1-bit comparison to test significance for each bit-plane. Hence, the computational complexity of the proposed hybrid method for significant vector test is several times more complicated than the significance test in SPECK. Although the significance test complexity of the proposed method is more complicated than SPECKs, the proposed hybrid method has the advantage that its total significance test number is smaller than that of SPECK. If an N N gray-level test image with nmax = 11 is coded by SPECK and the proposed hybrid method with p-stage RVQs. Then, the significance-test-number ratio of SPECK to the proposed hybrid method can be estimated by:
N N nmax 4n = max . ( N/2) ( N/2) p p

(2)

For a 512 512 gray-level test image with nmax = 12 and p = 7, the significance-test ratio of Eq. (2) is 6.9. Hence, for this example, the SPECK encoder needs 6.9 times of the significance tests of the proposed hybrid encoder. Although SPECK can use a simple bitwise operation for significance test, it suffers from the growth of the number of significance tests for large images. Both SPECK and the proposed hybrid method use the same algorithm to locate significant coefficients, but the proposed method usually has shorter significant paths because of the use of a 4D vector instead of a single pixel (coefficient). When the proposed hybrid method locates a significant vector, SPECK needs one more quadtree partition and 4 significance test to complete the significant path. For the decoding part, no significance test needed for SPECK or the proposed hybrid method, and the computational complexity of both methods is greatly reduced. The proposed hybrid method is implemented as simple look-up tables and the total amounts of significant vectors are about one quarter of the amounts of the significant pixels in

HYBRID IMAGE COMPRESSION BASED ON SPECK AND RVQ

1019

SPECK. The actual computational complexity depends on the image characteristic, codebook size, bit-allocation of codewords and so on. From above discussion, the propose hybrid decoder is as efficient as SPECK, and this is consistent with our experiment results. To summarize, the proposed hybrid method is suitable for asymmetric complexity applications that we can encode images off line, but need to decode images fast.

3. EXPERIMENTAL RESULTS
In this section, two applications of the proposed hybrid compression method are presented. The first application, in subsection 3.1, is the gray-level still image compression, and the other is the compression of chroma subsampling images in subsection 3.2. The platform for simulation is an IBM PC with Windows XP, and SPECK, SPIHT, and the proposed hybrid method are coded by Matlab. Linear phase biorthogonal wavelet filters with 9/7-coefficients are used in this paper. The number of wavelet decomposition levels in our experiments is 4. Fig. 2 shows the classification of a 4-level transformed image whose coefficients are classified into four types: LL, H, V, and D. The lowest frequency coefficients in subband LL are normalized such that their magnitudes are in the range of [0, 1), and these coefficients are coded by using the scalar SPECK. The wavelet coefficients in subbands of types H, V, and D are coded by using SPECK with RVQ. For the coefficient vectors of H, D, and V subbands, we empirically choose the stage number, in the RVQs, as 10 and 7 for 256 256 and 512 512 test images, respectively. Because the characteristics of H, V, and D subbands are different, each category has its own codebooks. Therefore, 30 and 21 codebooks are trained by using the K-means algorithm for 256 256 and 512 512 monochrome images, respectively. The codebook size of the first RVQ stage is 64 words, and that of the other RVQ stages is 32 words. Each codeword is a 4D vector in R4. 3.1 Grey-level Still Image Compression In this subsection, we will compare the proposed hybrid method with SPECK and SPIHT image codecs by encoding and decoding some test images (Fig. 8). Both 256 256 and 512 512 test images are used for testing. The proposed hybrid method and SPECK are compared by using 256 256 test images first, and then, three methods (including SPIHT) are compared for 512 512 test images. The compression rate is measured in bits per pixel (bpp), and the peak signal to noise ratio (PSNR) measured in dB is utilized to evaluate the decoded image quality. For the simulation of 256 256 monochrome images, 41 images, which do not include the three test images, are used to train the codebooks of the RVQs of the proposed method. Table 1 shows the simulation results of the 256 256 test images, and Figs. 9-11 show the PSNR-bpp curves for the 3 test images, where the horizontal and vertical axes are the compression rates in bpp and PSNR values in dB, respectively. For the 256 256 monochrome image Lena, the proposed hybrid coder outperforms the SPECK coder by 1.67 dB at 1.0 bpp, and 0.48 dB, on average, from 0.1 bpp to 1.5 bpp. For the 256 256 monochrome image Barbra, the proposed hybrid coder outperforms the SPECK coder by 1.23 dB at 1.1 bpp, and 0.49 dB on average. For the third 256 256 gray-level image

1020

SHENG-FUU LIN, HSI-CHIN HSIN AND CHIEN-KUN SU

(a) Lena. (b) Babara. (c) Goldhill. Fig. 8. Three 8-bit gray-level 256 256 test images.

Table 1. Simulation results of 256 256 test images.


Lena SPECK Proposed 40.89 41.68 40.51 40.82 40.09 40.07 39.58 39.68 39.05 39.19 36.96 38.63 36.41 37.24 35.78 36.02 35.08 35.34 33.74 34.55 32.39 32.85 31.43 31.56 29.33 30.44 28.72 28.97 28.01 28.14 25.89 26.58 25.25 25.44 PSNR (dB) Barbara SPECK Proposed 39.48 39.59 39.02 39.21 38.56 38.79 37.60 38.38 36.21 37.44 35.71 36.41 35.20 35.46 34.64 34.87 34.00 34.31 32.37 33.59 31.65 31.90 30.80 30.95 29.79 30.09 28.59 28.93 27.99 28.33 25.84 26.84 25.22 25.89
40 Proposed SPECK

bpp 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.25 0.2 0.125 0.1

Goldhill SPECK Proposed 33.61 34.34 33.22 33.92 32.85 33.53 32.48 33.19 32.10 32.64 31.48 31.72 30.56 30.97 30.10 30.43 29.61 29.98 29.09 29.48 28.54 29.00 27.44 27.76 26.74 26.89 26.29 26.49 25.77 26.03 24.46 24.92 24.11 24.49

35 PSNR (dB) 30 25 0

0.5 bpp

1.5

Fig. 9. The experimental results of the 256 256 gray-level image Lena.

Fig. 10. The experimental results of the 256 256 gray-level image Barbara.

HYBRID IMAGE COMPRESSION BASED ON SPECK AND RVQ

1021

Fig. 11. The experimental results of the 256 256 gray-level image Goldhill.

Fig. 12. The average improvements of the proposed hybrid coder compared with the original SPECK on more 256 256 test images.

Goldhill, the proposed hybrid coder outperforms the SPECK coder by 0.73 dB at 1.5 bpp, and 0.43 dB on average. The experimental results of more test images obtained from the USC (University of Southern California) image database are shown in Fig. 12. In Fig. 12 the curve denotes the average improvement by using the proposed hybrid coder compared with the pure SPECK coder. It is shown that the proposed hybrid coder is preferable to the SPECK coder in terms of the PSNR-bpp curves. For the experiments of 512 512 gray-level still images, the proposed hybrid method, SPECK, and SPIHT (with arithmetic coding) are simulated and compared with each other. SPIHT is selected for comparison because it is a wavelet-based method with very good performance and used in JPEG2000. A set of codebooks were trained by using 8 training images, downloaded from USC image database, and the K-means method. The stages of a RVQ of the proposed hybrid method were empirically reduced to 7 stages, since using fewer stages in a RVQ usually obtains better performance (higher PSNR values) for low bit-rate cases. The vectors used for 512 512 images are also 4D vectors (Fig. 3) in the vector space R4. The 7 thresholds of the 3 RVQs in the proposed hybrid method are: 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, and 0. Table 2 shows the simulation results of the proposed hybrid method, SPECK, and SPIHT (with arithmetic coding) on 512 512 test images. SPECK and SPIHT are two state-of-the-art techniques, and which one has better performance usually depends on the image characteristic. According to the results in Table 2, although we can not guarantee that the proposed hybrid method can always has the best performance; it seems that the proposed hybrid method can improve the SPECK codec for most images, especially under low bit-rate conditions. Three 0.25-bpp decoded images of SPECK, the proposed method, and SPIHT are shown in Fig. 13, and it is difficult to find difference among these images by our eyes instantly. By carefully inspecting the reconstructed images in Fig. 13, we found that the image of SPECK codec is smoother than the others and the proposed hybrid codec preserves more small details of the original images. Chao et al. proposed a vector SPECK [27] for still gray-level image compression. Three types of VQs (full search VQ, tree-structured VQ, and entropy constrained VQ) were used in their method at the same time, and the vector dimension and vector entries depend on the subbands and quantization levels where the vector is located. A large

1022

SHENG-FUU LIN, HSI-CHIN HSIN AND CHIEN-KUN SU

Table 2. Simulation results for 512 512 test images.


Lena SPECK Proposed SPIHT 40.44 40.29 39.89 39.99 40.03 39.39 39.54 39.74 38.69 38.89 39.40 38.14 37.46 39.01 37.53 36.87 37.38 36.78 36.03 36.62 35.82 34.07 35.61 34.42 33.46 34.16 33.65 32.64 33.29 32.71 29.48 30.31 29.82 PSNR (dB) Barbara SPECK Proposed SPIHT 35.23 36.18 36.77 34.67 34.83 35.96 34.00 34.32 35.01 32.85 33.75 33.88 31.40 33.00 32.72 30.62 31.02 31.63 29.68 30.10 30.33 28.00 28.95 28.54 27.30 27.98 27.60 26.49 26.92 26.66 23.99 24.75 24.37 Goldhill SPECK Proposed SPIHT 34.89 35.42 35.82 34.46 35.00 35.31 33.99 34.52 34.78 33.49 34.06 34.15 32.72 33.39 33.36 31.71 32.32 32.55 31.03 31.59 31.69 30.18 30.72 30.79 29.61 30.20 30.15 28.69 29.26 29.39 27.03 27.62 27.63

bpp 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.25 0.2 0.1

Table 3. Experiment results of SPECK, JPEG2000, and vector SPECK from [27].
Bit rate 0.125 0.2 0.25 SPECK 30.96 32.99 34.03 Lena JPEG2000 30.92 32.96 34.09 Vector SPECK 31.25 33.47 34.33

(a) (b) (c) Fig. 13. Decoded images of (a) SPECK, (b) the proposed hybrid method, and (c) SPIHT under 0.25bpp condition.

amount (1,500) of training images and Lloyd splitting method are used for training codebooks. Vector SPECK can outperform the JPEG2000 codec under low bit-rate conditions at the cost of added complexity, but it does not handle the lower bit planes for n = 3, 2, 1, and 0. Compared with vector SPECK, the proposed hybrid method has the features of low complexity and a wide bit-rate range. Table 3 shows some experiment data form [27], and they used 5 decomposition levels, 9/7 DWT, and arithmetic coding in SPECK. Since the conditions of Tables 2 and 3 are different, the results of Tables 2 and 3 of the same method are not equal. Hence, we only compare the difference values of SPECK and the proposed hybrid method in Table 2 with the difference values of SPECK and JPEG2000 (or the vector SPECK) in Table 3. For the Lena image under 0.25-bpp case, the vector SPECK

HYBRID IMAGE COMPRESSION BASED ON SPECK AND RVQ

1023

outperforms SPECK 0.3 dB, JPEG2000 outperforms SPECK 0.06 dB (Table 3), and the proposed hybrid method outperforms SPECK 0.7 dB (Table 2). Hence, it shows that the proposed hybrid method is very competitive and efficient. 3.2 Chroma Subsampling Image Compression The goal of the simulation is to compare the performance of the proposed hybrid coder with that of the CSPECK coder for YUV 4:2:0 images. Based on the simulation results, we can choose a proper coder for applications with such a format, e.g. MPEG-4, PAL DV, DVCAM, HDV, JPEG/JFIF, H.261, VC-1, and MJPEG. The test images, used in the simulation, have 256 256 Y (luminance) plane and 128 128 U and V (chrominance) planes. The 9/7-tap biorthogonal wavelet transform is performed on each plane separately, and the number of decomposition level is four. For a CSPECK codec, the decoder needs to know the maximum number of binary bit planes (nmax) that is used for coding the transformed image. Excluding the three LL subbands of Y, U, and V planes, the other coefficients (in H, V, and D subbands of each Y, U, or V plane) of the transformed image are coded by using the CSPECK with three 10-stage RVQs. In our experiments, 55 color images are used to generate 90 codebooks for the proposed codec, since the RVQs are 10-stage and there are three YUV planes that each has three types (H, V, and D) of 4D coefficient vectors. For the vectors in H subbands of Y plane, 128 vectors are selected to be the basis vectors for the vectors with L2 norms in [0.5, 1), and each of the other 9 codebooks of the H subbands in Y plane has 64 codewords. The same basis vector arrangement as that used in the H subband is used in the D and V subbands in Y plane. Because the human eye is less sensitive to the chrominance information than to the luminance information, fewer basis vectors are used in U plane or V plane. For the vectors in H, D, or V subbands of plane U or plane V, 32 basis vectors are used in the highest (i.e. 10th) stage of the RVQs, and each of the other 9 stages has 16 basis vectors. All the codebooks are trained by using the simple K-means method. The equivalent bit-per-pixel (ebpp) value defined in Eq. (3) is used for representing the compression rate for decoding a coded YUV 4:2:0 image:
ebpp = number of bits used 256 2 + 2 1282 .

(3)

The 256 256 color test image Goldhill is used for simulation, and the curves in Figs. 14-16 show the simulation results. The test image is originally 256 256 size in the R, G, and B planes (true color space), so they had to be preprocessed before simulation. First, the test image was transformed to the YUV space. Then, the U and V planes were downsampled to 128 128 pixels, where the downsampleing method was to calculate the arithmetic mean of the adjacent four-point values. We compare the PSNR values of the proposed hybrid method with those of the CSPECK coder in Y, U, and V planes, respectively. It can be seen that the PSNR values can be improved by 1.11 dB for the Y plane, 0.99 dB for the U plane, and 2.31 dB for the V plane at the bit budget of 98,304 bits (1.0 ebpp). For the same image, the average PSNR values (from 0.1 ebpp to 1.5 ebpp) of the proposed method are higher than those of CSPECK by 0.66 dB, 1.21 dB, and 2.22 dB in Y, U, and V planes, respectively.

1024

SHENG-FUU LIN, HSI-CHIN HSIN AND CHIEN-KUN SU

Fig. 14. The Y-plane experimental results of the chroma subsampling image Goldhill.

Fig. 15. The U-plane experimental results of the chroma subsampling image Goldhill.

Fig. 16. The V-plane experimental results of the chroma subsampling image Goldhill.

Based on the simulation results, it is obvious that the proposed method has superior improvement in the two chrominance planes (U and V), since the colors (chrominance information) of the four neighbors in a 2 2 block are usually similar. On the other hand, the luminance values are more probable to change abruptly than the chrominance values are, because of sharp edges and corners. Even though, the proposed method also achieves good results in the Y planes. The major added cost of the proposed method is the needs of training codebooks and determining parameters before encoding. Since the most time consuming codebook design can be done off-line and the codebook sizes of the RVQs are small, the proposed hybrid method is efficient in time and bit-budget.

4. CONCLUSIONS
In this paper, we propose a hybrid image coder, which is based on SPECK and RVQ, for still gray-level and chroma subsampling images. Compared with SPIHT and SPECK (two state-of-the-art algorithms), the experimental results have shown that the proposed hybrid method is efficient for image compression. According to the applications that we are interested in, the flexible proposed hybrid codec can be designed to improve its low

HYBRID IMAGE COMPRESSION BASED ON SPECK AND RVQ

1025

bit-rate or high bit-rate performance by using a short RVQ or a long RVQ. We also have shown that the proposed hybrid method has superior performance for the chrominance planes (i.e. U and V planes in YUV color space) in chroma subsampling image compression. Because of the asymmetry property of VQ, the proposed hybrid method is suitable for those applications whose load is also asymmetric and is heavy on the decoding side (e.g. the image archiving of an image database). Although the proposed hybrid codec is asymmetric, using RVQ instead of full-search VQ makes the increased complexity affordable and worthy.

ACKNOWLEDGEMENTS
The authors would like to thank the anonymous reviewers for their comments that significantly helped improve this paper.

REFERENCES
1. H. G. Musmann, P. Pirsch, and H. J. Grallert, Advances in picture coding, in Proceedings of IEEE, Vol. 73, 1985, pp. 523-548. 2. R. J. Clarke, Transform Coding of Images, Academic Press, New York, 1985. 3. O. J. Kwon and R. Chellappa, Region adaptive subband image coding, IEEE Transactions on Image Processing, Vol. 7, 1988, pp. 632-648. 4. K. R. Rao and J. J. Hwang, Techniques and Standards for Image Video and Audio Coding, Prentice Hall, New Jersey, 1996. 5. W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standards, Van Nostrand, New York, 1993. 6. JPEG2000 Core Coding System (Part 1), ISO/IEC 15444-1, Dec. 2000. 7. B. E. Usevitch, A tutorial on modern lossy wavelet image compression: Foundations of JPEG2000, IEEE Signal Processing Magazine, Vol. 18, 2001, pp. 22-35. 8. J. M. Shapiro, Embedded image coding using zerotrees of wavelet coefficients, IEEE Transactions on Signal Processing, Vol. 41, 1993, pp. 3445-3462. 9. A. Said and W. A. Pearlman, A new, fast, and efficient image codec based on set partitioning in hierarchical trees, IEEE Transactions on Circuits Systems for Video Technology, Vol. 6, 1996, pp. 243-250. 10. S. D. Servetto, K. Ramchandran, and M. T. Orchard, Image coding based on a morphological representation of wavelet data, IEEE Transactions on Image Processing, Vol. 8, 1999, pp. 1161-1174. 11. E. S. Hong and R. E. Ladner, Group testing for image compression, IEEE Transactions on Image Processing, Vol. 11, 2002, pp. 901-911. 12. A. Islam and W. A. Pearlman, An embedded and efficient low-complexity hierarchical image coder, in Proceedings of SPIE Visual Communications and Image Processing, Vol. 3653, 1999, pp. 294-305. 13. W. A. Pearlman, A. Islam, N. Nagaraj, and A. Said, Efficient, low-complexity image coding with a set-partitioning embedded block coder, IEEE Transactions on Circuits Systems for Video Technology, Vol. 14, 2004, pp. 1219-1235. 14. G. Strang and T. Nguyen, Wavelets and Filter Banks, Wellesley-Cambridge, MA,

1026

SHENG-FUU LIN, HSI-CHIN HSIN AND CHIEN-KUN SU

1996. 15. C. Chrysafis, A. Said, A. Drukarev, and W. A. Pearlman, SBHP A low complexity wavelet coder, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2000, pp. 2035-2038. 16. S. T. Hsiang and J. W. Woods, Embedded image coding using zero blocks of subband/wavelet coefficients and context modeling, in Proceedings of IEEE International Conference on Circuits and Systems, 2000, pp. 662-665. 17. C. E. Shannon, A mathematical theory of communication, The Bell System Technical Journal, Vol. 27, 1948, pp. 379-423, 623-656. 18. A. Gersho and R. M. Gray, Vector Quantization and Signal Compression, Kluwer Acdemic Publishers, MA, 1992. 19. S. Gupta and A. Gersho, Feature predictive vector quantization of multispectral images, IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, 1992, pp. 491501. 20. C. K. Su, H. C. Hsin, and S. F. Lin, Wavelet tree classification and hybrid coding for image compression, IEE Proceedings of Vision, Image, and Signal Processing, Vol. 152, 2005, pp. 752-756. 21. T. K. Abdel-Galil, E. F. El-Saadany, A. M. Youssef, and M. M. Salama, Disturbance classification using hidden Markov models and vector quantization, IEEE Transactions on Power Delivery, Vol. 20, 2005, pp. 2129-2135. 22. C. F. Barnes, Residual quantizers, Ph.D. Dissertation, Department of Electrical and Computer Engineering, BrigHam Young University, Provo, UT, 1989. 23. F. Kossentini, M. J. T. Smith, and C. F. Barnes, Image coding using entropy-constrained residual vector quantization, IEEE Transactions on Image Processing, Vol. 4, 1995, pp. 1349-1357. 24. Y. Shoham, Hierachical vector quantization with application to speech waveform coding, Ph.D. Dissertation, Department of Electrical and Computer Engineering, University of California at Santa Barbara, 1985. 25. B. H. Juang and A. H. Gray, Multiple stage vector quantization for speech coding, in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Processing, Vol. 1, 1982, pp. 597-600. 26. G. Xie and H. Shen, Highly scalable, low-complexity image coding using zeroblocks of wavelet coefficients, IEEE Transactions on Circuits Systems for Video Technology, Vol. 15, 2005, pp. 762-770. 27. C. C. Chao and R. M. Gray, Image compression with a vector SPECK algorithm, in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Processing, Vol. 2, 2006, pp. 445-448.

Sheng-Fuu Lin () was born in Taiwan, R.O.C., in 1954. He received the B.S. and M.S. degrees in Mathematics from National Taiwan Normal University in 1976 and 1979, respectively, the second M.S. degree in Computer Science from the University of Maryland in 1985, and the Ph.D. degree in Electrical Engineering from the University of Illinois, Champaign, in 1988. Since 1988, he has been on the faculty of the Department

HYBRID IMAGE COMPRESSION BASED ON SPECK AND RVQ

1027

of Electrical and Control Engineering at National Chiao Tung University, Hsinchu, Taiwan. His research interests include fuzzy theory, automatic target recognition, scheduling, image processing, and image recognition. Professor Lin is a member of the IEEE Control Society, Chinese Fuzzy System Association, and Chinese Automatic Control Society.

Hsi-Chin Hsin () received the M.S. and Ph.D. degrees in Electrical Engineering from the University of Pittsburgh, Pittsburgh, PA, in 1992 and 1995, respectively. He is a Professor in the Department of Computer Science and Information Engineering at National United University, Taiwan. His research interests include wavelet transform, image processing, CORDIC, DSP architectures and system on chip.

Chien-Kun Su () was born in 1962. He received the B.S. degree from National Taiwan University, Taiwan, in 1989, M.S. degree from the University of Southern California, U.S.A., in 1992, and the Ph.D. degree from National Chiao Tung University, Taiwan, in 2008. He has been on the faculty of the Department of Electrical Engineering at Chung Hua University, Hsinchu, Taiwan since 1995. His research interests include image processing and computer vision.

Das könnte Ihnen auch gefallen