Sie sind auf Seite 1von 4


Digital Video Compression using Embedded Zero tree Wavelet Encoding and Non uniform Quantization
Samir J.AL-Muraab

Abstract- Video Compression is a process of reducing the file size without degrading the video quality. This is achieved by reducing the redundancy between consecutive frames and within a single frame. Motion estimation is used to reduce temporal redundancy between frames by dividing the frame in to blocks and search for the best match according to the cost function. The proposed compression system uses the conventional red , green and blue color space representation and applies a two dimension discrete wavelet transform on the error residual signal before estimating the motion between frames. A value of threshold is assigned to the subband coefficients; a second of threshold is assigned to the second level detail subband coefficients. The second value is less than the first value . The resulting coefficients are quantized using a non-uniform quantizer and encoded using the EZW encoding algorithm. The system will be tested using two kinds of videos (simple, complex). The performance of the system will be tested by three main parameters; compression ratio(CR), Peak signal to noise ratio(PSNR) and processing time(PT). All graphics and codes are executed using MATLAB2008a.

Index Terms Compression ratio, non-uniform quantizer, Peak signal to noise ratio, Processing time, threshold.



Digital video has become a very important form of digital video processing; it is now being used in many different applications such as broadcasting, teleconferencing, mobile telephone, surveillance and entertainment. Compression of video has become a necessary operation because the transmission and storage of uncompressed video would be very costly and impractical. For example, a video sequence running for 60 minutes at 25 frames per second, at a resolution of (240*320) and with 8 bit per pixel would require (6912*106 bytes), this requires a lot of storage and bandwidth for transmission. Compression is process of reducing the number of bits required to represent the image and video without losing much of its quality. The time which is required to compress the video sequence must also be reduced especially when online transmission and slow internet connection [1].

conduct this process, discrete cosine transform is used to analyze the image. The most common signal transform is discrete wavelet transform.


The DWT provides efficient information for analysis of the original signal, using DWT will assist in reducing computation time. The time-scale representation of digital signal is obtained using digital filtering; Filters at different cutoff frequencies are used to analyze the signal at different scales. The resolution of the signal is the measure of information in the signal. In a one dimension discrete wavelet transform, the signal is passed through a wavelet filter and then discard some samples (down sampling) producing low frequency band(L)and high frequency band(H). In image and video application two dimensionDWT, the operation implies applying one dimension DWT on the vertical and horizontal direction. This will produce four frequency subbands: LL, LH, HL and HH. The LL is a smaller version of the original image called the "Approximation" which is the important subband.[2]. The other three are called detail subbands. The Approximation band can be analyzed more deeper until L-level producing another four subband as shown in figure (1).


The video data has a great deal of redundancy; there are two kinds of redundancy in a video data: Spatial Redundancy: The redundancy which exists in a single frame and can be reduced or ideally eliminated using spatial compression. Spatial compression can be defined as the process of decorrelating frame pixels. Many techniques are used to

Figure 1: 1-level and 2-level decomposition of DWT of an image

Temporal Redundancy: Video frames are similar between consecutive frames, therefore; there is a great deal of redundancy between these frames. One of the techniques that is used to reduce this redundancy is Motion Estimation techniques. The motion estimation compares between two frames and uses a cost function to find the best match between blocks. There are also types of searches used in motion estimation and the type of search used is Adaptive Road Search Pattern(ARSP). The ARPS uses the fact that the motion of a block that belongs to an object is coherent. The center of the search is located in the center of the search window, the cost function will find the best match. As the best match will be obtained the search will change to small diamond search pattern. The procedure will continue until the minimum cost function is obtained[3].


Embedded Zero tree Wavelet (EZW) Compression is a famous wavelet coding algorithm. Shapiro's Embedded Zero tree Wavelet encoder or EZW encoder has made a major breakthrough[4].The EZW encoder was originally designed to operate on images (2D-signals) but, it can also be used on other dimensional signals. The EZW encoder is based on progressive encoding to compress an image into a bit stream with increasing accuracy [5].

2012 JOT


The EZW is based on two essential observations: 1. Images have a low pass spectrum. When image is wavelet transformed, the energy in the subbamds decrease as the scale decrease, therefore; the wavelet coefficients will, on average, be smaller in the higher subbands than in the lower subbands. This shows that progressive encoding is a very natural choice for compressing wavelet transformed images, since the higher subbands only add detail. 2. Large wavelet coefficients are more important than small wavelet coefficients. These two observations are exploited by encoding the wavelet coefficients in decreasing order, in several passes. For every pass a threshold is chosen against which all the wavelet coefficients are measured. If a wavelet coefficient is larger than the threshold it is encoded and removed from the image. If it is smaller, it is left for the next pass. When all the wavelet coefficients have been visited, the threshold is lowered and the image is scanned again to add more detail to the already encoded image.

three dimension frame is spatially compressed by first applying a two level, two dimension discrete wavelet transform, the output of this stage are seven subbands for each layer. Following this stage is thresholding, the type of thresholding which is used is hard threshold. A threshold value (s1) is assigned for the second detail subbands leaving the approximation subband without change, while a second threshold value (s2) is assigned to the first detail level, this is applied for each color layer. A non uniform quantizer is used to quantize the coefficients of the wavelet domain. The quantizer uses the mean of the image layer and divide the result by an integer such as ( 0.2, 3, 5, 6, .. .etc ) to limit the step-size with respect to mean value of sub-group for the levels of quantization, then take integer division for each pixel on the step-size and multiply by the step size to obtain the output of the quantization stage. The temporal compression uses motion estimation techniques to reduce temporal redundancy and the type of search used is ARSP. Temporal compression is conducted after spatial compression resulting in a more faster compression action because the ARPS can approach the best match (minimum cost function) than using it before spatial compression, this will reduce the time taken by the algorithm. The residual error frame which result from the subtraction of the spatially compressed frame and the motion compensated frame in the wavelet domain is encoded using EZW algorithm. The EZW algorithm will stop scan until (sth) is reached (Target point). The decompression process is the inverse operation of the compression process as shown in figure(2).


The proposed system that is shown in figure(2) performs two major steps of a compression system; spatial compression and temporal compression, The spatial compression system reduces the spatial redundancy of each frame individually. A single color frame is a three dimension frame, the color space used is RGB color space. The

Figure (2): The proposed video compression and decompression system using EZW and Huffman coding in RGB color space


Two sample videos are used to test the system, the size of the test videos (256*256*3) and the numbers of frames are 60 frames. The search parameter of the ARSP is 4, macro block size of the search algorithm is 4*4. All MATLAB codes are executed on dual core processor and 1GB RAM Pentium IV.

The type of filter used for the DWT is dbi and the level of decomposition used is 2-level. Many threshold values (s1), (s2)are used to test the system. Three different values of (sth) will be used for every video sample. The step size value SS is assigned by 3. Table(1) how the CR and PSNR and the compression time taken by the algorithm, figures(3-6) show the CR of each frame of the sixty frames of the test video sample. The figure(7, 8) shows the first and the fifty six reconstructed frames.


Table(1). Shows the PSNR, CR &PT for various thresholds and EZW threshold
Video Name & Size Thresholds

(10, 30,3) Rhinos 256*256 (70, 150,3)

(10, 30,3) Mother&daughter 256*256 (70,150,3)

EZW thresh 3 8 14 3 8 14 3 8 14 3 8 14

PSNR in dB 30.05 29.91 24.43 30.12 29.196 24.308 32.06 31.27 29.18 29.89 29.46 28.47

CR 82.82% 85.23% 91.04% 91.34% 93.10% 95.38% 91.32% 93.47% 98.21% 95.39% 96.74% 99.09%

PT in sec 15960.0 12404.0 8667.0 7188.4 6103.0 4701.6 10208 5742.3 4044.2 5079.8 3924.0 1318.8

Figure(7): Shows The first and fifty two reconstructed frames for various EZW thresholds for table(4) using Rhinos movie sample 2012 JOT


Figure(8): Shows The first and fifty two reconstructed frames for various EZW thresholds for table(1) using mother& daughter movie sample

The use of motion estimation in the wavelet domain and using EZW on the error residual signal has led to the decrease of the time used by the algorithm. The proposed system has produced an superior increase in CR with considerable PSNR, the CR increases with the increase of thresholds s1t s2 since the number of coefficients will vary according to these two parameters; The increase of thresholds will increase the CR and decrease with the decrease of these two parameters. The PSNR will increase as EZW thresh(s^) decrease, since more coefficients will be included in the dominant pass. A good picture quality is produced by the algorithm.


Yao Nie and Kai-Kung, " Adaptive Road Pattern Search for Fast Bock-Matching Motion Estimation", IEEE Transactions image processing, Vol.11, No.12, December 2002. Shapiro.J.M., "Embedded Image Coding Using Zerotrees of wavelet coefficients", IEEE Transactions on signal processing, 41,No. 12(1993), P.3445-3462. V.S. Shingate, T. R. Sontakke and S.N. Talbar, "Still Image Compression using Embedded Zerotree Wavelet Encoding" (IJCSC) International Journal of Computer Science &Communication, vo1. 1,January-June 2010, pp.21

4. 5.

1. 2. T.sikora,"MPEG-1 and MPEG-2 digital video coding standard ", Mc Grow - Hill Book Company,Ed.R.Jurgens,1998. N.Venkateswaran and Y.V.Ramana Rao, "K-Means Clustering Based Image Compression in wavelet Domain". Asian Network for Scientific Information, Information Technology Journal 6(1).148-153,2007.

Samir Jasam Mohammad (Member IEEE) was born in Babylon-1959, Iraq. He received the B.Sc. degree in Electronics and Communications Department from the University of Baghdad (1984)-Iraq, M.Sc. and Ph.D. degrees in Electronics and Communication Engineering from the University of Technology-Iraq in 1987 and 2004 respectively. Since 2004, he has been with the University of Babylon-Iraq, where he is lecturer in Electrical Engineering Department. His research interests include DVB, CDMA, Modulation Technique, Image processing.