Sie sind auf Seite 1von 9

Lossless Compression of Medical Images Using Hilbert Space-Filling Curves

Jan-Yie Liang , Chih-Sheng Chen , Chua-Huang Huang , and Li Liu Department of Information Engineering and Computer Science Feng Chia University Taichung, Taiwan liangty, chenc, chh @pmlab.iecs.fcu.edu.tw Graduate Institute of Medical Informatics Taipei Medical University Taipei, Taiwan david@mail.tmch.org.tw

Abstract
A Hilbert space-lling curve is a curve of two-dimensional space that it visits neighboring points consecutively without crossing itself. The application of Hilbert space-lling curves in image processing is to rearrange image pixels in order to enhance pixel locality. An iterative program of the Hilbert space-lling curve ordering generated from a tensor product formulation is used to rearrange pixels of medical images. We implement four lossless encoding schemes, run-length encoding, LZ77 coding, LZW coding, and Huffman coding, along with the Hilbert space-lling curve ordering. Combination of these encoding schemes are also implemented to study the effectiveness of various compression methods. In addition, differential encoding is employed to medical images to study different format of image representation to the above encoding schemes. In the paper, we report the testing results of compression ratio and performance evaluation. The experiments show that the pre-processing operation of differential encoding followed by the Hilbert space-lling curve ordering and the compression method of LZW coding followed by Huffman coding will give the best compression result. Keywords: Hilbert space-lling curve, Run-Length encoding, LZ77 coding, LZW coding, Huffman coding, differential encoding.

image processing generates large and data-rich electronic les. To speed up electronic transmission and to minimize computer storage space, medical images often are compressed into les of smaller size. Compressed medical images must preserve all the original data details when they are restored for image presentation. That is, medical image compression and uncomprssion must be lossless. Data compression is a long-history technique of human activities. Abbreviation and other devices for shortening the length of transmitted messages have been used in every human society. Before Shannon, the activity of data compression was informal. It was Shannon who rst created a formal intellectual discipline for data compression. A remarkable outcome of Shannons formalization of data compression has been the use of sophisticated theoretical ideas [24]. For example, the JPEG standard, proposed in the 1980s and in use for transmitting and storing images, uses discrete cosine transformation, quantization, run-length coding, and entropy coding [20]. In this paper, the concept of information theory is reviewed. several coding schemes, including run-length coding, Huffman coding, LZ77 coding, and LZW coding, are used to compress medical images. These entropies of CT images before and after compression are shown. The approach is to lower the entropy of the compressed image,increasing the amount of compression resulted from those compression methods. Run-length encoding is a simple and effective compression scheme [6]. An example of a real-world use of run-length coding is for the ITU-T T.4 (G3 fax) standard for facsimile data compression. This is the standard for all home and business facsimile machines used over regular phone lines. The basic idea is to identify strings of adjacent messages of equal value and replace them with a single occurrence along with a count. The Huffman coding scheme rst scans the source data to obtain a probability model of the source data and then generates a coding tree using the probability model [8]. The Huffman coding scheme is probably the mostly used one in data and image compression pro1

1 Introduction
Modern medical diagnostics are often based on X-ray computerized tomograghy (CT) and magnetic resonance imaging (MRI) technique. The raw data delivered by such imaging devices usually take several mega-bytes of disk space. The diagnostic images for radiologic interpretation must be efciently stored and transmitted to physicians for future medical or legal purposes. Digital medical

grams, such as GZIP and JPEG. The Lempel-Ziv coding schemes, known as LZ77 and LZW coding schemes, are dictionary based [15, 16]. A dictionary is created at the time when source data is parsed. LZW coding scheme is often enhanced with the probability model as in the Huffman coding scheme. Examples of enhanced LZW coding are GZIP and ITU-T V.42bis. The Hilbert space-lling curve is a one-to-one mapping between an -dimensional space and a one-dimensional space [7]. The Hilbert space-lling curve scans all points of the -dimensional space without crossing itself. Also, the curve preserves neighborhoods of points as much as possible. Since David Hilbert presented the Hilbert spacelling curve in 1981, there have been several research works about formal specication of an operational model or a functional model. A mathematical history of Hilbert space-lling curves is given by Sagan [22]. Butz proposes an algorithm to compute the mapping function of Hilbert space-lling curves using bit operations [2, 3]. Quinqueton and Berthod propose an algorithm for computing all addresses of scanning path as a recursive procedure [21]. Kamata et al. proposes a non-recursive algorithm for the -dimensional Hilbert space-lling curve using look-up tables [11, 12]. Lin et al. presents a tensor product based algebraic formulation of two-dimensional and three-dimensional Hilbert space-lling curves [4, 18]. The tensor product formulas are also used to generate Hilbert space-lling curves in C programming language. Jagadish analyzes the clustering properties of Hilbert space-lling curves [9]. He shows that the Hilbert spacelling curve achieves the best clustering, i.e., it is the best space-lling curve in minimizing the number of clusters. Moon et al. provide closed-form formulas of the number of clusters required by a given query region of an arbitrary shape for Hilbert space-lling curves [19].

Khuri and Hsu propose a run-length encoding algorithm with Hilbert space-lling curve ordering [14]. Some experiments show that storage could be reduced up to 60% in the Hilbert space-lling curve order comparing to the row/column major order for images with high spatial homogeneity [1, 5]. In this paper, we will investigate lossless compression of medical images using the Hilbert space-lling curve order. The programs generated from tensor product formulas [18] are used to perform pixel reording. Four coding schemes, run-length coding, Huffman coding, LZ77 coding, and LZW coding, will be tested to compress medical images. These schemes are also combined in image compression. In addition to the coding schemes, images are pre-processed with order one to order three differentiation. The paper is organized as the following. An overview of lossless compression methods is given in Section 2. The Hilbert Space-lling curves is described in Section 3. Programs generated from tensor product formulas for both Hilbert space-lling curve ordering and its inverse are included in Section 3. The inverse Hilbert space-lling curve order is need for data uncompression. Experiments and performance evaluation are presented and discussed in Section 4. Concluding remarks and future works are given in Section 5.

Lossless Data Compression

Compression is a process used to reduce the physical size of information. A compression process has many goals: to store more information on the same media (reduce disk space usage, to reduce transmission time and bandwidth on a network, and to reuse an digital image. For a compression technique, we actually refer to two dual processes of compression algorithm: construction of the compressed data and reconstruction of the original data. Based on the resulting data of reconstruction, data compression scheme can be divided into two classes: lossless compression scheme and lossy compression scheme [23]. The compression ratio is dened as the ratio of the number of original uncompressed bytes to the number of compressed bytes including overhead bytes. A compression technique is considered optimum when the information content of the compressed data is closed to the entropy of the original image. Shannon shows that any denition of entropy satisfying his assumptions will be , where is the constant of a measurement unit [24]. This implies that the efciency of a source alphabet with symbols can be dened simply as being equal to its entropy.

Hilbert space-lling curves have been applied to image compression broadly. In most of the works, image pixels initially stored in the row major order are reordered using the Hilbert space-lling curve permutation and then a given procedure is employed to perform data compression. If an image is not of horizontal or vertical stripes, the Hilbert space-lling curve ordering places similar color pixels in adjacent areas consecutively, i.e., the compressibility of an image under the Hilbert space-lling curve order is more effective than the row-major order. Lempel and Ziv prove that the compression rate of an image is lower-bounded if it is scanned in the Hilbert spacelling curve order [17]. Kamata et al. uses a simple zero-order interpolation compression algorithm to compress gray images and color images [10, 13]. Before the application of the zero-order interpolation, the pixels of an image are rearranged into the Hilbert space-lling curve order. Experimental results of images are reported. The compressed images will lose some content details and irreversible. However, according to their study, the quality of the compressed images are close to JPEG images. 2

Lossless compression implies no loss of information. If data has been losslessly compressed, the original data can be totally recovered from the compressed data. Lossless compression is used for applications that cannot toler-

ate any difference between the original and reconstructed data. Lossless compression is generally implemented using one of two different types of modeling: statistical and dictionary-based. Statistical model reads in and encodes a single symbol at a time using the probability of the character appearance. This model generally encodes a single symbol at a time: reading a symbol, calculating its probability, then generating its compressed code. An example of this compression scheme is Huffman coding. A dictionary based compression technique uses a different concept. It reads in input data and looks for groups of symbols that appear in a dictionary. If a string match is found, a pointer or index into the dictionary can be output instead of the code for the symbol. Intuitively, the longer the match, the better is the compression. Static dictionary and adaptive dictionary schemes are two kinds of dictionary schemes. The genesis of most modern adaptive dictionary schemes can be traced to two landmark papers, written by Lempel and Ziv, in 1977 and 1978 [15, 16]. The coding schemes based on the 1977 paper are named as LZ77 schemes; those based on the 1978 paper are named as LZ78 schemes. For example, LZW is one of LZ78 schemes. In the rest of this section, we will briey review algorithms of Huffman coding, LZ77 coding, and LZW coding schemes. We also review a simple coding scheme: run-length coding. In data compression, some preprocessing steps are performed before the application of a coding scheme. A pre-process, called differential coding, is to compute the difference of adjacent data elements. In addition, the review will include the differential coding scheme.

convention is to put the lower weight root on the left, if .

When building the prex-code tree, we must consider the conditions for an optimal variable-length binary code [23]. We list these conditions as follows: and , if , 1. Given any two letters then , where is the number of bits in the codeword for .

2. The two least probable letters have codewords with the same maximum length .

3. In the tree corresponding to the optimum code, there must be two branches stemming from each intermediate node. 4. Suppose we change an intermediate node into a leaf node by combining all the leaves descending from it into a composite word of a reduced alphabet. Then, if the original tree was optimal for the original alphabet, the reduced tree is optimal for the reduced alphabet. In this paper, we build an optimal prex-code tree when applying Huffman coding scheme.

2.2 LZ77 Coding Algorithm


The LZ77 algorithm compresses by building a dictionary of previously seen strings that consists of a group of characters of varying lengths. At the highest level the algorithms can be described as follows. The LZ77 algorithm and its variants use a sliding window that starts from the beginning of the data and moves along with the cursor. The window can be divided into two parts, the part before the cursor, called the dictionary, and the part starting at the cursor, called the lookahead buffer. The size of these two parts are parameters of the program and are xed during execution of the algorithm. The algorithm includes the following steps : 1. Find the longest match of a string starting at the cursor and completely contained in the lookahead buffer to a string in the dictionary.

2.1 Huffman Coding Algorithm


A Huffman code is an optimal prex code generated from a set of probabilities [8]. It scans the source data to obtain a probability model of the source data rst, and then generates a coding tree using the obtained probability model. We will describe generation of the prex-code tree below. 1. Start with a forest of trees, one for each data element. Each tree contains a single vertex with weight , where is the probability of the data element on the vertex.

2. Repeat until only a single tree remains: (a) Select two trees with the lowest weight roots, say, and .

containing the position 2. Output a triple of the occurrence in the window, the length of the match and the next character passing the match.

(b) Combine them into a single tree by adding a new root with weight and making the two trees its children. It does not matter which subtree is the left or right child, but the

3. Move the cursor

characters forward.

We will use the LZ77 coding scheme in the experiments. 3

2.3 LZW Coding Algorithm


Unlike the LZ77, the LZW coding scheme does not have an existing dictionary. The LZW algorithm reads the data and tries to match a sequence of data bytes as large as possible with an encoded string from the dictionary. The matched data sequence and its succeeding character are grouped together and is then added to the dictionary for encoding later data sequence. For an image with -bit pixels, the compressed code of each pixel occupies bits or larger. While a smaller compressed code results in higher compression rate, it also limits the size of the dictionary. For example, a common arrangement uses a 12-bit compressed code for each 8-bit data element. A 12-bit code size allows 4096 entries in the dictionary. If the encoder runs out of space in the dictionary, the traditional LZW encoder must be aborted and a larger compression code size is tried again. The initial dictionary is a collection of roots containing all possible values of an -bit pixel. Let be a matched string that is empty at the beginning. The LZW compression algorithm starts from the beginning of the original data stream and performs the following steps:

the length of the original string. In this case run-length encoding will increase the data size. run-length encoding has the advantage of being simple to implement, but it is incapable of achieving high levels of compression for most data. Although some graphical images may compress well using run-length encoding, most textual data does not contain long runs of repeated characters. The run-length encoding compression algorithm steps are: 1. Step through the source data from beginning to end, watching for repeated sequences of characters. 2. Build the compressed data string as the source data is scanned. In this paper, we also compress medical images using the run-length encoding scheme and compare its results with other encoding schemes.

2.5 Differential Encoding Algorithm


Let be the source data string and be the -the character of . A difference is generated by taking the differences . Let be the resulting data string. The differential encoding algorithm is described as below:

1. Let

be the next character in the data stream.

2. Does string

present in the dictionary? 2.1. If yes, (extend with ).


2.2. If no, 2.2.1. Output the code word which denotes to the code stream. 2.2.2. Add the string to the dictionary. ( now contains only the charac2.2.3. ter ).

1. Let 2. Let

, and

3. Compute the difference, 4. If 5. Let

then .

. . else

3. Is end of data straem? 3.1. If yes, go to step 2. 3.2. If no, output the code word which denotes the code stream.

6. If is less than the size of the source string then

to

6.1.

6.2. Go to Step 2. The value 256 in Step 6.1 is determined by the gray level 00 to FF. Note that the resulting string of the differential encoding scheme has the same length as the source data string. Hence, the differential encoding scheme is only a pre-processing procedure of a data compression algorithm. It must be followed by other encoding schemes such as Huffman, LZW, LZ77, and run-length encoding schemes.

We implement the LZW coding scheme with 8-bit pixel and 12-bit code size.

2.4 Run-Length Encoding Algorithm


The run-length encoding scheme analyzes the data to be compressed by looking for runs of repeated characters. It stores these runs as a single character, precdded by a number representing the number of times this character is repeated in the run. Random data will not be compressed well if there are no runs of repeated characters. The algorithm cannot perform any compression at all. For example, the string ABAB will become 1A1B1A1B, twice 4

Hilbert Space-Filling Curves

Let be an -dimensional space. Peano, in 1890, discovered the existence of a continuous curve which passes

through every point of a closed square. In 1891, Hilbert [7] presented a curve having the space lling property in as shown in Figure 1(a). An understanding of the way in which a Hilbert curve is drawn is rapidly gained from Figure 1 showing the rst three steps of a recursive construction for the two-dimensional case. The Hilbert space-lling curve, as in Figure 1(b), is constructed by four copies of curves, and the curve, as in Figure 1(c), is then constructed by four copies of curves. In the recursive construction, the subcurves are connected in a given order, i.e., the order of four-point Gray permutation which is also the order of the Hilbert spacelling curve, and the orientation of each subcurve must be adjusted to t into the connection order.

13

14

15

The Hilbert space-lling curve visits all points on a continuous plane if we iterate the curve formation algorithm to the innite. For a nite space, the Hilbert spacelling curve is a curve on a grid and visits consecutive neighboring points without crossing the curve. A Hilbert space-lling curve is recursively constructed from Hilbert space-lling curves. The recursive tensor product formula of Hilbert spacelling curve for mapping from two-dimensional values to one-dimensional points is presented in [18]. This recursive tensor product formula is manipulated to an iterative formula. Operations of a tensor product can be mapped into program constructs of high-level programming languages. Therefore, the recursive and iterative tensor product formulas of Hilbert space-lling curves are translated into C programs. We use this program to reorder pixels of medical images before applying a compression algorithm described in Section 2. In addition to image compression, an image must be uncompressed to be restored to the original format. In order to perform image uncompression, the inverse Hilbert space-lling curve must be used. With the iterative tensor product formula of Hilbert space-lling curve, it is possible to derive the formula of the inverse Hilbert spacelling curves [18] and to generate its corresponding C program. We conclude this section with the iterative C program of Hilbert space-lling curves.
01 void hilbert(int n, int *a) { 02 int i1, j1, i; 03 int b[(int)pow(4,n)]; 04 int tmp; 05 06 07 08 09 10 11 12 if (n==1) { tmp = a[2]; a[2] = a[3]; a[3] = tmp; } else { for (i1=0; i1<(int)pow(2,n-1); i1++) { for (j1=0; j1<(int)pow(2,n-1); j1++) { // i0=0, j0=0 b[j1*(int)(pow(2,n-1))+i1]= a[i1*(int)(pow(2,n))+j1]; // i0=0, j0=1

16 17 18 19 20

b[i1*(int)(pow(2,n-1))+j1+ (int)pow(4,n-1)]= a[i1*(int)(pow(2,n))+j1+(int)pow(2,n-1)]; // i0=1, j0=0 b[((int)pow(2,n-1)-1-j1)* (int)(pow(2,n-1))+ ((int)pow(2,n-1)-1-i1)+ 3*(int)pow(4,n-1)]= a[i1*(int)(pow(2,n))+j1+ (int)pow(2,2*n-1)]; // i0=1, j0=1 b[i1*(int)(pow(2,n-1))+j1+ 2*(int)pow(4,n-1)]= a[i1*(int)(pow(2,n))+j1+ (int)pow(2,2*n-1)+(int)pow(2,n-1)]; } } for (i=0; i<(int)pow(4,n); i++) a[i] = b[i]; hilbert(n-1, &a[0]); hilbert(n-1, &a[(int)pow(4,n-1)]); hilbert(n-1, &a[2*(int)pow(4,n-1)]); hilbert(n-1, &a[3*(int)pow(4,n-1)]); } }

Experiments and Performance Evaluation

The goal of this paper is to test and compare various lossless compression algorithms. These algorithms are applied with and without Hilbert space-lling curve ordering. The intention is to verify the effectiveness of Hilbert space-lling curve ordering in lossless medical image compression. Similarly to differentiation encoding, the Hilbert space-lling curve ordering is not a compression scheme, but it is a pre-processing procedure. The experiments are carried out on a personal computer of Pentium III 850 and 384 mega-byte memory running Microsoft Windows 2000. The programs are written in C language and are developed using Microsoft Visual C++ 6.0. JPEG-LS, a lossless compression standard published by ISO/ITU jointly, is also tested and compared. The JEPG-LS program is a shareware published by Signal Processing and Multimedia Group (SPMG). The images being tested are 40 CT images shown in Figure 2(a1) to (a40). Each image is an 8-bit stored in the bitmap format occupies 263,222 bytes. At rst, we measure the entropies of the averages of the images. The entropy of the original CT image is 3.381574. The entropy of the images are rearranged into the Hilbert space-lling curve order followed by the differential scheme is 2.715905. The CT images are processed by the differential scheme followed by the rearrangement into the Hilbert space-lling curve order is 2.626565. We observe that the CT images processed by the differential scheme followed by the rearrangement into the Hilbert space-lling curve order is the lowest entropy. In the experiments, we test various encoding schemes, including run-length encoding, LZ77 coding, LZW cod5

(a)

(b) (c) Figure 1: Hilbert space-lling curves

ing, and Huffman coding, and their combinations. Along with the encoding schemes, differentiation and Hilbert space-lling curve ordering are applied as pre-processing operations. The experimental cases are summarized as the following: (1) The CT images are compressed by the LZ77 coding. (2) The CT images are compressed by the LZW coding. (3) The CT images are compressed by the run-length encoding. (4) The CT images are compressed by the Huffman coding. (5) The CT images are compressed by combining LZ77 and then Huffman (6) The CT images are compressed by combining LZW and then Huffman. (7) The CT images are processed by JPEG-LS alone. For each of the above encoding schemes, the following pre-processing operations are applied before image compression: (a) The piexels of a CT image is rearranged into the Hilbert space-lling curve order. (b) The CT images are processed by the differential scheme. (c) The CT images are rearranged into the Hilbert space-lling curve order and then followed by the differential scheme. (d) The CT images are processed by the differential scheme and then followed by the rearrangement into the Hilbert space-lling curve order. All the compression methods are testes on 40 CT images shown in Figure 2. The average size and average execution time from the 40 images obtained of each compression method are reported. Table 1 shows the average 6

Table 1: Average image size of each compression method and pre-processing operation
None None (a) (b) (c) (d)
263,222 263,222 263,222 263,222 263,222

(1)
111,888 112,266 110,868 113,614 111,300

(2)
103,709 101,230 83,915 86,869 81,535

(3)
97,312 98,242 93,360 92,125 94,988

(4)
118,354 118,354 108,189 110,398 108,189

(5)
108,223 108,473 98,869 102,156 99,145

(6)
93,170 92,298 77,827 80,248 76,931

(7)
169,533 210,914 180,519 220,065 203,540

image size of each compression method of seven encoding schemes (1) to (7) and each of four pre-processing operation (a) to (d). The rst data column marked none means that no encoding scheme is applied. The rst data row marked none means that no pre-processing operation is applied. All the image sizes also count the le header. Note that all the data obtained in the rst column is the same as the original image size because preprocessing operations do not reduce the image size. Table 2 is the average bit-per-pixel of the same compression methods and pre-processing operations. Let be an image size. The bit-per-pixel is calculated as . The experiments show that case (6d), which applies differential scheme followed by Hilbert space-lling curve ordering pre-processing operations and then compressed by the LZW coding followed by Huffman coding scheme, has the best compression result. The best compression ratio achieves size reduction of 71.77%. The experiments reveals that Hilbert space-lling curve ordering alone may not help to reduce compressed images. However, with the differential pre-processing, the source data are transformed to another format which has better locality. In this case, Hilbert space-lling curve ordering does play its role to enhance compression effect. Finally, we would like to point out that JPEG-LS does better without applying any pre-processing operation. However, its compression ratio is not as good as the other encoding schemes.

We also report execution time for image compression and uncompression of various encoding schemes and preprocessing operations in Table 3. In Table 3, two execution times are given. The rst value denotes execution

(a1)

(a2)

(a3)

(a4)

(a5)

(a6)

(a7)

(a8)

(a9)

(a10)

(b1)

(b2)

(b3)

(b4)

(b5)

(b6)

(b7)

(b8)

(b9)

(b10)

(c1)

(c2)

(c3)

(c4)

(c5)

(c6)

(c7)

(c8)

(c9)

(c10)

(a11)

(a12)

(a13)

(a14)

(a15)

(a16)

(a17)

(a18)

(a19)

(a20)

(b11)

(b12)

(b13)

(b14)

(b15)

(b16)

(b17)

(b18)

(b19)

(b20)

(c11)

(c12)

(c13)

(c14)

(c15)

(c16)

(c17)

(c18)

(c19)

(c20)

(a21)

(a22)

(a23)

(a24)

(a25)

(a26)

(a27)

(a28)

(a29)

(a30)

(b21)

(b22)

(b23)

(b24)

(b25)

(b26)

(b27)

(b28)

(b29)

(b30)

(c21)

(c22)

(c23)

(c24)

(c25)

(c26)

(c27)

(c28)

(c29)

(c30)

(a31)

(a32)

(a33)

(a34)

(a35)

(a36)

(a37)

(a38)

(a39)

(a40)

(b31)

(b32)

(b33)

(b34)

(b35)

(b36)

(b37)

(b38)

(b39)

(b40)

(c31)

(c32)

(c33)

(c34)

(c35)

(c36)

(c37)

(c38)

(c39)

(c40)

Figure 2: CT medical images

Table 2: Bit-per-pixel of each compression method and pre-processing operation


None (a) (b) (c) (d) None 8.03 8.03 8.03 8.03 8.03 (1) 3.41 3.43 3.38 3.47 3.40 (2) 3.16 3.09 2.56 2.65 2.49 (3) 2.97 3.00 2.85 2.81 2.90 (4) 3.61 3.61 3.30 3.37 3.30 (5) 3.30 3.31 3.01 3.12 3.03 (6) 2.84 2.82 2.38 2.45 2.35 (7) 5.17 6.44 5.51 6.72 6.21

Table 3: Execution time for CT image compression and uncompression (second)


None None (a) (b) (c) (d) (1) 0.328 0.048 0.482 0.237 0.341 0.056 0.497 0.248 0.497 0.248 (2) 0.082 0.051 0.238 0.242 0.094 0.058 0.250 0.251 0.250 0.251 (3) 0.043 0.042 0.198 0.234 0.053 0.049 0.210 0.241 0.210 0.241 (4) 0.105 0.160 0.263 0.353 0.115 0.168 0.271 0.361 0.272 0.361 (5) 0.408 0.196 0.565 0.387 0.419 0.202 0.576 0.396 0.576 0.396 (6) 0.155 0.181 0.315 0.377 0.167 0.191 0.325 0.385 0.325 0.385

time for image compression and the other value denotes execution of image uncompression. It worth to note that it takes about 0.156 seconds to rearrange pixels into the Hilbert space-lling curve order and it takes about about 0.191 seconds to restore pixels from Hilbert space-lling curve order back to the row-major order. Differential and reverse differential operations takes only a minor time of about 0.010 second. The combination of Hilbert spacelling curve ordering and differential takes about the total of the two individual execution times. The execution time overhead for Hilbert space-lling curve ordering ranges from 38.24% to 362.79%. The least overhead 7

is for the compression combined LZW and Huffman coding schemes. The most overhead is for the compression of run-length encoding scheme. If we consider the best compression scheme of differential with Hilbert space-lling curve ordering pre-processing operation and LZW with Huffman coding scheme, the overhead is about 41.17%.

Conclusions and Future Works

tional Journal of Geographical Information Systems, 4(1):2131, 1990. [2] A. R. Butz. Space lling curves and mathematical programming. Information and Control, 12(4):314 330, 1968. [3] A. R. Butz. Convergence with Hilberts space lling curve. Journal of Computer and System Sciences, 3(2):128146, 1969. [4] C.-S. Chen, S.-Y. Lin, and C.-H. Huang. Algebraic formulation and program generation of threedimensional Hilbert space-lling curves. In The 2004 International Conference on Imaging Science, Systems, and Technology, 2004. To appear. [5] V. Gaede and O. G unther. Multidimensional access methods. ACM Computing Surveys, 30(2):170231, 1998. [6] S. W. Golomb. Run-length encodings. IEEE Transactions on Information Theory, IT-12:140 149, 1966. [7] D. Hilbert. Uber die stetige abbildung einer linie auf Fl achenst uck. Mathematische Annalen, 38:459 460, 1891. [8] D. A. Huffman. A method for the construction of minimum redundancy codes. In Proceedings of the IRE, pages 10981101, 1951. [9] H. V. Jagadish. Linear clustering of objects with multiple attributes. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pages 332342. ACM Press, 1990. [10] S. Kamata, Y. Bandoh, and N. Nishi. Color image compression using a Hilbert scan. In Proceedings of Fourteenth International Conference on Pattern Recognition, volume 3, pages 15751578, 1998. [11] S. Kamata, R. O. Eason, and Y. Bandou. A new algorithm for N-dimensional Hilbert scanning. IEEE Transactions on Image Processing, 8(7):964973, 1999. [12] S. Kamata, M. Niimi, R. O. Eason, and E. Kawaguchi. An implementation of an Ndimensional Hilbert scanning algorithm. In Proceedings of the 9th Scandlnavian Conference on Image Analysis, pages 431440, 1995. [13] S. Kamata, M. Niimi, and E. Kawaguchi. A gray image compression using a Hilbert scan. In Proceedings of Thirteenth International Conference on Pattern Recognition, volume 2, pages 905909, 1996. 8

The paper is to study the effectiveness of Hilbert spacelling curves on lossless medical image compression. We rearrange pixels of a CT image according to the Hilbert space-lling curve order before it is applied to each of the four encoding schemes, run-length encoding, LZ77 coding, LZW coding, and Huffman coding. Combination of these encoding schemes are also tested. From the experiments, LZW coding followed by Huffman coding yields the best compression result. Also, by measuring the entropy, we can verify that the lower entropy to get the better compression ratio. However, the pre-processing operation of Hilbert space-lling curve ordering seems not to make major improvement. The LZW coding scheme is very sensitive with the size of table being used in the algorithm. The CT images are not compressed well by enlarging the table size. Changing the word size in the LZW scheme to match the frequently repeated strings size usually improves compression result. Differential pre-processing operation is applied to CT images. Since a CT image has most pixels closed to very white and very dark gray level. Differential preprocessing convert pixels to differences of closer values. It yields more data elements with similar values and improves compression ratio. Following differential preprocessing operation, Hilbert space-lling curve ordering enhances the locality of the differences. It makes the encoding schemes much more effectively. We also test CT image compression with multiple times of differential operation. It does not make the compression even better. Non-medical images, such as colored human portrait images and scenery images are also tested with the same pre-processing operations and encoding schemes. For these images, the Hilbert space-lling curve ordering will give much better results than CT images. However, in reality, these images often do not require loss-less compression. In the future work, we will also study application of the Hilbert space-lling curve ordering to lossy compression methods. The Hilbert space-lling curves presented in the paper rearranges source data of points. Many images are not necessary satisfy this limitation. One simple compensation is to extend an image with padding pixels. A real solution is to develop space-lling curves with arbitrary size. We will work on the design of space-lling curves of a rectangle space and size other than power of two.

References
[1] D. J. Abel and D. M. Mark. A comparative analysis of some 2-dimensional orderings. Interna-

[14] S. Khuri and H.-C. Hsu. Interactive packages for learning image compression algorithms. In Proceedings of the 5th Annual Conference on Innovation and Technology in Computer Science Education, pages 7376, 2000. [15] A. Lempel and J. Ziv. A universal algorithm for data compression. IEEE Transactions on Information Theory, 23:337343, 1977. [16] A. Lempel and J. Ziv. Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory, 24(5):530536, 1978. [17] A. Lempel and J. Ziv. Compression of twodimensional data. IEEE Transactions on Information Theory, 32:28, 1986. [18] S.-Y Lin, C.-S Chen, L. Liu, and C.-H Huang. Tensor product formulation for Hilbert space-lling curves. In Proceedings of the 2003 International Conference on Parallel Processing, pages 99106, 2003. [19] B. Moon, H. V. Jagadish, C. Faloutsos, and J. H. Saltz. Analysis of the clustering properties of the Hilbert space-lling curve. Knowledge and Data Engineering, 13(1):124141, 2001. [20] W. B. Pennebaker and J. L. Mitchell. JPEG Still Image Data Compression Standard. Van Nostrand Reinhold, 1993. [21] J. Quinqueton and M. Berthod. A locally adaptive Peano scanning algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 3(4):403412, 1981. [22] H. Sagan. Space-Filling Curves. Springer-Verlag, 1994. [23] K. Sayood. Introduction to Data Compression. Moorgan Kaufmann, 2nd edition, 1991. [24] C. E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27:379423, 623656, 1948.

Das könnte Ihnen auch gefallen