Wavelet

VECTOR QUANTIZATION ALGORITHIMS
FOR IMAGE COMPRESSION PROBLEMS

Thesis submitted in partial fulfilment of the requirements
for the degree of
MASTER OF ELECTRICAL ENGINEERING

By
Sunita Swarnamayee Biswal
Roll no: M4ELE10-18

Under the Guidance of
Prof. Anjan Rakshit
and
Dr. Amitava Chatterjee

DEPARTMENT OF ELECTRICAL ENGINEERING
FACULTY OF ENGINEERING AND TECHNOLOGY
JADAVPUR UNIVERSITY
KOLKATA - 700 032
2010
ii

JADAVPUR UNIVERSITY
Faculty of Engineering and Technology

CERTIFICATE

We hereby recommend that the thesis prepared under our supervision and
guidance by Sunita Swarnamayee Biswal entitled Vector Quantization
Algorithm For Image Comperssion Problems be accepted in partial fulfilment
of the requirements for the award of the degree of Master of Electrical
Engineering at Jadavpur University. The project, in our opinion, is worthy of its
acceptance.

Supervisors

Countersigned

Prof. Anjan Rakshit
Professor,
Department of Electrical Engineering
Jadavpur University
Dr. Amitava Chatterjee
Reader,
Department of Electrical Engineering
Jadavpur University
Prof. Abhijit Mukherjee
Head of the Department,
Department of Electrical Engg.
Jadavpur University
Prof. Niladri Chakraborty
Dean,
Faculty of Engg. & Technology
Jadavpur University
iii

JADAVPUR UNIVERSITY
Faculty of Engineering and Technology

CERTIFICATE OF APPROVAL

The foregoing thesis is hereby approved as a creditable study of an engineering
subject carried out and presented in a satisfactory manner to warrant its
acceptance as pre-requisite to the degree for which it has been submitted. It is
notified to be understood that by this approval, the undersigned do not
necessarily endorse or approve any statement made, opinion expressed and
conclusion drawn therein but approve the thesis only for the purpose for which
it has been submitted.

Final Examination for the Board of Examiners
Evaluation of Thesis

..

..

............................................................

(Signature of Examiners)

iv

Dedicated
to
My family,
who have always supported me in all my
endeavours ..
v

ACKNOWLEDGEMENTS
During my Post Graduate studies in Engineering (2008-2010), at Jadavpur
University, kolkata, I have got co- operation from several persons directly or
indirectly. Without their support it would have been impossible for me to make it a
success. That is why I wish to dedicate this section to recognize their support. I
want to start expressing a sincere acknowledgement to my Guides Dr. Anjan
Rakshit & Dr. Amitava Chatterjee, because they gave me the opportunity to carry
on research under their guidance and supervision. I received motivation,
encouragement from them during my studies and they guided me with their vast
technical know-how. Their sincerity and devotion to the work still inspires me to
continue the present work to the next level in future.

I also want to express my gratitude to Prof. S. Munshi, Dr. M. Dutta, Mr. B.
Bhattacharya, Mr. P. Kundu, Mr. G. Sarkar and Mr. D. Dey of electrical engineering
department of Jadavpur University for their valuable advice and suggestion during
the studies.
Here I would like to thank my friends Dipak Ranjan Swain, Suresh Kumar
Saha, Dwaipayan Dutta, Kiron Nandi and Soumyajit Konar for being my comfort
zone. I could resolve various so called trivial issues by discussing with them.
Thank you very much. You have been fabulous.
I would also like to express gratitude to all laboratory assistants in the
Electrical Measurement Lab in the Department of Electrical Engineering for always
expending their helping hands.
I am from a family, which always encourages for higher studies. My parents
despite their modest formal education always give highest preference to our
studies. My husband Dipak still suports me. A special thank to my husband, my
sister and my son for their tolerance and sacrifice during my studies. I deeply
respect the attitude with which they shared my responsibilities in the family and
allowed me to devote in my studies being undisturbed. I feel deep respect about
my family. I am fortunate and proud to have of such a family.

Kolkata, 18th MAY, 2010. ...........
(Sunita Swarnamayee Biswal)

vi

CONTENTS

Acknowledgements v
Contents vi
1 INTRODUCTION 1
1.1 Introduction 1
1.2 Thesis outline 2

2 IMAGE COMPRESSION ALGORITHMS 4
2.1 Image compression Algorithms 4
2.2 Types of image compression algorithms 8
2.2.1 Lossless compression 9
2.2.2 Lossy compression 10
2.3 Vector quantization 12

3 LEARNING VECTOR QUANTIZATION ALGORITHIMS 16
3.1 Learning vector quantization (LVQ) algorithms for image
compression
16
3.2 Neural Network (NN) 16
3.2.1 Supervised learning 19
3.2.2 Unsupervised learning
3.3 Types of unsupervised learning 19
3.3.1 Competitive learning (CL) 20
3.3.2 Self-organizing map (SOM) 22
3.4 Learning vector quantization (LVQ) 25
3.4.1 LVQ2 Algorithm 28
3.4.2 LVQ2.1 and LVQ3 Algorithms 28
3.5 Generalized Lloyd algorithm 30

4 FUZZY LEARNING VECTOR QUANTIZATION ALGORITHMS 34
4.1 Fuzzy learning vector quantization (FLVQ) algorithms for image
compression
34
4.2 Fuzzy k-means algorithm (FKM) 35
4.3 Fuzzy vector quantization (FVQ) algorithm 40
4.3.1 Fuzzy vector quantization I 41
4.3.2 Fuzzy vector quantization 2 42
vii

4.3.3 Fuzzy vector quantization 3 43
4.4 Fuzzy learning vector quantization (FLVQ) 44

5 PERFORMANCE EVALUATION OF VQ ALGORITHMS FOR IMAGE
COMPRESSION
47

6 CONCLUSION 63

REFERENCES 64

CHAPTER 1
INTRODUCTION

1.1 INTRODUCTION

At the present time, there is widespread use of continuous tone images in diverse
applications. The word image has many meaning for us. An image as defined in the real
world is considered to be a function of two real variables, for example, a(x, y) with a as the
amplitude (e.g. brightness) of the image at the real coordinate position (x, y) [1]. An image
may be considered to contain sub-images, sometimes referred to as regions-of-interest, i.e.
ROIs. The amplitudes of a given image will almost always be either real numbers or integer
numbers. The latter is usually a result of a quantization process that converts a continuous
range (say, between 0 and 100%) to a discrete number of levels. When an image is digitized,
it becomes a digital image. By using an analog image a(x, y) in 2D continuous space, a
digital image a[m, n] in a 2D discrete space can be described [1]. The 2D continuous image
a(x, y) is divided into N rows and M columns. The intersection of a row and a column is
termed as a picture element, image element, or pixel. The value assigned to the integer
coordinates [m, n] with {m=0,1,2,...,M-1} and {n=0,1,2,...N-1} is a[m, n]. A computer image
is a matrix (a two-dimensional array) of pixels.
The large file sizes of images can make the storage and transmission of large numbers of
images a serious problem [2]. The uses of images in the data processing environment can be
difficult as handling large amounts of data requires time, is costly, and requires large
bandwidths. These problems can be reduced by compressing the image files. Therefore,
image compression is a topic of great interest today. Image compression is the image data
elaboration branch dedicated to the image data representation. Image representations require
a lot of data. For example, a typical 8-bit 500 500 image requires 2Mb of data. Millions of
images are transmitted daily over satellite links, the internet, local area networks and the like.
chapter 1 introduction

Jadavpur University MEE Thesis
2
Image storage is required for medical images, magnetic resonance imaging, motion pictures,
digital radiology, satellite images, weather maps, educational and business documents, and
so on. To increase the rates of image transmission and storage, these images must be
compressed and coded. Image compression is concerned with minimizing the number of bits
required to represent an image. Compression is useful because it helps reduce the
consumption of expensive resources, such as hard disk space or transmission bandwidth. The
goal of image compression is to obtain a representation that minimizes bit rate with respect
to some distortion constraint. This is also connected with the pattern matching. Compressing
an image is significantly different than compressing raw binary data.
Image compression as a discipline has its origin in information theory. Compression of
images reduces the space necessary to encode, store or transmit digital images by changing
the way those images are represented. This is performed by an algorithm that removes the
redundancy in images so that it only requires a fraction of the original data to represent the
image. Of course, general purpose compression programs can be used to compress images,
but the result is less than optimal. This is because images have certain statistical properties
which can be exploited by encoders specifically designed for them. Also, some of the finer
details in the image can be sacrificed for the sake of saving a little more bandwidth or
storage space. Image compression is an important part of the digital world. The objective of
digital image compression techniques is the minimization of the number of bits required to
represent an image, while maintaining acceptable image quality.

The main objective of this thesis is to implement and make a comparative study of a
broad variety of vector quantization algorithms used in image compression, based on either a
hard (crisp) or soft (fuzzy) decision in a membership function, namely, generalized Lloyd
algorithm, fuzzy k-means algorithm, fuzzy vector quantization, and fuzzy learning vector
quantization. The difference between these is mainly the degree of membership of each
vector to the clusters. In the case of hard group, each vector in the codebook with a unitary
degree of membership belongs to only one cluster, whereas, for the fuzzy group, each vector
can belong to several clusters with different degrees of membership. Hence this thesis
presents a comparative analysis of vector quantization and fuzzy clustering algorithm for
chapter 1 introduction

3
image compression algorithm. Fuzzy clustering or codebook design is an essential process in
image compression based on vector quantization. Image-compression algorithms based on
vector quantization techniques have been researched for several years now. Recently, such
algorithms have also been implemented in hardware by several graphics chip vendors. Hence
the present study is felt to be a relevant one which can help designers to choose, from a set of
candidate vector quantization algorithms, the algorithm that suits his/her requirement best.

1.2 THESIS OUTLINE

The present work is organized in six chapters, the first being the introduction. In
Chapter 2 popular image compression methods are described. Here we emphasize on lossless
compression and lossy compression techniques and outline different types of quantization
methods. In Chapter 3 learning vector quantization algorithms for image compression has
been discussed in which generalized Lloyd algorithm has been described. Chapter 4 is
dedicated for presenting the fuzzy learning vector quantization algorithms in which fuzzy k-
means algorithm, fuzzy vector quantization algorithm and fuzzy learning vector quantization
algorithm are discussed. In Chapter 5 the performance of generalized Lloyd algorithm, fuzzy
k-means algorithm, fuzzy vector quantization algorithm and fuzzy learning vector
quantization algorithm have been compared, for a set of benchmark images under several
implementation conditions. Finally, we conclude the thesis in Chapter 6.

CHAPTER 2
IMAGE COMPRESSION ALGORITHMS

2.1 IMAGE COMPRESSION ALGORITHMS

Image compression has become an increasingly important and active area of research in
relatives. Many techniques have been developed in this area for quite some time now [2]. Image
compression algorithms are methods that reduce the number of symbols used to represent
source information, therefore reducing the amount of space needed to store the source
information or the amount of time necessary to transmit it for a given channel capacity.
Progress in compression algorithms has made digital image an enabling and penetrating
technology for many applications [1]. Compression algorithm development starts with
applications to two-dimensional (2-D) still images. Different compression algorithms are
designed for different purposes. Image compression techniques are characterized by the use
of an appropriate compression system model. A compression scheme performs well on a given
image, if the scheme is based on exploiting the specific type of characteristic that is present
within the image [1]. Hence, in order to derive a compression scheme that performs well on most
images, we need to find out the elements of the source on which to focus: image coding,
which maps source elements to output elements; and image structures: which enables
efficient implementation. A compression system consists of two blocks: an encoder and a
decoder. Compression is also referred to as an encoding process and decompression is
referred to as decoding process [3]. Image compression techniques are based on using an
appropriate model for the source data in which defined elements are not all equally likely. The
encoder and the decoder must agree on an identical model. An input image ( , ) f x y or an
original image is fed into the encoder, which creates a set of symbols from the input data.
After transmission over the channel, the encoded representation is fed to the decoder, where
a reconstructed output image
( , ) f x y is generated [3]. Shannon considered a general

compression system model which is shown in fig 2.1.
chapter 2 image compression algorithm

5

Original Compressed Reconstructed
Image Image Image

Fig-2.1 general compression system model
The reduced file created by the compression process is called the compressed file and is used
to reconstruct the image, resulting in the decompressed image. The original image, before
any compression is performed, is called the uncompressed image file [3]. The ratio of the
original, uncompressed image file and the compressed file is referred to as the compression
ratio. The compression ratio is denoted by:
Uncompressed file size
Compression Ratio
Compressed file size
=

The principles of image compression algorithm are based on Information Theory. In
1948, Claude Shannon, the father of Information Theory, introduce a measure of information
which is based on a probability definition. It uses the concept of entropy to measure the
amount of information that a source produces [1]. The source randomly produces symbols
following a certain probability model. The amount of information one receives from a source
is equivalent to the amount of the uncertainty that has been removed. The mapping from
source symbols into fewer target symbols is referred to as compression [1].

The principles of image compression algorithms are:

1) Reducing the redundancy in the image data, and (or)
2) Producing a reconstructed image from the original image with introduction of
error that is insignificant for the intended application.
Encoder

Decoder


6
Generally, compression systems for imaging applications consist of two steps,
sampling and quantization of an analog signal of real world documents, pictures, or
sequences of pictures, followed by digital image compression. Sampling and quantization help
to transform the continuous domain image into a digital format [1]. Let us discuss about
sampling and quantization in brief. In 1928 Harry Nyquist introduced a theorem which is
known as sampling theorem. According to this theorem sampling process maps a physical
image into an array of pixels by spatially sampling points of the physical image. This is a
digitization process of spatial coordinates of an image. In this process a continuous image
function f(x, y) can be sampled using a discrete grid of sampling points in the plane.
Sampling a continuous image function with sampling rate fs produces a discrete image function
whose frequency spectrum is the periodic replication of the original signal, and the replication
period is fs [2]. It is a lossless process. Sampling should be done in both coordinates and in
amplitude.

After sampling, the next step in image digitization is quantization. Quantization is a
very simple process. The aim of quantization is to produce a digital image in which each
pixel is represented by a limited number of bits. In other words it is a digitization process of
luminance and chrominance of image samples, resulting from sampling [1]. Discretization in
amplitude is often called quantization. This of consists mapping a number from a continuous
range of possible values into a finite set of approximating values. In this imaging application,
high-rate digital pixel intensities are mapped into a relatively small number of symbols. This
operation is nonlinear and noninvertible. Quantization can include throwing away some of
the components of the signal decomposition step. The number of quantization levels should
be high enough for human perception of fine shading details in the image. This is a lossy
process because we can never recover the original image [1].

Figure 2.2 shows sampling and quantization of an analog signal. There are two
different kinds of quantization for image compression: scalar quantization (SC) and vector
quantization (VQ). Both scalar and vector quantizer play an important role in image
compression.


7

( ) y t

4q
2q
0
-2q
-4q t
T 5T 10T

Fig- 2.2 Sampling and quantization of an analog signal.

In scalar quantization, each input symbol is treated separately in producing the output.
SQ includes such operations as rounding to the nearest integer [3]. The possible outputs-in
this case the integers-are called quantization levels, reproduction levels, or reconstruction
levels. Integers are evenly spaced; therefore this is an example of uniform quantization.
This type of quantization is insensitive to inter-pixel correlations. If the set of inputs and
outputs of a quantizer are scalars, then that type of quantizer is called as a scalar quantizer.
Scalar quantizers encode each output value individually. A scalar quantizer is a mapping
: q X Y such that [ ] ( ) q X , the number of possible quantized values, is either finite or count-
ably infinite [2]. Here a source value x X is given to an encoder and a decoder maps this value
to some quantized value y Y . SC not only fails to exploit correlations, it also destroys them,
thus hurting the image quality. Therefore, quantizing correlated data requires alternative
quantization techniques that exploit and largely preserve correlations.

The other type of quantization technique is vector quantization which is a
generalization of scalar quantization [25]. It quantizes groups (contiguous blocks) of pixels
rather than individual pixels. In VQ, the input symbols are clubbed together in groups called
vectors, and processed to give the output. This quantization can be used as a standalone
compression technique operating directly on the original data (images or sounds). In VQ, the
inputs are vectors rather than scalars. It has been proven to be a very powerful technique due

8
to its inherent theoretical superiority over scalar quantization [25]. VQ always provides better
performance than scalar quantization (SQ) evens the components of a vector are completely
uncorrelated. This explains why VQ of transform coefficients is better than SQ of transform
coefficients. However, if the components of a vector have stronger correlation with each other,
VQ performance is better. Shannons rate-distortion theory indicates that better performance
can be achieved by coding vectors instead of scalars even though the data source is memory
less. Due to the rapid increase in complexity with the vector dimensionality, in practice, an
image is always divided into small blocks (lower dimensionality vectors) and VQ is applied to
each vector individually. We will discuss about vector quantization in lossy compression
algorithm briefly.
2.2 TYPES OF IMAGE COMPRESSION ALGORITHMS
Image compression algorithms can be classified into two categories, lossless
compression and lossy compression.

2.2.1 LOSSLESS COMPRESSION

Lossless compression is a technique that decompresses data back to its original form
without any loss. The decompressed file and the original file are identical, which is shown in
figure 2.3. This method is also called noiseless coding, data compaction, invertible coding or
entropy coding, and refers to algorithms that allow the original pixel intensities to be perfectly
recovered from the compressed representation. Many compression methods used to compress
text, databases and other business data are lossless. This is an error-free procedure. Here the
original pixel intensities can be perfectly recovered from the compressed image
representation. These methods may also be preferred for high value content, such as medical
imagery or image scans made for archival purposes. For complex images this method is
limited to compressing the image file to about one-half to one-third its original size (2:1 to
3:1). For simple images, such as text-only images, lossless methods achieves much higher
compression. In many cases, it may not be noticeable to the human ear or eye. In other
cases, it may be noticeable, but not that critical to the application. The more tolerance for
loss is allowed, the smaller the file can be compressed, and the faster the file can be

9
transmitted over a network. Lossless algorithms do not change the content of a file [1]. If we
compress a file and then decompress it, it can be see that it has not changed. This type of
compression can be applied not just to graphics but to any kind of computer data such as
spreadsheets, text documents and software applications. Lossless file formats include: GIF
(Graphic Interchange Format), TIFF (Tagged Image File Format), WMA (Windows Media
Audio), RLE (Run Length Encoding) and PNG (Portable Network Graphic), and BMP
(bitmapped picture). The goal of lossless image compression is to represent an image signal
with the smallest possible number of bits without loss of any information, thereby speeding
up the transmission and minimizing the storage requirements.

However, lossless coding does not permit to reach high compression ratios. Thus,
most applications (e.g.: telemedicine, fast searching and browsing of medical volumetric
data) suffer from this limitation. For this kind of applications, lossy compression seems to be
an appropriate alternative. Lossless compression methods include RLE, string-table
compression, LZW (Lempel Ziff Welch) and Huffman coding.

COMPRESSED IMAGE

Fig-2.3: Lossless Compression

2.2.2 LOSSY COMPRESSION

When some distortion is introduced in the coding process and part of the original
information is irremediably lost, that type of algorithms are called lossy compression
algorithms. This type of compression technologies attempts to eliminate unnecessary
ORIGINAL
IMAGE

RESTORED
IMAGE
(Same as
original image)


10
information. In this scheme only a close approximation of the original data can be obtained.
This is generally used for video and sound, such as photos, in applications where minor
(sometimes imperceptible) loss of fidelity is acceptable to achieve a substantial reduction in
bit rate. For complex images this technique can achieve compression ratio of about 10 to 50,
and still retain high-quality visual information. For simple images, or lower quality results,
compression ratios as high as 100 to 200 can be attained [1]. This compression algorithm
takes advantage of the inherent limitations of the human eye and discards information that
cannot be seen. Most lossy compression algorithms allow for variable levels of quality
(compression) and as these levels are increased, file size is reduced. At the highest
compression levels, image deterioration becomes noticeable. The removal of information in
the lossy technique is acceptable for images. Lossy methods can frequently compress by
factors of ten or more, but there is always concern about what is lost. This is a type of image
compression in which actual information is lost. If images are designed only for viewing,
compression factors greater than ten can be achieved with no visible difference between an
original and a compressed image. The goal of this compression is to achieve the best possible
fidelity given an available communication or storage bit rate capacity or to minimize the
number of bits representing the image signal subject to some allowable loss of information.
This method is suitable for natural images. Lossy file formats include: mp3, jpeg (Just
Photographic Experts Group) and mpeg (Moving Picture Experts Group). Lossy compression
methods include DCT (Discreet Cosine Transform), Vector Quantization, Wavelet
compression, Fractal compression, Linear predictive coding etc. In the field of lossy
compression, a number of research works (such as derivation of a rate-distortion function,
universal compression algorithm etc) have already been made which has our knowledge in
this subject deepened. This thesis concentrates only on vector quantization (VQ) type lossy
compression techniques


11
COMPRESSED IMAGE

Fig-2.4: Lossy Compression
2.3 VECTOR QUANTIZATION (VQ)

Vector quantization (VQ) is a well known and very efficient approach for lossy image
compression. This is also known as "block quantization" or "pattern matching quantization".
This type of lossy data compression method is based on the principle of block coding. VQ
can also be used especially where the transform stage does not decor relate completely. Here
transformation is usually done by a codebook. A codebook is a set of finite codevectors for
representing the input vector [23]. The objective of vector quantization is to encode data
vectors in order to transmit them over a digital communications channel. To create a vector
quantization system, we must design a vector quantizer. If the set of inputs and outputs of a
quantizer are vectors, then that type of quantizer is called a vector quantizer. Vector
quantizers can be used to improve image compression performance. Design of vector
quantizer is accomplished by generating a codebook using the information provided by the
training vector set. An extensive study of vector quantizers or multi-dimensional quantizers has
recently been performed by many researchers. A vector quantizer is composed of two parts, an
encoder and a decoder. In a vector quantizer, encoding and decoding are highly asymmetric
processes. Vector quantization [25] parses the image into a sequence of groups of pixels. Each
group of pixels is referred to as a vector. The encoder views each input vector with every
codevector in the codebook and generates index which represent the minimum distortion
codevector from the input vector. The decoder uses a look-up table indexed by the channel
codeword to locate the codevector in codebook and generate the output vectors. Although VQ
requires a large amount of computational time to construct a codebook, once a codebook has
been constructed, decoding involves only lookup from the codebook. Resources needed for
ORIGINAL
IMAGE

RESTORED
IMAGE (Near
same and
original image)

12
decoding are considerably less than those needed for encoding. This is a very attractive point for
image compression and applications. VQ encoder is shown in figure 2.5(a) and VQ decoder is
shown in figure 2.5(b). A vector quantizer Q of dimension K and size N is a mapping from a
vector (or a point") in the K-dimensional Euclidean space, R
k
, into a finite set Y containing
N output or reproduction points, called code vectors, codewords, or codebook [25]. Thus,

:
k
Q R Y (2.1)

where { : 1, 2,......, },
k
j i
Y y i N y R = =
The design of a vector quantizer requires a set of training vectors, called the training
set. A training set X is formed by M training vectors, that is,
1 2
{ , ,........, },
k
M i
X x x x x R =
,
where 1, 2,........, i M = . A vector quantizer is designed by assigning the M training vectors to
N clusters, each represented by a codebook vector. The quality of the codebook design is
frequently measured by the following average distortion [4]:

min
1 1
1 1
( ) min ( , )
j
M M
i y y i j
i i
D d x d x y
M M

= =
= =

(2.2)

where ( , )
i j
d x y is the Euclidean distance and can be defined as:

2
1
( , ) ( )
k
i i j ij
j
d x y x y
=
=
(2.3)
where
j
x is the jth component of the input vector, and
ij
y is the jth is component of the
codeword
j
y . Associated with each codeword,
i
y is a nearest neighbor region called Voronoi
region. The set of Voronoi regions partition the entire space R
k
. Voronoi region is defined
by:

{
: ,
k
i i j
v x R x y x y = for all
} j i (2.4)

13

x Indexi
Input vector

Fig-2.5(a):A VQ encoder

Indexi
i
x
Output vector

Fig-2.5(b):A VQ decoder

1
2
.
.
,
N
x
x
x

CODEBOOK
min[ ( , )]
1, 2,....,
i
d x x
i N =

LOOK UP TABLE

1
2
.
.
,
N
x
x
x

CODEBOOK
chapter 2

Jadavpur University

Fig-2.6: Codewords in 2-dimensional space.
marked with red circles, and the Voronoi regions are separated with boundary lines.

The figure 2.6 is a two dimensional voronoi diagram. Here
2-dimensional space. A vector quantizer with minimum encoding distortion is called a
quantizer or nearest-neighbor quantizer
computation complexity and high comp
applications because it can provide high compression ratio and simple decod
vector quantization is well-known lossy compression method which can be applied to both
images and signals, including biosignal applications.
vectors, whereas scalar quantization is its special
There are many vector quantization variations which typically perform better, but usually they
have larger computational complexity

image compression algorithm

14

dimensional space. Input vectors are marked with an x, codewords are
red circles, and the Voronoi regions are separated with boundary lines.
two dimensional voronoi diagram. Here codewords are present in
A vector quantizer with minimum encoding distortion is called a
neighbor quantizer. A good vector quantization system should have low
complexity and high compression ratio. It has received great interest in many
applications because it can provide high compression ratio and simple decod
known lossy compression method which can be applied to both
images and signals, including biosignal applications. The general vector quantization
vectors, whereas scalar quantization is its special case (dealing with vectors with one element).
have larger computational complexity.
image compression algorithm
MEE Thesis

Input vectors are marked with an x, codewords are
red circles, and the Voronoi regions are separated with boundary lines.
codewords are present in
A vector quantizer with minimum encoding distortion is called a Voronoi
vector quantization system should have low
It has received great interest in many
applications because it can provide high compression ratio and simple decoding process. The
known lossy compression method which can be applied to both
The general vector quantization deals with
case (dealing with vectors with one element).
CHAPTER 3
LEARNING VECTOR QUANTIZATION ALGORITHIM

3.1 LEARNING VECTOR QUANTIZATION (LVQ) ALGORITHMS FOR IMAGE
COMPRESSION

In 1989 Teuvo Kohonen initiated the study of the prototype generation algorithm
called learning vector quantization (LVQ). LVQ is the name used for vector quantization
algorithms, implemented by training a competitive neural network using gradient descent
algorithm. Gradient descent based minimization allows the development of LVQ algorithms
capable of minimizing a broad variety of objective functions that cannot be treated by
conventional optimization methods frequently used for developing clustering and vector
quantization algorithms [5]. LVQ is a competitive network where the output is known. This
algorithm is used to maximize correct data. In an LVQ network, target values are available for
the input training patterns and the learning is supervised. The LVQ scheme is a simple one and it
gives a class of adaptive techniques for constructing vector quantizers. An advantage of LVQ
algorithm is that it creates prototypes that are easy to be interpreted by the experts in the field. It
can be applied to pattern recognition, multi-class classification and data compression tasks, e.g.
speech recognition, image compression, image processing or custom classification. In many
cases, LVQ algorithms can achieve better results than other neural network classifiers in spite of
their simple and time-efficient training process. Before describing LVQ algorithm here neural
network has been briefly discussed.

3.2 NEURAL NETWORK (NN)

A neural network (NN) can be defined as a massively parallel distributed processor that
has a natural propensity for storing experiential knowledge and making it available for use [7].
An NN derive their intelligence from the collective behavior of simple computational
mechanisms at individual neurons. NNs are useful for recognizing patterns, classifying inputs,
and adapting to dynamic environments by learning. However, the internal mapping structure of a
chapter 3 LVQ algorithm

16
NN is, many a times, treated like a black box. Typically, these networks are composed of
collections of processing elements. The nodes in neural networks are called processing elements,
and the directed links (information channels) are called interconnects. An NN topology is
specified by its interconnection scheme, the number of layers and the number of nodes per layer.
Figure 3.1 gives the structure of a three-layered neural network. That is represented by a set of
nodes and arrows. In this figure three types of nodes are present (input/hidden/output). The input
nodes receive the signals and the output nodes encode the concepts (or values) to be assigned.
The nodes in the hidden layers are not directly observable. Hidden layer nodes provide the
required degree of non-linearity for the network. According to the learning process, neural
networks are divided into two kinds:
a) Supervised and
b) Unsupervised.
The difference between them lies in how the networks are trained to recognize and categorize
the objects.

Output
1
y
2
y
3
y

2
ij
w

1
ij
w

Input
1
x
2
x

Fig-3.1:The structure of a representative three-layered neural network. Input nodes are
marked as
1
x and
2
x . Output nodes are marked as
1
y ,
2
y and
3
y .
1
ij
w and
2
ij
w are the weights
associated with the links between the input and the hidden layer nodes and the links between the
hidden and the output layer nodes, respectively.


17

3.2.1 SUPERVISED LEARNING

Learning with supervision or with a teacher is known as the supervised learning. In
this learning method, the network is given input samples from a training data set, along with the
current classification of each sample, and the network produces an output, signifying its best
guess for the classification of each input object. The network compares its output with the
correct, or target output which was specified by the user along with the input data. The network
then adjusts its internal components (connection weights) to make its output agree more closely
with the target output. In this way the network learns the correct classification of its training data
set [8]. We can say that, in this type of learning, training inputs are provided with the desired
outputs. For example, in a classification problem, the learner approximates a function mapping
a vector into classes by looking at input-output examples of the function. The output of the
function can be a continuous value (called regression), or can predict a class label of the input
object (called classification). The task of the supervised learner is to predict the value of the
function for any valid input object after having seen a number of training examples (i.e. pairs of
input and target output). To achieve this, the learner has to generalize from the presented data to
unseen situations in a "reasonable" way [9]. Depending on the nature of the teacher's
information, there are two approaches to supervised learning. One is based on the correctness of
the decision and the other based on the optimization of a training cost criterion. Supervised
learning can generate models of two types. Most commonly, supervised learning generates a
global model that maps input objects to desired outputs. In some cases, however, the map is
implemented as a set of local models (such as in case-based reasoning or the nearest neighbor
algorithm). This learning have been used in a wide range of applications including signal and
image processing, speech recognition, system identification, automatic diagnosis, prediction of
stock prices, signature authentication, detection of events in high- energy physics, etc [10].
Popular supervised learning algorithms include Perceptron learning algorithm, Least Mean
Square (LMS) algorithm, and Back propagation algorithm.


18
3.2.2 UNSUPERVISED LEARNING

Learning without supervision is known as the unsupervised learning. Using no
supervision from any teacher, unsupervised networks adapt the weights and verify the results
only on the input patterns. The unsupervised networks are also called as self-organizing
networks. In the unsupervised method, samples are input into the network and the network must
determine the correlations between the objects and produce an output in the correct class for each
input object. In essence, the unsupervised algorithm must have some internal means of
differentiating objects in order to classify them [8]. In this learning, the system parameters are
adapted using only the information of the input and are constrained by pre-specified internal
rules. This neural network cluster, code or categorize input data. Similar inputs are classified as
being in the same category, and should activate the same output unit, which corresponds to a
prototype of the category [10]. Unsupervised classification procedures are often based on some
kind of clustering strategy, which forms groups of similar patterns. The clustering technique is
very useful for pattern classification problems. Furthermore, it plays an important role in many
competitive learning networks. Unsupervised learning is very important in neural networks
because it leads to effective dimensionality reduction in the input data. This is achieved by
discovering a smaller number of features to work with than were present in the raw data, based
on statistical regularities in this data [11]. This is important since it is likely to be much more
common in the brain than supervised learning. This learning using LVQ technique is
successfully employed in several application fields. Two very simple classic examples of
unsupervised learning are clustering and dimensionality reduction. This type of NNs have been
widely used in clustering tasks, feature extraction, data dimensionality reduction, data mining
(data organization for exploring and search), information extraction, density approximation etc
[10]. Unsupervised learning includes competitive learning and self-organizing map which will
discuss next.

3.3 TYPES OF UNSUPERVISED LEARNING
Now we discuss competitive learning and self originated map in this sub section.


19
3.3.1 COMPETITIVE LEARNING (CL)

Competitive learning (CL) networks [12] are unsupervised neural networks where only
the active neurons are allowed to update their weight vectors. A basic competitive learning
model consists of feed forward and lateral networks with fixed output nodes (i.e. fixed number of
clusters). Here the output neurons of a neural network compete among themselves to become
active. As a result only one output neuron is activated at any given time [11].This is achieved by
means of a so called winner-take-all (WTA) operation [13]. The network in its simplest form,
works in accordance with WTA strategy. The input and output nodes are assumed to have binary
values (1 or 0). An input pattern x is a sample point in the n-dimensional real or binary vector
space. That is, there are as many output neurons as the number of classes and each output node
represents a pattern category. CL is useful for classification of input patterns into a discrete set of
output classes. The simplest CL network is shown in figure-3.2 which shows all inputs are
connected to a single layer of output neurons.

1
x
1
y

2
x
2
y

3
x
3
y
Fig-3.2: Competitive learning network. The solid lines indicate excitatory connections whereas
the dashed lines indicate inhibitory connections.


20
The winner-take-all operation is implemented by connecting the outputs to the other
neurons. As a result of competition, the winner of each iteration, element
*
i , is the element
whose total weighted input is the largest

*
. . ,
i
i
W X W X for all
*
i i

(3.1)

activation becomes the winner. In the case of normalized inputs, the unit with
*
i
produces
the
smallest activation in terms of

*
,
i
i
W X W X for all
*
i i , (3.2)
that is, the unit with normalized weight closest to the input vector becomes the winner. In fact,
the winning neuron can be found by simple search of maximum or minimum activation [10].
This neuron updates its weight while the weights of the other neurons remain unchanged. A
simple competitive weight updating rule is the following:

( )
*
*
0
j ij
ij
x w if
i i
w
if
i i
(3.3)
where is a constant,
j
x is the n-dimensional input vector and
ij
w is the change in weight
vectors. These types of learning algorithms are frequently based on a minimum loss function.
Although Kohonens competitive learning (KCL) network is originally not a clustering method,
it could be used as a prototype generation algorithm called a learning vector quantization (LVQ)
[12].In recent years, competitive learning algorithms have been widely used for vector
quantization methods. Vector quantization is based on the competitive learning paradigm, so it
is closely related to the self-organizing map model, described next.


21
3.3.2 SELF-ORGANIZING MAP (SOM)

SOM [8] or self-organizing feature map (SOFM) is a special type of competitive
learning network, where the neurons have a spatial arrangement, i.e. the neurons are typically
organized in a line or a plane [10]. SOFM has formed a basis for a great deal of research into
applying network models to the problem of codebook design in vector quantization. Professor
Teuvo Kohonen introduced SOM as the concept of classes ordered in a topological map. One
of the most interesting aspects of SOMs is that they learn to classify data without supervision.
This unsupervised Artificial Neural Networks (ANN) are mathematically characterized by
transforming high-dimensional data into two-dimensional representation, enabling automatic
clustering of the input, while preserving higher order topology [8]. It is trained using
unsupervised learning to produce a low-dimensional (typically two-dimensional), discretized
representation of the input space of the training samples, and called a map. During training, all
weight vectors are associated with the winner neuron and its neighbors are updated [10]. SOM
architecture is given in figure-3.4. It is sometimes called a Kohonen map. This Kohonens
method consists of one layer of neurons and uses the method of competitive learning with
winner take all approach. Its architecture consists of input or training data vectors X which are
mapped into a two-dimensional lattice, and each node on the lattice has an associated reference
or node's weight vectorW .

1
x

2
x

3
x

Fig-3.4: A Self-Organizing Map which shows a very small Kohonen network of 3 X 3 nodes
connected to the input layer representing a two dimensional vector.


22
The lines connecting the nodes in Figure 3.4 are only there to represent adjacency and do
not signify a connection as normally indicated when discussing a neural network. There are no
lateral connections between nodes within the lattice. To determine the best matching unit
(BMU), one method is to iterate through all the nodes and calculate the Euclidean distance
between each node's weight vector and the current input vector. The node with a weight vector
closest to the input vector is tagged as the BMU.
The Euclidean distance is given as:

2
0
( ) ( )
n
i i
i
D i x W
=
=
(3.4)
The weights of the BMU and neurons close to it in the SOM lattice are adjusted towards the
input vector. The magnitude of the change decreases with time and with distance from the BMU.
The reference vectors are then updated using the following rules:

( 1) ( ) ( ) ( )( ( ) ( )) W t W t t t X t W t + = + (3.5)

where t represents the time-step and is a small variable called the learning rate;
0 ( ) 1 t < < , which decreases with time and modulates the weight update.

is used to
represent the amount of influence a node's distance from the BMU has on its learning. Thus
( ) t is given as:

( ) exp t =
2
2
2 ( )
dist
t
1,2,3,.... t = (3.6)

where dist represents the distance a node from the BMU. is the width of the neighborhood
function and it can be calculated as:


23

0
( ) exp t =
t
1,2,3,.... t = (3.7)

where,
0
denotes the width of the lattice at time
0
t , denotes a time constant,

t and

is the
current time-step (same for the current iteration of the loop). The decay of the learning rate is
calculated at each iteration using the following equation:

0
( ) exp t =
t
1, 2,3,.... t =

(3.8)

where
0
is the learning rate at time
0
t .

SOMs are different from other artificial neural networks in the sense that they use a
neighborhood function to preserve the topological properties of the input space. An SOM enables
us to have an idea about the statistical distribution of the input vectors on the output layer. It is a
competitive process which can also be called vector quantization. Most SOM applications use
numerical data and are popularly employed in several applications e.g. automatic speech
recognition, clinical voice analysis, monitoring of the condition of industrial plants and
processes, cloud classification from satellite images, analysis of electrical signals from the brain,
organization of and retrieval from large document collections, environmental modeling, analysis
and visualization of large collections of statistical data etc. The SOFM learning method may be
viewed as the first of two stages as a classification algorithm which employs the LVQ to perform
the second and final stage. Improved classification performance can be obtained by using this
algorithm in combination with a supervised learning technique, such as LVQ, described next.


24
3.4 LEARNING VECTOR QUANTIZATION (LVQ)
LVQ is an online algorithm whose observations are processed one at a time. It is based
on the SOM or Kohonen feature map. It should be noted that LVQ is not a clustering algorithm.
This is a "nearest neighbor" neural net in which each node is designated, via its desired output.
This learning technique uses the class information to reposition the Voronoi vectors slightly, so
as to improve the quality of the classifier decision regions. Here classes are predefined and the
model vectors are labeled by symbols corresponding to the predefined classes. This
algorithm can be used when we have labeled input data. The basic LVQ algorithm is actually
quite simple. This algorithm improves the separation in classes from the solution suggested by
the unsupervised training, may be by the SOM algorithm. For a given input, the method consists
in bringing closer the most activated neuron if it is in the right class (supervised training), or to
push it further away in the opposite case. The other neurons (i.e. losers) remain unchanged. Each
neuron thus becomes class-representative. LVQ belongs to hard vector quantization group. In
the hard approach, each input is associated only with the group with the nearest center. The goal
is to determine a set of prototypes that best represent each class. It is applicable to p-
dimensional unlabeled data [13]. In LVQ, cluster substructure hidden in unlabeled p-dimensional
data can be discovered. The architecture of the LVQ is also similar to that of the Kohonen
feature map, without a topological structure assumed for the output units. The network has three
layers: an input layer, a Kohonen classification layer, and a competitive output layer. The basic
architecture of LVQ neural network is shown in Fig.3.5. In Fig.3.5, LVQ network contains an
input layer, a Kohonen layer which learns and performs the classification, and an output layer.
The input layer contains one node for each input feature; the Kohonen layer contains equal
numbers of nodes for each class; in the output layer, each output node represents a particular
class [6] or we can say the output unit has a known class, since it uses supervised learning.
Hence, it differs from Kohonens SOM, which uses unsupervised learning. In terms of neural
networks, an LVQ is a feed forward net with one hidden layer of neurons, fully connected with
the input layer. A codebook vector (CV) can be seen as a hidden neuron (Kohonen neuron) or a
weight vector of the weights between all input neurons and the regarded Kohonen neuron
respectively.


25

1
o

4
o
1
x

1
u

1
y

2
o

3
o
4
u

2
x

2
u
One y = 1, others = 0

2
y

2
y

3
x

3
u

Input layer hidden layer output layer

Adjustable weights (CVs)

Fig-3.5: LVQ architecture: one hidden layer with Kohonen neurons, adjustable weights
between input and hidden layer and a winner takes it all mechanism.

As we know LVQ, uses the same internal architecture as SOM: a set of n-dimensional
input vectors is mapped onto a two-dimensional lattice, and each node on the lattice has an
associated n-dimensional reference vector. The learning algorithm in LVQ, i.e., the method of
updating the reference vectors, is different from that in SOM. Because LVQ is a supervised

4
u
8
u
5
u
9
u
6
u
7
u
4
2
o x w =


26
method, during the learning phase, the input data are tagged with their correct class and each
output neuron represents a known category [8]. We define the input vector x as

1 2 3
( , , ,..., )
n
x x x x x =

and the reference vector for i th output neuron
i
w as

1 2 3
( , , ,..., )
i i i i ni
w w w w w =
We define Euclidean distance between the input vector and the reference vector of the i th
neuron as:

2
1
( ) ( )
n
j ji
j
D i x w
=
=
(3.9)

When ( ) D i is a minimum, the input vectors are compared to the reference vectors and the
closest match is found. The winning reference vector,
*
i
w is then obtained by the formula

*
i
i
w x w x

(3.10)
The reference vectors are then updated using the following rules:

* * *
( ) ( ) ( )( ( ))
i i i
w new w old t x w old = + if x is in the same class as
*
i
w ,

* * *
( ) ( ) ( )( ( ))
i i i
w new w old t x w old = if x is in a different class from
*
i
w ,

* *
( ) ( )
i i
w new w old = ) if i is not the index of the winning reference vector.

The learning rate0 ( ) 1 t < < should generally be made to decrease monotonically with time
and can be defined as:

0
( ) (1 / ) t t T = (3.11)


27
where
0
is the learning rate at time
0
t and T is the total number of learning iterations. The
LVQ training algorithm aims at producing highly discriminative reference vectors through
learning [6]. There are several versions of LVQ algorithm for which different learning rules are
used. LVQ algorithms are a family of training algorithms for the nearest-neighbor classifiers,
which include OLVQ which is the optimized version of LVQ1, LVQ2 and its improved versions
LVQ 2.1, LVQ3 algorithms etc [8]. All these algorithms are intended to be applied as extension
to previously discussed (O) LVQ1 (KOHONEN recommend an initial use of OLVQ1 and
continuation by LVQ1, LVQ2.1 or LVQ3 with a low initial learning rate) [19]. OLVQ1 is the
same as LVQ1, except that each codebook vector has its own learning rate. The popular LVQ2,
LVQ2.1 and LVQ3, algorithm are briefly discussed next.

3.4.1 LVQ2 ALGORITHM

An improved LVQ algorithm, known as LVQ2 algorithm, is sometimes preferred
because it comes closer in effect to Bayesian decision theory. The same weight or vector update
equations are used as in the standard LVQ, but they only get applied under certain conditions,
namely when:
1. The input vector x is incorrectly classified by the associated Voronoi vector.
2. The next nearest Voronoi vector does give the correct classification, and
3. The input vector x is sufficiently close to the decision boundary (perpendicular bisector plane)
between the Voronoi vector and nearest Voronoi vector. In this case, both Voronoi and nearest
Voronoi vectors are updated (using the incorrect and correct classification update equations
respectively). In LVQ2 algorithm, adaptation only occurs in regions with cases of
misclassification in order to get finer and better class boundaries.

3.4.2 LVQ2.1 and LVQ3 Algorithms

LVQ2.1 algorithm is an improved version of LVQ2 algorithm which aims at eliminating the
detrimental effect. It is to be used only after LVQ1 has been applied. LVQ2.1 allows
adaptation for correctly classifying codebook vectors [19]. The LVQ2.1 algorithm is based
on the idea of shifting the decision boundaries toward the Bayes limits with attractive and

28
repulsive forces. Here two BMUs are selected and only updated if one belongs to the desired
class and one does not, and the distance ratio is within a defined window.

LVQ3 leads to even more weight adjusting operations due to less restrictive adaptation rules
[19]. LVQ3 has been proposed to ensure that the reference vectors continue approximating

the class distributions, but it must be noted that if only one reference vector is assigned to
each class, LVQ3 operates same as LVQ2.1. If both BMUs are of the correct class, they are
updated but adjusted using an epsilon value (adjusted learning rate instead of the global
learning rate).

Basically the developer of an LVQ algorithm has to prepare a learning schedule, a plan
which LVQ-algorithm(s) LVQ1, OLVQ, LVQ2.1 etc. should be used with which values for the
main parameters at different training phases. Also, the number of codebook vectors for each
class must be decided in order to reach high classification accuracy and generalization, while
avoiding under or over fitting. In LVQ, it is difficult to determine a good number of codebook
vectors for a given problem. Here accuracy is highly dependent on the initialization of the model
as well as the learning parameters used (learning rate, training iterations, etc). In the domain of
neural networks, LVQ and its extensions are among the best known algorithms for classification.
However, these often end at local minima of the distortion surface because they only accept new
solutions which maximally reduce the distortion, resulting in suboptimal networks whose
performance is inferior to globally optimal networks [20]. LVQ is an alternative of the
Generalized Lloyd Algorithm (GLA), better known as the Linde-Buzo-Gray (LBG) [14]
algorithm. LVQ algorithms are commonly grouped with GLA in the discussion of VQ
techniques. GLA is simple and have relatively good fidelity. So, it is a widely used VQ method.
This algorithm starts with a good initial codebook and global codebook can also be generated.
The strategy of LVQ is the same as GLA, except that the codevector update function differs.
GLA is discussed next.


29
3.5 GENERALIZED LLOYD ALGORITHM

A well-known codebook design method is Generalized Lloyd algorithm (GLA) or
Linde Buzo and Gray (LBG) algorithm [14]. In 1980, Linde, Buzo and Gray proposed a VQ
design algorithm based on a training sequence which is known as LBG algorithm or GLA. This
method operates more in the input domain, clustering the input vectors and moving the centroid
to develop a new and better representation for the next iteration of the codebook. It is similar to
the k-means algorithm. GLA is an iterative gradient descent algorithm that tries to minimize an
average squared error distortion measure. This algorithm plays an important role in the design of
vector quantizer and in nearest neighbor feature clustering for pattern recognition. GLA begins
with a set of input vectors and an initial codebook. For each input vector, a codeword from the
codebook is chosen that yields the minimum distortion. If the sum of distortions from all input
vectors does not improve beyond some threshold, the algorithm stops. Otherwise the codebook is
modified as follows: Each codeword is replaced by the centroid of all input vectors that have
previously chosen it as their output vector. This completes one iteration. Then the GLA
iteratively keeps refining a codebook. In this algorithm, in each iteration the average distortion is
reduced this corresponds to a local change in the codebook, i.e., the new codebook is not
drastically different from the old codebook. Given an initial codebook, the algorithm leads to the
nearest local codebook, which may not be optimal. As codebook design is a complex
optimization problem, it has many local minima. GLA performance is sensitive to the
initialization of the codebook. The task of codeword search is to find the best-match codeword
from the given codebook for the input vector. This means the nearest codeword
1 2
( , ,...., )
j j j jk
y y y y = in the codebook c is found for each input vector
1 2
( , ,....., )
k
x x x x = such that the distortion between this codeword and the input vector is the
smallest among all codewords. The most common distortion measure between x and
j
y

is the
Euclidean distance as follows:

2
0
( ) ( )
k
i ji
i
D j x y
=
=
(3.12)


30
Now, we describe the GLA steps. It consists of two phases, as shown in figure.3.6.
a) Initialization of the codebook, and
b) Optimization of the codebook.

Fig: 3.6: The GLA Procedure.

A. Codebook Initialization
The codebook initialization process is very important. In the initialization phase,
two methods are mainly used: in a random manner and by splitting.

Random initialization. The initial code words are randomly chosen [12]. Generally,
they are chosen inside the convex hull of the input data set [18].

Initialization by splitting. The original GLA algorithm uses splitting technique to
initialize the codebook. This technique basically doubles the size of the codebook in
START

31
every iteration. This procedure starts with one code vector
1
(0) c that is set to the
average of all training vectors.

Step 1: In a general iteration there will be N code vectors in the codebook. (0) 1, 2,.....,
i
c N = .
Split each code vector into two code vectors (0)
i
c and (0)
i
c r + , where r is a fixed perturbation
vector. Set 2 N N .
Step 2: If there are enough code vectors, stop the splitting process. The current set of N code
vectors can now serve as the initial set (0)
i
c for the codebook optimization phase. If more code
vectors are needed, execute the optimistic algorithm on the current set of N entries, to converge
them to a better set; then go to Step 1.

B. Codebook Optimization

Step 1: Select a threshold value , set 0 k = and ( 1) D = +. Start with initial codebook
with code vectors ( )
i
c k (where k is currently zero, but will be incremented in each iteration).
Training vectors are denoted as
i
T .

Step 2: For each code vector ( )
i
c k , find the set of all training vectors
i
T , that satisfies,

( , ) ( , )
i i j j
d T c d T c i j < (3.13)

This set or cell (also called Voronoi Region) is denoted as ( )
i
P k . Repeat Step 2 for all
values of i .
Step 3: Calculate the distortion ( )
i
D k between each code vector ( )
i
C k and the set of training
vectors ( )
i
P k found for it in Step 2. Repeat for all i , then calculate the average ( ) D k of all the
( )
i
D k . A distortion ( )
i
D k for a giveni is calculated by computing the distances ( ( ), )
i m
d C k T for
all training vectors
m
T in the set ( )
i
P k and then calculating the average distance.

32
Step 4:

( ( 1) ( ))
,
( 1)
D k D k
If stop
D K

(3.14)
Otherwise, continue.

Step 5: 1 k k = + , find new code vectors ( )
i
c k that are the average of training vectors in cell
( 1)
i
P k that was computed in Step 2. Go to Step 2. Since the unused code vectors are doubled
in each step, such doubling might result in final codebook with many unused codevectors.

3.5.1 GLA Design Algorithm

Begin
1. Set a threshold , to be a ``small'' number. Let
1 2 3
{ , , ,..., },
M
X x x x x =
, 1,2,....., .
L
i
x R i M =
2. Select an initial code book
1 2
{ , ,...., }
N
y y y y = , , 1, 2,....., .
L
j
y R j N =
3. Calculate
(0) 2
1
1
min ( , )
j
M
y y i j
i
D d x y
M

=
=

Set 0 = .
4. Set 1 + .

0 i = .

5. Set 1 i i = +
Evaluate ( )
j i
x =
min
1, ( , ) ( )
0, ,
i j i
if d x y d x
otherwise
=

1, 2,...., j N =
6. If i M < , then go to step 5.

33
7. Calculate
1
1
( )
( )
M
j i i
i
j M
j i
i
x x
y
x
=
=
=
, 1, 2,...., j N =
8. Calculate
( ) 2
1
1
min ( , ).
j
M
y Y i j
i
D d x y
M
=
=

9. If
( 1) ( )
( 1)
D D
D

> , then go to step 4.

End.

GLA method is used in computer graphics and also used to generate dot drawings in the
style of stippling (Deussen, 2000). A GLA designed codebook may not yield a true optimum. It
may yield only a local optimum. For this reason, many have either modified or sought other
methods that will escape the local optima and finally find the global optimum [21].

CHAPTER 4
FUZZY LEARNING VECTOR QUANTIZATION ALGORITHMS

4.1 FUZZY LEARNING VECTOR QUANTIZATION (FLVQ) ALGORITHMS FOR
IMAGE COMPRESSION

The techniques for codebook design mentioned in the last chapter are based on hard or
crisp decisions in the sense that each training vector is assigned to a single cluster according to
some criterion. A common ingredient of all these techniques is that they assign each training
vector to a single cluster and ignore the possibility that this training vector may also belong to
other clusters [33]. But the limitations of these algorithms are the problem of getting stuck in
local optima, low speed etc. There are other methods proposed which attempt to overcome these
above mentioned problems. Here we discuss about soft or fuzzy approaches such as Fuzzy k-
means (FKM) algorithm, Fuzzy Vector quantization (FVQ) algorithm and fuzzy learning vector
quantization (FLVQ) algorithm. Fuzzy sets were introduced by Zadeh as a new way to represent
vagueness in everyday life [29]. In general, the approach of image compression carried out in
terms of fuzzy set technology exhibits a few appealing features. Firstly, fuzzy sets naturally
gravitate towards the idea of capturing a problem from a certain perspective with a strong
intention of dwelling upon nonnumeric information and exploiting information granulation or
compression. Secondly, fuzzy sets come equipped with a solid and rounded algorithmic
framework [32]. The use of fuzzy sets in clustering was first proposed in [19] and several
classification schemes were subsequently developed [33]. One can easily refer to a significant
number of algorithm-oriented studies especially those in the realm of image processing where
various mechanisms of fuzzy sets have been incorporated [32]. As it has been discussed in the
hard decision making approach, each input is associated only with the group with the nearest
center, but in the soft decision making approach, this association (or membership or probability
of being generated by) is a value between zero and one with the frequent requirement that the
memberships in different groups sum to one. The soft variants are known as the fuzzy methods.
Bezdek and Pal (1995) showed that these methods are in different ways related with each other
chapter 4 fuzzy learning vector quantization algorithms

35
[27]. In this chapter, we discuss these related methods of learning the cluster centers, mentioned
before as the soft approaches of FKM, FVQ, and FLVQ.

4.2 FUZZY K-MEANS ALGORITHM (FKM)

The FKM (or fuzzy k-means [23]) can be seen as the fuzzified version of the k-means
algorithm. It is a method of clustering which allows one piece of data to belong to two or more
clusters. This method is frequently used in pattern recognition. Based on the idea of fuzzy sets,
Ruspini developed the first fuzzy clustering algorithm [30]. In 1973 Dunn considered an
alternative formulation of the clustering process and proposed the fuzzy k-means algorithm
(FKM), which is also known as the fuzzy ISODATA algorithm [31]. Bezdek (in 1981) extended
Dunn's formulation and produced a family of infinitely many fuzzy k-means algorithms, which
includes Dunn's original algorithm as a special case. This algorithm is one of the most popular
fuzzy clustering algorithms which are classified as constrained soft clustering algorithm. A soft
clustering algorithm finds a soft partition of a given data set by which an element in the data set
may partially belong to multiple clusters. This algorithm generally exhibits good convergence
behavior. FKM is based on the concept of fuzzy partition. It does not classify fuzzy data, but
crisp data into fuzzy cluster. The aim of this algorithm is to find cluster centers (centroids) that
minimize dissimilarity functions. In fuzzy clustering, the fuzzy membership of an input in a
cluster decreases continuously as distance increases [27]. The FKM algorithm is an adaptation of
GLA that is intended to reduce its tendency of becoming trapped in a local minimum. It uses a
soft or fuzzy membership function that allows each training vector the possibility of belonging to
more than one codevector during training, the association with each codevector having a
different degree of membership [26]. In this technique, fuzzy set theory is used to obtain the
optimal value of the centroid. The behavior of the fuzzy k-means algorithm based on soft
decisions, justifies the need for the transition from soft to crisp decisions during the codebook
design process in such applications as image compression, where every image block is
represented by a single codebook vector [33]. Incorporation of the fuzzy theory in the FKM
algorithm makes it a generalized version of the hard K-means algorithm. Usually FKM are batch
mode algorithms, rather than pattern mode algorithms and this distinguishes them from classical
LVQ algorithms. Like GLA, FKM also yields improved quality of codebook design. FKM

36
clustering is the most widely used approach in both theory and practical applications of fuzzy
clustering techniques to unsupervised classification [28]. This algorithm is also a popular
approach for exploring the structure of a set of patterns, especially when the clusters are
overlapping or fuzzy. Clustering is a mathematical tool that attempts to discover structures or
certain patterns in a data set, where the objects inside each cluster show a certain degree of
similarity. Fuzzy clustering algorithms offer a framework for quantitatively formulating the
uncertainty involved in the partition of the training vector space. The use of such an approach
during the initial stages of the codebook design process could eliminate, or at least significantly
reduce, the dependence of the resulting codebook on the selection of the initial codebook [33].
Fuzzy clustering is one of the principal applications of fuzzy set theory, an important part of
which is the concept of a membership function. The membership function is the basic idea in
fuzzy set theory; its values measure degrees to which objects satisfy imprecisely defined
properties. Fuzzy memberships represent similarities of objects to imprecisely defined properties.
Membership values determine how much fuzziness a fuzzy set contains. Larger membership
values indicate higher confidence in the assignment of the pattern to the cluster. Let X be a
training vector set of size M and dimension L, i.e.
1 2 3
( , , ,..., ),
M
X x x x x =
, 1,2,....., ,
L
i
x R i M = Ris an L-dimensional Euclidean space. Let Y be a codevector
set of size N and dimension L, i.e.
1 2
{ , ,...., }
N
y y y y =
,
, 1, 2,....., .
L
j
y R j N = A
vector quantizer assigns each of the M training vectors to one of N clusters. Each cluster is
represented by a codevector and, hence, each training vector is also represented by a codevector.
The quality of the codebook design is often measured using the average of the absolute
distortion, between the training vectors and the corresponding nearest codevectors, represented
by D:

2 2
min
1 1
1 1
( ) min ( , )
j
M M
i y Y i j
i i
D d x d x y
M M

= =
= =

(4.1)


37
with Euclidean distance ( , ) .
i j i j
d x y x y =

In each iteration of FKM, each training vector is assigned to all clusters or codevectors
with different degrees or membership values between zero and one indicating the degree to
which the training vector belongs. The most popular membership function is defined as [33]:

( )
j i
u x =
( )
( )
( )
( )
1
1
1
1
1
,
1
( )
,
,
,
N
i j
j i
p
i p m
N
i j
p
i p
d x y
x
d x y
d x y
d x y
=

=
(
| |
(
= = |
|
(
\
| |

|
|
\
(4.2)

where is a parameter that controls the fuzziness of the membership and is also called as
fuzzifier. is a positive integer and can be expressed as
1
1 m
=
, normally1 < < .

Here, ( )
1
1
N
j i
j
x
=
=
, in otherwords ( )
1
1
N
j i
j
u x
=
=
for any data object by its definition

[33].The constraint to fulfill this condition is:
( )
, 0, 1, 2,....,
i p
d x y p N > = . It should be
treated as a special case if there is more than one training vector x
i
for which
( )
,
i p
d x y is equal to 0. This function has been chosen such that
( )
j i
u x is near unity if
j
y
alone is very close to
i
x and is very small if
j
y is much more remote than the codevector that is
closest to
i
x . In this way, every training vector has a degree of membership in relation with each
of the N codevectors. Increasing hardens the partitioning of the training vectors by the
membership function. Now we cannot simply compute the cluster centroids by the mean values.
Instead, we need to consider the membership degree of each data object. Equation (4.3) provides
the formula for cluster centroid computation:

38

( )
( )
1
1
M
m
j i i
i
j M
m
j i
i
u x x
y
u x
=
=
=
1, 2,...., . j N = (4.3)

At each iteration of FKM some or all of the codevectors will be found to change. The movement
of a particular codevector is determined only by its member training vectors. The FKM design
algorithm is summarized latter. The codebook design process is terminated if the fractional
decrease of distortion
( )
K

is below a threshold . is a very small value, normally
3 4
10 10

.
( )
K

is defined as:

( ) ( )
( )
1
( )
1
D D
K
D

= (4.4)

where is the index of iterations commencing with 1 = . The derivation of the fuzzy k-means
algorithms was based on the constrained minimization of the objective function [33]:

2
1 1
( )
N M
m
m j i i j
j i
J u x x y
= =
=
(4.5)
where1 m < < . The parameters in this equation, the cluster centroid vectors
j
y and the
components of the membership vectors
( )
j i
u x , can be optimized by Lagranges method.

4.2.1 FKM Design Algorithm
Begin
1. Select a threshold
2. Select an initial codebook
1 2
{ , ,...., }
N
y y y y =
3. Evaluate D
(0)
according to Eq.(4.1)
Set 0 =

39
4. Set 1 +

0 i =
5. Set 1 i i = +
Evaluate ( )
j i
x using Eq. (4.2), 1, 2,...., . j N =

6. If i M < , then go to step 5.

7. Calculate
j
y using Eq.(4.3), 1, 2,...., . j N =

8. Calculate
( )
D

according to Eq. (4.1)

9. If
( 1) ( )
( 1)
D D
D

>
, then go to step 4.

End.

In recent years, this method is employed in a wide range of applications, including fuzzy
control and machine vision. FKM generally produces better results in codebook design than
GLA and also reduces the dependence of the resulting codebook on the selection of the initial
codebook. This benefit is generally obtained at the expense of increased computation time
caused by the need to calculate the fuzzy membership and also because more iterations are
required due to the slow convergence. To overcome this deficiency, Karayiannis proposed fuzzy
vector quantization (FVQ) [24] algorithms as fast alternatives to the FKM algorithm. The FKM
FVQ algorithm is popularly employed in speech and speaker recognition and is used to train
codebooks in the vector quantization approach [27]. This FVQ algorithm, which is a fast
alternative to the FKM algorithm, is described next.


40
4.3 FUZZY VECTOR QUANTIZATION (FVQ) ALGORITHM

An FVQ model is a set of cluster centers determined using fuzzy c-means (FCM)
clustering to cluster the training dataset. So, it can be said that FVQ usually uses FCM clustering
algorithm to achieve clusters or codebooks. This was proposed by Karayiannis and Pai [9], and
Tsekouras [26].FVQ algorithms are based on a flexible strategy that allows gradual transition of
the membership function from soft to hard decisions [26]. FVQ algorithms achieve similar
qualities of codebook design to FKM but with much less computational effort. The FVQ
algorithm allows that each training vector is assigned to multiple codewords in the early stages
of the codebook design. Although, the FVQ algorithm reduces the dependence of the resulting
codebook on the initial codebook, the codewords are calculated in batch mode. The iterative
fuzzy vector quantization approach is based on a gradient decent approach, and the concept of
fuzzy logic is introduced into it. In FVQ, the source image is approximated coarsely by fixed
basis blocks, and the codebook is self-trained from the coarsely approximated image, rather than
from an outside training set or the source image itself. Therefore, FVQ is capable of eliminating
the redundancy in the codebook without any side information, in addition to exploiting the self-
similarity in real images effectively. FVQ is a clustering algorithm based on soft decisions that
leads to crisp decision at the end of the codebook design process [33]. The FVQ makes a soft
decision about which codeword is closest to the input vector, generating an output vector whose
components indicate the relative closeness (membership) of each codeword to the input vector.
In the training process, the output vector not only provides the membership description but also
provides detailed codewords distribution in the feature space which guides the FVQ codebook
updating. The FVQ keeps the fuzziness parameter constant throughout the training process. In
the initialization step of this algorithm, each training vector is assigned to a codebook vector
which is concentrated at a cluster center.

2
max
( ) 1
( )
i j
j i
i
x y
u x
d x

(4.6)

where is a positive integer that controls the fuzzification of the clustering process. Each
training vector is assigned to one cluster. Similar to FKM, FVQ algorithm does not classify

41
fuzzy data [FVQ2]. The advantages of FVQ versus FKM are the elimination of the effect of
initial codebook selection on the quality of clustering and the avoidance of a priori assumptions
for the level of the fuzziness needed for a clustering task [FVQ2]. FVQ algorithms are
categorized as FVQ1, FVQ2, and FVQ3 which are described below.

4.3.1. Fuzzy Vector Quantization I (FVQ1) [24]

The development of this algorithm is attempted by constructing a family of membership
functions. According to these conditions, the membership function
( )
j i
u x approaches unity as
( )
,
i j
d x y approaches zero and decreases monotonically to zero as the distance
( )
,
i j
d x y
increases from zero to
( )
max
( ) max ( , )
v
j i
i i j
y
d x d x y
= . A particular family of membership

functions can be formed by observing that

max max
( , ) 0 ( , ) 0
( , ) ( ) 1 ( )
i j i j
i j i i
d x y d x y if
d x y d x if d x
=
=

=

(4.7)

Since
max
( , )
( )
i j
i
d x y
d x
is an increasing function of the distance
( )
,
i j
d x y , the membership
function
( )
j i
u x can be of the form

( )
max
max
( , )
( ) ( , ), ( ) 1
( )
i j
j i i j i
i
d x y
u x f d x y d x
d x
| |
= =
|
\
(4.8)
where is a positive integer. This family of membership functions has been experimentally
evaluated, mainly because of its simplicity and low computational requirements. Nevertheless,
there may be other functions satisfying the conditions mentioned above that could be used as
well [24]. The vector assignment is based on crisp decisions towards the end of the vector
quantizer design. This can be guaranteed by the minimization with respect to
j
y of the
discrepancy measure
1 1
( , 1,2,..., ),
j
J J y j k = = defined in

42

2
1
1 1
( )
k M
j i i j
j i
J u x x y
= =
=
, (4.9)
which results in the formula

( )
( )
1
1
M
j i i
i
j M
j i
i
u x x
y
u x
=
=
=
1, 2,...., j k = . (4.10)
The direct implication of this selection is that the proposed algorithm reduces to the crisp k-
means algorithm after all the training vectors have been transferred from the fuzzy to the crisp
mode [24].

4.3.2. Fuzzy Vector Quantization 2 (FVQ2) [24]

This algorithm is based on the certainty measures used for training vector assignment by
the family of fuzzy k-means algorithms. The codebook vectors can be evaluated in this case by

( )
( )
1
1
M
m
j i i
i
j M
m
j i
i
u x x
y
u x
=
=
=
(4.11)
resulting from the minimization of ( , 1,2,..., ),
m m j
J J y j k = = and also used in fuzzy k-
means algorithms [32]. Training vector assignment is entirely based on crisp decisions towards
the end of the vector quantizer design. If the assignment of the training vector is based on crisp
decisions, the corresponding membership function
( )
j i
u x takes the values zero and one. In this
case,
( ) ( )
m
j i j i
u x u x =

regardless of the value of m. Therefore, the codebook vectors are
evaluated by the same formula used in the crisp k-means algorithm towards the end of the vector
quantizer design [24]. The combination of the formulae used in fuzzy k-means algorithms for
evaluating the membership functions and the codebook vectors with the vector assignment
strategy results in the (FVQ 2) algorithm.


43
4.3.3. Fuzzy Vector Quantization 3 (FVQ3) [24]

The proposed strategy for vector assignment can lead to a broad variety of algorithms,
which employ different schemes for evaluating the membership functions and the codebook
vectors [24]. The evaluation of the membership functions is again based on the formula:

( )
( )
1
1
1
1
( )
,
,
j i
m
k
i j
l i l
u x
d x y
d x y
=
=

(4.12)

which is associated with the fuzzy k-means algorithm. Since image compression is based on a
crisp interpretation of the designed codebook, it is reasonable that the evaluation of each
codebook vector be influenced more by its closest training vectors. If mapproaches
asymptotically unity, this requirement is satisfied by evaluating the codebook vectors using the
crisp formula Eq. (4.10) instead of Eq. (4.11). Consider the membership function
( )
j i
u x ,
defined in Eq. (4.12). If m is sufficiently close to unity and
( ) 1
j i
u x ,
( ) ( )
m
j i j i
u x u x .
As the value of
( )
j i
u x decreases,
( ) ( )
m
j i j i
u x u x < .For a fixedm, the difference between,
( )
m
j i
u x and
( )
j i
u x increases as
( )
j i
u x approaches zero. In conclusion, the evaluation of
the codebook vectors by Eq. (4.10) guarantees that each codebook vector is not significantly
affected by the training vectors that are assigned membership values significantly smaller than
unity [24]. Another advantage of this choice is that the codebook evaluation formula Eq. (4.10) is
computationally less demanding than Eq. (4.11), since it does not require the computation of
( )
m
j i
u x . In addition, the computational burden associated with the evaluation of the
membership functions can be significantly moderated by requiring that
1
1 m
=
, where is
a positive integer. Such a choice results in a wide range of values of m close to unity, given in
terms of
1
1
m
=
+
.

44
4.4 FUZZY LEARNING VECTOR QUANTIZATION (FLVQ)

Fuzzy learning vector quantization (FLVQ) is developed based on learning vector
quantization (LVQ) and extended by using fuzzy theory. It was already discussed in section 3.2
that LVQ is the name used for unsupervised learning algorithms associated with a competitive
neural network. Bezdek et al. [13], [23] originals proposed a batch learning scheme, known as
fuzzy learning vector quantization (FLVQ). Karayiannis et al. [4], [33] presented a formal
derivation of batch FLVQ algorithms, which were originally introduced on the basis of intuitive
arguments. This derivation was based on the minimization of a function defined as the average
generalized distance between the feature vectors and the prototypes. This minimization problem
is actually a reformulation of the problem of determining fuzzy -partitions that was solved by
fuzzy c-means algorithms [23], [28]. Reformulation is the process of reducing an objective
function treated by alternating optimization to a function that involves only one set of unknowns,
namely the prototypes [36]. The function resulting from this process is referred to as the
reformulation function. FLVQ [38] has quickly gained popularity as a fairly successful batch
clustering algorithm. FLVQ employs a smaller set of user defined parameters. Both FVQ and
FLVQ are batch procedures, which indicate that in each iteration they process all the training
data at once. This algorithm combines local and global information in the computation of a
relative fuzzy membership function [13], [38]. The update equations for FLVQ involve the
membership functions of the fuzzy c-means (FCM) algorithm. This membership function
provides the degree of compatibility of an input pattern with the vague concept represented by a
cluster center or we can say membership functions are used to determine the strength of
attraction between each prototype and the input vectors. FLVQ employs metrical neighbors in
the input space. In this case, the transition from fuzzy to crisp mode is accomplished by
manipulating the fuzziness parameter. FLVQ manipulates the fuzziness parameter throughout the
training process [34]. The fuzzy LVQ method presented here makes use of a fuzzy objective
function. This function is defined as the sum of the squares of the euclidian distances between
each prototype vector and each input vector. Each of these squares is weighted by the
characteristic function of the input vectors [13]. Suppose that there are a set of n training data

45
vectors
1 2 3
( , , ,..., )
p
n
X x x x x R = .The objective function of the fuzzy c-means algorithm
is

2
1 1
( , ) ( )
n c
m
m ik k i
k i
J U V u x v
= =
=

(4.13)
where
ik
u is the membership degree of the k th training vector in the i th cluster,
{[ ],1 ,1 }
ik
U u i c k n = is the partition matrix, {[ ],1 }
i
V v i c = the cluster
center matrix, and (1, ) m the fuzziness parameter. The problem is to minimize ( , )
m
J U V
under the following constraint:

1
1,
c
ik
i
u k
=
=
(4.14)
To achieve this task, the FLVQ exhibits some certain steps [33] which are described in FLVQ
design algorithm present next.

4.4.1 FLVQ Design Algorithm:

Begin
1. Select the number of clusters c , the initial values for the cluster centers
1 2
, ,.....,
c
v v v and a
value for the parameter .
2. Set the maximum number of iterations
max
t and select the initial
0
m and the final
f
m values
for the parameter m.
3. For
max
0,1, 2,..., t t =

(a) Calculate the fuzziness parameter

0 0
max
[ ( )]
( )
f
m t m m
m t
t

=


46
(b) Set

( )
2
( ( ) 1)
1
( )
m t
m t
c
k i
ik
i
k j
x v
a t
x v
=
(
| |
(
|
=
(
|
( \

(c) Update the cluster centers according to the following learning rule:

1
1
( )
( )
( )
n
ik
k
i n
ik
k
a t x
v t
a t
=
=
=
.
(d)
1
( ) ( ) ( 1)
c
i i
i
If E t v t v t
=
= <
then stop; Else continue.

The above algorithm maintains a smooth transition from fuzzy to crisp conditions by
gradually reducing the fuzziness parameter m from a large value (fuzzy partition) to a value
close to unity (near to crisp partition). When m = 1 the function in Eq. (4.12) is called c-means
(or crisp c-means) and the resulting partition is called crisp partition [8]. However, in fuzzy c-
means and FLVQ algorithms, m cannot be equal to unity [1, 33].

FLVQ is expected to be less likely trapped in local minima and less likely to generate
dead units than hard competitive alternatives [38]. In FLVQ, due to batch learning, the final
weight vectors are not affected by the order of the input sequence when its traditional
termination criterion is removed. Recent advances in the field have presented a broad family of
batch FLVQ algorithms formally defined as a class of cost function minimization schemes.
FLVQ is also related to several on-line fuzzy clustering algorithms such as the sequential
generalized LVQ (GLVQ) [13] and GLVQ family algorithms (GLVQ-F) [37], and the class of
on-line fuzzy algorithms for learning vector quantization (FALVQ, whose proposed instances are
termed FALVQ 1, FALVQ 2, and FALVQ 3) [25], [36].
CHAPTER 5
PERFORMANCE EVALUATION OF VQ ALGORITHMS
FOR IMAGE COMPRESSION

5.1 PERFORMANCE EVALUATION OF VQ ALGORITHMS FOR IMAGE
COMPRESSION

This chapter presents an experimental evaluation of all the described VQ algorithms
(GLA, FVQ, FKM, and FLVQ) under consideration based on several experiments involving real
benchmark image data. The algorithms are used to perform codebook design, which is the key
problem in image compression based on vector quantization [25]. In this section it has been
studied that the 512 512

pixels sized Lena, Boat, Pepper, Elaine and War images, each with
256 gray levels. Each of these images was divided into 4 4

blocks and the resulting 16384 16-
dimensional vectors were used as the training vectors [34]. All these images (in Fig. 5.1) were
used as an experimental training vector set. The pixels intensities of each image take values
between 0 and 255. Each block was rearranged as a 16-dimensional vectors, i.e. 16 L = and
these were 16384 such blocks. In this experiment, the quality of the designed codebook was
measured by the average distortion

(given in Eq.5.1) after the completion of the
th
v iteration:

2 2
min
1 1
1 1
( ) min ( , )
j
M M
v
i y Y i j
i i
D d x d x y
M M

= =
= =

(5.1)

The codebook design was performed by the GLA algorithm, FVQ algorithm, FKM
algorithm, and the FLVQ algorithm. Let X be the Lenna image set, which contains 16384 (=
M ) vectors in R
16
. The resulting images were evaluated by the peak signal-to-noise ratio
(PSNR), which is used to measure the performance of these algorithms and is defined as:
chapter 5 performance evaluation of VQ algorithms for image compression

48

2 2
10
2
1 1
255
10log
( )
N N
ij ij
i j
N
PSNR
F F
= =

=

(5.2)

where 255 is the peak signal value,
ij
F and
ij
F
are the pixel intensities for the original and the

reconstructed image. Here the image is of size N N where N =512 for each image.

(a) (b)

(c) (d)

49

(e)
Fig. 5.1 (a) Original Lenna image, (b) Original Pepper image, (c) Original Boat image, (d)
Original Elaine image and (e) Original War image, each with size 512 512 , considered

for
experimental evaluation.
Tables 1, 2, 3, 4, and 5 compare different VQ algorithms (GLA, FVQ, FKM, and
FLVQ) under consideration, applied to Lena, Boat, Pepper, Elaine and War images for same
number of iterations ( ) v . In all cases, the first 512 training vectors were selected as the initial
codebook. Parameters defined as: M = 16384; c =512; L = 16; =0.001; =10 for FKM.
For FLVQ, it was chosen
0
m =1.5 and
f
m = 1.05. Here it can be seen that in all of the cases
reported in Tables 1, 2, 3, 4, and 5, the FLVQ algorithm have lowest average distortion in all
cases. Out of the five images, in all five cases FLVQ algorithm showed highest PSNR values.

Table1: Comparison of different VQ algorithms applied to the Lenna image

Algorithm v

( ) v
D
PSNR(db)
GLA[17] 100 922 30.51
FVQ[40] 100 1096 29.77
FKM[16] 100 558 32.70
FLVQ[13] 100 544 32.81


50

Table2: Comparison of different VQ algorithms applied to the Pepper image

Algorithm v

( ) v
D
PSNR(db)
GLA[17] 100 1296 29.02
FVQ[40] 100 1653 27.98
FKM[16] 100 611 30.04
FLVQ[13] 100 609 32.32

Table3: Comparison of different VQ algorithms applied to the Boat image

Algorithm v
( ) v
D
PSNR(db)
GLA[17] 100 1965 27.22
FVQ[40] 100 2043 27.06
FKM[16] 100 1029 32.04
FLVQ[13] 100 979 30.26

Table4: Comparison of different VQ algorithms applied to the Elaine image

Algorithm v

( ) v
D
PSNR(db)
GLA[17] 100 976 30.25
FVQ[40] 100 1441 28.58
FKM[16] 100 635 32.14
FLVQ[13] 100 603 32.36


51
Table5: Comparison of different VQ algorithms applied to the War image

Algorithm v

( ) v
D
PSNR(db)
GLA[17] 100 739 31.44
FVQ[40] 100 660 31.97
FKM[16] 100 430 33.83
FLVQ[13] 100 417 33.96

In tables 1-5, the effect for a fixed codebook size c =512 ha been studied. Next, the
effect of the variation of codebook size is studied on the PSNR values obtained by the GLA [17],
the FVQ [40], the FKM [16], and the FLVQ [13] method. The above algorithms were used to
generate codebooks of size 2
nb
c = ( nb = 6,.7,8,9), where each training vector was
represented by nb bits. Each training vector corresponded to an image block of 16 pixels. To
conduct the simulations we used the same initial codebook vectors for all of the algorithms.
Table 6 summarizes the results for the Lena image. Similar experimentations were carried out for
the Pepper, Boat, Elaine and War images. Tables 7, 8, 9, and 10 represent the PSNR values
obtained by these simulations. It can be seen that the PSNR values and hence the compression
performance improves, on the whole, with the increase in the size of the codebook vectors. This
is in conformation with the usual understanding that, in lossy compression, more the
compression one desires to achieve, less the performance one is forced to settle for. One can
easily notice that in most of the cases reported in Tables 6, 7, 8, 9, and 10, the FLVQ algorithm
obtains the best results. Performance wise, FKM emerges as the second best algorithm, and, for
Pepper image, it outperformed FLVQ, for higher compression ratio situations.


52

Table6: PSNR values (in db) for Lenna image, for various codebook sizes

Algorithm Codebook Size (c)
64 128 256 512
GLA[17] 28.83 29.25 29.76 30.51
FVQ[40] 28.45 28.87 29.46 29.77
FKM[16] 29.60 30.67 31.68 32.70
FLVQ[13] 29.65 30.73 31.76 32.81

Table7: PSNR values (in db) for Pepper image, for various codebook sizes

Algorithm
Codebook Size(c)
64

128

256

512
GLA[17] 28.44 28.29 28.69 29.02
FVQ[40] 26.71 27.49 27.87 27.98
FKM[16] 29.18 30.39 31.41 32.30
FLVQ[13] 27.34 28.33 29.28 32.32

Table8: PSNR values (in db) for Boat image, for various codebook sizes

Algorithm
Codebook Size (c)
64

128

256

512
GLA[17] 25.91 26.28 26.86 27.22
FVQ[40] 26.11 26.54 26.87 27.06
FKM[16] 27.26 28.18 29.15 30.04
FLVQ[13] 27.34 28.33 29.28 30.26


53

Table9: PSNR values (in db) for Elaine image, for various codebook sizes

64

128

256

512
GLA[17] 29.19 29.60 29.95 30.25
FVQ[40] 28.18 28.42 28.56 28.58
FKM[16] 29.85 30.69 31.48 32.14
FLVQ[13] 29.95 30.82 31.60 32.36

Table10: PSNR values (in db) for War image, for various codebook sizes

64

128

256

512
GLA[17] 30.57 30.92 31.07 31.44
FVQ[40] 31.13 31.52 31.79 31.97
FKM[16] 31.68 32.42 33.12 33.83
FLVQ[13] 30.88 31.71 32.46 33.96

Next a graphical variation of these PSNR values is presented. Fig. 5.2 depicts the PSNR
values achieved by the four algorithms for the Lena image, Fig. 5.3 for the Boat image, Fig. 5.4
for the Elaine image, Fig. 5.5 for the Pepper image, and Fig. 5.6 for the War image. Codebooks
of sizes c = 2
nb
(nb = 1, 2, . . . 9) were generated and corresponding to that the PSNR values
have been calculated. In these figures, one can notice that, for all images, these four algorithms
perform comparatively, when c have small values i.e. when codebook size is small and hence
compression ratio is high. If the user is bready to sacrifice a little in compression performance,
then he has to choose a higher c , in which case FLVQ outperforms largely the other algorithms.

54

Fig.5.2 The peak signal-to-noise ratio (PSNR) for different VQ algorithms as a function of the
codebook size for the Lenna image

codebook size for the Boat image

55

codebook size for the Elaine image

codebook size for the Pepper image

56

codebook size for the War image

Figures 5.9ae show the lenna, Pepper, Boat, Elaine, and War images reconstructed from
the codebooks designed by FLVQ ( c = codebook size = 512,
0
m =1.5,
f
m =1.05 and v =100 ).
The PSNR values achieved were 32.81, 32.32, 30.26, 32.36, and 33.96 dB, respectively. Figures
5.10ae show the lenna, Pepper, Boat, Elaine and War images reconstructed from the codebooks
designed by FVQ ( c =codebook size = 512,
0
m =1.5, and v =100 ). The PSNR values achieved
were 29.77, 27.98, 27.06, 28.58, and 31.97 dB, respectively. Figures 5.11ae show the lenna,
Pepper, Boat, Elaine and War images reconstructed from the codebooks designed by FKM ( c =
codebook size =512, =10, and v =100 ). The PSNR values achieved were 33.70, 32.30, 30.04,
32.14, and 33.83 dB, respectively. Figures 5.12ae show the lenna, Pepper, Boat, Elaine and War
images reconstructed from the codebooks designed by GLA ( c = codebook size = 512, and v
=100 ). The PSNR values achieved were 32.812, 32.32, 30.26, 32.36, and 33.96 dB, respectively.
From these reconstructed images it can be concluded that it is almost impossible to visualize any

57
discernable difference using a naked eye and all reconstructed images look satisfactory.
However, from the systematic quantitative analysis carried out one can see difference in
compression performance and appreciate it. The degradation in compression will be usually
appreciated when c assumes lower and lower values.
Here Graphical User Interface (GUI) was used to show the difference between original
lenna image with reconstructed lenna image using FLVQ algorithm and FKM algorithm. GUI
brings together the latest in image compression technologies by utilizing potential research
concepts of soft-computing based techniques. The GUI include several image compression
techniques developed using the key concepts of the areas like Artificial Neural Networks (ANN),
Fuzzy Logic, Wavelet Transform. The layout of the GUI is designed for users who unaware of
technical details of the areas and this features makes the GUI more interactive front end tool.
Figures 5.7 show the original and reconstructed lenna image from the codebooks designed by
FLVQ using GUI. Figures 5.8 show the original and reconstructed lenna image from the
codebooks designed by FKM using GUI.

Fig.5.7 Original and Reconstructed Lenna image, each with size 512 512

designed by FLVQ
using Graphical User Interface (GUI).


58

Fig.5.8 Original and Reconstructed Lenna image, each with size 512 512

designed by FKM
using Graphical User Interface (GUI).

(a) (b)

(c) (d)

59

(e)
Fig. 5.9 (a) Reconstructed Lenna image, (b) Reconstructed Pepper image, (c) Reconstructed Boat
image, (d) Reconstructed Elaine image and (e) Reconstructed War image, each with size
512 512

designed by FLVQ,

for experimental evaluation.

(a) (b)

(c) (d)

60

(e)
Fig. 5.10 (a) Reconstructed Lenna image, (b) Reconstructed Pepper image, (c) Reconstructed
Boat image, (d) Reconstructed Elaine image and (e) Reconstructed War image, each with size
512 512

designed by FVQ, for experimental evaluation.

(a) (b)

(c) (d)

61

(e)
512 512

designed by FKM, for experimental evaluation.

(a) (b)

(d) (e)

62

(e)
512 512
,
designed by GLA, for experimental evaluation

63
CHAPTER 6
CONCLUSION

The present thesis concentrated on implementation of several algorithms to perform
codebook design in an image compression application based on vector quantization methods.
Four contemporary, popular vector quantization algorithms were considered here for solving the
problems in image compression domain. A detailed description of each of these VQ algorithms
was presented. Then an extensive analysis and performance comparison for these algorithms
were presented for five popular, benchmark images. The results show that the FLVQ algorithm
performed mostly better than other algorithms. This is demonstrated by a higher PSNR value
mostly obtained with FLVQ algorithms with lower values of average distortion of the resulting
codebook.
However the experimentation also validated the fact that usually in vector quantization
algorithms, will reduction in the size of the codebook vectors, the compression performance
degrades. The method wishes to consider the point as the future scope of research where we will
attempt to develop better vector quantization algorithms that can address this problem and
minimize the performance degradation in the quest of achieving better compression results.

64
REFERENCES

1. Weidong Kou, Digital Image Compression Algorithm & Standards.
2. J. Vaaben and B. Niss, "Compressing images with JPEG," Information Display,
vol. 7, no. 7/8, pp. 12-14, July/Aug. 1991.
3. Gonzalez, R.C. and R.E. Woods, 2002. Digital image processing, Pearson
Education, 2002.
4. N. B. Karayiannis and P.-I Pai, Fuzzy Vector Quantization Algorithms and Their
Application in Image Compression, Systems, Neural Nets, and Computing
Technical Report No. 93-01, University of Houston, March 1993.
5. N. B. Karayiannis and P.-I Pai, and Nicholas Zervos , Image Compression Based on
Fuzzy Algorithms for Learning Vector Quantization and Wavelet Image
Decomposition,.
6. Yang Degang, Chen Guo3, Wang Hui4, Liao Xiaofe, Learning Vector
Quantization Neural Network Method for Network Intrusion Detection.
7. S. Haykin, neural networks: A comprehensive foundation New York, NY:
Macmillan, 1994.
8. Y. X. Zhang, Y. H. Zhao. Learning vector quantization for classifying astronomical
objects, Chinese Journal of Astronomy and Astrophysics, 2003.
9. S. Kotsiantis, Supervised Machine Learning: A Review of Classification
Techniques, Informatica Journal 31 (2007) 249-268.
10. A. P. Uribe, Artificial Neural Networks: Algorithms & Hardware Implementation,
Extracted from draft version 2.2, Bio- inspired Computing Machines: Toward Novel
Computational Architectures. PPUR Press, 1998, pp. 289-316.
11. H.C. Card, Fellow, G.K. Rosendahl, D.K. McNeill, and R.D. McLeod, Competitive
Learning Algorithms and Neurocomputer Architecture.
12. Pal NR, Bezdek JC, Tsao ECK. Generalized clustering networks and Kohonens
self-organizing scheme. IEEE Trans Neural Networks 1993; 4:54957.
65
13. D. Rumelhart and D. Zipser. Feature discovery of competitive learning, Parallel
Distributed Processing: Explorations in the Microstructureof Cognition. Rumelhart
and McClelland, Eds. Cambridge, MA: MIT Press, 1986-1991.
14. K. Hertz, A. Krogh, and R. Palmer. Introduction to the Theory of Neural
Computation, chapter 9. Addison-Wesley, Redwood City, CA, 1991.
15. N.R. Pal, J.C. Bezdek, On clustering validity for the fuzzy c-means model, IEEE
Transactions on Fuzzy Systems 3 (1995) 370379.
16. Y. Linde, A. Buzo and R.M. Gray, An algorithm for vector quantizer design, IEEE
Transactions on Communications 28 (1980) (1), pp. 8495.
17. F. Shen, O. Hasegawa, An adaptive incremental LBG for vector quantization, Neural
Networks 19 (2006) 694704.
18. G. Patane, M. Russo, The enhanced LBG, Neural Networks 14 (2001) 12191237.
19. R. E. Bellman, R. A. Kalaba, and L. A. Zadeh, Abstraction and pattern
classification, J. Math. Anal. Appl., vol. 13, pp. 1-7, 1966.
20. Zhicheng Wang
1,
Department of Electrical and Computer Engineering, University of
Waterloo, N2L 3G1 Waterloo, Ontario, Canada.
21. C. Pope, L. Atlas, C. Nelson, A Comparison between Neural Network and
Conventional Vector Quantization Codebook Algorithms, Proceedings of IEEE
Pacific Rim Conference on Communications, Computers and Signal Processing,
Victoria, B.C., Canada, June 1-2, 1989, pp. [521-524
22. J.C. Bezdek, R. Ehrlich and W. Full, Fcm: the fuzzy c-means clustering algorithm,
Comput. Geosci. 10 (1984) (2-3), pp. 191203.
23. Linde, A. Buzo, and R.M. Gray, An Algorithm for Vector Quantizater Design,
IEEE Transaction on Comm., vol. 28, 1980, pp. 84-95.
24. N.B. Karayiannis and P.I. Pai, Fuzzy vector quantization algorithms and their
application in image compression, IEEE Trans. Image Process. 4 (1995) (9), pp.
11931201.
25. A. Gersho, R.M. Gray, Vector Quantization and Signal Compression, Kluwer
Academic, Boston, MA, 1992.
26. E. Alpaydn, Soft vector quantization and the EM algorithm1, Neural Networks 11
(1998) 467477.
66
27. Dat Tran, T. Van Le and Michael Wanger, Fuzzy Gaussian Mixture Models for
Speaker recognition.
28. L. A. Zadeh, Fuzzy sets, Inform. Contr., vol. 8, pp. 338-353, 1965.
29. E. H. Ruspini, A new approach to clustering, Inform. Contr., vol. 15, pp. 22-32,
1969.
30. J. C. Dunn, A fuzzy relative of the ISODATA process and its use in detecting
compact well-separated clusters, J. Cybernetics, vol. 3, no. 3, pp. 32-57, 1973.
31. K. Hirota, W. Pedrycz, Data compression with fuzzy relational equations, Fuzzy
Sets and Systems 126 (2002) 325335.
32. Bezdek, J.C.: Pattern recognition with fuzzy objective function algorithms
(Plenum, New York, 1981).
33. Xu, W., Nandi, A.K., and Zhang, J.: Novel fuzzy reinforced learning vector
quantization algorithm and its application in image compression IEE Proc.-Vis.
Image Signal Process., Vol. 150, No. 5, October 2003.
34. Fu-Lai Chung and Tong Lee, Fuzzy Learning Vector Quantization, Proceedings of
1993 International Joint Conference on Neural Networks.
35. N. B. Karayiannis and J. C. Bezdek, An integrated approach to fuzzy learning
vector quantization and fuzzy c-means clustering, IEEE Trans. Fuzzy Syst., vol. 5,
pp. 622628, 1997.
36. N. B. Karayiannis, J. C. Bezdek, N. R. Pal, R. J. Hathaway, and P. Pai, Repair to
GLVQ: A new family of competitive learning schemes, IEEE Trans. Neural
Networks, vol. 7, no. 5, pp. 10621071, 1996.
37. J. C. Bezdek and N. R. Pal, Two soft relative of learning vector quantization,
Neural Networks, vol. 8, no. 5, pp. 729743, 1995.
38. R. N. Dav`e and R. Krishnapuram, Robust clustering method: A unified view,
IEEE Trans. Fuzzy Syst., vol. 5, no. 2, pp. 270293, 1997.
39. G.E. Tsekouras, A fuzzy vector quantization approach to image compression,
Applied Mathematics and Computation 165 (2005) 539560.

Wavelet

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Wavelet

Hochgeladen von

Copyright:

Verfügbare Formate

VECTOR QUANTIZATION ALGORITHIMS

FOR IMAGE COMPRESSION PROBLEMS

( , ) f x y is generated [3]. Shannon considered a general

> , then go to step 4.

, normally1 < < .

for any data object by its definition

= . A particular family of membership

then stop; Else continue.

are the pixel intensities for the original and the

Das könnte Ihnen auch gefallen