Sie sind auf Seite 1von 5

1

1

AbstractIn this paper we propose a new method to represent
information granules by Gaussian functional forms. First, the
fuzzy granules are extracted from data by a fuzzy clustering
algorithm. Then, they are properly represented by Gaussian
functions determined by solving a constrained quadratic
programming problem on membership values returned by the
clustering algorithm. Simulation results show that compact and
robust fuzzy granules are attained, with the appreciable feature
of being represented in a short functional form.

Index TermsFuzzy information granulation, Gaussian
membership functions, Constrained quadratic programming,
Fuzzy clustering.

I. INTRODUCTION
uzzy information granulation is the process of discovering
pieces of information, called information granules,
expressed in terms of fuzzy theory [1], [2], [3]. The
attained granules can be successively used in Fuzzy
Information Systems (FIS) to perform inferences on the
working environment.
Fuzzy clustering is a general unsupervised method to
induce fuzzy granules (i.e. clusters) that represent groups of
observations that are close in the sense of some predefined
metric. Many fuzzy clustering algorithms return a prototype
vector and a partition matrix that contains the membership
values of each observation to each cluster [4]. Such partition
matrix needs large memory requirements, since its space
complexity is linear in the number of observations and in the
number of clusters. Moreover, the partition matrix does not
convey any direct information about fuzzy memberships of
new data. For these reasons, when a FIS is built on the derived
clusters, only prototype information is usually used to define
the fuzzy granules, while the partition matrix is partially or
totally ignored.
When Gaussian functions are adopted to represent fuzzy
granules, one problem is choosing the widths of membership


G. Castellano is with the Computer Science Department, University of
Bari, Via Orabona 4, 70125 Bari, Italy (phone: +39-080-5442456; fax: +39-
080-5443156; e-mail: castellano@ di.uniba.it).
A. M. Fanelli. is with the Computer Science Department, University of
Bari, Via Orabona 4, 70125 Bari, Italy (e-mail: fanelli@di.uniba.it).
C. Mencar is with the Computer Science Department, University of Bari,
Via Orabona 4, 70125 Bari, Italy (e-mail: mencar@di.uniba.it).

functions. Indeed, while the centers of the Gaussian functions
can coincide with the prototypes calculated by the clustering
algorithm, there is no analytical way to define their widths if
partition matrix is ignored.
In literature, some heuristic techniques have been proposed
to define the Gaussian widths [5]; however, most of them
require the introduction of some user-defined parameters and
do not exploit the information about the fuzzy clusters
discovered by the clustering algorithm. Often, the widths are
chosen by trial-and-error, so that Gaussian functions are not
too flat (too much overlapping) or too peaked (they do not
cover the whole input space). The consequence is a large
waste of time that sum up with unexploited useful information
provided by the clustering algorithm.
In this work, we propose a new method to define fuzzy
information granules represented in terms of Gaussian
membership functions, with the main feature that widths are
calculated by exploiting the information conveyed by the
partition matrix of the clustering algorithm. The key
advantage of the proposed approach is the ability of
automatically finding good Gaussian representations of
fuzzy granules in terms of mean squared error. The approach
does not require trial-and-error procedures or strong
constraints, such as imposing the same width for all the
granules (i.e. isotropic Gaussian functions).
The proposed method can be applied with any fuzzy
clustering algorithm that returns a prototype vector and the
corresponding partition matrix. In this work, we use Fuzzy C-
Means [4] as basic clustering algorithm from which the
Gaussian representation is derived.
The paper is organized as follows. Section II defines the
Gaussian representation as a constrained quadratic
programming problem. In Section III, a real-word
experimentation is carried out to validate the approach, and in
Section IV some final conclusions are drawn.
II. GAUSSIAN REPRESENTATION OF FUZZY GRANULES
A fuzzy clustering algorithm can be described as a function
that accepts a training set of observations and returns a set of
prototypes other than a partition matrix. The number of
clusters may be predefined or determined by the algorithm.
Hence, a generic fuzzy clustering algorithm may be
formalized as:
A Compact Gaussian Representation of Fuzzy
Information Granules
Giovanna Castellano, Anna M. Fanelli, Member, IEEE and Corrado Mencar
F


2
| | : 0,1 ,
, 1, 1
m c
m c
n
fc
m c


>
X X
X
(1)
such that:
( )
1 2
, , , ,
N
fc P U = x x x (2)
where:
| |
1 2
, , ,
c
P = p p p (3)
is the matrix of all prototypes (one for each column), and:
| | 1,2, ,
1 2
1,2, ,
, , , i m
c ij
j c
U u =
=
= =

u u u

(4)
is the partition matrix, that contains the membership value
of each observation to each cluster.
The objective is to find, for each cluster, a set of Gaussian
representations of the discovered clusters, corresponding to
the following functional form:
| |
( ) ( ) ( )
( ) ,
: exp
T
C
C

u = x x x (5)
where is the center and C is the inverse of the width
matrix. Matrix C should be symmetric positive definite (s.p.d.)
in order to have the classical bell shape centered on of the
function graph. In many cases, a simpler diagonal positive
width matrix is often required. Indeed, if C is a diagonal
matrix, that is:
( )
1 2
: diag diag , , , , 0
n i
C c c c c = = > c (6)
then the fuzzy granule can be represented as product of
independent scalar exponential functions:
| |
( )
| |
( ) ( )
( )
2
, ,
1 1
exp
i i
n n
i i i i C c
i i
x c x

u u
= =
= =

x (7)
The problem can be decomposed into c independent sub-
problems that find the best representation for each cluster
discovered by the clustering algorithm. Hence, in the
following we concentrate on a single cluster and we will omit
the cluster index j when unnecessary.
Generally, there is no an exact solution to the problem, i.e.
there is not a pair , C such that:
| |
( )
,
:
and: : 0
i i C
T
i u
C

u =
>
x
x 0 x x
(8)
In order to choose the best Gaussian representation,
some error function has to be defined. Because of the
nonlinearity of the equations in (8), it is not possible to apply
general linear systems theory. On the other hand, the equation
system in (8) is equivalent to the following:
( ) ( ) : log , spd
T
i i i
i C u C = x x (9)
The system (9) can be rewritten as:
: log , spd
T
i i i
i C u C = x x (10)
where the center of the Gaussian membership function is
put equal to the cluster prototype:
j
= p (11)
and the following change of variables is done:

i i
= x x (12)
By imposing C to be positive diagonal, the system can be
further simplified as:
2
1
: log , 0
n
ik i i i
k
i x c u c
=
= >

(13)
where: | |
1,2, ,

i ik
k n
x
=
= x

.
The equations in (13) form a constrained linear system;
generally, it has not an exact solution, so a constrained least
squared error minimization problem can be formulated as
follows:
( )
2
2
1
1 1
minimize: log
subject to: 0
m n
ik i i m
i k
f x c u
= =
| |
= +
|
\ .
>

c
c
(14)
If the following matrix is defined:
2
1,2, ,
1,2, ,

i m
ik
k n
H x =
=
=

(15)
then, excluding the constant terms, the problem (14) can be
reformulated as:
( )
1
2
minimize:
subject to: 0
T T
f G = +
>
c c c g c
c
(16)
where:
2
T
G H H = (17)
and:
2 log
T
H = g u (18)
The problem can be solved with classical constrained
quadratic programming techniques. Usually, quadratic
programming algorithms only accept constraints in the form:
A c b (19)
In this case, it is useful to express the constraints of the
objective function in the form:
min
c c (20)
where the vector
min
c defines the maximum admissible
amplitudes and it is provided manually. If
min
= c 0 , then all
possible amplitudes are admissible, even infinite.
The objective function f can be usefully rewritten as:


3
( )
( )
2
1
1
exp diag
log
T
m
i i
m
i i
f
u
=
| |

| =
|
\ .

x c x
c (21)
which is the mean squared log-ratio between the Gaussian
membership approximation and the actual membership value
assigned by the clustering algorithm. The squared log-ratio is
a concave positive function with global minimum in 1 with
value 0. By expanding the Taylor series of the squared log-
ratio with center in 1, it is possible to observe that in a
sufficiently small neighbor of point 1, the function can be
approximated by:
( ) ( ) ( )
( )
2 2 3
log 1 1 x x O x = + (22)
In such neighborhood, the following approximation can be
done:
| |
( )
| |
( )
2 2
, ,
log 1
i i C C
i i
u u

u u

| | | |
= | |
| |
\ . \ .
x x
(23)
This implies that:
| |
( )
( )
2
2
, i i i C
u u

u x (24)
As a consequence, if the objective function assumes small
values, the resulting Gaussian membership function
approximates the partition matrix with a small mean squared
error. This property validates the proposed approach.
The space complexity of the proposed representation is
O(nc), while the memory required for storing the partition
matrix is O((m+n)c). In this sense, the proposed approach
leads to a compact representation of fuzzy granules.
III. SIMULATION RESULTS
In this Section we use a computer experiment to illustrate
the proposed approach. As an information granulation
problem, we have chosen the North East dataset (fig. 1),
containing 123,593 postal addresses (represented as points),
which represent three metropolitan areas (New York,
Philadelphia and Boston) [6]. The dataset can be grouped into
three clusters, with a lot of noise, in the form of uniformly
distributed rural areas and smaller population centers.
We have used FCM to generate three fuzzy clusters from
the dataset. Successively, the prototype vector and the
partition matrix returned by FCM were used by the proposed
Figure 1: The North East Dataset

Figure 2: Fuzzy cluster for Philadelphia city and its Gaussian
representation


Figure 3: Fuzzy cluster for Boston city and its Gaussian representation


Figure 4: Fuzzy cluster for New York city and its Gaussian
representation
Philadelphia
New York
Boston


4
method to obtain a Gaussian representation of the three
clusters. For FCM and quadratic programming, the
MATLAB R11.1 Fuzzy toolbox and Optimization toolbox
have been used respectively.
Centers and widths of the derived Gaussian functions are
reported in Table I. Figures 2, 3 and 4, depict for each cluster
both membership values in the partition matrix as grey-levels,
and the radial contours of the corresponding Gaussian
function.
As it can be seen in the figures, Gaussian granules obtained
by the proposed approach properly model some qualitative
concepts about the available data. Specifically, regarding each
cluster as one of the three metropolitan areas (Boston, New
York, Philadelphia), membership values of postal addresses
can be interpreted as the degree of closeness to one city
(cluster prototype). Such concept is not easily captured with
clusters discovered by FCM alone, since, as the figures
illustrate, the membership values of the addresses do not
always decrease as the distances from the prototype cluster
increase.
Also, table I reports the Mean Squared Errors (MSE)
between Gaussian granules and fuzzy clusters, defined as:
( )
( )
2
1
,
1
j j
m
j i i m C
i
u

u


=
=

x E (25)
The low values of MSE for each granule, demonstrate how
well the resulting Gaussian membership functions
approximate the partition matrix of FCM.
In order to evaluate quantitatively the derived Gaussian
information granules, the Xie-Beni index has been used as
compactness and separation validity measure [7]. Such
measure is defined as:
2
2
1 1
2
,
min
c m
ij j i
j i
i j
i j
S
m

= =

=

p x
p p
(26)
where:
( )
,
, for FCM clusters
, for Gaussian granules
j j
ij
ij
i
C
u

x
(27)
In other words, the Xie-Beni index for the FCM clusters has
been directly computed on the partition matrix returned by the
clustering algorithm. Conversely, for Gaussian granules the
measure has been computed by re-calculating the membership
values of each observation of the dataset with the derived
Gaussian membership functions.
Table II summarizes a comparison between fuzzy granules
extracted by FCM alone, and those obtained by the proposed
approach, in terms of Xie-Beni index, number of floating
point operations (FLOPS) and time/memory requirements on a
Intel Pentium III 500MHz and 128MB RAM.
As it can be seen, the Xie-Beni index values for the
Gaussian granules and FCM clusters are comparable. The
slight difference is due to the nature of the proposed method
that generates convex (Gaussian) approximations for the
partition matrix, which is generally not convex, i.e. it assumes
high values even for points very distant from the prototype
(see figs. 2-4).
The time required for representing granules with Gaussian
functional forms is negligible compared to the time required
for FCM, hence the total computational cost of the proposed
method (FCM + Gaussian representation) is comparable with
FCM alone. More important, the method provides a compact
representation of the granules. Indeed, each Gaussian granule
is fully described only with a prototype vector and a diagonal
width matrix. As a consequence, once granules have been
represented by Gaussian functions, the partition matrix can be
discarded, thus saving a large amount of memory.
IV. CONCLUSIONS
In this paper, we have proposed a method to derive a
Gaussian representation of information granules by solving a
constrained quadratic programming problem. Unlike heuristic
techniques, no hyper-parameter has to be specified, and the
granules representation fully exploits all the information
returned by the fuzzy clustering algorithm used to extract
granules from data. The derived granules have good features
in terms of fine approximation and compact representation.
Moreover, they are very robust against noise, as the real-world
experimentation showed, and they can be usefully integrated
in most Inference Systems to perform fuzzy reasoning about
the working environment.
REFERENCES
[1] L.A. Zadeh, Fuzzy sets and information granularity. In M.M. Gupta,
R.K. Ragade and R.R. Yager, eds., Advances in Fuzzy Sets Theory and
Applications, North Holland, Amsterdam, 1979, pp. 3-18.
[2] L.A. Zadeh, Towards a theory of fuzzy information granulation and its
centrality in human reasoning and fuzzy logic. Fuzzy Sets and Systems,
Vol. 90(1997), pp. 111-127.
[3] W. Pedrycz, Granular computing: an introduction. In Proc. of IFSA-
NAFIPS 2001, Vancouver, Canada, 2001, pp. 1349-1354.
TABLE I
PARAMETERS OF GAUSSIAN INFORMATION GRANULES
Granule
Measure
Boston New York Philadelphia
Center (0.6027, 0.6782) (0.3858, 0.4870) (0.1729, 0.2604)
Amplitudes (0.0906, 0.1027) (0.0580, 0.0606) (0.1013, 0.1151)
MSE 0.0360 0.0203 0.0347

TABLE II
PERFORMANCE MEASUREMENTS
Quantity FCM
Gaussian
representation
Xie-Beni index 0.1656 0.2687
FLOPS 792M 14.1M
Time required 138.1 s (81 iterations) 14.7 s
Memory required 2,966,280 B 144 B


5
[4] J.C. Bezdek, Pattern recognition with fuzzy objective function
algorithms. Plenum, New York, 1981.
[5] S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice-
Hall, NJ, 1999.
[6] G. Kollios, D. Gunopulos, N. Koudas, S. Berchtold, Efficient Biased
Sampling for Approximate Clustering and Outlier Detection in Large
Datasets. IEEE Transactions on Knowledge and Data Engineering, to
appear, 2002. Avalable:
http://dias.cti.gr/~ytheod/research/datasets/spatial.html
[7] X.L. Xie, G. Beni, A Validity Measure for Fuzzy Clustering, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 1991, 13(4),
pp. 841-846.

Das könnte Ihnen auch gefallen