Performance Comparison of Fuzzy Queries On Fuzzy Database and Classical Database

International Journal of Computer Application
Issue 3, Volume 1 (February 2013)

Available online on http://www.rspublication.com/ijca/ijca_index.htm
ISSN: 2250-1797
PERFORMANCE COMPARISON OF FUZZY

QUERIES ON FUZZY DATABASE AND CLASSICAL
DATABASE
Sujit kumar Mondal#1,Joyassree Sen#2, Md. Rabiul Islam#3, Md. Shamim Hossian#4
#1,2,4
Department of Computer Science and Engineering

Islamic University, Kushtia-7003, Bangladesh.
#3
Department of Computer Science and Engineering

Northen University, Khulna campus, Bangladesh
ABSTRACT
In this paper we have designed a sample classical and fuzzy database and applied both
classical and fuzzy queries on these databases. Presently, most of the relational database
system use query, which has syntax and semantics defined preciously, to retrieve data.
But sometimes, we are likely to use vague (fuzzy) terms in our query. Our
implementation gives the user the flexibility to query the database using natural language.
We also calculate the time cost of classical and/or fuzzy query on classical and/or fuzzy
database (DB) and have shown that the time cost of the fuzzy query on sample fuzzy
database has been reduced.
Keyword: Fuzzy logic, FDBMS, Fuzzy query, Classical database
Corresponding Author: Sujit kumar Mondal
INTRODUCTION
A database-management system (DBMS) consists of a collection of interrelated data and
a set of programs to access those data. The primary goal of a DBMS is to provide an
environment that is both convenient and efficient for people to use in retrieving and
storing information. A number of operations are performed on a DBMS. Searching is an
important operation among those. A significant amount of time is needed for searching
data from a DBMS. As the size of a DB increases, the searching time also increases. A
number of algorithms have been developed to improve the performance of searching
using query. But those algorithms have been developed only for classical DB. The
traditional DBMS cannot manipulate incomplete, imprecise and vague data such as very
high, about 30, etc. properly. To overcome this problem, FDBMS (Fuzzy Database
Management System) has been introduced. The primary focus of fuzzy logic is on natural
language, where reasoning with imprecise propositions approximates is rather typical.
As the size of DB is increasing day by day, programmers are intending to reduce the time
complexity to access data from a large database1. Large database may consist of millions
of data and it costs significant amount of time to find any particular record from that
Page 109

ISSN: 2250-1797
database. The search time may be reduced by indexing database through the B-tree
algorithm2. We have eliminated lack of expressiveness and also reduced searching time
by designing fuzzy database and using fuzzy query on it, which is the exploration of this
paper.
FUZZY DATABASE AND FUZZY QUERY
The database which contains incomplete, imprecise and vague data is called fuzzy
database. Fuzzy database is founded on fuzzy logic. There are two feasible ways to
incorporate fuzziness in DBMS3: One is making fuzzy queries to the classical databases
and the other is adding fuzzy information to the system. In this paper, the fuzzy database
based on relational data model, called fuzzy relational database, enhances the relational
model by modelling imprecise in data and/or query. A fuzzy relation is represented by a
fuzzy or crisp attribute. Crisp attributes have precise data, such as Tk.7980 balance and
fuzzy attribute consists of imprecise data. There are four possible types of fuzzy
attribute6: Type 1, Type 2, Type 3 and Type 4. Among these, only Type 1, which has
precise data and linguistic labels over them, is explored to build fuzzy database in this
paper. The schema of used relation in this paper is as follows:
Account_schema=(account_no, branch_name, balance), where all attributes are crisp
for classical database. For fuzzy database, the only fuzzy attribute is balance and others
are crisp. In both cases, account_no. is a primary key. An example of a sample account
relation is shown in Table 1.
Table 1: Account Relation
account_no.
branch_name
balance
103
104
Dhaka
Narial
7890
4500
107
Bagura
8500
120
108
Bagura
Kushtia
8100
2500
110
Dhaka
3250
101
Narial
15000
We use three linguistic terms5 over balance attribute. They are high, moderate, and
low over balance attribute have been used. The membership function4 over fuzzy
attribute, balance, is illustrated in Figure 1.
Page 110

ISSN: 2250-1797
Y
1
LOW
MODERATE
HIGH
(x)
Slope1
Slope2
Slope3
Slope4
X
1000
D1
3000
5000
7000
D2
D4
D3
12000
9000
Figure 1: Fuzzy Sets characterizing Balance
Degree of membership of x for low

Compute D1 5000 x
If ( D1 0) , then Degree of membership 0 .. (i )
else if ( x 1000) , then Degree of membership 1
else Degree of membership Slope1 * D1
Degree of membership of x for moderate
Compute D2 x 3000, D3 9000 x
If ( D2 0) or ( D3 0) , then Degree of membership 0 . (ii )
else Degree of membership min
D 2 * Slolpe 2
D3 * Slope3
1
Degree of membership of x for high

Compute D4 x 12000
If ( D4 0) , then Degree of membership 0 ... (iii )
else if ( x 12000) , then Degree of membership 1
else Degree of membership Slope4 D4
The query statement, used to retrieve data from database, which involves imprecise
information, is called fuzzy query6. We only explore those queries over nonkey attribute,
balance. Then, multiple records may be retrieved through these queries.
BUILDING CLASSICAL AND FUZZY DATABASE
To build a classical database, we construct an index file based on B-tree algorithm during
building a data file of desired relation. The internal structure of index file of sample
database after applying B-tree algorithm2 is shown in Figure 2.
Page 111

ISSN: 2250-1797
101
103
104
103
104
107
120
108
110
101
Dhaka
Narial
Bagura
Bagura
Kushtia
Dhaka
Narial
7890
4500
8500
8100
2500
3250
15000
Figure 2: The internal structure of classical database

Here, every pointer of leaf node points to only one record of account data files and the
leaf node contains four pointers.
To build fuzzy database, we build an individual index file for each linguistic term of
fuzzy attribute of that database. So, in this paper, three index files, based on B-tree index
structure, for each of those linguistic terms named low index file, moderate index file,
and high index file has been built. The search-key of those index files consists of primary
key, account_no, with fuzzy value of that balance. These index files are built during
building data file for account. To build account fuzzy database, we follow the following
steps for each record:
1. Insert the desired record in account data file after testing validity of primary
key of that record and set, ad, with the address of the current record where it
is inserted in account data file.
2. Compute the fuzzy value of balance of current record for low, moderate and
high through the formula 1, 2, 3 respectively.
3. The search-key is made by packing account_no and fuzzy value for each
linguistic term, except zero fuzzy values.
4. If fuzzy value for high is not zero, then
The search-key is inserted into high index file. Let it is inserted into
ith position of a leaf node for high index file.
In pi , the address of current record, ad, is assigned.
5. Repeat step 4 for linguistic terms moderate and low.
For example, the current record is (103, Dhaka, 7890). This record is inserted into
account data file and the address of that record in account data file is taken. The
membership degree for fuzzy set high is: High 7890 .178 , which is greater than 0 .
So, the search-key containing account_no 103 and fuzzy value .178, for balance 7890 is
inserted into high index file. The membership degree for fuzzy set moderate is:
mod erate7890 .333 , which is greater than 0 . So the search-key containing account_no
103 and fuzzy value .333, for balance 7890 is inserted into moderate index file. Similarly
Page 112

ISSN: 2250-1797
for fuzzy set low, low 7890 0 . So the search-key for balance 7890 is not inserted into
low index file. And so on. The structure of a leaf node with 4 pointers for the low,
moderate and high index files are shown in the following Figures 3, 4 and 5, respectively.
104 .75
108 1
110 .125
103 Dhaka
7890
104 Narial
4500
107 Bagura 8500
120 Bagura 8100
108 Kushtia 2500
110 Dhaka
3250
101 Narial
15000
Figure 3: The internal structure of account relation for low linguistic term
103
.37
104
.5
107
.167
103 Dhaka
7890
104 Narial
4500
107 Bagura 8500
120 Bagura 8100
108 Kushtia 2500
110 Dhaka
3250
101 Narial
15000
Figure 4: The internal structure of account relation for moderate linguistic term
101
103
.178
107
103
104
107
120
108
110
101
.3
Dhaka
Narial
Bagura
Bagura
Kushtia
Dhaka
Narial
7890
4500
8500
8100
2500
3250
15000
Figure 5: The internal structure of account relation for high linguistic term
MEASUREMENTS OF QUERY COST
The query cost is measured in terms of disk accesses, CPU time to execute a query and
the cost of communication2. In large database systems, only the number of disk accesses
is considered because it is slower than memory operation. In this paper, we have
Page 113

ISSN: 2250-1797
measured the query cost of classical and fuzzy query over classical database and also
measured the cost of fuzzy query over fuzzy database.
A. Query Cost of Classical Query over Classical Database:
In a query processing, we traverse a path in the tree from the root to some leaf node. If
there are k search-key values in the file, at most
log n k
2
nodes are to be accessed.
Typically a node is made to be the same size as a disk block, which is typically 4kb. The
cost of the query operation for single record is represented in terms of I/O operations
which are equal to the height the tree plus one I/O to fetch for the record; each of these
I/O operations requires a seek and a block transfer. Thus the cost is
where
t S , tT

log n k 1 t S tT ,
are seek time and block transfer time, respectively2. If the query produces, m,
records as output, and then the cost will be log n k m t S tT . Typical values for high
2
end disks today would be tS 4 milliseconds and tT 0.1 milliseconds. For example, the
classical query Q.1 is find all account numbers whose balance is greater than 7000.
The output of this query statement is shown in Table 2:
In our example, k=8, n=4, m=4. So a lookup query requires only log 4 8 3 nodes or
2
blocks to be accessed. So, the query cost of above query statement is 3 4 4 0.1 28.07
ms.
Table 2: Output of query statement Q.1
Account_no
103
107
120
101
Branch_name
Dhaka
Bagura
Bagura
Narial
Balance
7890
8550
8100
15000
B. Query Cost of Fuzzy Query over Classical Database:

The necessary steps to retrieve a record from classical database by using fuzzy queries,
with equality on nonkey, are as follows:
1. Set C with root node of index file of database which is defined in query statement.
2. Repeat steps 3 and 4 while C is not leaf node.
3. Find the smallest search key K i of node C.
4. Set C with the node pointed to by Pi .
5. By using Pi extract the record from database.
6. Calculate the fuzzy value in fuzzy set which is defined as linguistic term in fuzzy
Page 114

ISSN: 2250-1797
query statement.
7. If the fuzzy value is greater than zero, then select the record. Otherwise not.
8. Set i=i+1
9. Repeat steps 5, 6, 7 and 8 until the end of the index file.
For example, the fuzzy query Q.2 is find all account numbers whose balance is
high. So the output of the desired query Q.2 is shown in the following Table 3.
Table 3: Output of Desired Query Q.2
Account_no
Branch_name Balance High(balance)
103
Dhaka
7890
.178
107
Bagura
8550
.3
120
Bagura
8100
.22
101
Narial
15000
1
The query cost of above fuzzy query statement Q.2 is 3 4 4 0.1 28.07 ms, which is the
same as the classical query on classical database. The difference between the linguistic
term of traditional logic and fuzzy logic is that linguistic term of traditional logic does not
provide the facility of level of membership function, but the linguistic term of fuzzy logic
does. For example, the fuzzy query Q.3 is find all account numbers whose balance is
high with threshold .3. So the output of the desired query is shown in Table 4.
Table 4: Output of Desired Query Q.3
Account_no
107
101
Branch_name
Bagura
Narial
Balance High(balance)
8550
.3
15000
1
Thus, we can say that although fuzzy query on classical database increases the
expressiveness of human expression, there is no effect on searching time.
C. Query Cost of Fuzzy Query over Fuzzy Database:
The necessary steps to retrieve records from fuzzy database by using fuzzy query are as
follows:
1. Set C with root node of desired index file which is defined as the linguistic term in
query statement.
2. Repeat steps 3 and 4 while C is not leaf node.
3. Find the smallest search key K i of node C.
4. Set C with the node pointed to by Pi .
5. By using Pi extract the record from database.
6. Test the record by using selection condition whether it is the desired record.
7. If it satisfies the selection condition, then it is the desired record. Otherwise not.
8. Set i=i+1;
Page 115

ISSN: 2250-1797
9. Repeat steps 5, 6, 7 and 8 until the end of the index file.

For example, the fuzzy query Q.2 is find all account numbers whose balance is
high, when applied to fuzzy database yield same result as in table 3.
The fuzzy value is calculated during building the fuzzy database and is stored in the
index file named linguistic term. So, extra time to compute fuzzy value is not killed
during search any record from the database. In our example, the number of search-key of
high index file is 4, i.e., k=4, n=4. So a lookup fuzzy query requires only
log 4 4 2
2
nodes or blocks to be accessed. Since the result contains four records, i.e., m=4. So, the
query cost of above fuzzy query statement Q.2 on fuzzy database is 2 4 4 0.1 24.06 ms,
which is less than Q.2 for fuzzy query on classical database.
Table 5: Experimental Results
For Classical
Query
K
Query
cost (ms)
10000
20.5
50000
23.4
K
7000
6000
4000
2000
1500
10000
5000
22000
31453
19673
Low
Query
cost (ms)
19.86
19.6
18.9
17.63
17.12
20.5
19.7
21.9
22.54
21.7
For Fuzzy Query

Moderate
K
Query
K
cost (ms)
4000
18.87
2000
4500
19.07
3500
5000
19.3
3000
4555
19.1
6500
6000
19.6
5500
30000 22.5
15000
25000 22.13
25000
17000 21.44
13000
24316 22.08
9761
11435 20.74
29431
High
Query
cost (ms)
17.63
18.63
18.4
19.73
19.43
21.2
22.13
20.97
20.46
22.42
RESULTS AND DISCUSSION

The query costs to retrieve one record from classical and/or fuzzy databases with
different size by using classical and/or fuzzy query are shown in Table 5, where the
number of pointers in each node, i.e. n, is 20. If the number of distinct records in the
database is n, then for fuzzy database, each index files contain either n or less than n
records, whereas for classical database, the index file contains absolutely n records. So,
the searching time of one record from that fuzzy database by using fuzzy query is also
reduced. This time is more effective when the database is so large and the node size is
also large.
The number of search keys in index file of fuzzy database depends on the number of
linguistic terms over fuzzy attributes of fuzzy database and it will be reduced by
constructing several index files corresponding to the linguistic terms.
Page 116

ISSN: 2250-1797
CONCLUSION
It has been shown that a query costs less time in fuzzy database, and speed up the system.
So our proposed system is more useful than ordinary queries on classical database.
Though it requires some more space because of increasing index file as increasing the
linguistic terms in fuzzy database. But we think that this is not a matter of considerable
problem due to advance in memory technology. We will extend this work by using fuzzy
queries with fuzzy quantifiers, such as very in future. We also plan to study fuzzy
information processing in other database model, such as object oriented ones.
REFERENCES
[1]. Seymour Lipschutz, Theory and Problems of Data Structures, Mcgraw Hill
Education. 1999.
[2]. Abraham Silberschatz, Henry F. Korth, S. Sudarshan; Database System Concepts,
th
Mcgraw Hill Education. 4 edition, 2002.

[3]. Liberios Vokorokos, Anton Balaz, Norbert Adam, Parallelism in fuzzy databases,
th
Teachmedia, 5 Edition, 2006.

[4]. S. Rajasekaran and G.A. Vijayalakshmi Pai, Neural Networks, Fuzzy Logic and
Genetic Algorithm Synthesis and Applications, Prentice Hall India, 3rd Edition,
2003.
[5]. George, J. Klir, Bo Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications,
Prentice Hall of India, 1997.
[6]. Jose Galindo, Angelica Urrutia, Mario Piattini, Fuzzy Databases Modeling, Design
and Implementation.John
th
Wiley & Sons. Inc.4 Edition, 2004.
Page 117

Performance Comparison of Fuzzy Queries On Fuzzy Database and Classical Database

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Performance Comparison of Fuzzy Queries On Fuzzy Database and Classical Database

Hochgeladen von

Copyright:

Verfügbare Formate

International Journal of Computer Application

Issue 3, Volume 1 (February 2013)