Beruflich Dokumente
Kultur Dokumente
#3
ABSTRACT
In this paper we have designed a sample classical and fuzzy database and applied both
classical and fuzzy queries on these databases. Presently, most of the relational database
system use query, which has syntax and semantics defined preciously, to retrieve data.
But sometimes, we are likely to use vague (fuzzy) terms in our query. Our
implementation gives the user the flexibility to query the database using natural language.
We also calculate the time cost of classical and/or fuzzy query on classical and/or fuzzy
database (DB) and have shown that the time cost of the fuzzy query on sample fuzzy
database has been reduced.
Keyword: Fuzzy logic, FDBMS, Fuzzy query, Classical database
Corresponding Author: Sujit kumar Mondal
INTRODUCTION
A database-management system (DBMS) consists of a collection of interrelated data and
a set of programs to access those data. The primary goal of a DBMS is to provide an
environment that is both convenient and efficient for people to use in retrieving and
storing information. A number of operations are performed on a DBMS. Searching is an
important operation among those. A significant amount of time is needed for searching
data from a DBMS. As the size of a DB increases, the searching time also increases. A
number of algorithms have been developed to improve the performance of searching
using query. But those algorithms have been developed only for classical DB. The
traditional DBMS cannot manipulate incomplete, imprecise and vague data such as very
high, about 30, etc. properly. To overcome this problem, FDBMS (Fuzzy Database
Management System) has been introduced. The primary focus of fuzzy logic is on natural
language, where reasoning with imprecise propositions approximates is rather typical.
As the size of DB is increasing day by day, programmers are intending to reduce the time
complexity to access data from a large database1. Large database may consist of millions
of data and it costs significant amount of time to find any particular record from that
Page 109
database. The search time may be reduced by indexing database through the B-tree
algorithm2. We have eliminated lack of expressiveness and also reduced searching time
by designing fuzzy database and using fuzzy query on it, which is the exploration of this
paper.
FUZZY DATABASE AND FUZZY QUERY
The database which contains incomplete, imprecise and vague data is called fuzzy
database. Fuzzy database is founded on fuzzy logic. There are two feasible ways to
incorporate fuzziness in DBMS3: One is making fuzzy queries to the classical databases
and the other is adding fuzzy information to the system. In this paper, the fuzzy database
based on relational data model, called fuzzy relational database, enhances the relational
model by modelling imprecise in data and/or query. A fuzzy relation is represented by a
fuzzy or crisp attribute. Crisp attributes have precise data, such as Tk.7980 balance and
fuzzy attribute consists of imprecise data. There are four possible types of fuzzy
attribute6: Type 1, Type 2, Type 3 and Type 4. Among these, only Type 1, which has
precise data and linguistic labels over them, is explored to build fuzzy database in this
paper. The schema of used relation in this paper is as follows:
Account_schema=(account_no, branch_name, balance), where all attributes are crisp
for classical database. For fuzzy database, the only fuzzy attribute is balance and others
are crisp. In both cases, account_no. is a primary key. An example of a sample account
relation is shown in Table 1.
Table 1: Account Relation
account_no.
branch_name
balance
103
104
Dhaka
Narial
7890
4500
107
Bagura
8500
120
108
Bagura
Kushtia
8100
2500
110
Dhaka
3250
101
Narial
15000
We use three linguistic terms5 over balance attribute. They are high, moderate, and
low over balance attribute have been used. The membership function4 over fuzzy
attribute, balance, is illustrated in Figure 1.
Page 110
Y
1
LOW
MODERATE
HIGH
(x)
Slope1
Slope2
Slope3
Slope4
X
1000
D1
3000
5000
7000
D2
D4
D3
12000
9000
D 2 * Slolpe 2
D3 * Slope3
1
Page 111
101
103
104
103
104
107
120
108
110
101
Dhaka
Narial
Bagura
Bagura
Kushtia
Dhaka
Narial
7890
4500
8500
8100
2500
3250
15000
for fuzzy set low, low 7890 0 . So the search-key for balance 7890 is not inserted into
low index file. And so on. The structure of a leaf node with 4 pointers for the low,
moderate and high index files are shown in the following Figures 3, 4 and 5, respectively.
104 .75
108 1
110 .125
103 Dhaka
7890
104 Narial
4500
107 Bagura 8500
120 Bagura 8100
108 Kushtia 2500
110 Dhaka
3250
101 Narial
15000
Figure 3: The internal structure of account relation for low linguistic term
103
.37
104
.5
107
.167
103 Dhaka
7890
104 Narial
4500
107 Bagura 8500
120 Bagura 8100
108 Kushtia 2500
110 Dhaka
3250
101 Narial
15000
Figure 4: The internal structure of account relation for moderate linguistic term
101
103
.178
107
103
104
107
120
108
110
101
.3
Dhaka
Narial
Bagura
Bagura
Kushtia
Dhaka
Narial
7890
4500
8500
8100
2500
3250
15000
Figure 5: The internal structure of account relation for high linguistic term
MEASUREMENTS OF QUERY COST
The query cost is measured in terms of disk accesses, CPU time to execute a query and
the cost of communication2. In large database systems, only the number of disk accesses
is considered because it is slower than memory operation. In this paper, we have
Page 113
measured the query cost of classical and fuzzy query over classical database and also
measured the cost of fuzzy query over fuzzy database.
A. Query Cost of Classical Query over Classical Database:
In a query processing, we traverse a path in the tree from the root to some leaf node. If
there are k search-key values in the file, at most
log n k
2
Typically a node is made to be the same size as a disk block, which is typically 4kb. The
cost of the query operation for single record is represented in terms of I/O operations
which are equal to the height the tree plus one I/O to fetch for the record; each of these
I/O operations requires a seek and a block transfer. Thus the cost is
where
t S , tT
log n k 1 t S tT ,
are seek time and block transfer time, respectively2. If the query produces, m,
records as output, and then the cost will be log n k m t S tT . Typical values for high
2
end disks today would be tS 4 milliseconds and tT 0.1 milliseconds. For example, the
classical query Q.1 is find all account numbers whose balance is greater than 7000.
The output of this query statement is shown in Table 2:
In our example, k=8, n=4, m=4. So a lookup query requires only log 4 8 3 nodes or
2
blocks to be accessed. So, the query cost of above query statement is 3 4 4 0.1 28.07
ms.
Table 2: Output of query statement Q.1
Account_no
103
107
120
101
Branch_name
Dhaka
Bagura
Bagura
Narial
Balance
7890
8550
8100
15000
query statement.
7. If the fuzzy value is greater than zero, then select the record. Otherwise not.
8. Set i=i+1
9. Repeat steps 5, 6, 7 and 8 until the end of the index file.
For example, the fuzzy query Q.2 is find all account numbers whose balance is
high. So the output of the desired query Q.2 is shown in the following Table 3.
Table 3: Output of Desired Query Q.2
Account_no
Branch_name Balance High(balance)
103
Dhaka
7890
.178
107
Bagura
8550
.3
120
Bagura
8100
.22
101
Narial
15000
1
The query cost of above fuzzy query statement Q.2 is 3 4 4 0.1 28.07 ms, which is the
same as the classical query on classical database. The difference between the linguistic
term of traditional logic and fuzzy logic is that linguistic term of traditional logic does not
provide the facility of level of membership function, but the linguistic term of fuzzy logic
does. For example, the fuzzy query Q.3 is find all account numbers whose balance is
high with threshold .3. So the output of the desired query is shown in Table 4.
Table 4: Output of Desired Query Q.3
Account_no
107
101
Branch_name
Bagura
Narial
Balance High(balance)
8550
.3
15000
1
Thus, we can say that although fuzzy query on classical database increases the
expressiveness of human expression, there is no effect on searching time.
C. Query Cost of Fuzzy Query over Fuzzy Database:
The necessary steps to retrieve records from fuzzy database by using fuzzy query are as
follows:
1. Set C with root node of desired index file which is defined as the linguistic term in
query statement.
2. Repeat steps 3 and 4 while C is not leaf node.
3. Find the smallest search key K i of node C.
4. Set C with the node pointed to by Pi .
5. By using Pi extract the record from database.
6. Test the record by using selection condition whether it is the desired record.
7. If it satisfies the selection condition, then it is the desired record. Otherwise not.
8. Set i=i+1;
Page 115
log 4 4 2
2
nodes or blocks to be accessed. Since the result contains four records, i.e., m=4. So, the
query cost of above fuzzy query statement Q.2 on fuzzy database is 2 4 4 0.1 24.06 ms,
which is less than Q.2 for fuzzy query on classical database.
Table 5: Experimental Results
For Classical
Query
K
Query
cost (ms)
10000
20.5
50000
23.4
K
7000
6000
4000
2000
1500
10000
5000
22000
31453
19673
Low
Query
cost (ms)
19.86
19.6
18.9
17.63
17.12
20.5
19.7
21.9
22.54
21.7
High
Query
cost (ms)
17.63
18.63
18.4
19.73
19.43
21.2
22.13
20.97
20.46
22.42
Page 116
CONCLUSION
It has been shown that a query costs less time in fuzzy database, and speed up the system.
So our proposed system is more useful than ordinary queries on classical database.
Though it requires some more space because of increasing index file as increasing the
linguistic terms in fuzzy database. But we think that this is not a matter of considerable
problem due to advance in memory technology. We will extend this work by using fuzzy
queries with fuzzy quantifiers, such as very in future. We also plan to study fuzzy
information processing in other database model, such as object oriented ones.
REFERENCES
[1]. Seymour Lipschutz, Theory and Problems of Data Structures, Mcgraw Hill
Education. 1999.
[2]. Abraham Silberschatz, Henry F. Korth, S. Sudarshan; Database System Concepts,
th
th
Page 117