Sie sind auf Seite 1von 28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATION RULES

G.S.Bhamra 1 , A. K.Verma 2 and R.B.Patel 3

1 M. M. University, Mullana, Haryana, 133207 - India 2 Thapar University, Patiala, Punjab, 147004- India 3 Chandigarh College of Engineering & Technology, Chandigarh- 160019- India

ABSTRACT

The intelligent agent based model is a popular approach in constructing Distributed Data Mining (DDM) systems to address scalable mining over large scale and ever increasing distributed data. In an agent based distributed system, variety of agents coordinate and communicate with each other to perform the various tasks of the Data Mining (DM) process. In this study a serial computing mode of a multi-agent system (MAS) called Agent enabled Mining of Globally Strong Association Rules (AeMGSAR) is presented based on the serial itinerary of the mobile agents. A Running environment is also designed for the implementation and performance study of AeMGSAR system.

KEYWORDS

Knowledge Discovery, Association Rules, Intelligent Agents, Multi-Agent System

1.INTRODUCTION

Data Mining (DM) technique is used to extract some interesting and valid data patterns implicitly stored in large databases [1], [2]. Intelligent software agent technology is an interdisciplinary technology dealing with the development and efficient utilization of autonomous software objects called agents which have access to geographically distributed and heterogeneous resources. They are autonomous, adaptive, reactive, pro-active, social, cooperative, collaborative and flexible. They also support temporal continuity and mobility within the network. An intelligent agent with mobility feature is known as Mobile Agent (MA). MA migrates from node to node in a heterogeneous network without losing its operability. On reaching at a network node MA is delivered to an Agent Execution Environment (AEE) where its executable parts are started running. Upon completion of the desired task, it delivers the results to the home node. A Mobile Agent Platform (MAP) or Agent Execution Environment (AEE), is a server application that provides the appropriate functionality to MAs to authenticate, execute, communicate, migrate to other platform, and use system resources in a secure way. A Multi Agent System (MAS) is distributed application comprised of multiple interacting intelligent agent components [3].

Let

an identifier ( TID ) and I = d = K m , total m data items in DB . A set of items in a particular

} , which is a set of k data

be a transactional dataset of size D where each transaction T is assigned

DB =

{

T

j

,

j =

1

K D

}

{

i

,i

1

}

transaction T is called itemset or pattern. An itemset,

P =

{

d

i

,i

= Kk

1

DOI:10.5121/ijcsa.2015.5307

77

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

items in a particular transaction T and P I , is called k-itemset. Support of an itemset,

s

(

P

)

=

No_of_T_containing_P

D

%

is the frequency of occurrence of itemset P in DB , where

No_of_T_containing_P is the support count (sup_count) of itemset P . Frequent Itemsets (FIs)

are the itemset that appear in DB frequently, i.e., if s (P) min_th_sup (given minimum threshold support), then P is a frequent k-itemset. Finding such FIs plays an essential role in miming the interesting relationships among itemsets. Frequent Itemset Mining (FIM) is the task of finding the set of all the subsets of FIs in a transactional database [2].

Association Rules (ARs) are used to discover the associations among item in a database [4]. It is an implication of the form P Q[support,confidence] where, P I , Q I and P Q = ∅ . An AR is measured in terms of its support and confidence factor where support of the rule

( s (P Q) ) is the probability of both P and Q appearing in T , i.e.,

confidence of the rule ( c(P Q) ) is the conditional probability of Q given P , i.e., p (Q | P) .

An AR is said to be strong if s (P Q) min_th_sup (given minimum threshold support) and

p (P Q) and the

c(P Q) min_th_conf (given minimum threshold confidence). Association Rule Mining (ARM) today is one of the most important aspects of DM tasks. In ARM all the strong ARs are generated from the FIs. The ARM can be viewed as two step process [5], [6].

1. Find all the frequent k-itemsets (

2. Generate Strong ARs from

L

k

L )

k

a. For each frequent itemset,

b. For

l L , generate all non empty subsets of

k

subset

s of

l ,

output

the

rule

l .

every

sup_count

non

(

l

)

empty

s (l s) ”,

sup_count

(

s

)

min_th_conf

if

Distributed Association Rule Mining (DARM) is the task of generating the globally strong association rules from the global FIs in a distributed environment. Few preliminaries notations and definitions required for defining DARM and to make this study self contained are as follows:

S =

{

S

i

S

DB =

CENTRAL

i

{

}

= Kn , n distributed sites. , Central Site.

,i

1

T

j

,

j =

1

K D

i

}

, Horizontally partitioned data set of size

each transaction

T

j

is assigned an identifier (TID).

D

i

at the local site

DB

I =

= U

{

d

i

n

i = 1

,i

DB

i

, the aggregated dataset of size

D

}

= Km , total m data items in each

1

=

DB .

i

FI

L

k

FISC

i

(

)

L

k

LSAR

i

(

L

i

)

, Local frequent k-itemsets at site S . , List of support count Itemset L

i

FI

(

k

i

)

.

, List of locally strong association rules at site

n

i = 1

D

i

S .

i

,

DB DB

i

j

= ∅

TLSAR

L

= U

TFI

L

k

= U

n

i

= 1

n

i

= 1

L

i

FI

(

k

L

i

)

, List of total locally strong association rules. , List of total frequent k-itemsets.

LSAR

S

i

, where

78

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

GFI

L

k

= I

n FI

i = 1

(

L

k

i

)

, List of global frequent k-itemsets.

GSAR

L CENTRAL

, List of Globally strong association rule.

Local Knowledge Base (LKB), at site

reference to the local supervisor for local decisions. Global Knowledge Base (GKB), at

S

i

, comprises of

FI

L

k

(

i

)

which can provide

,

for the global decision making [7]. Like ARM, DARM

FISC

, L

k

(

i

)

and

LSAR

L

i

S

CENTRAL

comprises of L

task can also be viewed as two-step process [6]:

TLSAR

TFI

, L

k

GFI

, L

k

and

GSAR

L

CENTRAL

GFI

1. Find the global frequent k-itemset ( L

k

) from the distributed Local frequent k-itemsets

FI

( L ) from the partitioned datasets.

k

(

i

)

2. Generate globally strong association rules (

GSAR

L

CENTRAL

) from

GFI

L

k

.

The existing agent based systems specifically dealing with DARM task are: Knowledge Discovery Management System (KDMS) [8], Efficient Distributed Data Mining using Intelligent Agents [9], Mobile Agent based Distributed Data Mining [10], An Agent based Framework for Association Rule Mining of Distributed Data (AFARMDD) [11], [12], Multi-Agent Distributed Association Rule Miner (MADARM) [13]. All these systems are academic research projects. Qualitative comparison of these DARM frameworks is provided in [14]. Most of the existing agent based frameworks for DARM task are only prototype model and lacks the appropriate underlying AEE, scalability, privacy preserving techniques, global knowledge generation and implementation using a real datasets.

The rest of the paper is organised as follows. Section 2 described the running environment for the proposed system along with various algorithms involved. Serial computing model of AeMGSAR is presented in Section 3. Algorithms for all the agents involved in this system are also discussed. Section 4 describes the implementation and performance study of the system and finally the article is concluded in Section 5.

2.ENVIRONMENT FOR THE PROPOSED SYSTEM

Every MAS needs an underlying AEE to provide a running infrastructure on which agents can be deployed and tested. A running environment has been designed in Java. Various attributes of the MA are encapsulated within a data structure known as AgentProfile . It contains the name of MA ( AgentName ), version number ( AgentVersion ), entire byte code ( BC ), list of nodes to be NODES

visited by MA, i.e., itinerary plan ( L

) , type of the itinerary ( ItinType ) which can be

serial or parallel, a reference of current execution state ( AObject ) and an additional data structure

i )

known as Briefcase that acts as a result bag of MA to store final resultant knowledge (

at a particular site. Computational time ( CPUTime ) taken by a MA at a particular site is also

i . In addition to results, Briefcase also contains the system time for start of

stored in

agent journey (

time of MA ( TripTime ) calculated using

) and total round trip . Stationary as well

as mobile agents involved in the models would be discussed later on. This environment consists of the following three components:

Result_S

Result_S

TripTime

), system time for end of journey (

TripTime TripTime

end

TripTime

end

TripTime

start

start

79

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

Data Mining Agent Execution Environment (DM_AEE): It is the key component that

acts as a Server. DM_AEE is deployed on any distributed sites

receiving, executing and migrating all the visiting DM agents. It receives the incoming

S i , retrieves the entire BC of agent and save it with

S i after that execution of the agent is

S i and is responsible for

AgentProfile

AgentName.class in the local file system of the site

at

site

started using AObject . Steps are shown in Algorithm 1.

Agent Launcher (AL): It acts a Client at agent launching station (

) and launches

the goal oriented DM agents on behalf of the user through a user interface to the

is a repository

of all mobile as well as stationary agents (SAs). AL first reads and stores AgentName in AgentProfile . The entire BC of the AgentName is loaded from the Agent Pool and

DM_AEE running at the distributed sites. Agent Pool (or Zone) at

S CENTRAL

S CENTRAL

stored in AgentProfile .

L NODES

and ItinType are retrieved and stored in AgentProfile .

TripTime is maintained in Briefcase which is further added to AgentProfile . In case of

start

serial computing model, i.e., if ItinType = Serial , AL dispatches a specific single MA

, and it travels from node to node. AgentVersion is set as 1 for this

agent. AL also contacts the Result Manager (RM) for processing the Briefcase of an agent. Detailed steps are given in Algorithm 2.

Result Manager (RM): It manages and processes the Briefcase of all MAs. RM is either

contacted by a MA for submitting its results or by AL for processing the results of the specific MA. On completion of itinerary, each DM agent submits its results to RM which computes total round trip time ( TripTime ) of that MA and saves it in the Briefcase of that

.

When it is contacted by AL for processing the results of a specific agent it sends back the AgentProfile of that agent. Steps are defined in Algorithm 3.

along with

L NODES

agent. It ItinType = Serial then it saves the updated AgentProfile of an agent at

S CENTRAL

Algortihm 1 DATA MINING AGENT EXECUTION ENVIRONMENT (DM_AEE)

1:

2:

3:

4:

5:

6:

7:

8:

9:

10:

procedure DM_AEE( ) while TRUE do AgentPofile listen and receive AgentProfile at S

i

AgentName get AgentName from AgentProfile

BC retrieve the BC of agent from AgentProfile

save the BC with AgentName.class in the local file system of S

AObject get AObject from AgentProfile

AObject.run()

end while

end procedure

i

> current state

> start executing mobile agent

Algortihm 2 AGENT LAUNCHER (AL)

1:

2: option read option (dispatch / result)

3: switch option do

4: case dispatch

5: AgentName read Mobile Agent's name

6: add AgentName to AgentProfile

procedure AL( )

> dispatch the mobile agent to DM_AEE

80

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

7:

8:

9:

10:

11:

12:

13:

14:

15:

16:

17:

18:

19:

20:

21:

22:

23:

24:

25:

26:

27:

28:

29:

30:

31:

32:

33:

34:

35:

36:

37:

38:

39:

40:

41:

42:

43:

44:

45:

46:

47:

48:

49:

50:

51:

52:

BC load entire byte code of AgentName from AgentPool

add BC to AgentProfile

L NODES

read Itinerary (IP addresses) of mobile agent

ItinType read ItinType ( Serial / Parallel)

add ItinType to AgentProfile

if

ItinType = " Serial " then AgentVersion 1

add AgentVersion to AgentProfile

NODES

add L to AgentProfile switch AgentName do case LFIGA minthrsup read minimum threshold support

> Serial Itinerary

AObject new LFIGA(AgentProfile, minthrsup) end case case LKGA minthrconf read minimum threshold confidence

AObject new LKGA(AgentProfile, minthrconf) end case case TFICA AObject new TFICA(AgentProfile) end case

case

LKCA

AObject

new LKCA( AgentProfile)

end case

case

GKDA

GSAR

L

CENTRAL

load L

GSAR

CENTRAL

add L

GSAR

CENTRAL

to Briefcase

generated by GKGA at S

CENTRAL

add updated Briefcase to AgentProfile

AObject new GKDA (AgentProfile)

end case

end switch

add AObject to AgentProfile

Transfer AgentProfile to DM_AEE at first IP address in L

NODES

> current state

end if end case case result AgentName read mobile agent's name

> process the result of mobile agent

ItinType read mobile agent's ItinType

add AgentName to L

AgentInfo

add ItinType to L

> Result processing for Serial Itinerary Agents

if

AgentInfo

ItinType = " Serial " then

AgentProfile contact RM for L

Briefcase retrieve Briefcase from AgentProfile switch AgentName do case LFIGA

AgentInfo

81

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

53:

process the

Briefcase of LFIGA

54:

end case

55:

case

LKGA

56:

process the

Briefcase of LKGA

57:

end case

 

58:

case TFICA

59:

call GFIGA (Briefcase)

> stationary agent

60:

end case

 

61:

case

LKCA

62:

call GKGA (Briefcase)

> stationary agent

63:

end case

 

64:

case

GKDA

 

65:

process the Briefcase of GKDA

66:

end case

 

67:

end switch

68:

end if

69:

end case

70:

end switch

71:

end procedure

Algortihm 3 RESULT MANAGER (RM)

1:

procedure RM( )

2:

while TRUE do

3: listen and receive the incomming request

4:

5:

6:

7:

8:

9:

10:

11:

12:

13:

14:

15:

16:

17:

18:

19:

20:

21:

22:

23:

24:

25:

26:

if

contacted by a mobile agent for submitting results from site S

AgentProfile receive the incomming AgentProfile from site S

ItinType retrieve ItinType from AgentProfile

Briefcase retrieve mobile agent's Briefcase from AgentProfile

TripTime retrieve TripTime from Briefcase

i

then

i

start

start

TripTime retrieve TripTime from Briefcase

end

end

TripTime TripTime TripTime

end

start

add TripTime to Briefcase

add updated Briefcase to AgentProfile

if ItinType = " Serial " then

save AgentProfile at S end if end if if contacted by AL for processing the results then

CENTRAL

AgentName retrieve AgentName from incomming L

AgentInfo

ItinType retrieve ItinType from incomming L

AgentInfo

if ItinType = " Serial " then

AgentProfile load AgentProfile for AgentName from S

dispatch AgentProfile to AL

end if

end if

end while

end procedure

CENTRAL

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

The overall working of AeMGSAR system may be divided into following six stages:

1. Request Stage: Request for the DARM is initiated at with necessary credentials.

2. Preparation Stage: AL through User Interface reads agent name; version number; Itinerary for the MAs journey is obtained in terms of IP addresses of the distributed nodes to be visited by a MA; any specific additional data for a specific MA is obtained; Agent code for the specific MA is loaded from AgentPool; for serial itinerary a single specific MA is dispatched by AL to travel and visit n distributed sites in parallel.

by AL on behalf of the user

S CENTRAL

3. Local Mining Stage: ARM process is performed locally by specific DM agents on each distributed site and results are kept as local knowledge base at that site.

4. Result Collection Stage: Collector agents visits each site and collect the results generated by DM agents and submit the results back to RM at

5. Knowledge Integration and Global Knowledge Generation Stage: Knowledge or result integration is carried out by the RM with the help of stationary agent and Global Knowledge in the form of Globally Strong Association Rules may be generated with the help of other stationary agents at

6. Global Knowledge Dispatching Stage: Global knowledge is dispatched to the distributed sites by a dispatching agent to compare it with the local knowledge at each site.

S

CENTRAL

.

S

CENTRAL

.

to compare it with the local knowledge at each site. S CENTRAL . S CENTRAL .

Figure 1. AeMGSAR Serial Computing Model

83

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

3.SERIAL COMPUTING MODEL OF AEMGSAR

Serial computing model of AeMGSAR system is shown in Figure 1. It consists of total seven

agents, five of these are MAs dispatched from and other two are intelligent SAs running at

to perform different tasks. The CPU time

with serial itinerary multi-hop migration

S CENTRAL

S

CENTRAL

taken by a MA while processing on each site along with some other specific information is

carried back in the result bag at

. Agents in serial number 1-5 visit n sites serially other

parameters are collected from different resources. Detailed relationship among these agents and working behaviour of each agent is as follows:

S CENTRAL

1. Local Frequent Itemset Generater Agent (LFIGA): This is a MA that carries the

by

i at that site with the constraint of min_th_sup . It carries back the

. This agent is embedded

computational time ( CPUTime ) at

with Apriori algorithm [15] for generating all the frequent k-itemset lists. It may be equipped with decision making capability to select other FIM algorithms based on the density of the dataset at a particular site. More details are available in Algorithm 4.

2. Local Knowledge Generater Agent (LKGA): This is a MA that carries the AgentProfile & min_th_conf . LKGA applies the constraint of min_th_conf to generate and

.

list also support and confidence for a particular association rule along with the site

.

AgentProfile & min_th_sup .

scanning the local

DB

LFIGA

generates and stores

each site

S

i

and

FI

L

k

(

i

)

and

FISC

L

k

i

( )

at

site

S

i

TripTime

end

store

LSAR

L

i

LSAR

L

i

by using the

FI

L

k

i

( )

and

FISC

L

k

i

( )

lists already generated by LFIGA agent at site

and

TripTime

S

i

end

name. It carries back the computational time ( CPUTime ) at each site Detailed steps are given in Algorithm 7.

S

i

3. Total Frequent Itemset Collector Agent (TFICA): This is a MA that carries the

) generated by LFIGA

in the result bag to RM at

. In addition to this resultant knowledge, it also carries back the computational

AgentProfile . TFICA collects list of local frequent k-itemset (

agent and carries back the list of total frequent k-itemset

S CENTRAL

TFI

L

k

L FI

k

i

( )

time ( CPUTime ) at each site

S

i

and

TripTime

end

. It executes Algorithm 8.

4. Local Knowledge Collctor Agent (LKCA): This is a MA that carries the AgentProfile . LKCA collects the list of locally strong association rules ( L ) generated by LKGA

LSAR

i

agent and carries back the list of total locally strong association rules ( L ) in the result

bag to RM at

computational time ( CPUTime ) at each site

Algprithm 9.

TripTime . Steps are shown in

. In addition to this resultant knowledge, it also carries back the

TLSAR

S CENTRAL

S

i

and

end

5. Global Knowledge Dispatcher Agent (GKDA): This is a MA that carries the AgentProfile containing global knowledge ( L ). It dispatches global knowledge at every site for further decision making and comparing with the local knowledge at that site. It executes Algorithm 12.

,

6. Global Frequent Itemset Generater Agent (GFIGA): It is a stationary agent at

mainly used for processing the result bag of TFICA, i.e., total frequent k-itemset list

. More details

GSAR

CENTRAL

S CENTRAL

(

are available in Algorithm 10.

TFI

L

k

) generated y TIFCA to generate the global frequent itemset list,

GFI

L

k

7. Global Knowledge Generater Agent (GKGA): It is also a stationary agent at

mainly used for processing the

GFI

L

k

list and

TLSAR

L

S CENTRAL

,

list to compile the global knowledge,

84

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

i.e., the list of globally strong association rules, Algorithm 11.

L GSAR

CENTRAL

. Detailed steps are shown in

Algortihm 4 LOCAL FREQUENT ITEMSET GENERATER AGENT (LFIGA)

Input:

AgentProfile,A collection of agent attributes set by the AL

min_th_sup,the given minimum threshold support

Output:

L FI &SC

,the list of frequent itemsets and their support counts

1:

2:

3:

4:

5:

6:

7:

8:

9:

10:

11:

12:

13:

14:

15:

16:

17:

18:

19:

20:

21:

22:

23:

24:

25:

26:

27:

28:

29:

30:

31:

32:

33:

34:

35:

36:

37:

38:

procedure LFIGA( AgentProfile,min_th_sup )

CPUTime get system time

start

Briefcase get Briefcase from AgentProfile

DB load DB from local file system of site S

i

i

i

T DB get

i

.

(0)

I DB get

i

.

(1)

DB[T][I] DB get

i

.

 

> No. of records

> No. of items

(3)

> itemset data bank

minsupcount (T × min_th_sup) / 100

FIL

CFIL {1,2,3

> generate frequent-1 itemset list (

1

I}

1

for i 1,I

do

) and support count list (

FISC

1

)

> candidate frequent-1 itemset

to zero

> initialize the support count array

SCFIL

1

SCFIL [i]

1

0

end for k 1 for all

candidate c CFIL

1

do

for all

if

transaction t DB do

c t

SCFIL [k] SCFIL [k] +1

then

1

1

end if end for k k +1 end for

> prune

k 1, I

for

CFIL to generate FIL and FISC

1

1

1

do

if

SCFIL [k] minsupcount then

1

add c

add SCFIL [k] to FISC

k

CFIL to FIL

1

1

1

1

end if

end for

if

FIL

1

then

FI

add FIL to L

1

add FISC to L

1

FISC

end if k 2 while

FIL

k

1

≠ ∅

do

CFIL Call GenerateCFIL(FIL

for

.

0

k

1,

SCFIL

i

k -1

CFIL

k

i

[ ]

k

length do

)

> find support count for every candidate

> see Algorithm 5

> initialize the array

SCFIL

k

to zero

85

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

39:

40:

41:

42:

43:

44:

45:

46:

47:

48:

49:

50:

51:

52:

53:

54:

55:

56:

57:

58:

59:

60:

61:

62:

63:

64:

65:

66:

67:

68:

69:

70:

71:

72:

73:

74:

75:

76:

77:

78:

79:

80:

81:

82:

83:

84:

end for i 1 for all

if

candidate c CFIL do for all transaction t DB do

then

k

c t

> find support count for every candidate > scan DB

SCFIL [k] SCFIL [k] +1

1

1

end if end for i i +1

end for

> prune

for

if

i

CFIL

1,

k

to generate

SCFIL length do

k

.

[i]

k

FIL

k

SCFIL

minsupcount

and

FISC

k

then

add c CFIL

i

k

to FIL

k

add SCFIL [i] to FISC

k

k

end if

end for

if

FIL

k

then

FI

add FIL to L

k

add FISC to L

k

FISC

end if

k k +1

end while

add T to L

FI &SC

FI

add L

to L

FI &SC

add L

save L

CPUTime get system time

FISC

FI &SC

to L

FI &SC

in the local file system of this site S

end

CPUTime CPUTime CPUTime

end

add CPUTime to Result_S

add Result_S to Briefcase

i

i

start

i

add updated Briefcase to AgentProfile

L NODES

get itinerary list from AgentProfile

NODES

L

remove first IP address from L

NODES

add updated L

NODES

to AgentProfile

if

L

AObject new LGFIGA(AgentProfile, min_th_sup)

NODES

then

add AObject to AgentProfile

else

transfer AgentProfile to DM_AEE at first IP address in L

NODES

TripTime get system time for end of agent journey

end

add TripTime

add updated Briefcase to AgentProfile

end

to Briefcase

transfer AgentProfile to RM at S

end if

end procedure

CENTRAL

> visited site

> itinerary not empty

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

Algortihm 5 GENERATECFIL

Input: L , Frequent k -1 itemsets k − 1 Output: C k ,Candidate Frequent
Input:
L
, Frequent k -1 itemsets
k
− 1
Output:
C
k ,Candidate Frequent k itemsets
1:
procedure GENERATECFIL (
L
)
k
1
2:
for all
itemset l ∈ L
do
1
k-1
3:
for all
itemset l ∈ L
do
2
k-1
4:
if
(l
[1] = l [1]) ∧ (l
[2] = l
[2]) ∧L ∧ (l [k - 1] = l
[k - 1]) then
1
2
1
2
1
2
5:
c
← l
⊗ l
> join step: generate candidates
1
2
6:
end if
7:
if
HASINFREQUENTSUBSET (
c L
,
)
then
> see Algorithm 6
k
1
8:
delete c
9:
else
10:
add c to C
k
11:
end if
12:
end for
13:
end for
14:
return
C
k
15:
end procedure
Algortihm 6 HASINFREQUENTSUBSET
Input: c, Candidate k itemsets
Output:
L
, Frequent k
− 1
itemsets
k
1
1:
procedure HASINFREQUENTSUBSET (
c L
,
)
k
1
2:
for all
(k - 1) subset s ∈ c
do
3:
if
s
L
then
k
1
4:
5:
6:
7:
8:
9:
return TRUE
else
return FALSE
end if
end for
end procedure
Algortihm 7 LOCAL KNOWLEDGE GENERATER AGENT (LKGA)
Input:
• AgentProfile,A collection of agent attributes set by the AL
• min_th_conf,the given minimum threshold confidence
LSAR
Output:
L
,the list of locally strong association rules
1:
procedure LKGA( AgentProfile,min_th_conf )
2:
CPUTime ← get system time
start
3:
Briefcase ← get Briefcase from AgentProfile
FI &SC
FI &SC
4:
L
← load L
from
local file system of this site S
i
FI
&
SC
5:
T ← L
.
get
(0)
> No. of records
FI
FI
&
SC
6:
L
← L
.
get
(1)
> frequent k-itemset list

87

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

7:

8:

9:

10:

11:

12:

13:

14:

15:

16:

17:

18:

19:

20:

21:

22:

23:

24:

25:

26:

27:

28:

29:

30:

31:

32:

33:

34:

35:

36:

37:

38:

39:

40:

41:

42:

43:

44:

45:

46:

47:

FISC

L

for

L

FI

k

L

k

2,

L

L

FI

.

.

FI

&

SC

get (2)

size do

.

get k

(

)

for all

l L

k

do

generate all non - empty subsets of l

l

l

AR

for all

subsets

get support count of l from L

/ T)

non - empty subset s l

(l

spcount

×

100

subsets

do

FISC

spcount

support

spcount get support count of s from L

s

AR

if

conf

AR

(l

conf

/ s

min_th_conf

)×100

then

spcount

spcount

FISC

AR

print

add l to AR

S

add S

add AR

strong

"s l - s[AR AR

strong

strong

%,AR

conf

strong

IP

support

i

get IP address of this site S

IP

i

to AR

strong

to L

LSAR

i

end if

end for

end for

end for

save

CPUTime

CPUTime CPUTime CPUTime

LSAR

L

in the local file system of this site S

end

get system time

end

start

i

%]"

add CPUTime to Result_S

add Result_S to Briefcase

i

i

add updated Briefcase to AgentProfile

L NODES

get itinerary list from AgentProfile

L

add updated L

if

NODES

remove first IP address from L

NODES

to AgentProfile

NODES

NODES

≠ ∅ then AObject new LKGA(AgentProfile, min_th_conf)

L

> support count list

> get frequent k-itemset list

> support of the association rule

> confidence

of the association rule

> visited site

> itinerary not empty

add AObject to AgentProfile

else

transfer AgentProfile to DM_AEE at first IP address in L

NODES

TripTime get system time for end of agent journey

end

add TripTime

add updated Briefcase to AgentProfile

transfer AgentProfile to RM at S

end

to Briefcase

CENTRAL

end if

end procedure

88

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

Algortihm 8 TOTAL FREQUENT ITEMSET COLLECTOR AGENT (TFICA)

Input: AgentProfile,A collection of agent attributes set by the AL

Output:

L FI

,the list of locally frequent itemsets

1:

2:

3:

4:

5:

6:

7:

8:

9:

10:

11:

12:

13:

14:

15:

16:

17:

18:

19:

20:

21:

22:

23:

24:

25:

procedure TFICA( AgentProfile,min_th_conf )

get system time

Briefcase get Briefcase from AgentProfile

CPUTime

start

FI &SC

L

load L

FI &SC

from local file system of this site S

L

add L

CPUTime get system time

FI

FI

L

FI

end

.

get (1)

to Result_S

&

SC

i

CPUTime CPUTime CPUTime

end

add CPUTime to Result_S

add Result_S to Briefcase

i

i

start

i

add updated Briefcase to AgentProfile

L NODES

get itinerary list from AgentProfile

L

add updated L

if

NODES

remove first IP address from L

NODES

L

NODES

to AgentProfile

then

NODES

AObject new TFICA(AgentProfile)

add AObject to AgentProfile

transfer AgentProfile to DM_AEE at first IP address in L

NODES

else

TripTime get system time for end of agent journey

end

add TripTime

add updated Briefcase to AgentProfile

transfer AgentProfile to RM at S

end

to Briefcase

CENTRAL

end if

end procedure

> frequent k-itemset list

> visited site

> itinerary not empty

Algortihm 9 LOCAL KNOWLEDGE COLLECTOR AGENT (LKCA)

Input: AgentProfile,A collection of agent attributes set by the AL

Output:

L LSAR

,the list of locally strong association rules

1: procedure LKCA( AgentProfile )

2:

3: Briefcase get Briefcase from AgentProfile

4:

5:

6:

7:

8:

9:

add updated Briefcase to AgentProfile

CPUTime

LSAR

get system time

LSAR

start

L

add L

CPUTime get system time

load L

LSAR

from local file system of this site S

i

start

to Result_S

end

CPUTime CPUTime CPUTime

end

add CPUTime to Result_S

i

add Result_S to Briefcase

i

10:

i

89

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

11:

12:

13:

14:

15:

16:

17:

18:

19:

20:

21:

22:

23:

24:

L NODES

get itinerary list from AgentProfile

L

add updated L

if

NODES

remove first IP address from L

NODES

L

NODES

to AgentProfile

then

NODES

AObject new LKCA(AgentProfile)

add AObject to AgentProfile

transfer AgentProfile to DM_AEE at first IP address in L

NODES

else

TripTime get system time for end of agent journey

end

add TripTime

add updated Briefcase to AgentProfile

transfer AgentProfile to RM at S

end

to Briefcase

CENTRAL

end if

end procedure

> visited site

> itinerary not empty

Algortihm 10 GLOBAL FREQUENT ITEMSET GENERATER AGENT (GFIGA)

Input: Briefcase, Result bag of TFICA agent

Output:

L GFI

,the list of global frequent itemsets

1:

2:

3:

4:

5:

6:

7:

8:

9:

10:

11:

procedure GFIGA( Briefcase )

CPUTime

start

get system time

TFI

L

GFI

retrieve total frequent itemsets

retrieve global frequent itemsets

GFI

L

GFI

in the local file system of site S

end

get system time

end

CPUTime

start

GFI

L

(

(

U

n

i=1

I

FI

)

CENTRAL

L

i

n

i = 1

from Briefcase

FI

L

i

)

from Briefcase

L

print

save L

CPUTime

CPUTime CPUTime print CPUTime

return

end procedure

Algortihm 11 GLOBAL KNOWLEDGE GENERATER AGENT (GKGA)

Input: Briefcase, Result bag of LKCA agent

Output:

GSAR

L

CENTRAL

,the list of globally strong association rules

1:

2:

3:

4:

5:

6:

7:

procedure GKGA( Briefcase )

CPUTime

start

get system time

TLSAR

L

retrieve total strong rules

(

U

n

i=1

LSAR

L

i

)

from Briefcase

L

for all

GFI

load global frequent itemsets L

AR

strong

L

TLSAR

do

(

GFI

L get frequent itemset from AR

if

GFI

L L

then

strong

)

from S

CENTRAL

90

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

8:

9:

print

add AR

AR

strong

along with the site address (S

strong

to L

GSAR

CENTRAL

IP

i

10:

11:

12:

13:

14:

15:

16:

17: end procedure

end if

end for

save L

GSAR

CENTRAL

end

GSAR

L

in the local file system of site S

get system time

end

CPUTime

start

CPUTime

CPUTime CPUTime print CPUTime

return

CENTRAL

CENTRAL

)

Algortihm 12 GLOBAL KNOWLEDGE DISPATCHER AGENT (GKDA)

Input: AgentProfile,A collection of agent attributes set by the AL

Output:

Dispatch L

GSAR

at each distributed site S

i

CENTRAL

1:

2:

3:

4:

5:

6:

7:

8:

9:

10:

11:

12:

13:

14:

15:

16:

17:

18:

19:

20:

21:

22:

23:

24:

procedure GKDA( AgentProfile )

CPUTime get system time

start

Briefcase get Briefcase from AgentProfile

GSAR

L

CANTRAL

get L

GSAR

CENTRAL

from Briefcase

save L

GSAR

CENTRAL

in the local file system of site S

CPUTime get system time

end

CPUTime CPUTime CPUTime

end

add CPUTime to Result_S

add Result_S to Briefcase

i

i

start

i

add updated Briefcase to AgentProfile

L NODES

get itinerary list from AgentProfile

L

add updated L

if

NODES

remove first IP address from L

NODES

L

NODES

to AgentProfile

then

NODES

AObject new GKDA(AgentProfile)

add AObject to AgentProfile

transfer AgentProfile to DM_AEE at first IP address in L

NODES

else

TripTime get system time for end of agent journey

end

add TripTime

add updated Briefcase to AgentProfile

transfer AgentProfile to RM at S

to Briefcase

end

CENTRAL

end if

end procedure

> visited site

> itinerary not empty

91

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 Figure 2. Control Panel of AeMGSAR 4.IMPLEMENTATION AND

Figure 2. Control Panel of AeMGSAR

4.IMPLEMENTATION AND PERFORMANCE STUDY

All the agents as well as control panel as shown in Figure 2 are designed in Java. Synthetic

dataset (

transactions and 10 items in each respectively using Transactional Data Set Generator (TDSG) tool [16]. Binary and transactional versions of these datasets are shown in Appendix A. The

required configuration of the system is shown in Table 1 with additional deployment of DM_AEE

at each distributed site and AL and RM at

shown in Figure 3. CPU time consumed by various MAs at site

. Round Trip time taken by various MAs is

is shown in Figure

4, Figure 5 and Figure 6, respectively. CPU time for GFIGA and GKGA is 101357102 nano

seconds and 33317458 nano seconds, respectively.

LFIGA agent with 20% min_th_sup are shown in Appendix B.1, B.2 and B.3.

sites generated by LKGA agent with 50% min_th_conf are shown in Appendix B.4, B.5 and B.6.

is shown in Figure 7. Fifteen numbers

Globally frequent itemsets generated by GFIGA at

of 2-itemsets and eight number of 3-itemsets are globally frequent in

itemsets, which are locally frequent, are not globally frequent. Globally strong association rules

(

for globally frequent 3-itemsets are shown in Figure 8

and

at distributed sites generated by

at distributed

, with 3500, 3850 and 3900

DB

i

) is stored across three distributed sites

S

1

,

S

2

and

S

3

S CENTRAL

FI

L

k

i

( )

S ,

1

S

2

and

S

3

and

FISC

L

k

i

( )

LSAR

L

i

S CENTRAL

TFI

L

k

list and 4, 5 and 6-

GSAR

L

CENTRAL

) generated by GKGA at

S

CENTRAL

L GSAR

CENTRAL

for 2-itemsets are shown in Appendix B.7.

On comparing this system with the traditional central data warehouse (DW) based approach for ARM where entire data from the distributed sites is centrally collected in a DW [17], it is found that the storage cost is reduced as data is mined locally and only the resultant knowledge is carried at the central site by mobile agents. As size of the resultant data carried across by mobile

92

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

agents is small so network communication cost is also reduced in this case. Data mining is performed locally by agents, so computational cost at central site is also minimised. AeMGSAR reflects the global knowledge because all the strong association rules generated are also strong at each distributed site. The system relies upon the Java's in-built security system. As MAs are scalable in nature so performance would not be affected by adding more sites.

Table 1. Network Configuration

Site Name

Processor

OS

LAN Configuration

IP a

Network

S

CENTRAL

Intel b

MS c

192.168.46.5

NW d

S

1

Intel b

MS c

192.168.46.212

NW d

S

2

Intel b

MS c

192.168.46.189

NW d

S

3

Intel b

MS c

192.168.46.213

NW d

a. IP address with Mask: 255.255.255.0 and Gateway 192.168.46.1

b. Intel Pentium Dual Core(3.40 GHz, 3.40 GHz) with 512 MB RAM