Sie sind auf Seite 1von 27

DATA ANALYSIS USING

APRIORI ALGORITHM & NEURAL NETWOK

By

Ashutosh Padhi

Registration Number:
FMS/MBA/2015-17/000246

Sri Sri University


CUTTACK - 754006
May-June, 2016

1
DATA ANALYSIS USING
APRIORI ALGORITHM & NEURAL NETWOK

By

Ashutosh Padhi

Under the guidance of

Dr. Bhagirathi Nayak


Assistant Professor,
Sri sri University, Cuttack

Sri Sri University


CUTTACK - 754006
May-June, 2016

2
1. INTRODUCTION
1.1. Mining Frequent Itemsets – Apriori Algorithm

The Apriori algorithm was proposed by Agrawal and Srikant in 1994. Apriori is designed to
operate on databases containing transactions (for example, collections of items bought by
customers, or details of a website frequentation). Other algorithms are designed for finding
association rules in data having no transactions (Winepi and Minepi), or having no timestamps
(DNA sequencing). Each transaction is seen as a set of items (an itemset). Given a threshold , the
Apriori algorithm identifies the item sets which are subsets of at least transactions in the
database.

Apriori uses a "bottom up" approach, where frequent subsets are extended one item at a time
(a step known as candidate generation), and groups of candidates are tested against the data.
The algorithm terminates when no further successful extensions are found.

Apriori uses breadth-first search and a Hash tree structure to count candidate item sets
efficiently. It generates candidate item sets of length from item sets of length . Then it prunes
the candidates which have an infrequent sub pattern. According to the downward closure
lemma, the candidate set contains all frequent -length item sets. After that, it scans the
transaction database to determine frequent item sets among the candidates.

The pseudo code for the algorithm is given below for a transaction database , and a support
threshold of . Usual set theoretic notation is employed, though note that is a multiset. is the
candidate set for level . At each step, the algorithm is assumed to generate the candidate sets
from the large item sets of the preceding level, heeding the downward closure lemma. Accesses
a field of the data structure that represents candidate set, which is initially assumed to be zero.
Many details are omitted below, usually the most important part of the implementation is the
data structure used for storing the candidate sets, and counting their frequencies.

Theoretical aspects

In data mining, association rule learning is a popular and well researched method for discovering
interesting relations between variables in large databases. Piatetsky-Shapiro describes analyzing
and presenting strong rules discovered in databases using different measures of interestingness.
Based on the concept of strong rules, Agrawal introduced association rules for discovering
regularities between products in large scale transaction data recorded by point-of-sale (POS)
systems in supermarkets. For example, the rule {onion,potatoes}=>{burger} found in the sales
data of a supermarket would indicate that if a customer buys onions and potatoes together, he

3
or she is likely to also buy burger. Such information can be used as the basis for decisions about
marketing activities such as, e.g., promotional pricing or product placements. In addition to the
above example from market basket analysis association rules are employed today in many
application areas including Web usage mining, intrusion detection and bioinformatics. In
computer science and data mining, Apriori is a classic algorithm for learning association rules.

Apriori is designed to operate on databases containing transactions (for example, collections of


items bought by customers, or details of a website frequentation). Other algorithms are designed
for finding association rules in data having no transactions (Winepi and Minepi), or having no
timestamps (DNA sequencing).

Example 1: A grocery store has weekly specials for which advertising supplements are created
for the local newspaper. When an item, such as peanut butter, has been designated to go on
sale, management determines what other items are frequently purchased with peanut butter.
They find that bread is purchased with peanut butter 30% of the time and that jelly is purchased
with it 40% of the time. Based on these associations, special displays of jelly and bread are placed
near the peanut butter which is on sale. They also decide not to put these items on sale. These
actions are aimed at increasing overall sales volume by taking advantage of the frequency with
which these items are purchased together.

There are two association rules mentioned in Example 1. The first one states that when
peanut butter is purchased, bread is purchased 30% of the time. The second one states that 40%
of the time when peanut butter is purchased so is jelly. Association rules are often used by retail
stores to analyze market basket transactions. The discovered association rules can be used by
management to increase the effectiveness (and reduce the cost) associated with advertising,
marketing, inventory, and stock location on the floor. Association rules are also used for other
applications such as prediction of failure in telecommunications networks by identifying what
events occur before a failure. Most of our emphasis in this paper will be on basket market
analysis, however in later sections we will look at other applications as well.
The objective of this paper is to provide a thorough survey of previous research on association
rules. In the next section we give a formal definition of association rules. Section 3 contains the
description of sequential and parallel algorithms as well as other algorithms to find association
rules. Section 4 provides a new classification and comparison of the basic algorithms . Section 5
presents generalization and extension of association rules. In Section 6 we examine the
generation of association rules when the database is being modified. In appendices we provide
information on different association rule products, data source and source code available in the
market, and include a table summarizing notation used throughout the paper.

4
ASSOCIATION RULE PROBLEM
A formal statement of the association rule problem is [Agrawal1993] [Cheung1996c]:
Definition 1: Let I ={I1, I2, … , Im} be a set of m distinct attributes, also called literals. Let D be a
database, where each record (tuple) T has a unique identifier, and contains a set of items such
that TI An association rule is an implication of the form XY, where X, YI, are sets of items
called itemsets, and X  Y=. Here, X is called antecedent, and Y consequent.

Two important measures for association rules, support (s) and confidence (), can be defined
as follows.

Definition 2: The support (s) of an association rule is the ratio (in percent) of the records that
contain X  Y to the total number of records in the database.

Therefore, if we say that the support of a rule is 5% then it means that 5% of the total records
contain X  Y. Support is the statistical significance of an association rule. Grocery store
managers probably would not be concerned about how peanut butter and bread are related if
less than 5% of store transactions have this combination of purchases. While a high support is
often desirable for association rules, this is not always the case. For example, if we were using
association rules to predict the failure of telecommunications switching nodes based on what set
of events occur prior to failure, even if these events do not occur very frequently association rules
showing this relationship would still be important.

Definition 3: For a given number of records, confidence ( ) is the ratio (in percent) of the number
of records that contain X  Y to the number of records that contain X.

Thus, if we say that a rule has a confidence of 85%, it means that 85% of the records
containing X also contain Y. The confidence of a rule indicates the degree of correlation in the
dataset between X and Y. Confidence is a measure of a rule’s strength. Often a large confidence
is required for association rules. If a set of events occur a small percentage of the time before a
switch failure or if a product is purchased only very rarely with peanut butter, these relationships
may not be of much use for management.
Mining of association rules from a database consists of finding all rules that meet the user-
specified threshold support and confidence. The problem of mining association rules can be
decomposed into two subproblems [Agrawal1994] as stated in Algorithm 1.

Algorithm 1. Basic:
Input:
I, D, s, 
Output:
Association rules satisfying s and 

5
Algorithm:
1) Find all sets of items which occur with a frequency that is greater than or equal to the
user-specified threshold support, s.
2) Generate the desired rules using the large itemsets, which have user-specified threshold
confidence, .

The first step in Algorithm 1 finds large or frequent item sets. Item sets other than those are
referred as small item sets. Here an item set is a subset of the total set of items of interest from
the database. An interesting (and useful) observation about large item sets is that:
If an item set X is small, any superset of X is also small.
Of course the contrapositive of this statement (If X is a large item set than so is any subset of X)
is also important to remember. In the remainder of this paper we use L to designate the set of
large item sets. The second step in Algorithm 1 finds association rules using large item sets
obtained in the first step. Example 2 illustrates this basic process for finding association rules
from large item sets.
Example 2: Consider a small database with four items I={Bread, Butter, Eggs, Milk} and four
transactions as shown in Table 1. Table 2 shows all item sets for I. Suppose that the minimum
support and minimum confidence of an association rule are 40% and 60%, respectively. There
are several potential association rules. For discussion purposes we only look at those in Table 3.
At first, we have to find out whether all sets of items in those rules are large. Secondly, we have
to verify whether a rule has a confidence of at least 60%. If the above conditions are satisfied for
a rule, we can say that there is enough evidence to conclude that the rule holds with a confidence
of 60%. Itemsets associated with the aforementioned rules are: {Bread, Butter}, and {Butter,
Eggs}. The support of each individual itemset is at least 40% (see Table 2). Therefore, all of these
itemsets are large. The confidence of each rule is presented in Table 3. It is evident that the first
rule (Bread  Butter) holds. However, the second rule (Butter  Eggs) does not hold because
its confidence is less than 60%.

Table 1 Transaction Database for Example 2


Transaction ID Items
i Bread, Butter, Eggs
T2 Butter, Eggs, Milk
T3 Butter
T4 Bread, Butter

6
Table 2 Support for Itemsets in Table 1 and Large Itemsets with a support of 40%
Itemset Support, s Large/Small
Bread 50% Large
Butter 100% Large
Eggs 50% Large
Milk 25% Small
Bread, Butter 50% Large
Bread, Eggs 25% Small
Bread, Milk 0% Small
Butter, Eggs 50% Large
Butter, Milk 25% Small
Eggs, Milk 25% Small
Bread, Butter, Eggs 25% Small
Bread, Butter, Milk 0% Small
Bread, Eggs, Milk 0% Small
Butter, Eggs, Milk 25% Small
Bread, Butter Eggs, Milk 0% Small

Table 3 Confidence of Some Association Rules for Example 1 where =60%


Rule Confidence Rule Hold
Bread  Butter 100% Yes
Butter  Bread 50% No
Butter  Eggs 50% No
Eggs  Butter 100% Yes

The identification of the large itemsets is computationally expensive [Agrawal1994].


However, once all sets of large itemsets (l  L) are obtained, there is a straightforward algorithm
for finding association rules given in [Agrawal1994] which is restated in Algorithm 2.

Algorithm 2. Find Association Rules Given Large Itemsets:


Input:
I, D, s, L
Output:
Association rules satisfying s and 

7
Algorithm:
1) Find all nonempty subsets, x, of each large itemset, l  L
3) For every subset, obtain a rule of the form x (l-x) if the ratio of the frequency of
occurrence of l to that of x is greater than or equal to the threshold confidence.

For example, suppose we want to see whether the first rule {Bread  Butter) holds for
Example 2. Here l = {Bread, Butter}, and x = {Bread}. Therefore, (l-x) = {Butter}. Now, the ratio
of support(Bread, Butter) to support(Bread) is 100% which is greater than the minimum
confidence. Therefore, the rule holds. For a better understanding, let us consider the third rule,
Butter  Eggs, where x = {Butter}, and (l-x) = {Eggs}. The ratio of support(Butter, Eggs) to
support(Butter) is 50% which is less than 60%. Therefore, we can say that there is not enough
evidence to conclude {Butter}  {Eggs} with 60% confidence.
Since finding large itemsets in a huge database is very expensive and dominates the
overall cost of mining association rules, most research has been focused on developing efficient
algorithms to solve step 1 in Algorithm 1 [Agrawal1994] [Cheung1996c] [Klemettinen1994]. The
following section provides an overview of these algorithms.

1.2. Introduction to neural networks


What is a Neural Network?

An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the
way biological nervous systems, such as the brain, process information. The key element of this
paradigm is the novel structure of the information processing s ystem. It is composed of a large
number of highly interconnected processing elements (neurons) working in unison to solve
specific problems. ANNs, like people, learn by example. An ANN is configured for a specific
application, such as pattern recognition or data classification, through a learning process.
Learning in biological systems involves adjustments to the synaptic connections that exist
between the neurones. This is true for ANNs as well.

Historical background

Neural network simulations appear to be a recent development. However, this field was
established before the advent of computers, and has survived at least one major setback and
several eras.

Many important advances have been boosted by the use of inexpensive computer emulations.
Following an initial period of enthusiasm, the field survived a period of frustration and disrepute.
During this period when funding and professional support was minimal, important advances were
made by relatively few researchers. These pioneers were able to develop convincing technology
8
which surpassed the limitations identified by Minsky and Papert. Minsky and Papert, published a
book (in 1969) in which they summed up a general feeling of frustration (against neural networks)
among researchers, and was thus accepted by most without further analysis. Currently, the
neural network field enjoys a resurgence of interest and a corresponding increase in funding.

The first artificial neuron was produced in 1943 by the neurophysiologist Warren McCulloch and
the logician Walter Pits. But the technology available at that time did not allow them to do too
much.

Why use neural networks?

Neural networks, with their remarkable ability to derive meaning from complicated or imprecise
data, can be used to extract patterns and detect trends that are too complex to be noticed by
either humans or other computer techniques. A trained neural network can be thought of as an
"expert" in the category of information it has been given to analyze. This expert can then be used
to provide projections given new situations of interest and answer "what if" questions.
Other advantages include:

1. Adaptive learning: An ability to learn how to do tasks based on the data given for training
or initial experience.
2. Self-Organization: An ANN can create its own organization or representation of the
information it receives during learning time.
3. Real Time Operation: ANN computations may be carried out in parallel, and special
hardware devices are being designed and manufactured which take advantage of this
capability.
4. Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads
to the corresponding degradation of performance. However, some network capabilities
may be retained even with major network damage.

Neural networks versus conventional computers

Neural networks take a different approach to problem solving than that of conventional
computers. Conventional computers use an algorithmic approach i.e. the computer follows a set
of instructions in order to solve a problem. Unless the specific steps that the computer needs to
follow are known the computer cannot solve the problem. That restricts the problem solving
capability of conventional computers to problems that we already understand and know how to
solve. But computers would be so much more useful if they could do things that we don't exactly
know how to do.

9
Neural networks process information in a similar way the human brain does. The network is
composed of a large number of highly interconnected processing elements (neurones) working
in parallel to solve a specific problem. Neural networks learn by example. They cannot be
programmed to perform a specific task. The examples must be selected carefully otherwise
useful time is wasted or even worse the network might be functioning incorrectly. The
disadvantage is that because the network finds out how to solve the problem by itself, its
operation can be unpredictable.

On the other hand, conventional computers use a cognitive approach to problem s olving; the
way the problem is to solved must be known and stated in small unambiguous instructions. These
instructions are then converted to a high level language program and then into machine code
that the computer can understand. These machines are totally predictable; if anything goes
wrong is due to a software or hardware fault.

Neural networks and conventional algorithmic computers are not in competition but
complement each other. There are tasks are more suited to an algorithmic approach like
arithmetic operations and tasks that are more suited to neural networks. Even more, a large
number of tasks, require systems that use a combination of the two approaches (normally a
conventional computer is used to supervise the neural network) in order to perform at maximum
efficiency.

Human and Artificial Neurones - investigating the similarities


How the Human Brain Learns?

Much is still unknown about how the brain trains itself to process information, so theories
abound. In the human brain, a typical neuron collects signals from others through a host of fine
structures called dendrites. The neuron sends out spikes of electrical activity through a long, thin
stand known as an axon, which splits into thousands of branches. At the end of each branch, a
structure called a synapse converts the activity from the axon into electrical effects that inhibit
or excite activity from the axon into electrical effects that inhibit or excite activity in the
connected neurones. When a neuron receives excitatory input that is sufficiently large compared
with its inhibitory input, it sends a spike of electrical activity down its axon. Learning occurs by
changing the effectiveness of the synapses so that the influence of one neuron on another
changes.

10
Components of a neuron The synapse

From Human Neurones to Artificial Neurones

We conduct these neural networks by first trying to deduce the essential features of neurones
and their interconnections. We then typically program a computer to simulate these features.
However because our knowledge of neurones is incomplete and our computing power is limited,
our models are necessarily gross idealizations of real networks of neurones.

Some interesting numbers

BRAIN PC
Vprop=100m/s Vprop=3*108 m/s
N  100hz  N  109 hz
N=1010-1011 neurons N=109
The parallelism degree ~1014 like
1014processors with 100 Hz frequency. 104
connected at the same time.

An engineering approach
A neuron
A more sophisticated neuron is the McCulloch and Pitts model (MCP). The difference from the
previous model is that the inputs are ‘weighted’; the effect that each input has at decision making
is dependent on the weight of the particular input. The weight of
an input is a number which when multiplied with the input gives y y
the weighted input. These weighted inputs are then added u wi xi
u w0 
together and if they exceed a pre-set threshold value, the neuron
fires. In any other case the neuron does not fire.
x1 xn
11
In mathematical terms, the neuron fires if and only if; the addition of input weights and of the
threshold makes this neuron a very flexible and powerful one. The MCP neuron has the ability to
adapt to a particular situation by changing its weights and/or threshold. Various algorithms exist
that cause the neuron to 'adapt'; the most used ones are the Delta rule and the back error
propagation. The former is used in feed-forward networks and the latter in feedback networks.

y f d
 
w x  w0  f
j 1 j j
d
j 0
wj x j 
4 Architecture of neural networks
Feed-forward networks
Feed-forward ANNs allow signals to travel one way only; from input to output. There
is no feedback (loops) i.e. the output of any layer does not affect that same layer.
Feed-forward ANNs tend to be straight forward networks that associate inputs with
outputs. They are extensively used in pattern recognition. This type of organization
is also referred to as bottom-up or top-down.

Feedback networks
Feedback networks (figure 1) can have signals traveling in both directions
by introducing loops in the network. Feedback networks are very powerful
and can get extremely complicated. Feedback networks are dynamic; their
'state' is changing continuously until they reach an equilibrium point. They
remain at the equilibrium point until the input changes and a new
equilibrium needs to be found. Feedback architectures are also referred to as interactive or
recurrent, although the latter term is often used to denote feedback connections in single-layer
organizations.

Network layers
The commonest type of artificial neural network consists of three groups, or layers, of units: a
layer of "input" units is connected to a layer of "hidden" units, which is connected to a layer of
"output" units.

The activity of the input units represents the raw information that is fed into the network.

The activity of each hidden unit is determined by the activities of the input units and the weights
on the connections between the input and the hidden units.

The behavior of the output units depends on the activity of the hidden units and the weights
between the hidden and output units.
12
This simple type of network is interesting because the hidden units are free to construct their
own representations of the input. The weights between the input and hidden units determine
when each hidden unit is active, and so by modifying these weights, a hidden unit can choose
what it represents.

We also distinguish single-layer and multi-layer architectures. The single-layer organization, in


which all units are connected to one another, constitutes the most general case and is of more
potential computational power than hierarchically structured multi-layer organizations. In multi-
layer networks, units are often numbered by layer, instead of following a global numbering.

13
2. TOOL USED

RapidMiner uses a client/server model with the server offered as either on-premise, or in public
or private cloud infrastructures.

According to Bloor Research, RapidMiner provides 99% of an advanced analytical solution


through template-based frameworks that speed delivery and reduce errors by nearly eliminating
the need to write code. RapidMiner provides data mining and machine learning procedures
including: data loading and transformation (Extract, transform, load (ETL)), data preprocessing
and visualization, predictive analytics and statistical modeling, evaluation, and deployment.
RapidMiner is written in the Java programming language. RapidMiner provides a GUI to design
and execute analytical workflows. Those workflows are called “Processes” in RapidMiner and
they consist of multiple “Operators”. Each operator performs a single task within the process,
and the output of each operator forms the input of the next one. Alternatively, the engine can
be called from other programs or used as an API. Individual functions can be called from the
command line. RapidMiner provides learning schemes, models and algorithms and can be
extended using R and Python scripts.

RapidMiner functionality can be extended with additional plugins which are made available via
RapidMiner Marketplace. The RapidMiner Marketplace provides a platform for developers to
create data analysis algorithms and publish them to the community.

With version 7.0, RapidMiner included updates to its getting started materials, an updated user
interface, and improvements to its data preparation capabilities

14
3. SIMMULATIONS & RESULTS
3.1. For association rule

15
3.1.2. RESULT

16
BEST 15 Association Rules

[att3, att2] --> [att4] (confidence: 0.805)

[att3] --> [att4] (confidence: 0.815)

[att5, att3] --> [att4] (confidence: 0.815)

[TID] --> [att3, att2] (confidence: 0.876)

[att4] --> [att2] (confidence: 0.906)

[att4] --> [att3, att2] (confidence: 0.906)

[TID, att4] --> [att2] (confidence: 0.906)

[att4] --> [att3, att2] (confidence: 0.906)

[att3, att4] --> [att2] (confidence: 0.906)

[att3, att4] --> [TID, att2] (confidence: 0.906)

[att5] --> [att2] (confidence: 0.908)

[att5] --> [att3, att2] (confidence: 0.908)

[att3, att5] --> [att2] (confidence: 0.908)

[att5] --> [att2, att4] (confidence: 0.908)

[att4, att5] --> [att2] (confidence: 0.908

17
3.2. For neural network

18
3.2.2. Results

19
1 PerformanceVector
PerformanceVector:
accuracy: 95.00% +/- 10.00% (mikro: 94.44%)
ConfusionMatrix:
True: profit loss
profit: 26 1
loss: 1 8
precision: 88.89% (positive class: loss)
ConfusionMatrix:
True: profit loss
profit: 26 1
loss: 1 8
recall: 88.89% (positive class: loss)
ConfusionMatrix:
True: profit loss
profit: 26 1
loss: 1 8
AUC (optimistic): unknown (positive class: loss)
AUC: unknown (positive class: loss)
AUC (pessimistic): unknown (positive class: loss)

2 ImprovedNeuralNet
Hidden 1
========

Node 1 (Sigmoid)
----------------
Advertising: 1.904
transportations: 0.313
maintance & others: 3.141
Sales: -1.931
profits: -3.915
month: 0.632
Bias: 0.330

Node 2 (Sigmoid)
----------------
Advertising: 0.715
transportations: 0.345
maintance & others: 1.296
Sales: -0.925
profits: -1.648
month: 0.337
Bias: -0.420

Node 3 (Sigmoid)
----------------
Advertising: -0.132
transportations: -0.183
maintance & others: 0.116
Sales: 0.185
profits: 0.213
month: 0.351
Bias: 0.226

20
Node 4 (Sigmoid)
----------------
Advertising: 0.284
transportations: 0.291
maintance & others: 0.765
Sales: -0.490
profits: -0.911
month: 0.290
Bias: -0.457

Node 5 (Sigmoid)
----------------
Advertising: 1.694
transportations: 0.323
maintance & others: 2.833
Sales: -1.776
profits: -3.454
month: 0.566
Bias: 0.219

Node 6 (Sigmoid)
----------------
Advertising: -0.054
transportations: 0.001
maintance & others: 0.275
Sales: 0.030
profits: -0.136
month: 0.304
Bias: -0.047

Node 7 (Sigmoid)
----------------
Advertising: 1.918
transportations: 0.331
maintance & others: 3.150
Sales: -1.975
profits: -3.985
month: 0.583
Bias: 0.384

Node 8 (Sigmoid)
----------------
Advertising: 1.221
transportations: 0.317
maintance & others: 2.111
Sales: -1.336
profits: -2.611
month: 0.427
Bias: -0.089

Node 9 (Sigmoid)
----------------
Advertising: 0.871
transportations: 0.337
maintance & others: 1.565
Sales: -1.052
profits: -2.028

21
month: 0.378
Bias: -0.311

Node 10 (Sigmoid)
-----------------
Advertising: 0.585
transportations: 0.365
maintance & others: 1.116
Sales: -0.817
profits: -1.456
month: 0.270
Bias: -0.420

Hidden 2
========

Node 1 (Sigmoid)
----------------
Node 1: -1.627
Node 2: -0.487
Node 3: 0.430
Node 4: -0.175
Node 5: -1.484
Node 6: 0.239
Node 7: -1.704
Node 8: -0.999
Node 9: -0.718
Node 10: -0.442
Bias: 1.328

Node 2 (Sigmoid)
----------------
Node 1: -1.609
Node 2: -0.516
Node 3: 0.375
Node 4: -0.145
Node 5: -1.322
Node 6: 0.181
Node 7: -1.578
Node 8: -0.961
Node 9: -0.687
Node 10: -0.363
Bias: 1.161

Node 3 (Sigmoid)
----------------
Node 1: -1.487
Node 2: -0.465
Node 3: 0.264
Node 4: -0.194
Node 5: -1.271
Node 6: 0.175
Node 7: -1.451
Node 8: -0.913
Node 9: -0.606
Node 10: -0.387

22
Bias: 1.004

Node 4 (Sigmoid)
----------------
Node 1: -1.515
Node 2: -0.550
Node 3: 0.379
Node 4: -0.174
Node 5: -1.398
Node 6: 0.158
Node 7: -1.592
Node 8: -0.965
Node 9: -0.651
Node 10: -0.458
Bias: 1.211

Node 5 (Sigmoid)
----------------
Node 1: -1.497
Node 2: -0.546
Node 3: 0.359
Node 4: -0.125
Node 5: -1.291
Node 6: 0.139
Node 7: -1.466
Node 8: -0.938
Node 9: -0.633
Node 10: -0.412
Bias: 1.032

Node 6 (Sigmoid)
----------------
Node 1: -1.654
Node 2: -0.528
Node 3: 0.442
Node 4: -0.150
Node 5: -1.513
Node 6: 0.274
Node 7: -1.731
Node 8: -0.987
Node 9: -0.712
Node 10: -0.426
Bias: 1.341

Node 7 (Sigmoid)
----------------
Node 1: -1.781
Node 2: -0.570
Node 3: 0.505
Node 4: -0.157
Node 5: -1.494
Node 6: 0.311
Node 7: -1.805
Node 8: -1.047
Node 9: -0.712
Node 10: -0.435
Bias: 1.432

23
Node 8 (Sigmoid)
----------------
Node 1: -1.796
Node 2: -0.545
Node 3: 0.516
Node 4: -0.122
Node 5: -1.556
Node 6: 0.342
Node 7: -1.813
Node 8: -1.072
Node 9: -0.732
Node 10: -0.436
Bias: 1.451

Node 9 (Sigmoid)
----------------
Node 1: -1.250
Node 2: -0.442
Node 3: 0.135
Node 4: -0.200
Node 5: -1.100
Node 6: 0.065
Node 7: -1.277
Node 8: -0.781
Node 9: -0.573
Node 10: -0.372
Bias: 0.679

Node 10 (Sigmoid)
-----------------
Node 1: -1.769
Node 2: -0.516
Node 3: 0.436
Node 4: -0.133
Node 5: -1.526
Node 6: 0.305
Node 7: -1.773
Node 8: -1.090
Node 9: -0.768
Node 10: -0.393
Bias: 1.463

Output
======

Class 'profit' (Sigmoid)


------------------------
Node 1: 2.181
Node 2: 1.989
Node 3: 1.799
Node 4: 1.985
Node 5: 1.909
Node 6: 2.152
Node 7: 2.294
Node 8: 2.326

24
Node 9: 1.506
Node 10: 2.295
Threshold: -5.297

Class 'loss' (Sigmoid)


----------------------
Node 1: -2.114
Node 2: -2.008
Node 3: -1.868
Node 4: -2.016
Node 5: -1.840
Node 6: -2.198
Node 7: -2.290
Node 8: -2.322
Node 9: -1.512
Node 10: -2.285
Threshold: 5.301

25
4. FINDINGS

After completions of my project I got the best 15 association rules for ABC mart are

BEST 15 Association Rules

[att3, att2] --> [att4] (confidence: 0.805)

[att3] --> [att4] (confidence: 0.815)

[att5, att3] --> [att4] (confidence: 0.815)

[TID] --> [att3, att2] (confidence: 0.876)

[att4] --> [att2] (confidence: 0.906)

[att4] --> [att3, att2] (confidence: 0.906)

[TID, att4] --> [att2] (confidence: 0.906)

[att4] --> [att3, att2] (confidence: 0.906)

[att3, att4] --> [att2] (confidence: 0.906)

[att3, att4] --> [TID, att2] (confidence: 0.906)

[att5] --> [att2] (confidence: 0.908)

[att5] --> [att3, att2] (confidence: 0.908)

[att3, att5] --> [att2] (confidence: 0.908)

[att5] --> [att2, att4] (confidence: 0.908)

[att4, att5] --> [att2] (confidence: 0.908

I got the performance vector for ABC mart is 95.00% accuracy.

PerformanceVector
PerformanceVector:
accuracy: 95.00% +/- 10.00% (mikro: 94.44%)

26
5. BIBOGRAPHY
1. Guido Deutsch, “RapidMiner from Rapid-I at CeBIT 2010,” Data Mining Blog, March 18, 2010.

2. McCulloch, Warren; Walter Pitts (1943). "A Logical Calculus of Ideas Immanent in Nervous
Activity". Bulletin of Mathematical Biophysics.

3. Rochester, N.; J.H. Holland; L.H. Habit; W.L. Duda (1956). "Tests on a cell assembly theory
of the action of the brain, using a large digital computer". IRE Transactions on Information Theory

4. http://www.kurzweilai.net/how-bio- inspired-deep-learning-keeps-winning-competitions 2012


Kurzweil AI Interview with Jürgen Schmidhuber on the eight competitions won by his Deep
Learning team 2009–2012

5. Bayardo Jr, Roberto J. (1998). "Efficiently mining long patterns from databases"

6. Hahsler, Michael (2005). "Introduction to arules – A computational environment for mining


association rules and frequent item sets

27

Das könnte Ihnen auch gefallen