Sie sind auf Seite 1von 17

ASSIGNMENT-7

Q1) Go to the WEKA explorer environment and load the training file “iris.arff”. Remove the
class attribute using the preprocessing dialog. Go to the clustering dialog. Cluster the iris dataset
using the k-Means Clustering algorithm with k=5. Hand in the result given by WEKA (Cluster
mean and standard deviation).

Opening iris.arff and removing class attribute

Appling Clustering with simpleKMean(k=2)


Opening testt.arff file in the data folder of weka3-7 and apply Discretize filter.

Apply clustering with SimplekMean by clicking start button.


Right click on last result list and select visualize cluster assignment.
Q2) Cluster the iris dataset using the k-Means Clustering algorithm with k=3 and k=4 and with ten
different value of the seed parameter, using the option: Classes to cluster evaluation and store the
results on an excel file. Compute the mean of the two differnt k value for the k-Means

Adding a new field seedsize in iris.arff file in data folder by opening it in the wordpad and then save the
changed file in notepad and open the file in notepad.

Open the irises.arff file in data folder


Apply NumericToNominal filter in Preprocess option.

Apply Clustering and select ‘store clusters to visualization’ option and set nominal seedsize field click on
start button.
Excel sheet

Right click on last result list and select save result buffer and save file in data folder with .xls extention.
(result.xls)

Q3) Visualize the cluster mean values and standard deviation for - sepallength versus sepalwidth –
petallength versus petalwidth Don’t erase the result of the k-Means algorithm.

Select visualize option and then select two fields(sepallength,sepalwidth)and then apdate.
Q4) Using the EM algorithm with the iris dataset. Try to run EM with the automatic estimation of
the number of cluster. Then try to repeat the experiments of 2 point of the previous section with the
EM algorithm.

Click on cluster option then choose EM option then select “use training set” radiobutton and click start
button.
Q5) Create an “arff”-file containing the data points

t1 = (4, 2, 3, 5, 2, 2, 2, 1)

t2 = (3, 2, 5, 4, 3, 2, 1, 4)

t3 = (1, 3, 3, 5, 2, 3, 2, 1)

t4 = (4, 2, 0, 5, 2, 2, 2, 1)

t5 = (3, 2, 3, 4, 3, 2, 1, 4)

t6 = (2, 5, 3, 5, 2, 2, 2, 1)

t7 = (4, 1, 3, 7, 2, 1, 2, 1)

t8 = (3, 1, 5, 4, 3, 2, 1, 4)

t9 = (2, 5, 2, 5, 2, 5, 2, 1)

Creating arff(testt.arff) file in notepad and save in data folder in weka3-7.


Q6) Create a small excel file(excel.file) and save it on desktop.then open the excel file,set csv property
and save the excel file with .csv extention(excel.csv).import this excel file in weka by clicking open
Q7) Load in Weka the Weather dataset and select the Classify tab

a) Select the classifier J48 and run with the default options.

b) Visualize the tree

c) Visualize the errors

d) Consider different parameters of J48 to improve the quality of the classifier

select weather.numeric file from data folder in weka3-7.then select preprocess option and
apply discretize filter.

In classify option choose J48 option in tree option and click start button.
Visualize tree

Right click on the result set and select visualize the tree option.

Visualize errors

Rigth click on result set option and select visualize customize error option.
Q8 Load the Iris dataset

a) Select the classifier J48 and run with the default options.

b) Visualize the tree

c) Visualize the errors

d) Consider different parameters of J48 to improve the quality of the classifier

select iris.numeric file from data folder in weka3-7.then select preprocess option and apply
discretize filter.

In classify option choose J48 option in tree option and click start button.

Visualize the tree

Right click on the result set and select visualize the tree option.
Visualize the error

Rigth click on result set option and select visualize customize error option.
Q9) Load the bank dataset and discretize the attributes that are numeric. This pre-processing can
be done with filtering.

a) Now run the association rule algorithm and play with the parameters.

b) Which rules are always true? Write them down.

c) Write down a couple of interesting rules and a couple of trivial rules.

Select credit file from data folder and apply discretize filter on it in preprocess option.

Select associate option and choose Apriori algorithm.then double click on Apriori and play with the
parameters.

Confidence option:-

Conviction option:-
Q 10) Load Iris dataset and run association analysis (note that association rules work on nominal
attributes!)

a) Generate 10 association rules and discuss some inferences you would make from
them

b) Then change the support and confidence to lower values. Discuss the difference?

c) Select the Output Item set option and run again the Apriori algorithm. Discuss the
difference?

Conviction:-
Confidence:-

Confidence with true value and conviction with false value are same.

Das könnte Ihnen auch gefallen