Sie sind auf Seite 1von 28

S7:

Intro to Weka Lab


Shawndra Hill Spring 2013 TR 1:30-3pm and 3-4:30


Preprocessing

Preprocessing: supervised sampling


Unbalanced class, supervised sampling can be used to balance the data set. Click Choose in the filter section, and follow this path:

Preprocessing: supervised sampling


Click on the resample box next to the Choose button, and the pop-up window emerges to set the desired parameters

A balanced sample is now obtained

Preprocessing: unsupervised sampling

Preprocessing: lters

For a numeric transformation filter, click Choose, and then follow the path shown below, selecting NumericTransform. The pop-up to set up the transformation emerge by clicking the NumericTransform box next to the Choose button. To go-back from the transformation, press the button Undo in the preprocessing menu.

Preprocessing: lters

Feature selecHon
A"ribute evaluator op/ons: Search method op/ons:

Go to the Select attributes section to perform feature selection. You need to define 2 things in order to do so: what attribute evaluator and what search method to use.

Click on the Choose buttons to access and set the options


8

Feature selecHon: (InfoGain,Ranker)

Click on the Choose buttons to access and set the options

Feature selecHon: (PCA, Ranker)

10

Feature selecHon: (Wrapper, Greedy STW)

11

Feature selecHon: models with selected aQributes

Remove unselected a"ributes in the TRAINING set; save the modied TRAINING le

Then open the TEST set le, and remove unselected a"ributes in the same way. Save the modied TRAINING and TEST set les, open them in WEKA and run your model
12

ClassicaHon: Test set opHons

13

ClassicaHon: K-NN

To create a K-NN classifier, go to the classifier tab and select options as indicated above. Once you select Ibk, you can click in the Ibk box (right next to the Choose button) and set the parameters in the pop-up window (figure in the right)
14

ClassicaHon: Nave Bayes

15

ClassicaHon: Decision trees

16

Numeric predicHon: Linear regression

17

Numeric predicHon: Neural Networks

18

Numeric predicHon: Neural Networks


Clicking in More you can learn about the NN parameters. Weka descrip/ons for some of the important parameters are shown below:

19

ClassicaHon: Output

20

ClassicaHon: Output

Correctly classied instances: 5269/9639=54.68%

TP rate: Class 1: 2056/(2056+2770)=0.427 Class 2: 3213/(3213+1957)=0.668 The diagonals of TP and FP rate sum up to 1
21

ClassicaHon: Output

22

ClassicaHon: Output
Predicted class Class probability es/mate NOTES: 1:1 class 1, label 1 2:0 class 2, label 0 Class labels can be anything First (second) column in class probability estimates indicate probability of being class 1 (class 2). E.g., the predicted prob. of obs.1 being class 1 is 0.578 + indicates mistakes in the classification * Indicates the predicted class
23

ClassicaHon: Output
Predicted class Class probability es/mate Class probability estimates can be stored using the command line, an example below. Get to the command line in Windows by typing cmd in the Run dialogue box.

copy your training and test .arff files to the Weka directory, and then use the following command line: java -cp weka.jar weka.classifiers.trees.J48 -t TRAIN.arff -T TEST.arff -p 0 >filename.probs

24

ClassicaHon: ROC curves

25

VisualizaHon tab

26

VisualizaHon tab

27

S7: Intro to Weka Lab


Shawndra Hill Spring 2013 TR 1:30-3pm and 3-4:30

Das könnte Ihnen auch gefallen