Sie sind auf Seite 1von 12

1.

On the first question, after opening step1_preprocessing. R and loading data


(creditworthiness.csv) we need to assign them in the data_raw variable. The next
step would be opening step2_SOMtraining.R. initially data_raw including all
features has been assigned to data_train, then by implementation of these data som-
model is trained, the next step would be opening step3_Visualization.R and by this
script plotHeatMap function is loaded with source('./plotHeatMap.R') then using
plotHeatMap(som_model, data_train, variable = 0) heatmap pictures for all of the
features would be ploted, after that we need to compare all plots with target feature’s
heatmap that on this study it would be credit_rating. By considering correlations
between credit_rating and other features, we would be able to choose five features
that are most correlated with the target, the second time som-model is trained again
but this time only by this features in the other words we assign data_raw[, c(1, 2, 8,
22, 12, 46)] to data_train and tarin som-model then heatmap for this feature has been
plotted, so we could find out the most correlation between target and this chosen
feature to select it for training the MLP network.

data <- data_raw[, 0:45]


target = data_raw[46]
plotHeatMap(som_model, data_train, variable = 0)
#======
#Code for clustring , this code using mlp classifier for calssification
# traget or labels: attributes 46 of dataset
# first we install RSNNS package for using mlp
#install.packages(c("RSNNS"))
library(RSNNS)
# this section we divide our dataset to train and test
smp_size <- floor(0.75 * nrow(data))
set.seed(123)
train_ind <- sample(seq_len(nrow(data)), size = smp_size)
train <- data_raw[train_ind, ]
test <- data_raw[-train_ind, ]
# this section we separate features and labels
Y_train <- train[46]
X_train <- scale(train[, c(1, 2, 8, 12, 22)])
Y_test <- test[46]
X_test <- scale(test[, c(1, 2, 8, 12, 22)])
# mpl model
model <- mlp(X_train, Y_train, size=500, learnFuncParams=c(0.2),
learnFunc = "Std_Backpropagation",
maxit=1000,)
# predict
Y_train_predict <- predict(model,X_train)
Y_test_predict <- predict(model,X_test)
# compute errors
error_train = sum(Y_train - Y_train_predict)/nrow(Y_train)
error_test = sum(Y_test - Y_test_predict)/nrow(Y_test)
error_train
error_test
2.
For this question, MPL model has been trained with five features chosen on previous steps. In
order to do that first we have to open step4_clustering. R and then install RSNNS package with
install.packages(c("RSNNS"))then we need to use RSNNS’s MLP model. The overall data would
be divided into train and test data, and each section would be separated with distinguished train
and test labels (the label is the credit_rating) from train and test data as the following sample code:

smp_size <- floor(0.75 * nrow(data))


set.seed(123)
train_ind <- sample(seq_len(nrow(data)), size = smp_size)
train <- data_raw[train_ind, ]
test <- data_raw[-train_ind, ]
# this section we separate features and labels
Y_train <- train[46]
X_train <- scale(train[, c(1, 2, 8, 12, 22)])
Y_test <- test[46]
X_test <- scale(test[, c(1, 2, 8, 12, 22)])
then we would develope MLP model and train it for train data by the following code:
model <- mlp(X_train, Y_train, size=c(1000, 4), learnFuncParams=c(0.05),
learnFunc = "Std_Backpropagation",
maxit=1000,)
upon training train and test data are introduced to the model and their labels would be predicted,
eventually test and train error would be calculated by comparison of the actual labels with
predicted labels obtained by the codes:

# predict
Y_train_predict <- predict(model,X_train)
Y_test_predict <- predict(model,X_test)
# compute errors
error_train = sum(Y_train - Y_train_predict)/nrow(Y_train)
error_test = sum(Y_test - Y_test_predict)/nrow (Y_test)
Heatmaps for the best feature selected are depicted on figures bellow

Das könnte Ihnen auch gefallen