Sie sind auf Seite 1von 11

Big Data

Volume 6 Number 2, 2018


ª Mary Ann Liebert, Inc.
DOI: 10.1089/big.2018.0023

ORIGINAL ARTICLE

Deep Learning Method for Denial of Service Attack


Detection Based on Restricted Boltzmann Machine
Yadigar Imamverdiyev and Fargana Abdullayeva*

Abstract
In this article, the application of the deep learning method based on Gaussian–Bernoulli type restricted Boltz-
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

mann machine (RBM) to the detection of denial of service (DoS) attacks is considered. To increase the DoS attack
detection accuracy, seven additional layers are added between the visible and the hidden layers of the RBM.
Accurate results in DoS attack detection are obtained by optimization of the hyperparameters of the proposed
deep RBM model. The form of the RBM that allows application of the continuous data is used. In this type of RBM,
the probability distribution of the visible layer is replaced by a Gaussian distribution. Comparative analysis of the
accuracy of the proposed method with Bernoulli-Bernoulli RBM, Gaussian–Bernoulli RBM, deep belief network
type deep learning methods on DoS attack detection is provided. Detection accuracy of the methods is verified
on the NSL-KDD data set. Higher accuracy from the proposed multilayer deep Gaussian–Bernoulli type RBM is
obtained.
Keywords: deep learning; deep belief network; restricted Boltzmann machine; NSL-KDD

Introduction ordered at a cheap price or can be used as a special tool


In the CIA (confidentiality, integrity, and availability) designed by hackers. There are economical (struggle
triad, the primary aim of denial of service (DoS) attacks against adversary), political, terrorism, and cybercrime
is the violation of the information availability. The oc- motives of DoS attacks. There are facts of using DDoS
currence of natural causes (e.g, technical accidents), er- attacks for political purposes against countries, political
rors (poor configuration of systems and services), and parties, and social movements.2 According to the DDoS
DoS attacks can result in the violation of availability attacks in Q3 2017 report by Kaspersky laboratory, re-
of information resources, networks, and e-services. sources in 98 countries were attacked in the third quarter
DoS attacks are malicious actions that can prevent of 2017.3 Hackers can use the special technology and dis-
accessing of the legitimate users to the system, network, tributed infrastructure (botnets) to reach a huge speed of
application software, or information. There are differ- DDoS attacks.4 This allows isolating even the information
ent types of DoS attacks, they become a single-source infrastructure of the whole country from the Internet.
(launched from one system) and distributed (launched The problem of DDoS attacks detection has been
from numerous systems) form. Distributed DoS widely studied in the past decade and is still of high in-
(DDoS) attacks are a specific type of DoS attacks. terest. Machine learning methods have been developed
DDoS attacks target Internet infrastructure objects (in- to detect the DDoS attacks using classification and clus-
cluding routers and Domain Name System (DNS) serv- tering algorithms. One of the drawbacks of traditional
ers), bandwidths, and servers (including Hyper Text machine learning algorithms is that they use hand-
Transfer Protocol (HTTP), DNS, and other services).1 crafted features for recognition task.
Nowadays the implementation of DoS attacks does This is not ‘‘true’’ machine learning. It is desirable
not require special knowledge in the information tech- that the machine itself found and structured features
nology field. Thus, the realization of this attack can be for attack detection.
Institute of Information Technology, Azerbaijan National Academy of Sciences, Baku, Azerbaijan.

*Address correspondence to: Fargana Abdullayeva, Institute of Information Technology, Azerbaijan National Academy of Sciences, 9A, B. Vahabzade Street, AZ1141, Baku,
Azerbaijan, E-mail: a_farqana@mail.ru

159
160 IMAMVERDIYEV AND ABDULLAYEVA

Motivations Still, deep learning methods are less applied to


The main purpose of the network attack detection sys- DDoS attacks detection, only a few studies have
tem is to differentiate malicious behavior from normal been published.12,13
network traffic. This type of analysis to meet with The purpose of this article is to develop an approach
cybersecurity, in which the nature of the new attacks based on DNNs for accurate and adaptive detection
is not known in advance and appeared in real time of DoS attacks in real time. To achieve this purpose,
on a continuous basis, should be carried out in a flexi- the self-learning discriminative restricted Boltzmann
ble and effective way. This process can be carried out by machine (RBM), based on the energy model of the net-
constructing the model of any type of attack that can work of stochastic neurons, is used. As the data classes
influence the network, or simply by constructing a nor- are not used in the learning process, the RBM model
mal traffic common model. can detect attacks in real time with high precision
Such a model is usually built on training data and is and adaptability.
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

used to classify previously unknown or suspicious


events. In the attack detection problem, the classifica- Contributions
tion becomes one of the fundamental issues. Learning The contributions of this article are the following:
through the classification, the system recognizes com-
 Multilayer RBM-based DoS attack detection
plex traffic patterns, distinguishes different events
method is proposed. To obtain the accurate results
according to matching templates, and makes intelligent
in DoS attack detection, the hyperparameters of
decisions. For accurately classifying data to one of the
the proposed deep RBM model are optimized.
attack types, machine learning methods such as neural
 RBM has been trained on standardized (normal-
networks5 support vector machines (SVMs) are used.6,7
ized) data set. In the training process of the
Currently, deep learning is one of the most intensive
RBM, the labels of the data set are not used.
research trends in the field of artificial intelligence8,9
 Trained features of RBM are used as an input data
and opens wide opportunities to overcome the con-
for learning the next layers of the neural network.
straints of traditional machine learning methods. In tra-
In the result, the high-dimensional unlabeled data
ditional machine learning algorithms, the features are
have been selected as optimal small-sized features.
extracted by the humans. There is a special research
 The effectiveness of the RBM method is evaluated
direction—feature engineering. But in the big data pro-
on the NSL-KDD data set. The advantage of detec-
cessing, deep neural networks work better than a human
tion accuracy of the proposed deep RBM model in
in feature extraction. In the speech recognition, natural
comparison with the SVM, decision tree, Bernoulli-
language processing, computer vision, and other fields,
Bernoulli RBM, Gaussian–Bernoulli RBM, and
they represent better results than alternative methods.10
deep belief network (DBN) is demonstrated.
Successful applications of deep learning methods in
different fields attract the attention in the cybersecurity The remainder of this article is organized as follows:
field, too. Deep learning is often considered as identical DDoS Attacks and Detection Methods: A Brief Over-
with deep neural networks (DNNs). DNNs do not have view section introduces brief information on DDoS at-
a universally accepted definition, usually, the neural tacks. Related Works section presents the related work
networks with more than one hidden layer are called of DNNs and the related work of DDoS detection using
DNNs. One of the reasons for the successful use of deep learning. The Restricted BM section introduces
DNNs is that these networks automatically extract the proposed solution of detecting DDoS attacks by
from data important features needed to solve the RBMs. The Experiments and Discussion section pres-
cyber attack detection problem. In traditional machine ents results of experiments. Finally, the Conclusions
learning algorithms, features should be extracted by the section presents the concluding remarks.
human,11 there is a special direction in machine learn-
ing research called feature engineering. But in big data DDoS Attacks and Detection Methods:
processing, the DNNs work better than humans in fea- A Brief Overview
ture extracting. DNN by using several processing layers The first large-scale DDoS attacks are conducted in
learns the representation of the data with multilevel ab- early February 2000 to the large companies such as
straction. Representation on each layer is formed on Yahoo!, eBay, and CNN, and since then, the develop-
the basis of the representation of the previous layer. ment of preventive mechanisms against DDoS attacks
DEEP LEARNING 161

focuses on the center of attention of network practi- and the hidden neurons are varied according to the
tioners and security researchers. probability activation functions.
There are the following types of DDoS attacks: RBM is a BM that has no relationship between the
hidden layer neurons. Owing to the special bipartite
 Volumetric attacks—catch the bandwidth of the
graph structure, it is possible to clearly find the proba-
target server; measurement unit—bits per second.
bilities of the hidden layer neurons. If a sufficient num-
Examples—ICMP-, UDP-, TCP-flood attacks.
ber of neurons are used in the hidden layer, RBM can
 Protocol-based attacks—capture the resources of
generate any discrete distribution.
the target server; measurement unit—packages per
RBM is a key structural unit for constructing the
second. Examples—Ping flood, SYN-flood, Smurf
DBN. DBN is a multilayer network, in which the lower
attacks.
layers are sigmoid belief network and the upper layer is
 Application level attacks—by using the applica-
RBM.
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

tion level vulnerabilities target server is crashed.


Deep BM (DBM) is sometimes used in the pretrain-
Examples are Hash DoS attacks and Teardrop
ing step instead of the autoencoder. Multilayer archi-
attacks.
tecture of the DBM is the main difference from RBM.
A great deal of research on the topic of DDoS attacks Hybrid DL architecture integrates the generative and
has been carried out and in recent years, there are sev- discriminative architectures. DNN can be given as an
eral important review articles analyzing the taxonomy example of hybrid architecture. In Ref.,24 DNN is a cas-
of DDoS attacks, attack mechanisms, proposed archi- cade of fully connected hidden layers and often uses the
tectures, and methods for protection from DDoS at- RBM stack as a pretraining stage.
tack; their advantages and disadvantages have been
published. Suggested approaches for detecting DDoS DDoS attack detection using deep learning
attacks can be divided into initial approaches, statistical Computer network attacks detection is one of the areas
analysis, knowledge-based soft computing, and ma- that have been investigated for a long time and new
chine learning methods.14–19 ideas have been developed in numerous approaches.
Statistical methods are based on quantitative analysis In Ref.,25 a hybrid method based on a DBN and SVM
of traffic, and signature methods are based on quality is proposed. Here, the DBN is used for the feature se-
analysis of traffic. Standard methods of statistical anal- lection and the SVM for classification, and by this
ysis and knowledge-based methods do not allow to method, the NSL-KDD data set is classified into five
detect previously unknown attacks, and, therefore, ma- classes, such as Normal, Remote to Local (R2L), DoS,
chine learning methods are widely used as a solution User to Root (U2R), and Probing. Based on the pro-
mechanism for this problem. posed concept, by using the DBN, the data set is re-
duced from 41 to 5 features and then SVM is applied
Related Works to these reduced data and classification is performed.6
Deep neural networks By applying sparse autoencoder and softmax regression
There are many methods referred to as deep learning type self-taught learning approach, a network attacks
(DL) and in Ref.20 they are divided into three groups: detection method is proposed. The effectiveness of
generative, discriminative, and hybrid. Generative ar- the intrusion detection system constructed on self-
chitectural methods include autoencoders,21 recurrent taught learning has been tested on the NSL-KDD
neural networks,22 and Boltzmann machines (BMs). data set.
Each layer of the deep network learns independently, In Ref.26 on the basis of autoencoder and DBN, the
bypassing the previous pretraining procedure. It then malware detection method is proposed. Here primarily
allows checking a good initial approach to run the by applying the autoencoder, the key features of the
backpropagation algorithm. Depending on the selected data are extracted and data volume reduction is carried
model, each layer may be RBM or CNN (convolutional out. Then by applying the DBN learning method, the
neural network).23 malware code is detected. Conducted experiments
BM is a network of symmetrically connected sto- show that the precision of the proposed hybrid method
chastic binary units. The units are divided into two is higher than a single DBN method. The evaluation of
groups, by describing visible and hidden states (analogy the proposed method is conducted on the KDDCUP’99
with hidden Markov models). The states of the visible data set. The data are classified into the probe, U2R,
162 IMAMVERDIYEV AND ABDULLAYEVA

R2L, DoS, and Normal classes. Ref. 27 offers a hybrid on all the features of the NSL-KDD data set, the deep
approach based on the combination of spectral cluster- learning method obtaines very low result compared
ing and DNN for intrusion detection in sensor net- with the mentioned methods, but when it is tested on
works. In the first stage, the network features are six features, the method in terms of accuracy metric
clustered and divided into k subsets. In the second gets the high result and composed of 75.75%.
stage, by applying deep learning network to the subsets In Ref.31 an approach to detect malicious zero-day
generated in the clustering phase, more understandable Adobe Flash applications is proposed. The proposed
features are obtained and at the end, by using a test set, approach consists of three stages: (1) extraction, (2)
the attacks are detected. conversion, and (3) classification. In the extraction
In Ref.28 based on the multilayer DBN, the DoS at- stage header, tags and action features are extracted.
tacks detection method is proposed. DBN consists of The conversion phase performs the conversion of the
numerous RBMs. Here in advance in the learning pro- data generated at the extraction phase into (0, 1) values
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

cess, the training of the RBM is carried out. Then the or matrices. The classification phase carries out train-
trained features of RBM are used as an input data for ing of deep learning models based on the data acquired
learning RBM of the next layer of the DBN stack. at the conversion stage. If raw values are given, it is
The effectiveness of the DBN method is tested on the scaled to between 0 and 1 to reduce computational
KDD CUP 1999 data set. The detection accuracy of complexity. If the sequence of instructions or API
the DBN model is better than the SVM and ANN meth- calls are given, it is projected to a N*M matrix, where
ods. In Ref.,29 a network anomaly detection method N is the number of sequences and M is the number
based on a semisupervised approach is proposed. To of types. At the end, the results of deep learning meth-
carry out this analysis, the discriminative RBM tool is ods are combined with the ensemble method. The pro-
used. In the semisupervised anomaly detection system, posed detection system is tested on randomly collected
the classifier is trained according to the normal profile malicious Flash applications. In the data set, benign
of the data, any deviation from such state is modeled and malicious classes are used. The difference of our
as an anomaly signal. To evaluate the effectiveness of approach from existing works is that in this article by
the proposed method, the experiments are provided adding the additional seven layers between the hidden
on ‘‘real’’ KDD ’99 data set. Here instead of the 41 fea- and visible layers of RBM, multilayered deep RBM ar-
tures of the data set, 28 features only related to network chitecture is formed. And this quite increases the DoS
traffic are selected. In the selection of hyperparameters, attacks detection efficiency.
the neural network learning rate is taken equal to 0.1, In Ref.,32 to detect DDoS attacks in the web-
the number of epochs at first is taken as 15, then as 10 application layer deep learning method is used. It is
and 5, and experiments are carried out. even more difficult to detect application-level attacks
In Ref.30 on the basis of deep learning in the software than DDoS attacks on TCP and IP layers because in
defined networking (SDN) environment, flow-based such attacks attackers launch the legitimate HTTP
anomaly detection approach is proposed. In this study, requests from the genuine network computers. In
a DNN for the intrusion detection is constructed, and this article for the extraction of the higher level fea-
the model training is conducted on the NSL-KDD tures, the Stacked AutoEncoder is applied.
data set. Six features of the NSL-KDD data set are In Ref.33 for the detection of application-level DDoS
used. To evaluate the effectiveness of the model, the ac- attacks using encrypted protocols, the anomaly detection
curacy, precision, recall, and f-measure metrics are used. approach based on computed statistics from network
To detect attacks, the feature vector of the model con- packets is applied. In the proposed scheme, negotiations
sists of six parameters of the NSL-KDD data set, such between the web server and its clients are clustered and
as duration, protocol_type, src_bytes, dst_bytes, count, normal user behavior is established. With the help of
and srv_count. The difference of this work from others Stacked AutoEncoder, the distribution of negotiations
is that it uses simplex initial processing and feature ex- on these clusters is analyzed. Client negotiations deviated
traction technologies in the SDN context. A comparative from the normal samples are classified as an anomaly.
analysis of the proposed neural network method with In Ref.,12 deep learning is used for the distributed de-
the existing J48, Naive Bayes (NB), NB tree, random for- tection of attacks in the IoT fog ecosystem. Experiments
est, random tree, multilayer perceptron, and SVM meth- have shown that the distributed detection system outper-
ods on the accuracy metrics is conducted. When tested forms the centralized system, and in terms of detection
DEEP LEARNING 163

accuracy, it is effective against shallow neural networks.13 an observation, that is, one visible unit for each feature
For detecting DDoS attacks in the SDN environment, a of an input pattern. The hidden units model dependen-
deep learning-based defense system model is proposed. cies between the components of observations, that is,
The model can learn samples from network traffic se- the dependencies between the features. Energy-based
quences and track network attack activities chronologi- model means that the probability distribution over
cally. Using defense system based on this model, it is the variables v and h is defined by an entropy function.
possible to efficiently clean DDoS attack traffic on SDN. This function is described by Equations (1) and (2),36
Summarizing the mentioned problems, the list of first in its vectorial form in bold and second in its ex-
deep learning-based network intrusion detection meth- tended form:
ods can be described as follows (Table 1).
Eðv, hÞ =  hT Wv  aT v  bT h: (1)
Restricted BM
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

m, n m n
RBM is a stochastic network of neurons made up of Eðv, hÞ =  + vi hj wij  + ai vi  + bj hj : (2)
two layers: the visible and the hidden layers. The visible i, j = 1 i=1 j=1

layer represents the collected data, whereas the hidden


layer tries to learn features from the visible layer aiming Using the entropy function, it is possible to assign a
to represent a probabilistic distribution of the data.34 probability for each pair of neurons of the network,
The network is called restricted because the neurons one in the visible layer and the other in the hidden
in a layer have connections only to the neurons in the layer, giving a probabilistic distribution described in
other layer. Connections between the layers are sym- Ref.35:
metric and bidirectional, allowing information transfer e  Eðv, hÞ
in both directions. pðv, hÞ = : (3)
+ e  E(v, h)
The RBM is graphically illustrated in Figure 1, v, h
where m is the number of neurons in the visible layer
ðv1 , . . . , vm Þ, n is the number of neurons in the hid- The probability of the vector from the visible layer v
den layer ðh1 , . . . , hn Þ, a and b are bias vectors, and is given as the sum of all probabilities of the vector
w is the weight matrix. from the hidden layer, described in Ref.35:
RBM is an energy-based model that uses a layer of + e  E(v, h)
hidden variables to model a probabilistic distribution h
pðvÞ = : (4)
over visible variables.35 The visible units constitute + e  Eðv, hÞ

the first layer and correspond to the components of v, h

Table 1. Deep learning-based distributed denial of service detection methods

Feature
Article extraction Classifier Goal Data set Classes
25
DBN DBN and SVM Attack detection NSL-KDD Normal, R2L, DoS, U2R,
and Probing
10
— Sparse auto-encoder and Network attack detection NSL-KDD Normal, R2L, DoS, U2R,
soft-max regression type and Probing
self-taught learning
26
AutoEncoder DBN Malicious code detection KDDCUP’99 Normal, R2L, DoS, U2R,
and Probing
27
Spectral Combination of the spectral Intrusion detection in KDD-Cup99, NSL-KDD, Normal, R2L, DoS, U2R,
clustering clustering and deep sensor networks sensor network data set and Probing
neural network
28
Normalized DBN Reducing volume of the data KDD CUP 1999 Normal, R2L, DoS, U2R,
based on feature extraction and Probing
and DoS attacks detection
29
Normalized Discriminative RBM Classification of network traffic in KDD ’99 Normal and anomaly
normal and anomaly classes
31
Normalized Deep feed-forward neural network, Malicious zero-day Adobe Flash Synthetic data Benign and malicious
deep recurrent neural network software detection

DBN, deep belief network; DoS, denial of service; RBM, restricted Boltzmann machine; U2R, user to root; R2L, remote to local; SVMs, support vector
machines.
164 IMAMVERDIYEV AND ABDULLAYEVA
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

FIG. 1. Restricted Boltzmann machine (RBM).

Since RBMs do not have connections between neigh- distribution of Gaussian–Bernoulli type RBM is de-
boring neurons of the same layer, the events are inde- fined as follows:
pendent. This facilitates the calculations of the   1 m
conditional probabilities, which are35: p hj = 1jv = sigm(bj þ 2 + vi wij )
r i=1
(9)
Y  
pðhjvÞ = p h j jv : (5)
j n
pðvi = 1jhÞ = N(ai þ + hj wij , r2 ): (10)
Y j=1
pðhjvÞ = pðvi jhÞ: (6)
i

Learning in RBM
The first versions of RBM were developed to solve
The training of an RBM involves in minimizing the
problems with binary data. Considering that, Equa-
negative log-likelihood given by
tions (7) and (8) could be generalized using the cor-
rect probability distribution for binary data, given by q log p(V)
Dwij = e = eÆvi hj æd  Ævi hj æm , (11)
Ref.35: qwij
 
  m
p hj = 1jv = sigm bj þ + vi wij (7) where e is the learning rate, Ææd and Ææm are used to
i=1 represent the expected values of the data and the
!
model, respectively.
n The expectation can be obtained by Equations (7)
pðvi = 1jhÞ = sigm ai þ + hj wij , (8) and (8) for binary data, whereas Equations (9) and
j=1
(10) give expectations for continuous data.
where sigm(x) is the sigmoid function.
Using RBM for only binary data obviously limits its Data set description
potential for problem solving. To maximize the accu- For the conducting experiments in the classification
racy of the problem, the form of the RBM is used that process, the NSL-KDD data set is used, which is a fun-
allows dealing with continuous data. The best known damental data set for the evaluation of the effectiveness
type of such RBM is the Gauss type RBM. In this of the proposed approaches in the field of creating the
RBM, the probability distribution of the visible layer network intrusion detection systems.
is changed to a Gaussian distribution. This variation Each record of the NSL-KDD data set consists of 41
is called Gaussian–Bernoulli RBM.37 The probable features (e.g., protocol type, service, and flag) and these
DEEP LEARNING 165

Table 2. Evaluating the effectiveness of the methods

Accuracy f-measure g-mean Precision Recall Sensitivity Specificity TN TP Average value

SVM radial basis 0.7161 0.7400 0.7173 0.6096 0.9416 0.9416 0.5464 0.5464 0.9416 0.7445
SVM (epsilon-SVR) 0.7269 0.7550 0.7251 0.6139 0.9804 0.9804 0.5363 0.5363 0.9804 0.6998
Decision tree 0.6744 0.7190 0.6620 0.5710 0.9705 0.9705 0.4516 0.4516 0.9705 0.7157
Gaussian–Bernoulli RBM 0.7323 0.7530 0.7348 0.6233 0.9509 0.9509 0.5678 0.5678 0.9509 0.7591

TN, true negative; TP, true positive.

records are labeled as normal and specific types of at- and average value indexes are used. In Table 2, compar-
tacks. Here the attacks are divided into four classes: ative analysis of RBM results with classic classification
algorithms on various metrics is described.
 DoS: For example, Neptune, Smurf, Pod and Tear-
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

As seen in Table 2, the results of the RBM algorithm


drop.
outperform the results of the other algorithms. Thus,
 R2L: Unauthorized access from a remote machine
the accuracy of SVM (radial basis) algorithm was
to a local. For example, Guess-password, Ftp-
0.7161, SVM (epsilon-SVR) was 0.7269, decision tree
write, Imap, and Phf.
was 0.6744, but in RBM algorithm, this measure
 U2R: Unauthorized access to root privileges.
achieved 0.7323. As seen from here, the DoS attacks de-
For example, Buffer-overflow, Load-module,
tection efficiency of the RBM algorithm is better than
Perl, and Spy.
that of other algorithms.
 Probing: For example, Port-sweep, IP-sweep,
To determine the significance of the proposed
Nmap, and Satan.
methods it is desirable to provide the statistical
Experiments and Discussion tests.38,39 Here RBM algorithm is launched on the
We implemented three deep learning and three tra- NSL-KDD data set 22 times. The effectiveness of
ditional machine learning architectures: Bernoulli- the method is evaluated on the basis of accuracy, f-
Bernoulli RBM, Gaussian–Bernoulli RBM, DBN, measure, g-mean, precision, recall, sensitivity, speci-
SVM (radial basis), SVM (epsilon-SVR), and deci- ficity, TN rate, and TP rate metrics. For detailed in-
sion tree. formation, see Supplementary Tables S1–S3 given in
In the experiments for detecting DoS attacks, RBM is the Supplementary Data (Supplementary Data are
used, which consists of 7 layers, 100 neurons, and 38 available online at www.liebertpub.com/big).
visible neurons and are conducted on NSL-KDD data
set containing five classes (probe, U2R, R2L, DoS,
and normal).
The number of neurons in the layers is taken as 100,
100, 100, 100, 100, 100, and 100 respectively. Column-
wise normalization of data set is between the [0, 1] in-
terval. To train the network, 5 epochs are used. The
weights for the used 100 hidden neurons are selected
randomly. For the training and testing of the network,
NSL-KDDTrain+_20Percent data set is used. So, the
number of rows of train data contains 25,194 samples
and the test data contain 4508 samples. Experiments
are carried out on 38 features of the data set. Three fea-
tures that do not have a significant impact on the effec-
tiveness of DoS attack detection are not taken into
account in the classification process. The activation
FIG. 2. Boxplot representation of the average
function of the model is sigmoid.
values of various metrics of deep learning
To evaluate the results of the experiments accuracy,
methods for denial of service attack detection.
f-measure, g-mean, precision, recall, sensitivity, speci-
ficity, true negative (TN) rate, true positive (TP) rate,
166 IMAMVERDIYEV AND ABDULLAYEVA
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

FIG. 3. Comparison of the methods on NSL-KDD data set.

To evaluate the robustness of the Bernoulli-Bernoulli As seen from Figure 2, Gaussian–Bernoulli-type


RBM, Gaussian–Bernoulli RBM, and the DBN algo- RBM gives better results than the other methods and
rithms, the program in the Matlab environment is because the deviation of this method is very small,
launched 22 times on the NSL-KDD data set and a box- the boxplot representation of this method is too tight.
plot representation based on the average value of re- For better demonstration of the results, Figure 3 vi-
sults obtained on different metrics is created. Boxplot sually illustrates the comparison of methods. Notice
representation of the average values of various metrics that in these figures, we subtract the recall, sensitivity,
for deep learning methods of DoS attack detection is and TP rate values from all the methods, thus the dif-
shown in Figure 2. ference can be observed more clearly.
The worst, mean, and best case values obtained from
22 times running of each algorithm are given in Table 3.
From Table 3, it is obvious that the worst result
obtained by Gaussian–Bernoulli RBM is even better
than the best results obtained by the other methods.
DNNs are models with a large number of parameters
and these parameters should be regulated during the
training process. When learning such models, a large
volume of data should be used. However, the size of
the NSL-KDD data set used here is small. Training

Table 3. Comparison of the results of the deep learning


methods according to the accuracy metric

Method Worst Mean Best


FIG. 4. Accuracy dynamics of the RBM by
Bernoulli-Bernoulli RBM 0.6422 0.7255 0.6828
number of epochs from two to five. Gaussian–Bernoulli RBM 0.6894 0.7591 0.7323
DBN 0.4343 0.5299 0.4809
DEEP LEARNING 167

In this article, the experiments related to epoch


number are also conducted. According to the number
of epochs, the detection accuracy of DoS attacks by
training the Gaussian–Bernoulli type RBM is described
in Supplementary Table S4.
As given in Supplementary Table S4, the RBM’s DoS
attack detection accuracy gradually decreases as the
number of epochs increases here. But the RBM’s DoS
attack detection accuracy gradually increases as the
number of epochs increases from 2 to 5 (Supplemen-
tary Table S5) (see Supplementary Data).
For better demonstration of the results in Supple-
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

mentary Table S5, Figure 4 visually shows the RBM


DoS attack detection accuracy dynamics.
FIG. 5. Detection rate dynamics of the RBM over As seen from Figures 5 and 6, in fact, when the num-
various metrics by number of epochs from two to ber of iterations over the data is taken less, the model
five. detects DoS attacks with high precision (Fig. 5), but
as the number of iterations increases, the accuracy of
the model for DoS attacks detection gradually falls
the deep learning model on small data set with big iter- down (Fig. 6).
ation number leads to overfitting of the network40 and Since the TN rate has the same value as the specific-
this reduces the effectiveness of the model. To elimi- ity metric, and TN rate and sensitivity metric have the
nate this problem, in this article, the number of itera- same values as the recall, the TN rate, TP rate, and sen-
tions is taken small. sitivity are not included in Figure 6.

FIG. 6. Accuracy dynamics of the RBM by number of epochs from 5 to 65.


168 IMAMVERDIYEV AND ABDULLAYEVA

Table 4. Restricted Boltzmann machine denial of service attack detection statistics based on the number of epochs

No. of predicted points


No. of real
Classes points 5 epochs 10 epochs 15 epochs 20 epochs 25 epochs 30 epochs 35 epochs

Normal 1 1935 2908 2780 2885 3009 1356 3227 2874


DoS 2 1526 1077 1555 844 895 1323 918 1027
U2R 3 46 0 0 0 0 0 0 0
R2L 4 513 0 0 429 520 1638 228 594
Probe 5 488 523 173 350 84 191 135 13
Total 4508 4508 4508 4508 4508 4508 4508 4508

As seen in Figure 5, if we increase the number of iter- than the other Bernoulli-Bernoulli RBM and DBN
ations from two to five, the DoS attack detection accuracy type deep learning methods. This method also outper-
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

of the Gaussian–Bernoulli RBM gradually increases. But forms the results of SVM, radial basis, SVM (epsilon-
if we increase this value starting from 5 to 65, the detec- SVR), decision tree type machine learning methods too.
tion accuracy of the algorithm gradually decreases. Vis- We have shown that our model significantly im-
ual representation of this case is depicted in Figure 6. proves the detection accuracy on DoS attack detection
Attacks on the test data are divided into four catego- tasks compared with previous work. In future work we
ries: probe (e.g., IP sweep, vulnerability scanning), DoS also plan on experimenting with LSTM decoders as
(e.g., mail bomb, UDP storm), U2R (e.g., buffer over- well as deep and bidirectional LSTM encoders.
flow attacks, rootkits) and R2L (e.g., password guess-
ing, worm attack). Acknowledgment
Since the number of data in the probe and DoS clas- This work was supported by the Science Develop-
ses is much larger than U2R and R2L, there is always a ment Foundation under the President of the Republic
decrease in the attacks detection effectiveness of the al- of Azerbaijan—Grant No. EIF-KETPL-2-2015-1(25)-
gorithm.36 This condition is associated with the inabil- 56/05/1.
ity of the algorithm to handle the U2R and R2L classes
well. Therefore researchers try to study the attack clas-
Author Disclosure Statement
ses separately. Below is the number of points in the
No competing financial interests exist.
DoS, U2R, R2L, probe (attacks), and normal classes,
and the ability of the Gaussian–Bernoulli RBM method
for detecting these points in the different iterations val- References
1. Mansfield-Devine S. DDoS goes mainstream: How headline-grabbing at-
ues is given (Table 4). tacks could make this threat an organisation’s biggest nightmare. Netw
As the number of iterations increases in the Gaussian– Secur. 2016;11:7–13.
2. Nazario J. Politically motivated denial of service attacks. In: Czosseck C,
Bernoulli RBM method, the points from the R2L class are Geers K, eds. The virtual battlefield perspectives on cyber warfare.
detected, but in this case, the detection accuracy of the Amsterdam: Ios Press, 2009, pp. 163–181.
3. Kaspersky DDoS attacks in Q3 2017. Kaspersky. Available online at https://
method decreases. This is due to the fact that there are securelist.com/ddos-attacks-in-q3-2017/83041/ (last accessed March
many false detections allowed. In the experiments, the 22, 2018).
4. Kolias C, Kambourakis G, Stavrou A, Voas J. DDoS in the IoT: Mirai and
points from probe, DoS, and normal classes are detected other botnets. Computer. 2017;50:80–84.
with high accuracy; however, the other U2R and R2L at- 5. Kobojek P, Saeed K. Application of recurrent neural networks for user
tack classes give the worst result compared with the other verification based on keystroke dynamics. J Telecommun Inf Technol.
2016;3:60–70.
classes. It should be noted that the U2R and R2L classes 6. Chandola V, Banerjee A, Kumar V. Anomaly detection: A survey. J ACM
are in general less visible in data and a lower accuracy Comput Surv (CSUR). 2009;41:1–58.
7. Garcia-Teodoro P, Diaz-Verdejo J, Macia-Fernandez G, Vazquez E.
should, therefore, be expected. Anomaly-based network intrusion detection: Techniques, systems and
challenges. Comput Secur. 2009;28:18–28.
8. Aminanto ME, Kim K. Deep learning in intrusion detection system: An
Conclusions overview. In: Proceedings of the International Research Conference on
In this article, the discriminative RBM was applied to Engineering and Technology, Jakarta, Indonesia, 2016, pp. 1–12. Available
online at https://pdfs.semanticscholar.org/c0fa/578c1fae002e02834806a
the network attacks detection problem. The results of 576d811002cb4a4.pdf (last accessed February 20, 2018).
the experiments on the NSL-KDD data set showed 9. Wang Z. The applications of deep learning on traffic identification.
Blackhat USA, 2015, pp. 1–10.
that the proposed multilayer deep Gaussian–Bernoulli 10. Niyaz Q, Sun W, Javaid AY, Alam M. A Deep learning approach for network
RBM method gave better results in detecting attacks intrusion detection system. In: Proceedings of the 9th EAI International
DEEP LEARNING 169

Conference on Bio-inspired Information and Communications Tech- security.org/TC/SP2015/program-posters.html (last accessed January
nologies (BICT), New York, 2016, pp. 21–26. 10, 2018).
11. Imamverdiyev YN, Sukhostat LV. Network traffic anomalies detection 32. Yadav S, Subramanian S. Detection of application layer DDoS
based on informative features. Radio Electron Comput Sci Control. attack by feature learning using stacked autoencoder. In: Proceedings
2017;3:113–120. of the International Conference on Computational Techniques in
12. Diro AA, Chilamkurti N. Distributed attack detection scheme using deep Information and Communication Technologies (ICCTICT), New Delhi,
learning approach for Internet of Things. Future Gener Comput Syst. India, 2016, pp. 361–366.
2018;82:761–768. 33. Zolotukhin M, Hämäläinen T, Kokkonen T, Siltanen J. Increasing web
13. Li C, Wu Y, Yuan X, et al. Detection and defense of DDoS attack–based on service availability by detecting application-layer DDoS attacks in
deep learning in OpenFlow-based SDN. Int J Commun Syst. 2018:1–15. encrypted traffic. In: Proceedings of the 23rd International Conference
14. Mirkovic J, Reiher P. A taxonomy of DDoS attack and DDoS defense on Telecommunications (ICT), Thessaloniki, Greece, 2016, pp. 1–6; 2016.
mechanisms. ACM SIGCOMM Comput Commun Rev. 2004;34:39–53. 34. Memisevic R, Hinton GE. Learning to represent spatial transformations
15. Peng T, Leckie C, Ramamohanarao K. Survey of network-based defense with factored higher-order Boltzmann machines. Neural Comput. 2010;
mechanisms countering the DoS and DDoS problems. ACM Comput 22:1473–1492.
Surv. 2007;39:1–42. 35. Larochelle H, Bengio Y. Classification using discriminative restricted
16. Loukas G, Oke G. Protection against denial of service attacks: A survey. Boltzmann machines. In: Proceedings of the 25th ACM International
Comput J. 2010;53:1020–1037. Conference on Machine Learning, Helsinki, Finland, 2008, pp. 536–543.
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.

17. Bhuyan MH, Kashyap HJ, Bhattacharyya DK, Kalita JK. Detecting distrib- 36. Saied A, Overill RE, Radzik T. Artificial neural networks in the detection of
uted denial of service attacks: Methods, tools and future directions. known and unknown DDoS attacks: Proof-of concept. Commun Com-
Comput J. 2013;57:537–556. put Inf Sci. 2014;430:300–320.
18. Tama BA, Rhee KH. Data mining techniques in DoS/DDoS attack detec- 37. Hinton GE, Salakhutdinov R. Reducing the dimensionality of data with
tion: A literature review. In: Proceedings of the 3rd International Con- neural networks. Science. 2006;313:504–507.
ference on Computer Applications and Information Processing 38. Alguliyev RM, Aliguliyev RM, Isazade NR. An unsupervised approach to
Technology (CAIPT 2015), Yangon, Myanmar, 2015, pp. 1–4. generating generic summaries of documents. Appl Soft Comput. 2015;
19. Girma A, Garuba M, Goel R. Advanced machine language approach to 34:236–250.
detect DDoS attack using DBSCAN clustering technology with entropy. 39. Alguliev RM, Aliguliyev RM, Mehdiyev CA. pSum-SaDE: A modified
In: Latifi S, ed. Information Technology - New Generations. Advances in p-median problem and selfadaptive differential evolution algorithm
Intelligent Systems and Computing, 2018, vol. 558. Cham, Switzerland: for text summarization. Appl Comp Intell Soft Comput. 2011;
Springer, pp. 125–131. 2011:13.
20. Deng L. A tutorial survey of architectures, algorithms, and applications for 40. Xiao L, Shao Z, Liu G. K-means algorithm based on Particle Swarm
deep learning, APSIPA. Trans Signal Inf Process. 2014;3:1–29. Optimization algorithm for anomaly intrusion detection. In: The sixth
21. Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: world congress on intelligent control and automation, Dalian, China,
Learning useful representations in a deep network with a local 2006, pp. 5854–5854.
denoising criterion. J Mach Learn Res. 2010;11:3371–3408.
22. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput.
1997;9:1735–1780.
23. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep con- Cite this article as: Imamverdiyev Y, Abdullayeva F (2018) Deep
volutional neural networks. Adv Neural Inf Process Syst. 2012;25:1097–1105. learning method for denial of service attack detection based on
24. Deng L, Yu D. Deep learning: Methods and applications. Found Trends restricted Boltzmann machine. Big Data 6:2, 159–169, DOI: 10.1089/
Signal Process. 2014;7:197–387. big.2018.0023.
25. Salama MA, Eid HF, Ramadan RA, et al. Hybrid Intelligent Intrusion
Detection Scheme. In: Gaspar-Cunha A, Takahashi R, Schaefer G, Costa
L, eds. Soft Computing in Industrial Applications. Advances in Intelli-
gent and Soft Computing, 2011, vol. 96. Berlin, Heidelberg, Germany:
Springer, pp. 293–303.
26. Li Y, Ma R, Jiao R. A hybrid malicious code detection method based on
Deep Learning. Int J Secur Appl. 2015;9:205–216.
Abbreviations Used
27. Ma T, Wang F, Cheng J, et al. A hybrid spectral clustering and deep neural BM ¼ Boltzmann machine
network ensemble algorithm for intrusion detection in sensor net- CNN ¼ convolutional neural network
works. Sensors. 2016;16:1–23. DBM ¼ deep BM
28. Gao N, Gao L, Gao Q. An intrusion detection model based on deep belief DBN ¼ deep belief network
networks. In: Proceedings of the Second International Conference on DDoS ¼ distributed DoS
Advanced Cloud and Big Data (CBD), Huangshan, China, 2014; pp. 247– DNN ¼ deep neural network
252. DoS ¼ denial of service
29. Fiore U, Palmieri F, Castiglione A, Santis AD. Network anomaly detection with NB ¼ Naive Bayes
the restricted Boltzmann machine. Neurocomputing. 2013;122:13–23. R2L ¼ remote to local
30. Tang TA, Mhamdi L, McLernon D, et al. Deep Learning approach RBM ¼ restricted Boltzmann machine
for network intrusion detection in software defined networking. RNN ¼ recurrent neural networks
In: Proceedings of the IEEE International Conference on Wireless SDN ¼ software defined networking
Networks and Mobile Communications (WINCOM), Fez, Morocco, 2016, SVM ¼ support vector machine
pp. 258–263. TN ¼ true negative
31. Jung W, Kim S, Choi S. Poster: Deep learning for zero-day flash malware TP ¼ true positive
detection. In: Proceedings of the 36th IEEE Symposium on Security and U2R ¼ user to root
Privacy, San Jose, California, 2015. Available online at https://www.ieee-

Das könnte Ihnen auch gefallen