Beruflich Dokumente
Kultur Dokumente
ORIGINAL ARTICLE
Abstract
In this article, the application of the deep learning method based on Gaussian–Bernoulli type restricted Boltz-
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.
mann machine (RBM) to the detection of denial of service (DoS) attacks is considered. To increase the DoS attack
detection accuracy, seven additional layers are added between the visible and the hidden layers of the RBM.
Accurate results in DoS attack detection are obtained by optimization of the hyperparameters of the proposed
deep RBM model. The form of the RBM that allows application of the continuous data is used. In this type of RBM,
the probability distribution of the visible layer is replaced by a Gaussian distribution. Comparative analysis of the
accuracy of the proposed method with Bernoulli-Bernoulli RBM, Gaussian–Bernoulli RBM, deep belief network
type deep learning methods on DoS attack detection is provided. Detection accuracy of the methods is verified
on the NSL-KDD data set. Higher accuracy from the proposed multilayer deep Gaussian–Bernoulli type RBM is
obtained.
Keywords: deep learning; deep belief network; restricted Boltzmann machine; NSL-KDD
*Address correspondence to: Fargana Abdullayeva, Institute of Information Technology, Azerbaijan National Academy of Sciences, 9A, B. Vahabzade Street, AZ1141, Baku,
Azerbaijan, E-mail: a_farqana@mail.ru
159
160 IMAMVERDIYEV AND ABDULLAYEVA
focuses on the center of attention of network practi- and the hidden neurons are varied according to the
tioners and security researchers. probability activation functions.
There are the following types of DDoS attacks: RBM is a BM that has no relationship between the
hidden layer neurons. Owing to the special bipartite
Volumetric attacks—catch the bandwidth of the
graph structure, it is possible to clearly find the proba-
target server; measurement unit—bits per second.
bilities of the hidden layer neurons. If a sufficient num-
Examples—ICMP-, UDP-, TCP-flood attacks.
ber of neurons are used in the hidden layer, RBM can
Protocol-based attacks—capture the resources of
generate any discrete distribution.
the target server; measurement unit—packages per
RBM is a key structural unit for constructing the
second. Examples—Ping flood, SYN-flood, Smurf
DBN. DBN is a multilayer network, in which the lower
attacks.
layers are sigmoid belief network and the upper layer is
Application level attacks—by using the applica-
RBM.
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.
R2L, DoS, and Normal classes. Ref. 27 offers a hybrid on all the features of the NSL-KDD data set, the deep
approach based on the combination of spectral cluster- learning method obtaines very low result compared
ing and DNN for intrusion detection in sensor net- with the mentioned methods, but when it is tested on
works. In the first stage, the network features are six features, the method in terms of accuracy metric
clustered and divided into k subsets. In the second gets the high result and composed of 75.75%.
stage, by applying deep learning network to the subsets In Ref.31 an approach to detect malicious zero-day
generated in the clustering phase, more understandable Adobe Flash applications is proposed. The proposed
features are obtained and at the end, by using a test set, approach consists of three stages: (1) extraction, (2)
the attacks are detected. conversion, and (3) classification. In the extraction
In Ref.28 based on the multilayer DBN, the DoS at- stage header, tags and action features are extracted.
tacks detection method is proposed. DBN consists of The conversion phase performs the conversion of the
numerous RBMs. Here in advance in the learning pro- data generated at the extraction phase into (0, 1) values
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.
cess, the training of the RBM is carried out. Then the or matrices. The classification phase carries out train-
trained features of RBM are used as an input data for ing of deep learning models based on the data acquired
learning RBM of the next layer of the DBN stack. at the conversion stage. If raw values are given, it is
The effectiveness of the DBN method is tested on the scaled to between 0 and 1 to reduce computational
KDD CUP 1999 data set. The detection accuracy of complexity. If the sequence of instructions or API
the DBN model is better than the SVM and ANN meth- calls are given, it is projected to a N*M matrix, where
ods. In Ref.,29 a network anomaly detection method N is the number of sequences and M is the number
based on a semisupervised approach is proposed. To of types. At the end, the results of deep learning meth-
carry out this analysis, the discriminative RBM tool is ods are combined with the ensemble method. The pro-
used. In the semisupervised anomaly detection system, posed detection system is tested on randomly collected
the classifier is trained according to the normal profile malicious Flash applications. In the data set, benign
of the data, any deviation from such state is modeled and malicious classes are used. The difference of our
as an anomaly signal. To evaluate the effectiveness of approach from existing works is that in this article by
the proposed method, the experiments are provided adding the additional seven layers between the hidden
on ‘‘real’’ KDD ’99 data set. Here instead of the 41 fea- and visible layers of RBM, multilayered deep RBM ar-
tures of the data set, 28 features only related to network chitecture is formed. And this quite increases the DoS
traffic are selected. In the selection of hyperparameters, attacks detection efficiency.
the neural network learning rate is taken equal to 0.1, In Ref.,32 to detect DDoS attacks in the web-
the number of epochs at first is taken as 15, then as 10 application layer deep learning method is used. It is
and 5, and experiments are carried out. even more difficult to detect application-level attacks
In Ref.30 on the basis of deep learning in the software than DDoS attacks on TCP and IP layers because in
defined networking (SDN) environment, flow-based such attacks attackers launch the legitimate HTTP
anomaly detection approach is proposed. In this study, requests from the genuine network computers. In
a DNN for the intrusion detection is constructed, and this article for the extraction of the higher level fea-
the model training is conducted on the NSL-KDD tures, the Stacked AutoEncoder is applied.
data set. Six features of the NSL-KDD data set are In Ref.33 for the detection of application-level DDoS
used. To evaluate the effectiveness of the model, the ac- attacks using encrypted protocols, the anomaly detection
curacy, precision, recall, and f-measure metrics are used. approach based on computed statistics from network
To detect attacks, the feature vector of the model con- packets is applied. In the proposed scheme, negotiations
sists of six parameters of the NSL-KDD data set, such between the web server and its clients are clustered and
as duration, protocol_type, src_bytes, dst_bytes, count, normal user behavior is established. With the help of
and srv_count. The difference of this work from others Stacked AutoEncoder, the distribution of negotiations
is that it uses simplex initial processing and feature ex- on these clusters is analyzed. Client negotiations deviated
traction technologies in the SDN context. A comparative from the normal samples are classified as an anomaly.
analysis of the proposed neural network method with In Ref.,12 deep learning is used for the distributed de-
the existing J48, Naive Bayes (NB), NB tree, random for- tection of attacks in the IoT fog ecosystem. Experiments
est, random tree, multilayer perceptron, and SVM meth- have shown that the distributed detection system outper-
ods on the accuracy metrics is conducted. When tested forms the centralized system, and in terms of detection
DEEP LEARNING 163
accuracy, it is effective against shallow neural networks.13 an observation, that is, one visible unit for each feature
For detecting DDoS attacks in the SDN environment, a of an input pattern. The hidden units model dependen-
deep learning-based defense system model is proposed. cies between the components of observations, that is,
The model can learn samples from network traffic se- the dependencies between the features. Energy-based
quences and track network attack activities chronologi- model means that the probability distribution over
cally. Using defense system based on this model, it is the variables v and h is defined by an entropy function.
possible to efficiently clean DDoS attack traffic on SDN. This function is described by Equations (1) and (2),36
Summarizing the mentioned problems, the list of first in its vectorial form in bold and second in its ex-
deep learning-based network intrusion detection meth- tended form:
ods can be described as follows (Table 1).
Eðv, hÞ = hT Wv aT v bT h: (1)
Restricted BM
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.
m, n m n
RBM is a stochastic network of neurons made up of Eðv, hÞ = + vi hj wij + ai vi + bj hj : (2)
two layers: the visible and the hidden layers. The visible i, j = 1 i=1 j=1
Feature
Article extraction Classifier Goal Data set Classes
25
DBN DBN and SVM Attack detection NSL-KDD Normal, R2L, DoS, U2R,
and Probing
10
— Sparse auto-encoder and Network attack detection NSL-KDD Normal, R2L, DoS, U2R,
soft-max regression type and Probing
self-taught learning
26
AutoEncoder DBN Malicious code detection KDDCUP’99 Normal, R2L, DoS, U2R,
and Probing
27
Spectral Combination of the spectral Intrusion detection in KDD-Cup99, NSL-KDD, Normal, R2L, DoS, U2R,
clustering clustering and deep sensor networks sensor network data set and Probing
neural network
28
Normalized DBN Reducing volume of the data KDD CUP 1999 Normal, R2L, DoS, U2R,
based on feature extraction and Probing
and DoS attacks detection
29
Normalized Discriminative RBM Classification of network traffic in KDD ’99 Normal and anomaly
normal and anomaly classes
31
Normalized Deep feed-forward neural network, Malicious zero-day Adobe Flash Synthetic data Benign and malicious
deep recurrent neural network software detection
DBN, deep belief network; DoS, denial of service; RBM, restricted Boltzmann machine; U2R, user to root; R2L, remote to local; SVMs, support vector
machines.
164 IMAMVERDIYEV AND ABDULLAYEVA
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.
Since RBMs do not have connections between neigh- distribution of Gaussian–Bernoulli type RBM is de-
boring neurons of the same layer, the events are inde- fined as follows:
pendent. This facilitates the calculations of the 1 m
conditional probabilities, which are35: p hj = 1jv = sigm(bj þ 2 + vi wij )
r i=1
(9)
Y
pðhjvÞ = p h j jv : (5)
j n
pðvi = 1jhÞ = N(ai þ + hj wij , r2 ): (10)
Y j=1
pðhjvÞ = pðvi jhÞ: (6)
i
Learning in RBM
The first versions of RBM were developed to solve
The training of an RBM involves in minimizing the
problems with binary data. Considering that, Equa-
negative log-likelihood given by
tions (7) and (8) could be generalized using the cor-
rect probability distribution for binary data, given by q log p(V)
Dwij = e = eÆvi hj æd Ævi hj æm , (11)
Ref.35: qwij
m
p hj = 1jv = sigm bj þ + vi wij (7) where e is the learning rate, Ææd and Ææm are used to
i=1 represent the expected values of the data and the
!
model, respectively.
n The expectation can be obtained by Equations (7)
pðvi = 1jhÞ = sigm ai þ + hj wij , (8) and (8) for binary data, whereas Equations (9) and
j=1
(10) give expectations for continuous data.
where sigm(x) is the sigmoid function.
Using RBM for only binary data obviously limits its Data set description
potential for problem solving. To maximize the accu- For the conducting experiments in the classification
racy of the problem, the form of the RBM is used that process, the NSL-KDD data set is used, which is a fun-
allows dealing with continuous data. The best known damental data set for the evaluation of the effectiveness
type of such RBM is the Gauss type RBM. In this of the proposed approaches in the field of creating the
RBM, the probability distribution of the visible layer network intrusion detection systems.
is changed to a Gaussian distribution. This variation Each record of the NSL-KDD data set consists of 41
is called Gaussian–Bernoulli RBM.37 The probable features (e.g., protocol type, service, and flag) and these
DEEP LEARNING 165
SVM radial basis 0.7161 0.7400 0.7173 0.6096 0.9416 0.9416 0.5464 0.5464 0.9416 0.7445
SVM (epsilon-SVR) 0.7269 0.7550 0.7251 0.6139 0.9804 0.9804 0.5363 0.5363 0.9804 0.6998
Decision tree 0.6744 0.7190 0.6620 0.5710 0.9705 0.9705 0.4516 0.4516 0.9705 0.7157
Gaussian–Bernoulli RBM 0.7323 0.7530 0.7348 0.6233 0.9509 0.9509 0.5678 0.5678 0.9509 0.7591
records are labeled as normal and specific types of at- and average value indexes are used. In Table 2, compar-
tacks. Here the attacks are divided into four classes: ative analysis of RBM results with classic classification
algorithms on various metrics is described.
DoS: For example, Neptune, Smurf, Pod and Tear-
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.
Table 4. Restricted Boltzmann machine denial of service attack detection statistics based on the number of epochs
As seen in Figure 5, if we increase the number of iter- than the other Bernoulli-Bernoulli RBM and DBN
ations from two to five, the DoS attack detection accuracy type deep learning methods. This method also outper-
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.
of the Gaussian–Bernoulli RBM gradually increases. But forms the results of SVM, radial basis, SVM (epsilon-
if we increase this value starting from 5 to 65, the detec- SVR), decision tree type machine learning methods too.
tion accuracy of the algorithm gradually decreases. Vis- We have shown that our model significantly im-
ual representation of this case is depicted in Figure 6. proves the detection accuracy on DoS attack detection
Attacks on the test data are divided into four catego- tasks compared with previous work. In future work we
ries: probe (e.g., IP sweep, vulnerability scanning), DoS also plan on experimenting with LSTM decoders as
(e.g., mail bomb, UDP storm), U2R (e.g., buffer over- well as deep and bidirectional LSTM encoders.
flow attacks, rootkits) and R2L (e.g., password guess-
ing, worm attack). Acknowledgment
Since the number of data in the probe and DoS clas- This work was supported by the Science Develop-
ses is much larger than U2R and R2L, there is always a ment Foundation under the President of the Republic
decrease in the attacks detection effectiveness of the al- of Azerbaijan—Grant No. EIF-KETPL-2-2015-1(25)-
gorithm.36 This condition is associated with the inabil- 56/05/1.
ity of the algorithm to handle the U2R and R2L classes
well. Therefore researchers try to study the attack clas-
Author Disclosure Statement
ses separately. Below is the number of points in the
No competing financial interests exist.
DoS, U2R, R2L, probe (attacks), and normal classes,
and the ability of the Gaussian–Bernoulli RBM method
for detecting these points in the different iterations val- References
1. Mansfield-Devine S. DDoS goes mainstream: How headline-grabbing at-
ues is given (Table 4). tacks could make this threat an organisation’s biggest nightmare. Netw
As the number of iterations increases in the Gaussian– Secur. 2016;11:7–13.
2. Nazario J. Politically motivated denial of service attacks. In: Czosseck C,
Bernoulli RBM method, the points from the R2L class are Geers K, eds. The virtual battlefield perspectives on cyber warfare.
detected, but in this case, the detection accuracy of the Amsterdam: Ios Press, 2009, pp. 163–181.
3. Kaspersky DDoS attacks in Q3 2017. Kaspersky. Available online at https://
method decreases. This is due to the fact that there are securelist.com/ddos-attacks-in-q3-2017/83041/ (last accessed March
many false detections allowed. In the experiments, the 22, 2018).
4. Kolias C, Kambourakis G, Stavrou A, Voas J. DDoS in the IoT: Mirai and
points from probe, DoS, and normal classes are detected other botnets. Computer. 2017;50:80–84.
with high accuracy; however, the other U2R and R2L at- 5. Kobojek P, Saeed K. Application of recurrent neural networks for user
tack classes give the worst result compared with the other verification based on keystroke dynamics. J Telecommun Inf Technol.
2016;3:60–70.
classes. It should be noted that the U2R and R2L classes 6. Chandola V, Banerjee A, Kumar V. Anomaly detection: A survey. J ACM
are in general less visible in data and a lower accuracy Comput Surv (CSUR). 2009;41:1–58.
7. Garcia-Teodoro P, Diaz-Verdejo J, Macia-Fernandez G, Vazquez E.
should, therefore, be expected. Anomaly-based network intrusion detection: Techniques, systems and
challenges. Comput Secur. 2009;28:18–28.
8. Aminanto ME, Kim K. Deep learning in intrusion detection system: An
Conclusions overview. In: Proceedings of the International Research Conference on
In this article, the discriminative RBM was applied to Engineering and Technology, Jakarta, Indonesia, 2016, pp. 1–12. Available
online at https://pdfs.semanticscholar.org/c0fa/578c1fae002e02834806a
the network attacks detection problem. The results of 576d811002cb4a4.pdf (last accessed February 20, 2018).
the experiments on the NSL-KDD data set showed 9. Wang Z. The applications of deep learning on traffic identification.
Blackhat USA, 2015, pp. 1–10.
that the proposed multilayer deep Gaussian–Bernoulli 10. Niyaz Q, Sun W, Javaid AY, Alam M. A Deep learning approach for network
RBM method gave better results in detecting attacks intrusion detection system. In: Proceedings of the 9th EAI International
DEEP LEARNING 169
Conference on Bio-inspired Information and Communications Tech- security.org/TC/SP2015/program-posters.html (last accessed January
nologies (BICT), New York, 2016, pp. 21–26. 10, 2018).
11. Imamverdiyev YN, Sukhostat LV. Network traffic anomalies detection 32. Yadav S, Subramanian S. Detection of application layer DDoS
based on informative features. Radio Electron Comput Sci Control. attack by feature learning using stacked autoencoder. In: Proceedings
2017;3:113–120. of the International Conference on Computational Techniques in
12. Diro AA, Chilamkurti N. Distributed attack detection scheme using deep Information and Communication Technologies (ICCTICT), New Delhi,
learning approach for Internet of Things. Future Gener Comput Syst. India, 2016, pp. 361–366.
2018;82:761–768. 33. Zolotukhin M, Hämäläinen T, Kokkonen T, Siltanen J. Increasing web
13. Li C, Wu Y, Yuan X, et al. Detection and defense of DDoS attack–based on service availability by detecting application-layer DDoS attacks in
deep learning in OpenFlow-based SDN. Int J Commun Syst. 2018:1–15. encrypted traffic. In: Proceedings of the 23rd International Conference
14. Mirkovic J, Reiher P. A taxonomy of DDoS attack and DDoS defense on Telecommunications (ICT), Thessaloniki, Greece, 2016, pp. 1–6; 2016.
mechanisms. ACM SIGCOMM Comput Commun Rev. 2004;34:39–53. 34. Memisevic R, Hinton GE. Learning to represent spatial transformations
15. Peng T, Leckie C, Ramamohanarao K. Survey of network-based defense with factored higher-order Boltzmann machines. Neural Comput. 2010;
mechanisms countering the DoS and DDoS problems. ACM Comput 22:1473–1492.
Surv. 2007;39:1–42. 35. Larochelle H, Bengio Y. Classification using discriminative restricted
16. Loukas G, Oke G. Protection against denial of service attacks: A survey. Boltzmann machines. In: Proceedings of the 25th ACM International
Comput J. 2010;53:1020–1037. Conference on Machine Learning, Helsinki, Finland, 2008, pp. 536–543.
Downloaded by SUNY Stony Brook package(NERL) from www.liebertpub.com at 06/21/18. For personal use only.
17. Bhuyan MH, Kashyap HJ, Bhattacharyya DK, Kalita JK. Detecting distrib- 36. Saied A, Overill RE, Radzik T. Artificial neural networks in the detection of
uted denial of service attacks: Methods, tools and future directions. known and unknown DDoS attacks: Proof-of concept. Commun Com-
Comput J. 2013;57:537–556. put Inf Sci. 2014;430:300–320.
18. Tama BA, Rhee KH. Data mining techniques in DoS/DDoS attack detec- 37. Hinton GE, Salakhutdinov R. Reducing the dimensionality of data with
tion: A literature review. In: Proceedings of the 3rd International Con- neural networks. Science. 2006;313:504–507.
ference on Computer Applications and Information Processing 38. Alguliyev RM, Aliguliyev RM, Isazade NR. An unsupervised approach to
Technology (CAIPT 2015), Yangon, Myanmar, 2015, pp. 1–4. generating generic summaries of documents. Appl Soft Comput. 2015;
19. Girma A, Garuba M, Goel R. Advanced machine language approach to 34:236–250.
detect DDoS attack using DBSCAN clustering technology with entropy. 39. Alguliev RM, Aliguliyev RM, Mehdiyev CA. pSum-SaDE: A modified
In: Latifi S, ed. Information Technology - New Generations. Advances in p-median problem and selfadaptive differential evolution algorithm
Intelligent Systems and Computing, 2018, vol. 558. Cham, Switzerland: for text summarization. Appl Comp Intell Soft Comput. 2011;
Springer, pp. 125–131. 2011:13.
20. Deng L. A tutorial survey of architectures, algorithms, and applications for 40. Xiao L, Shao Z, Liu G. K-means algorithm based on Particle Swarm
deep learning, APSIPA. Trans Signal Inf Process. 2014;3:1–29. Optimization algorithm for anomaly intrusion detection. In: The sixth
21. Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: world congress on intelligent control and automation, Dalian, China,
Learning useful representations in a deep network with a local 2006, pp. 5854–5854.
denoising criterion. J Mach Learn Res. 2010;11:3371–3408.
22. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput.
1997;9:1735–1780.
23. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep con- Cite this article as: Imamverdiyev Y, Abdullayeva F (2018) Deep
volutional neural networks. Adv Neural Inf Process Syst. 2012;25:1097–1105. learning method for denial of service attack detection based on
24. Deng L, Yu D. Deep learning: Methods and applications. Found Trends restricted Boltzmann machine. Big Data 6:2, 159–169, DOI: 10.1089/
Signal Process. 2014;7:197–387. big.2018.0023.
25. Salama MA, Eid HF, Ramadan RA, et al. Hybrid Intelligent Intrusion
Detection Scheme. In: Gaspar-Cunha A, Takahashi R, Schaefer G, Costa
L, eds. Soft Computing in Industrial Applications. Advances in Intelli-
gent and Soft Computing, 2011, vol. 96. Berlin, Heidelberg, Germany:
Springer, pp. 293–303.
26. Li Y, Ma R, Jiao R. A hybrid malicious code detection method based on
Deep Learning. Int J Secur Appl. 2015;9:205–216.
Abbreviations Used
27. Ma T, Wang F, Cheng J, et al. A hybrid spectral clustering and deep neural BM ¼ Boltzmann machine
network ensemble algorithm for intrusion detection in sensor net- CNN ¼ convolutional neural network
works. Sensors. 2016;16:1–23. DBM ¼ deep BM
28. Gao N, Gao L, Gao Q. An intrusion detection model based on deep belief DBN ¼ deep belief network
networks. In: Proceedings of the Second International Conference on DDoS ¼ distributed DoS
Advanced Cloud and Big Data (CBD), Huangshan, China, 2014; pp. 247– DNN ¼ deep neural network
252. DoS ¼ denial of service
29. Fiore U, Palmieri F, Castiglione A, Santis AD. Network anomaly detection with NB ¼ Naive Bayes
the restricted Boltzmann machine. Neurocomputing. 2013;122:13–23. R2L ¼ remote to local
30. Tang TA, Mhamdi L, McLernon D, et al. Deep Learning approach RBM ¼ restricted Boltzmann machine
for network intrusion detection in software defined networking. RNN ¼ recurrent neural networks
In: Proceedings of the IEEE International Conference on Wireless SDN ¼ software defined networking
Networks and Mobile Communications (WINCOM), Fez, Morocco, 2016, SVM ¼ support vector machine
pp. 258–263. TN ¼ true negative
31. Jung W, Kim S, Choi S. Poster: Deep learning for zero-day flash malware TP ¼ true positive
detection. In: Proceedings of the 36th IEEE Symposium on Security and U2R ¼ user to root
Privacy, San Jose, California, 2015. Available online at https://www.ieee-