Sie sind auf Seite 1von 5

International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS-2017)

Data Mining Techniques in Detecting and Predicting


Cyber Crimes in Banking Sector
K. Chitra Lekha Dr. S. Prakasam
Ph.D Research Scholar Associate Professor
Department of Computer Science and Applications Department of Computer Science and Applications
SCSVMV University, Kanchipuram, India. SCSVMV University, Kanchipuram, India
chithuparthi03@gmail.com sp@kanchiuniv.ac.in

Abstract—Data mining applications are utilized in many the computer system properties such as files, website pages or
banking sectors for client segmentation and productivity, software [3].
credit scores and authorization, predicting payment Developing a high-quality cyber crime tool to recognize
default, advertising, detecting fake transactions, etc. This crime prototypes rapidly and competently for future cyber
paper presents a general idea about the model of Data crime pattern exposure is essential. Banking sector has been
Mining techniques and diverse cyber crimes in banking hotspot for cyber crime be it natural or unnatural. And also the
applications. It also provides an inclusive survey of technology is becoming indispensible part of banks it has
competent and valuable techniques on data mining for become easy for users so as for attackers as now they have
cyber crime data analysis. The objective of cyber crime more mode to exploit the vulnerabilities [4]. Banking sectors
data mining is to recognize patterns in criminal manners are prone to many interruptions originated by an assortment of
in order to predict crime anticipate criminal activity and categories of threats; numerous threats are distinct under
prevent it. This paper implements a novel data mining diverse groups that is cyber fraud, trade permanence
techniques like K-Means, Influenced Association Classifier development and information safety measures. Cyber crimes
and J48 Prediction tree for investigating the cyber crime that are committed in banks include hacking, Credit card
data sets and sorts out the accessible problems. The K- fraud, money laundering, DoS attacks, phishing, salami
Means algorithm is being utilized for unsupervised attacks, ATM card cloning etc. Cyber threats such as
learning cluster within influenced Association pharming, phishing, tempted reveal of private details like
Classification. K-means selects the initial centroids so that identity theft are the security qualms that subsist in the brains
of clients in banking and financial sectors. Perceiving cyber
the classifier can mine the record and formulate
crime can be extremely tough as well, as of numerous online
predictions of cyber crimes with J48 algorithm. The
business transactions and hectic network traffic which
collective knowledge of K-Means, Influenced Association generate enormous quantity of data and just a segment of
Classifier and J48 Prediction tree tends certainly to afford which relay to prohibited actions.
a enhanced, incorporated, and precise result over the
cyber crime prediction in the banking sectors. Our law Crime prediction uses past data and after analyzing data,
enforcement organizations require to be adequately predict the future crime with location and time. At present,
outfitted to defeat and prevent the cyber crime. serial criminal cases rapidly occur so it is an challenging task
to predict future crime accurately with better performance [5].
Credit card and web based crime are increasingly as more
Keywords—Cyber crime, Data mining, clustering, Influenced technologies are rising high. To deal and overcome fraud,
Association Classification, J48. clustering and classification techniques are implemented [6].
Fraud detection is one of the difficult process not only
I. INTRODUCTION technically, but also in crime investigations. The method of
fraud detection is based on simple comparisons, and also
Data mining is the computer-assisted process to break based on association, clustering, prediction and outlier
through and analyzing large amount of data and then detection [7]. Association Rule mining as generates “n‟ best
extracting the meaning of data. It is also the process of association rules based on n selected and Classification and
analyzing data from different perspectives and summarizing it Regression Tree (CART) that predict categorical class labels
in to useful information [1]. The Data mining prediction [8]. Clustering technique is devised as a multi purposive
techniques are capable to enhance the accuracy, performance, optimization crisis. The suitable clustering algorithm and
speed of predicting the cyber crime. Cyber crime has been parameter locations depend on the entity dataset and projected
increasing in complexity and financial costs since corporations use of consequences. Clustering as such is not a routine task,
started to utilize computers in the course of doing business. but it is an iterative method of knowledge innovation or
Cyber criminals are becoming more sophisticated and are interactive multi purposive optimization that engrosses test
targeting consumers as well as public and private and malfunction.
organizations [2]. Cyber crime analysis has a very momentous
responsibility of law enforcement system in any country. Cyber crimes can be reduced from the banking
Cyber crime involves the breakdown of privacy, or damage to transactions by applying the updated technology and

978-1-5386-1887-5/17/$31.00 ©2017 IEEE

1639
International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS-2017)

appointing reliable officers and devices [9]. Cyber attacks in the form of crime class or category of crime by which
the banking sectors might be in the type of illegal access, effective measures can be deployed by the local law
devastation, bribery or amendment of data or any kind of enforcement agencies to avoid crime and secure the
malicious practice to source network malfunction, reboot or neighborhood. It can also predict the time frame for the crimes
sling. Current security system have made cracking very if their collected data is quite rich [13].
monotonous however not unfeasible. So some vital intensity
Tushar Sonaqwanev, Shirin Shaikh, Shaista Shaikh, Rahul
of security must be recognized before trade on the internet can
Shinde and Asif Sayyad (2015) had grouped crime data
be consistently performed. To preserve against cyber crimes,
according to various types of crimes that had taken place
intrusion recognition methods ought to be intended, executed
against women in different states and cities of India. They
and governed. The proposed model is invented to access
used K-means algorithm for clustering, Pearson’s correlation
massive quantity of criminal records so that prediction can be
coefficient for correlating crimes between two variables and
made as per the precedent performance of the criminal folks.
Linear regression for crime prediction [14].
As the records in the system enhances, there is no need to
insert the information. In the current digital era, the probable Uttam Mande, Y. Srinivas and J. V. R. Murthy (2012) had
risk to protect massive quantity of data with a diverse society collected the crime dataset from Andhra Pradesh police
of cyber criminals is a huge dispute. department and potentially aimed to identify a criminal based
on the witness or clue at the crime spot. They used binary
II. LITERATURE SURVEY clustering and classification techniques to analyze the criminal
data. They also tried to identify the criminal by mapping
Dr. Zakaria Suliman Zubi and Ayman Altaher Mahmmud criminal using the method of auto correlation and the way in
had proposed a model for crime and criminal data that which the incident had taken place and their features of the
analyzes using simple K-means algorithm for clustering and crime are considered to ratify the criminal [15].
Apriori algorithm for data Association rules. It also tends to
help specialist in discovering patterns and trends, making Lawrence McClenden and Natarajan Meghanathan (2015)
forecasts, finding relationships and possible explanations, had proved how effective and accurate the machine learning
mapping criminal networks and identifying possible suspects. algorithms used in data mining analysis can be at predicting
They showed the promising results of their model proposed violent crime patterns. With the aid of WEKA tool, they
model from the attributes for crime, criminal and the results of observed that Linear regression algorithm was very effective
K-means algorithm. They also gave the overall statistical and accurate in predicting than Additive regression and
knowledge about the criminal age versus crime type which Decision stamp algorithms when implemented them with same
provided the input to the K-means algorithm [10]. finite set of features on the Communities and Crime dataset
[16].
Rasoul Kiani, Silamak Mahdavi and Amin Keshavarzi
(2015) had applied a theoretical model based on data mining Javad Hosseinkhani, Suhaimi Ibrahim, Suriyati Chuprat
techniques such as clustering and classification to real crime and javid Hosseinkhani Naniz (2014) had afforded a review
dataset recorded by police in England and Wales within 1990 for extracting useful information by means of Data mining, in
to 2011. They assigned weights to the features in order to order to find the crime hot spots out and predict crime trends
improve the quality of their model and removed low value for them using crime data mining techniques. They also
from them. They employed Genetic algorithm for optimizing evaluated State-of-the-art approaches for extracting useful
of Outlier detection operator parameters using Rapid miner information by means of data mining, in order to find crime
tool [11]. hot spots out and predict crime trends for them using crime
data mining techniques [17].
Dr. K. Chitra and B. Subhashini (2013) had analyzed the
data mining techniques and its applications in banking sectors Raghavendra Patidar and Lokesh Sharma (2011) had tried
like fraud detection and prevention, customer retention, to deduct fraudulent transaction through Neural network along
marketing and risk management. They discussed the need of with the Genetic algorithm. They used Genetic algorithms for
data mining techniques in the banking sectors for better making the decisions about network topology, number of
targeting and acquiring new customers, most valuable hidden layers, number of nodes that would be used in the
customer retention, automatic credit approval which is used design of neural network for their problem of credit card fraud
for fraud detection and prevention in real time, providing detection. For the learning of purpose of artificial neural
segment based products, analysis of the customers, transaction network they used supervised learning feed forward back
patterns over time for better retention and relationship, risk propagation algorithm [18].
management and marketing [12].
Anisha Agarwal, Dhanashree Chougule, Arpita Agarwal
Akshay Kumar Singh, Neha Prasad, Nohil Narkhede and and Divya Chimote (2016) had utilized frequent pattern
Siddharth Mehta (2016) had described a system for analyzing mining with association rule mining for analyzing the various
crime and also discussed the method of increasing the crimes done by a criminal and predict the chance of each
accuracy of crime prediction for the prevention of crimes. crime that can again be performed by that criminal. This
They used Apriori algorithm for the identification of trends prediction was based on attributes like criminal record,
and patterns in crime. They also used Decision trees for crime education, occupation, friend circle, family background and
prediction because of its robust nature and also it can be other factors. They implemented Apriori algorithm for
employed with large datasets. Their system gives the result in generating frequent item sets. They designed an application

1640
International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS-2017)

that will be accessed and available to the authorized users Fig. 1. Proposed model for Cyber crime prediction
anytime and anywhere, along with the main functionality of
prediction of the further crimes by individual criminal [20].
Linda Delamaire, Hussein Abdou and John Pointon (2009)
had intended to identify the different types of credit card fraud
and to review alternative techniques that have been used in
fraud detection. They discussed the way how decision trees,
Genetic algorithms, Clustering techniques and Neural
networks can be utilized as detection techniques in credit card
fraud detection. They identified the different types of frauds
such as bankruptcy fraud, counterfeit fraud and also discussed
the measures to deduct those frauds using data mining
techniques [21].
Atul bamrara, Gajendra Singh and Mamta Bhati (2013)
had attempted to reveal the varied cyber attack strategies
adopted by cyber criminals to target the selected banks in
India where spoofing, brute overflow etc are found positively
correlated with public and private sector banks. Their findings
also showed a positive correlation between Intrusion detection
and cyber attacks ; system monitoring and online identity
theft, DOS attack, credit card or ATM fraud [22].

III. PROPOSED MODEL FOR CYBER CRIME PREDICTION


Discovering and exploring cyber crimes and probing their
affiliations with virtual criminals are implicated in evaluating
cyber crime progression. The proposed work presents the
model over cyber crime prediction with K-Means clustering
technique, Influenced Associative classifier and J48 classifier.
For the cyber crime prediction in banking sectors, the
proposed model grants an enhanced prediction outcome.
Influenced Associative Classifier affords a well-organized through Knowledge innovation from abnormal patterns and
way to utilize the classification method with Association Rule also it achieves recognition in combating cyber credit-card
Mining, which enhances the prediction accuracy for fraud Data Mining aids by contributing in solving tribulations
classification. It also employs the influenced support and in banking sector by discovering patterns, relationships and
confidence structure for digging out the Association rule from links that are unseen in the business information accumulated
crime database. The incorporated implementation of J48 in the crime databases.
technique with K-Means and Influenced Association Classifier
provides the enhanced prediction outcome over the cyber 1. Association Rule mining
crime hazards in banking sectors.
Based on frequent occurrences of the crime patterns,
Association rule mining produces rules for cyber crime
A. Collection of cyber crime dataset dataset. These generated rules assist the assessment producer
of defense society to take a hindrance action. The procedure
A diversity of cyber crime data has to be collected for the comprises the subsequent measures:
prediction of cyber crime class in banking sector by the
analysis of crime pattern. So this data has to be collected from x The method of determining commonly occurring
various news feeds, articles and blogs, police department item sets in the cyber crime database.
websites over the internet. The collected cyber crime data is
x The identification of patterns in program
stored in crime database for further handling of data.
implementation and customer behaviors as association rules
B. Pre-processing of cyber crime dataset known as intrusion recognition.
The cyber crime dataset stored in Crime database has to be
preprocessed before applying data mining techniques on them.
Because preprocessing removes noisy data, missing values etc. 2. Clustering
C. Data mining Techniques Splitting up of a set of records or items to a number of
groups is called clustering. Clustering is implied on
For Pre-processed data, Data mining techniques and discovering interactions linking cyber crime and criminal
algorithms are implemented to identify or forecast fraud

1641
International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS-2017)

characteristics having some past mysterious general For the classification of issues and problems in the cyber
characteristics. For discovering frauds in banking sectors, crime prediction analysis, J48 algorithm is more spiky
clustering techniques are utilized. Clustering is phrased as and precise. Two steps in J48 are:
unsupervised learning because its classes are not definite and x Formation of tree.
determined in progress and consortium of data is through x Validate the built tree over the cyber crime
exclusive of supervision. K-means partition algorithm is data set.
implemented in clustering cyber crime datasets because of its
The J48 algorithm uses pruning method for construction of the
minimalism and less computational intricacy. At first, the
tree. The pruning technique diminishes the size if the tree by
quantity of data items are assembled and precised as ‘k’
clusters. Between the mean distances of objects, the mean removing appropriate data that guides the terrible concert in
value is intended. The repositioning iterative method is prediction. The anticipated J48 algorithm classifies the data till
utilized to recover the partitions by transferring items from the entire categorization and affords utmost accuracy over the
one cluster to other. Then until the union occurs, the number training of cyber crime data. It also stabilizes the precision and
of iterations is carried out. litheness. The J48 algorithm is the extensive version of
decision tree C4.5. The J48 algorithm produces the classifier
3. Classification output in the form of rule sets and decision tree. The rule sets
Classification is the most frequently used data mining are straightforward to recognize and too easy for employing
technique, which executes a set of pre-classified examples to within the application.
build up a model that can classify the instances of attributes at
huge scale. The classification technique creates an association
between a dependent variable and an independent variable by IV.CONCLUSION
mapping the data points. Within the given dataset,
Classification is used to bring out in which group each data The proposed model generates a superior concept over the
occurrence is associated.[19] Classification is utilized to create cyber crime prediction by implementing the novel data mining
several models of unknown patterns and prospect assessment techniques such as K-Means, Influenced Association
on the basis of the previous decision making. Automatic credit Classification with Prediction tree J48. The Influenced
authorization is the nearly major procedure in the banking Association Classification is an improved model for
sector and financial organizations. Frauds can be prohibited by classification and association with weighted support and
building a superior assessment for the credit consents using confidence measures. From cyber crime datasets, K-Means
the classification representation based on decision trees such algorithm bunches the item sets. The Classification concert
as J48, CART etc. and precision can be enhanced with K-Means, Influenced
Association Classification with Prediction tree J48. In the
4. Influenced Association Classification
For accomplishing more precision, the associative banking sectors, the clients have to be aided through precise
classification is extremely novel and improved method which requirements in the application software to discover alert
assimilates the mining of association rule and classifications while a stern interruption is recognized. Intrusion tools ought
of the model prediction. This method is being implemented for to be established wherever it is practicable and appraised on a
ruling out the link and association over item sets. The standard basis. To scrap beside cyber attacks, customer
associative classification comes under unsupervised learning tutoring must be prepared in association with government and
since it does engage any class characteristic for rule other confidential organizations. Awareness agenda should be
extraction. Two steps employed to extract association rules put into practice to guarantee that clients recognize data
are, concerns, intensity of privacy and the method to make the
banking transactions secure.
1. Through cyber crime data set, classes are
generated based on the association rule.
2. In the class labels, perform analysis on the dataset
classification.
References
The Influenced Association classification is entirely novel [1] Ms. H. N. Gangavane and Prof. Ms. M. C. Nikose, “A Survey on
perception for rule categorization. It also intends weighted Document Clutering for identifying Criminal”, IJRITCC, Vol. 2, Issue 2,
confidence and support structure for mining association rules February 2015, pp. 459-463.
over the cyber crime data set. Various steps implemented in [2] Ms M. Lakshmi Prasanthi and Tata A S K Ishwarya, “Cyber Crime
Influenced Association Classifier has been summarized below: Prevention & Detection”, IJARCCE, Vol. 4,Issue 3, March 2015, pp. 45-
48.
x Initially, Pre-process the cyber crime dataset so [3] Shubham Kumar, Dr. Santanu Koley and Uday Kuamr, “Present Scenrio
further mining practices can be achieved on them. of Cyber Crime in INDIA and its Preventions”, IJSER, Vol.6, Issue 4,
April 2015,pp. 1972-1976..
x To replicate the assessment in the replica of [4] Manpreet Kumar, Divya Bansal and Sanjeev Sofat, “Study of Cyber
prediction, every element is assigned within a range of weight Frauds and BCP Related Attacks in Financial Institutes”, IJICT, Vol. 4,
Issue 16, 2014, pp. 1647-1652.
5. Cyber crime Prediction using J48

1642
International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS-2017)

[5] Nikhil Dubhey and Setu KumarChaturvedi, “A Survey of Crime


Prediction Technique using Data mining”,IJERA,Vol. 4,Issue 3, March
2014, pp. 396-400.
[6] Muhammad Arif, Khabaib Amjad Alam and Mehdi Hussain, “Crime
Mining : A Comprehensive Survey”, International Journal of u- and e-
Service, Science and Technology, Vol. 8, Issue 2, 2015, pp. 357-364.
[7] Dr. R. Jayabrabu, Dr. V. Saravanan and Dr. J. Jebamalar Tamilselvi, “A
Framework for Fraud Detection System in Automated Data mining
using Intelligent agent for better Decision making process”, Science
Direct, March 2014.
[8] K. Chitra Lekha and Dr. S. Prakasam, “An analysis of finding the
Influencing Factors of supporting for the “GiveitUp” LPG Subsidy for
the Government using Data mining Techniques”, IJCA, Vol. 143, Issue
5, June 2016, pp.34-39.
[9] Dr. M. Imran Siddique and Sama Rehman, “Impact of Eletronic Crime
in Indian Banking Sector”, IJBIT, Vol. 1, Issue 2, September 2011, pp.
159-164.
[10] Dr. Zakaria Suliman Zubi and Ayman Altaher Mahmmud, “Crime Data
Analysis using Data mining Techniques to Improve Crimes Prevention”,
International Journal of Computers, Vol. 8, 2014, pp. 39-45.
[11] Rasoul Kiani, Silamak Mahdavi and Amin Keshavarzi, “Analysis and
Prediction of Crimes by Clustering and Classification”, IJARAI, Vol. 4,
Issue 8, 2015, pp. 1-7.
[12] Dr. K. Chitra and B. Subhashini, “Data mining Techniques and its
Applications in Banking Sector”, IJETAE, Vol. 3, Issue 8, August 2013,
pp. 219-226.
[13] Akshay Kumar Singh, Neha Prasad, Nohil Narkhede and Siddharth
Mehta, “Crime: Classification and Pattern Prediction”, IARJSET, Vol.
3,Issue 2, February 2016, pp. 41-43.
[14] Tushar Sonaqwanev, Shirin Shaikh, Shaista Shaikh, Rahul Shinde and
Asif Sayyad, “Crime Pattern Analysis, Visualization and Prediction
using Data Mining”, IJARIIE, Vol. 1, Issue 4, 2015, pp.681-686.
[15] Uttam Mande, Y. Srinivas and J. V. R. Murthy, “An Intelligent Analysis
of Crime Data using Data mining & Auto Correlation Models”, IJERA,
vol. 2, Issue 4, August 2012, pp. 149-153.
[16] Lawrence McClenden and Natarajan Meghanathan, “Using Machine
Learning Algorithms to Analyze Crime Data”, Machine Learning and
Applications: An International Journal, Vol. 2,Issue 1, March 2015,
pp.1-12.
[17] Javad Hosseinkhani, Suhaimi Ibrahim, Suriyati Chuprat and javid
Hosseinkhani Naniz, “Web Crime mining by means of Data mining
Techniques”, Research Journal of Applied Sciences, Engineering and
Technology, Vol. 7, Issue 10, 2014, pp. 2027-2032.
[18] Raghavendra Patidar and Lokesh Sharma, “Credit card Fraud Detection
using Neural Network”, IJSCE, Vol. 1, Issue NCAI2011, May 2011.
[19] K. Chitra Lekha and Dr. S. Prakasam, “Performance Assessment of
Different Classification Techniques”, CiiT International Journal of Data
mining and Knowledge Engineering, Vol. 9, Issue 1, January 2017, pp.
20-23.
[20] Anisha Agarwal, Dhanashree Chougule, Arpita Agarwal and Divya
Chimote, “Application for Analysis and Predicion of Crime data using
Data mining”, Proceedings of IRF-IEEEforum International Conference,
India, April 2016, pp. 35-38.
[21] Linda Delamaire, Hussein Abdou and John Pointon, “Credit Card fraud
and Detection techniques: A review’, Banks and Bank Systems, UK,Vol.
4, Issue 4, 2009.
[22] Atul bamrara, Gajendra Singh and Mamta Bhati, “Cyber Attacks and
Defense Strategies in India: An Emprical Assessment of Banking
Sector”, IJCC,Vol.7, Issue 7, January-June 2013,pp. 49-61.

1643

Das könnte Ihnen auch gefallen