Sie sind auf Seite 1von 5

2017 IEEE 2nd International Conference on Big Data Analysis

Review: An Evaluation of Major Threats in Cloud Computing Associated with Big


Data

Kamalpreet Kaur, Ali Syed, Azeem Mohammad, Malka N. Halgamuge


School of Computing and Mathematics, Charles Sturt University, Melbourne, Victoria 3000, Australia
e-mail: ASyed@studygroup.com, AMohammad@studygroup.com, MHalgamuge@studygroup.com

AbstractIn today's corporate society, the productivity, data technologies like MapReduce and Hadoop, it was very
stability, and management of an organization relies upon the difficult, error-prone and time-consuming to process large
power of databases. Most organizations outsource their datasets, therefore, with these technologies, dataset
databases in the form of big data and then transfer it into processing is much easier -than the traditional processing
cloud. Although cloud computing technology brings many tools.
benefits for an organization, their security risk factor still Many organizations collect this complex data by doing
remains as a big barrier for its wide-spread adoption. worldwide surveys to improve their decision-making process
Therefore, this problem poses a critical question such as: Is which is extremely important to sustain a healthy future for
information secure in cloud? Due to this uncertainty, the
their business [1]. Thus, cloud comes with security
primary aim of this study is to describe and identify most
vulnerable aspects of security threats in cloud environment
challenges as storing big data into the cloud, even though
through content analysis and highlight and evaluate gaps in the content owners do not know where their data is kept do not
literature to draw scholarly attention. This paper analyzed trust the cloud. First of all, let me introduce what cloud
content to source data that helps to identify gaps in the computing is? Cloud computing is a set of IT services that
literature. These gaps then have been identified and evaluated provides to end-users over the internet in order to scale up or
to answer questions with possible solutions. This research will down their service requirements. Cloud computing is the
help both vendors and users about security issues that have fastest growing sector in IT industry as its capacity increased
been heightened with recent population advancements and dynamically with no investing in new infrastructure and
demands that have been pointed out for improvement. This licening software. There are many advantages of cloud
study has reviewed literature in the field over the span of six computing such as cost efficient, storage capacity, back-up
years, and endeavors to seek answers for the question and cast and recovery, quick deployment, easy to access information
solutions through thorough evaluation and analysis: the from anywhere and much more. Despiteits advantages,
security related issues in cloud computing associated with big some disadvantages are also there that we need to be aware
data must be taking into account by security practitioners of, before using this computing. For instance, technical
when assessing the needs of service providers. This study has issues (network or connectivity problem), security, prone to
found that cloud environment is an innovation, and the blend attack (hack attacks). However, its pros outweigh its cons
of parallel computing and cloud computing can offer various and there is only one thing that we need to be aware of which
advantages. It is ideal for different kinds of applications that
is the fact that cloud computing technology needs to be
can suit different needs if expenses of application modifications
consolidate the cost of setup and maintenance of cloud
secured that it would not cause any leaks to stored sensitive
computing. This study has analyzed content and has also information [2]. Moreover, cloud computing provides virtual
found that management solution of only one big secure data resources to their consumers via the internet so that they can
after integrating it with cloud needs yet to be designed. use cloud infrastructures, services, and softwares. This is a
cheaper way than any other computing systems with zero
Keywords-Big Data, MapReduce, Cloud computing, RDBM maintenance cost. The service provider is only responsible
for providing the availability of services, nonetheless with
I. INTRODUCTION services that are also known as "IT on demand" or utility
computing.
Big data plays an essential role in the business world, as Google has introduced MapReduce, as a framework that
storage of significant amounts of complex data securely is uses Hadoop's distribution file system (HDFS) [3]. In
one of the most crucial aspects of corporate operations. In MapReduce, the vast amount of data can be converted into
actual, big data is s term that describes huge amounts of tuples and then these tuples can be deduced as an input and
structured, semi-structured and unstructured data. The then reduced further by dividing these tuples into smaller
demand of innovative method of processing data in a cost sets of tuples. This way, big data can be managed
effective method, in order to provide process automation nonetheless, at the same time, it can still create a problem of
and enable decision making is a need and sparks scholarly security, due to business growth and monitoring. However,
interest. Big data has 3Vs: high-volume of data, high- MapReduce is not sufficient either, as it lacks security of
velocity of data, and high variety of types of information that sensitive data, and confidentiality becomes a huge problem.
needs to be mined. Although, the specific quantity of big Additionally, we will discuss how Hadoop frame will be
data is not mention anywhere, it can be in petabytes or used to solve this issue by using different techniques.
Exabyte. According to Peter Skomoroch, before having big Although there are many security and privacy issues, they

978-1-5090-3619-6/17/$31.00 2017 IEEE 368


still can be solved by using encryption, multi-factor proceedings and measure progress to assist in solving this
authentication, security hardware, and applications [3]. perplexing, complex issue. This research uses a Historic-
logic method, which will assist in interpreting the findings of
II. OUTSOURCING DATA AND APPLICATIONS security matters in cloud computing in previous research. By
Cloud computing provides access to data anywhere at gathering information and opinion from expert's and
any time; nevertheless, the challenge is to check authorized evaluating it through with data retrieved, will allow the study
entities. to shed light in the area for further development.
Due to significant challenges caused by monitoring and
so on, cloud computing suffers to reach its full potential. V. RESULTS AND DISCUSSION
However, we have to rely on the third party, and cloud Cloud environment has different domains and each is
providers, need to make it secure and communicate its
having different security and privacy challenges that we need
reliability with their customers. When we use cloud
to consider before migrating data to the cloud. The security
environment, to make decisions about complex data and
platforms as there has not been any other computing ever challenges are categorized into different levels including
done before, [4], [5], as, there is no technical mechanism to network level, data level, authentication level etc. However,
prevent cloud providers from accessing customer's data the protection of sensitive information is a major concern. In
unethically. Therefore, it is significant to build a new layer order to enhance the security, it is most important to provide
that can support, a contract negotiation phase between the access control, authentication, and authorization to the data
cloud provider and end-users. Thus, we need the that are stored in the cloud. There are three main elements in
combination of technical and nontechnical mechanism to get the information security that are CIA (Confidentiality,
clients/customers trust towards cloud environment [6]. Integrity, and Availability). However, this is the
responsibility of the cloud provider to supply these basic
III. SECURITY AND PRIVACY CHALLENGES elements to the customers. In Figure 1, it is clearly shown
The Service level agreement- means that the service level that confidentiality (31%), Integrity (24%), and availability
agreement is a contract agreement that is used to build a new (19%) are the most threaten attributes as compared to other
layer that can support a contract negotiation phase between attributes including usability, accountability, security, and
the cloud provider and customers. However, it is still very reliability. Data security, integrity, and recovery are the
difficult to bargain about security, privacy, and trust, thus, biggest risk in cloud computing. It is very important to know
there are still some ways to assure customers about the fact what happens to the stored data, and if the cloud fails then,
that cloud provider provides services which is subjected to a we recover the applications back up. Cloud computing
contract agreement, and this makes it difficult. technology uses third-party networks to access cloud
(i) Access control: There is a credential based access services. Thus, there are many risks to access the services in
control requirement which must be there to access control the cloud. There is a need to compare different cloud
and services so that customer's provenance information may
providers services before moving on to cloud. These
not be leaked.
services are becoming more popular as they are convenient,
(ii) Trust management: Service provider needs to build
new access control policies rather than telling customers that affordable and provide storage space [9].
their data would never be breached in order to manage trust Moreover, cloud systems are encountered with various
among customers. attacks such as loss of data, denial of services, unauthorized
(iii) Authentication and identity management: IDM access to data, and unauthorized data modification, etc.
(Identity Management) mechanisms are used to determine Bleikertz et al. [14] have claimed that the advantages of
the authenticate users and services which are based on using Cloud is followed by the security, privacy, and safety
credentials and characteristics. challenges. These difficulties and challenges have prevented
(iv) Privacy and data protection: Privacy is the biggest the selection of cloud computing, as it has not been chosen
issue in all challenges that are discussed so far. However, when everything is considered. It is expressed that, highly
most companies are not comfortable to store their adaptable; nonetheless exceptionally complex cloud
confidential data outside of their premises. By outsourcing computing administrations are arranged by utilizing web
data shared infrastructure, the customers background interface by clients. However, wrong designing of cloud
information might be at risk. This can be used for many computing by clients may prompt vulnerable security issues,
purposes, such as auditing, trace back, and historically based and it can factor security issues. This paper has used a
access control. This shows that the balance between client's particular type of methodology that can help us understand
personal information and privacy is the biggest draw back in
issues from the users side and server side. Amazon's Elastic
the cloud environment [7], [8].
Compute Cloud (EC2) has been decided for this evaluation.
The strength of this work is that it provides a robust
IV. MATERIAL AND METHOD
analysis of vulnerabilities, and security attacks, therefore,
The research method used for this study to interpret the this analysis helps the vendors in order to enhance the
trends of data sourced from analyzing content is to find the

369
security policies. The weakness is that it is unique to processing taking on board the example of healthcare data, if
Amazon, and it could contribute more if it would be general this sensitive information leaks to the third party, for instance,
[14]. insurance companies who can access this data, then the
insurance company can find out about medical conditions
that would result in the increase of their premiums. Therefore,
to protect data from breaches of confidentiality, it is crucial
to provide strong security, this proves that Airavat is the best
technology to secure confidentiality. With the use of this
technique, the untrusted MapReduce program is sent to
Airavat then it could be protected as seen in Figure 1. After
performing the computations of MapReduce, it could cause
leakage of information. It uses a unique system called Linux
(SELinux) to add Mandatory access code when Airavat is
implemented on Hadoop. This technique provides strong
Figure 1. The percentage of compromised attributes in cloud environment security and privacy by preventing leakage of sensitive data
associated with big data and uses access control mechanism as it is the first system
that calculates access control with differential privacy
A new Architecture and Transparent Cloud Protection without auditing untrusted codes. The weakness of this
System (TCPS) has been discussed by Lombardi et al. [15] system is that it supports not only small sets of reducers and
to improve the security of cloud resources due to the generates but also enough noise to assure the differential
integrity protection problems in the cloud environment. They privacy of values [17].
guaranteed that they have recognized the integrity protection Cloud computing is a valuable tool. However,
problems, and to address the integrity issues, they have organizations still need to be understood and managed in
proposed a framework called TCPS to expand the security of depth and prioritize execution of any agreements.
cloud assets. As indicated by them, their proposed Fortunately, there are some mitigation strategies if cloud
framework, TCPS can be utilized to watch the visitor's customers can follow; it may reduce the level of risks [19].
integrity, and still keep honesty and virtualization. In TCPS Gatewood [20] suggests that deciding a vendors inside the
system, in order to manage the image systems, they have audit process, on how frequently the vendor evaluated
used image filter and scanners in order to detect malicious external organizations, the principles of the merchant is held
images to prevent from security vulnerability and security to, regardless of whether it is interested in being examined
attacks. The strength of this work is that, it proposed an consistently. Keeping up consistently with security
instrument that gives enhanced security, transparency, and arrangements and administrative prerequisites can be hard to
interruption identification system. The limitation is that they illustrate. Gatewood recommends that as merchants hurry to
haven't accepted their work, nor have they sent it in expert create and introduce cloud-based methods; they may miss the
cloud computing situation [15]. mark on including the essential records of administration
MapReduce is massive amount of data that can be controls. Moreover, investigating a various security features
converted into tuples and then these tuples can be provided [21-23] could be an interesting path to explore in the future
to reduce it as an input and then reduce these divided tuples to protect Big Data [24].
into a smaller set of tuples [16], [17]. This is a way; that big Researchers have discovered many issues in a cloud
data can be managed although, at the same time, it will still environment and start working on these matters in order to
create a problem of security, because data monitoring and minimize these problems. There are threats for using utility
business continuity can cause glitches. However, computing, as some of the significant results, corresponds to
MapReduce is not sufficient due to lack of security of the our given results in the table below. This table summarizes
sensitive data. Therefore, the proposed method to reduce the different techniques used to address the security and
security issues in the cloud is, Airavat. Airavat is a privacy of the big data in the cloud computing by a different
MapReduce-based system that is used to store and provide study. There are new security and privacy issues that are
high security and privacy of sensitive data (Healthcare, identified in the rest of the papers that is obsolete to the
shopping transactions, etc.). It is a new integration of access argument of this study.
control. The Airavat uses MapReduce on clusters in parallel
TABLE I. THE COMPARISON OF DIFFERENT CLOUD SECURITY TECHNIQUES

No Author Technique Description Problem identified Result


/method used
1. Narwal et al. Kerberos- Kerberos is an authentication system Kerberos encrypts This technique provides a secure
(2015) encryption uses cryptographic tickets to fend off considerable shorter one- authentication in an open
technology based sending the plain text passwords over way hash environment and it is very costly
[11] on Needham- the connecting wire. regarding CPU power and time.
Schroeder
protocol

370
2. Bertino et al. XML Document- In this, the queries can be processed The information size was XML data document is used for a
(2014) [12] Cryptography and according to the policy provided by increased in XML secure environment to access
digital signature cloud provider, instead of processing format, and it created control of the third party, which
technique all queries. some integrity issues in introduces another trusted layer of
government, health, and security to the model.
finance area because of
the mode of delivery of
content.
3. Kevin Cryptographic The sensitive data can be stored in Managing private and If Intruder can get the database,
Hamlen encrypted form in the database rather public key however, they cannot get actual
(2013) [2] than plain text. data due to encryption of data.

4. Zhou et al. Declarative This technique is used to explore the Data management issues Data-centric security provides
(2016, 2012) Secure Distributed security premises of secure data are listed below: - secure query processing, efficient
[10, 9] Systems (DS2) sharing between the apps hosted on System analysis and end-to-end verification of data,
the clouds. forensic system analysis and forensics
Distributed query
processing
Query correction
assurance
5. Rongxing et Bilinear pairing This system uses five steps to control Data forensics and post Difficult to implement because it is
al. (2010) technique unauthorized user access and resolves examination based on a complex mathematical
[13] disputes of big data. The five steps model. However, this system
are: Setup, key generation, pushes the use of cloud computing
AnonyAuth, AuthAccess, and for full recognition to the public
Provenance tracking
6. Bleikertz et Amazons EC2 Amazon's EC2 have applied Reachability audit of Amazon EC2 provides a robust
al. (2010) specialized query policy language for Amazon security graphs analysis of security attacks and
[14] security analysis model and weigh up and groups vulnerabilities to enhance the
it for the practical domain. This security policies. However, it is
security analysis has been unique to Amazon, and it could be
implemented in Python and weighs up contributed more if it would be
that was calculated on Amazon EC2 general.
7. Lombardi et Transparent Cloud TCPS can be utilized to watch the Cloud security TCPS gives enhanced security,
al. (2010) Protection System visitors integrity and keeping the vulnerabilities and transparency, and interruption
[15] (TCPS) honesty and virtualization. attacks identification system, however, they
have not accepted their work, nor
they have sent in expert cloud
computing situation.
8. Roy et al. Airavat This technique provides strong -It supports a small set of Airavat is the first system that
(2010) [16], MapReduce-based security and privacy by preventing reducers. calculates access control with
[17] system leakage of sensitive data using access -Airavat generates differential privacy without auditing
control mechanism enough noise in order to untrusted codes
assure the differential
privacy of values.
9. Mladen et al. Virtual VCL is an open source end-to-end service A theoretical concept so it is not
(2008) [18] Computing implementation that provides NYU insulation via VPN, SSH proposed as much. It could have
Laboratory students with virtual access to tunnels, and VLANs contributed if the practical things
technique software applications that are were discussed in this work.
academically relevant.

with cloud computing such as databases, networking,


VI. CONCLUSION virtualization, and operating systems that are problematic as
This study has reviewed selected publication from well. Using different studies, we have identified some major
different sources to identify and evaluate gaps regarding research problems that should be taken into account to
security issues in cloud-associated problems with big data. ensure the security of big datas success. Moreover, the
The cloud computing is an emerging technology that most of management solution of only one secure big data after
the organizations are developing, as this system is constantly integrating with the cloud needs is yet to be designed. In
evolving in this environment. This system has numerous RDBM (Relational Database Management), the problem of
benefits, nonetheless, security and privacy issues are at the protecting the system from attacks by utilizing the resources
top to control sensitive data, including many technologies and mitigate risks is only one important problem. However,
the security solutions are being discovered yet, and even

371
leading providers like Amazon, Google, etc. is facing [13] Rongxing et al, Secure Provenance: The Essential Bread and
security issues. Therefore, the decision of adopting cloud Butter of Data Forensics in Cloud Computing, ASIACCS10,
Beijing, China.
computing is still in the progress and could be based on
[14] S. Bleikertz et al, "Security Audits of Multi-tier Virtual
ration of benefits to eliminate threats and risk. Infrastructures in Public Infrastructure Clouds", 2010.
[15] F. Lombardi and R. Pietro, "Transparent Security for
REFERENCES Cloud", SAC '10 Proceedings of the 2010 ACM Symposium
on Applied Computing, no. 978-1-60558-639-7, pp. 414-415,
[1] B. Matturdi, X. Zhou, S. Li and F. Lin, "Big Data security 2010.
and privacy: A review", China Communications, vol. 11, no. [16] I. Roy et al., "Airavat: Security and Privacy for MapReduce",
14, pp. 135-145, 2014. NSDI, vol. 10, pp. 297-312, 2010.
[2] T. Erl, R. Puttini and Z. Mahmood, Cloud computing. 2013 [17] I. Roy, "Airavat: Security and Privacy for MapReduce",
[3] W. Wei and X. Gu, "SecureMR: A Service Integrity google.com, 2016.
Assurance Framework for MapReduce", Proceedings of [18] M. Vouk, "Cloud Computing- Issues, Research and
IEEE CCIS2012, pp. 240-244, 2012.. Implementations", Journal of Computing and Information
[4] A. Gholami and E. Laure, "Big Data Security and Privacy Technology, vol. 16, no. 4, p. 235, 2008.
Issues in the CLOUD", International Journal of Network [19] T. Betcher, "Cloud Computing: Key IT-Related Risks and
Security & Its Applications, vol. 8, no. 1, pp. 59-79, 2016. Mitigation Strategies for Consideration by IT Security
[5] F. Shaikh and S. Haider, "Security treats in Cloud Practitioners", 2010.
Computing", Int. Conf. for Internet Technology and Secured [20] B. Gatewood, "Clouds On the Information Horizon: How To
Transactions (ICITST), Abu Dhabi, pp 214 219, 2011. Avoid The Storm", CRM, vol. 43, no. 4, pp. 32-36, 2009.
[6] P. Hoving and J. Essn, "Minutes from the first meeting of [21] D. V. Pham, A. Syed, A. Mohammad and M. N. Halgamuge,
TC 11, security and protection in information processing "Threat Analysis of Portable Hack Tools from USB Storage
systems", Computers & Security, vol. 4, no. 2, pp. 149-152, Devices and Protection Solutions", International Conference
1985. on Information and Emerging Technologies, pp 1-5, Karachi,
[7] "IEEE Cloud Computing Special Issue on Cloud Security", Pakistan, June 2010.
IEEE Cloud Comput., vol. 2, no. 5, pp. c2-c2, 2015. [22] D. V. Pham, A. Syed, M. N. Halgamuge, Universal serial bus
[8] C. Pfleeger, Security in Computing. Upper Saddle River, NJ: based software attacks and protection solutions, Digital
Prentice Hall PTR, 1997. Investigation 7 (3), 172-184, 2011.
[9] Y. Zhang and Y. Zhou, "TransOS: a transparent computing- [23] D. V. Pham, M. N. Halgamuge, A. Syed P. Mendis,
based operating system for the cloud", International Journal Optimizing windows security features to block malware and
of Cloud Computing, vol. 1, no. 4, pp. 287, 2012. hack tools on USB storage devices, Progress in
[10] W. Zhou, "Towards a Data-centric View of Cloud Security", electromagnetics research symposium, 350-355, 2010.
2016. [24] V. Vargas, A. Syed, A. Mohammad, and M. N. Halgamuge,
[11] A. Narwal and S. Tomar, "Kerberos Protocol: A Review", "Pentaho and Jaspersoft: A Comparative Study of Business
IJERT, vol. 4, no. 04, 2015. Intelligence Open Source Tools Processing Big Data to
[12] V. Inukollu, S. Arsi and S. Rao Ravuri, "Security Issues Evaluate Performances", Int. Journal of Advanced Computer
Associated with Big Data in Cloud Computing", Science and Applications (IJACSA), vol 7, no 10, pp 20-29,
International Journal of Network Security & Its Applications, November 2016.
vol. 6, no. 3, pp. 45-56, 2014.

372

Das könnte Ihnen auch gefallen