Big Data Cloud Computing Safety Measures: A Case Study

By Dr. Ifath Nazia Ghori,
Lecturer, Jazan University – Kingdom of Saudi Arabia.
Email: ghori.ing@gmail.com

ABSTRACT: scalable and efficient way. Information privacy and
Big Data and cloud computing are two important security is one of most concerned issues for Cloud
issues in the last two decades; it enables computing Computing due to its open environment with very
resources to be provided as information technology limited user side control. It is also an important
services with high effectiveness and success. Big data challenge for Big Data. After few years later more data
is a broad term for data sets so large or complex that globally would be touched with Cloud Computing
traditional data applications are inadequate. which provides strong storage, computation and
Challenges include analysis, capture, search, distributed capability in support of Big Data
sharing, storage, transfer, visualization, and processing. Other considerations are that information
information privacy. The term often refers simply to privacy and security challenges in both Cloud
the use of predictive analytics or other certain Computing and Big Data must be investigated. the
advanced methods to extract value from data, and privacy and security providing such forum for
seldom to a particular size of data set. Accuracy in researchers, and developers to exchange the latest
big data may lead to more confident decision making. experience, research ideas and development on
And better decisions can mean greater operational fundamental issues and applications about security and
efficiency, cost reduction and reduced risk. Now a privacy issues in cloud and big data environments.
day’s big data is one of the most problems that
researchers try to solve it and focusing their The cloud helps organizations and enables rapid on
researches over it to get ride the problem of how big demand provisioning of server resources such as
data could be managing in the recent systems and CPUs, manage, storage, bandwidth, and share and
managed with the cloud of computing, and the one of analyze their Big Data in a reasonable and simple to
the most important issue is how to gain a ideal use. The cloud infrastructure as a service platform,
security for big data in cloud computing, this paper supported by on demand analytics solution seller that
reviews a survey of big-data with cloud computing makes the large size of data analytics very affordable.
security and the mechanisms that are used to protect As location independent cloud computing Involving
and secure data and also have a privacy for big-data shared services providing resources , software and data
with an available clouds. to systems and The hardware on demand, actually the
storage networking in cloud is a very strong because
Keywords: BigData, Large Data, Cloud Providers, use driver for high performance.
Cloud Computing, NAS, Security, data privacy. Data
protection

1. INTRODUCTION
Big data is known as a datasets with size beyond the
ability of the software tools that used today to manage
and process the data within a dedicated time. With
Variety, Volume, Velocity Big Data such military data
or other unauthorized data need to be protected in a FIG 1. BIG DATA AND CLOUDS

Page 813

BIG DATA : PRESENT AND FUTURE These days the data are produced from many sources such as social networks. With non-blocking big data related jobs will be created globally and only throughput. ISCSI STORAGE AREA data recorded mostly in nonstandard forms which NETWORK cannot be analyzed using traditional data models and methods. employment total cost of ownership. Ethernet prevails as the mainstream technology for Cloud Storage with ISCSI based block storage and By 2015 According to a Gartner report 4. big data refers essentially to the following data types. whatsapp. However. LinkedIn. the transactions websites companies. FIBER CHANNEL STORAGE AREA from 4. and Social data such as social network and application platforms like Facebook. FIG. Conventional enterprise data such as Customer information in Data Base. marketing strategies and storage as in figure 4 below the user access data improved customer relations. In network attached right decision making. Machine generated and Sensors data such as smart meter. Moreover the huge amount of NETWORK FIG 3. Also the total of data volume is expanding constantly. the challenges but the opportunities are also exists the access storage based on block. NETWORK ATTACHED STORAGE 2. So between 2013 and 2020 it will go to 44 trillion GBs FIG 2.4 million new Network Attached Storage (NAS). Arista Networks switches are opportunities are enormous in the big data job markets ideal for cloud storage applications [3]. However. record density. better public services and remotely based on system via network. website and sensor network.4 trillion GBs. So. low latency. manufacturing sensors etc. 4. Big Data today have a wide range of In these above figures Storage Area Networks. the requirements of cloud storage needs hypothesized to a group of sub nodes operations performed with some of the units and CPUs advanced.For example Arista provides Networks with Specifications and product line of switching solutions as shown in figures 2 and 3 below. 10 Gigabit so on. but there are very few training and education offerings Page 814 . According to a recent report the most of data unstructured or semi structured and the size of data exists now is doubling in every two years. Twitter and YouTube. and leading one third of these will be filled.

S3 promises a outcomes. from a platform perspective. the choice of a cloud platform been a precursor and facilitator to the emergence of big standard has implications on which tools are available data. build and achieve a similarly reliable storage solution AT&T. Oracle. example. This is completely transparent to the user and Some of the cloud providers that offer IaaS services requires no actions or knowledge. This leads to confuse for decision capacity. is something all makers in charge of big data projects. the core cloud architectures. especially if customers are keeping more of it. failures. Big data has opened the Currently. standardized technologies. Most big data products are mainly based on The cloud computing space has been dominated by open-source technologies. Some are processes to detect and repair errors and device household names while others are recently emerging. privacy and security problems. or immense computing power and storage needs. standards are Amazon Web Services until recently. Rackspace. A company could that can be used for big data include Amazon. but it would require fabulous capital expenditures and Page 815 . Increasingly especially important and needed for interoperability of serious alternatives are emerging like Google Cloud the hardware and software components of commercial Platform and other clouds that mentioned above. and Verizon/Terremark. This is less than an hour outage per discusses what to look for and how to get started with month. Scaling architectures. changeable. that it is a big data project. IBM. In Professional. 5. BIG DATA CLOUD PROVIDERS average. and dependable products and project most prominent solution in the space. highly durable.com. bytes to terrabytes.e. solutions. Therefore. CLOUD COMPUTING IN BIG DATA Stack.999999999% cloud storage. This article introduces cloud computing and 99.000 objects he can expect to lose one object every 10.000. SAS Amazon and many others. S3 achieves this by storing data in multiple Cloud providers come in all shapes and sizes and offer facilities with error checking and self-healing many different products for big data. and has to scale from a few At the same time business stakeholders expect swift. Data retention question of how and which cloud computing is the continues to double and triple year-over-year because optimal choice for their computing needs. Amazon’s S3 cloud storage is the inexpensive. Certainly. cloud platforms come in several forms and The cloud storage challenges in big data analytics fall sometimes have to be incorporated with traditional into two categories: capacity and performance. BIG DATA CLOUD STORAGE However. business. and Open 3. Cloudera. If a customer stores 10. Cloud computing is the co modification of and which alternative providers with the same computing time and data storage by means of technology are available. Lack of official standards also aggravates Amazon Web Services compatible solutions. i. and durability per year. SAP. one of the most high-profile IaaS service growing interest to new tools production Beginning providers is Amazon web Services with its Elastic with the introduction of Apache Hadoop and Map Compute Cloud (Amazon EC2). It has significant advantages over traditional physical deployments.9% monthly availability and 99. These projects regularly exhibit impacts us because we need to provide capacity.focusing on this market. IBM. bursting. The durability can be illustrated with an cloud computing. Consequently. Amazon’s own offering or companies with application programming interface compatible offerings. cloud storage needs to be highly available. an open source project with a wide industry The rise of cloud computing and cloud data stores has backing. leads to a cloud providers need to watch closely.000 years on 4. Amazon didn’t start Reduce and also many open source have been out with a vision to build a big infrastructure services implemented and developed by companies Terradata.

transfer. enabling an open systems in parallel while their services are ported to a and elastic infrastructure to store and manage big data. i. This enables new products challenging since the economic advantages of scale are and projects with a viable option to start on a small usually not achievable within most projects and scale with low costs. dedicated hardware to rent and the number of nodes reading or writing. encrypt virtual private networks as well as encrypted storage to address most security concerns. This big data appliance enables an Public clouds share physical resources for data open and elastic infrastructure to store and manage big transfers. Visualization makes access to other customers’ data extremely difficult.g. These are valid concerns fully integrated. regulations. Global data centered Private clouds are dedicated to one organization and companies like Google or Facebook have the expertise do not share physical resources. storage. the vast majority of customers and projects unrelated. They can trade capital expenditure for an requirements and regulations that need a strict operational one. servers. As an open source NoSQL database technology. the enterprise infrastructure must adapt ensure compliance with local privacy laws and in order to utilize it. To meet these requirements. backed by SYNNEX’ powerful 7. that many solutions also scale horizontally. variety and velocity grow processing restricted to specific geographic regions to exponentially. and As big data volume.e.operational challenges. The resource can be and scale to do this efficiently. Apache Cassandra. Cloud storage is effectively a boundless public cloud offerings is rarely obtained and the data sink. extreme Corporation.e. The return of investment compared to indefinitely. PUBLIC CLOUD distribution model. Real world problems around public Page 816 . are for while MongoDB is a cross-platform document. and efficient if organizations are to remain deployments are legacy systems with special hardware competitive. prominently for computing performances is operational overhead and risk of failure is significant. big data turnkey private cloud however if these demands are extraordinary the appliance powered by Nebula to the channel. Another reason for private cloud agile. and processing. e. becoming increasingly scalable. which is excellent since it requires no separation of an organization’s data storage and capital outlay or risk. customers data. The rack question if a cloud architecture is the correct solution level appliance is a fully integrated private cloud has to be raised. i. cloud environment culminating in a switch to a The new rack system includes industry-standard cheaper public or hybrid cloud. a leading distributor of IT products and memory or computing instances which are not solutions. benefit from using a cloud storage requirement of private cloud deployments are security service. Cloud 6. It provides from the first byte processing from accidental or malicious access reliable and scalable storage solutions of a quality through shared resources. the Nebula One Cloud Controller. When a product proves organizations despite the utilization of industry successful these storage solutions scale virtually standards. However. when Additionally. Synnex needs or exceptional resource demand. however. PRIVATE CLOUD providers may also offer data storage. MongoDB. Security concerns. One reason can be to establish a system engineered to deliver workloads for both private cloud for a period to run legacy and demanding Apache Cassandra and MongoDB. announced it is the first distributor to offer a available in public clouds. have private visualized computing environments and Apache Cassandra provides multi-site distributed isolated storage. to adopt private clouds or custom deployments. A typical underlying start-ups. Big data projects and provided in-house or externally. which entice a few computing of big data across multiple data centers. Private cloud setups are otherwise unachievable. oriented open source database system. cloud providers have captured the trend data is copied in parallel by cluster or parallel for increased security and provide special computing processes the throughput scales linear with environments.

making data inflow to the cloud provider free or very Analysis: cheap. Page 817 . trade journals to draft an initial list of high priority how can we ensure private data stays private as it security and privacy problems studied published moves through different stages of analysis. Formalizing a threat model that covers most of the The data lock-in is a soft measure and works by cyber-attack or data-leakage scenarios. The copying of data out to local systems or Finding tractable solutions based on the threat model.cloud computing are more mundane like data lock-in Modeling: and fluctuating performance of individual instances. This is not an Implementation: impossible problem and in practice encourages Implanting the solution in existing infrastructures. the alliance’s members hope to an expanded version that addresses three new distinct prod technology vendors. Volume. By outlining the issues challenges as shown in figure 6 below. Security and privacy issues are magnified by the three V’s of big data: Velocity. FIG. streaming nature of data acquisition and the increasingly high volume of inter. 5 CHALLENGES OF CSA’s BIG DATA cloud migrations. But lost among all the excitement about the potential of Big Data are the very real security and privacy challenges that threaten to slow this momentum. diversity of data sources and formats. practitioners to collaborate on computing techniques and business practices that reduce the risks associated with analyzing massive datasets using innovative data analytics. The Expanded involved. along with analysis of internal and external Top 10 Big Data challenges have evolved from the threats and summaries of current approaches to initial list of challenges presented at CSA Congress to mitigating those risks. Usually this is not sensible anyway due to network speed and complexities around dealing with numerous platforms. academic researchers and issues. Following this exercise. BIG DATA PRIVACY AND SECURITY Big Data remains one of the most talked about technology trends in 2013. and Variety. Characterized a problem as a challenge if the output? The answers to those questions that prompted proposed solution does not cover the problem the group’s latest 39-page report detailing 10 major scenarios. They also agree on the big challenges presented by Big Data. Interviewed CSA questions that come next: How can we make the members and surveyed security practitioner oriented systems that store and compute the data secure? And. mechanisms. other providers is often more expensive. utilizing more services from a cloud provider instead of moving data in and out for different services or processes. The CSA’s Big Data Working Group followed a Security Alliance know that big data and analytics three step process to arrive at top security and privacy systems are here to stay. These factors include variables such as large-scale cloud infrastructures. Consequently. traditional security WORKING GROUP. which are tailored to securing small-scale static (as opposed to streaming) data. 8. the Working Group security and privacy challenges facing infrastructure researchers compiled their list of the Top 10 providers and customers. often fall short The information security practitioners at the Cloud [17]. input and solutions.

CONCLUSION mitigation and compliance regimes require encryption Recently. Security Intelligence etc. network and even cloud concept of big data. researchers focusing their efforts in how to to safeguard data. needed with the result that the entire data storage layer security intelligence information that can be used with needs security protection. Data Security Platform: The Vormetric Data Security Platform secures critical data – placing the safeguards and access controls for your data with your data. handing out. environments at the file system and volume level. key management. There are many types of a Security protection and security used such as Vormetric Encryption. encryption and key management amount of data as known a Big data deals with three that enables compliance and is transparent to concepts volume . Privileged users of all types challenges that arise when systems try to handle the (including system. System update and safekeeping algorithms and methods. applications and users. More researches required to administrators) can see plaintext information only if overcome the security of big data instead of current specifically enabled to do so. Page 818 . As managing and based access controls that restrict access to data that processing of big data have many problems and has been encrypted allowing only approved access to required more efforts to handle these requirements data by processes and users as required to meet strict when deal with big data. there is a virtual certainty that see only encrypted data. manage . policy analyzing and securing the big data . FIG. The data security platform includes strong encryption. fine-grained access controls and the security intelligence information needed to identify the latest in advanced persistent threats (APTs) and other security attacks on your data. Encryption and Information and Event Management solution to Key Management. Fine-grained Access a new mechanisms to manage . This Big Data analytics security solution allows organizations to gain the benefits of the intelligence gleaned from Big Data analytics while maintaining the security of their data – with no changes to operation of the application or to system operation or administration. Controls: Vormetric provides the fine-grained. Data Security Platform. not the plaintext source. either protected information or critical Intellectual Property (IP) will be present. This information is Security Intelligence: Vormetric logs capture all access distributed throughout the Big Data implementation as attempts to protected data providing high value. identify compromised accounts and malicious insiders as well as finding access patterns by processes and Vormetric Encryption: seamlessly protects Big Data users that may represent and APT attack in process. Variety and velocity which requires processes. storing. 6 VORMETRIC PERCEPTION Encryption and Key Management: Data breach 8.Given the very large data sets that contribute to a Big administrative processes continue to work freely – but Data implementations. security is one of the compliance requirements. Vormetric provides the strong. handling and also processing the huge centrally managed.

Cloud Computing in HPC: Usage and Types.dummies. Intersect360 Research. Cloud Computing in HPC: Usage and Types. Intersect360 Research. International Journal of Application or Innovation in Engineering & Management (IJAIEM). October 2012 .‖ June 2011 7.html 10.com/cloud- computing/041814/gartner-hybrid-cloud-big- data-analytics-top-smart-government-trends 4. http://fcw.techrepublic. Kamran Zamanifar3 . Solving Big Data Problems with Private Cloud Storage.org/wiki/Cloud_computi ng 12. https://en.com/article/21760 86/cloud-computing/big-data-drives-47-- growth-for-top-50-public-cloud- companies. https://en. Addison Snell.wikipedia. Vahid Ashktorab1 . http://www.org/wiki/Big_data 11.‖ June 2011 3.html 5.9.com/how- to/content/big-data-cloud-providers.org/wiki/Vormetric Page 819 . Seyed Reza Taghizadeh2 and Dr.” A Survey on Cloud Computing and Current Solution Providers”. Solving Big Data Problems with Private Cloud Storage.wikipedia.com/Articles/2012/04/30/FEAT- BizTech 6. http://www. http://www. Addison Snell.wikipedia.businessweek. 2. http://www. REFERENCES 1. http://talkincloud. https://en.com/articles/2013- 08-07/the-future-of-big-data-apps-and- corporate-knowledge 8.com/blog/the- enterprise-cloud/cloud-computing-and-the- rise-of-big-data/ 9.networkworld.