You are on page 1of 7

Big Data Cloud Computing Safety Measures: A Case Study

By Dr. Ifath Nazia Ghori,
Lecturer, Jazan University – Kingdom of Saudi Arabia.
Email: ghori.ing@gmail.com

ABSTRACT: scalable and efficient way. Information privacy and
Big Data and cloud computing are two important security is one of most concerned issues for Cloud
issues in the last two decades; it enables computing Computing due to its open environment with very
resources to be provided as information technology limited user side control. It is also an important
services with high effectiveness and success. Big data challenge for Big Data. After few years later more data
is a broad term for data sets so large or complex that globally would be touched with Cloud Computing
traditional data applications are inadequate. which provides strong storage, computation and
Challenges include analysis, capture, search, distributed capability in support of Big Data
sharing, storage, transfer, visualization, and processing. Other considerations are that information
information privacy. The term often refers simply to privacy and security challenges in both Cloud
the use of predictive analytics or other certain Computing and Big Data must be investigated. the
advanced methods to extract value from data, and privacy and security providing such forum for
seldom to a particular size of data set. Accuracy in researchers, and developers to exchange the latest
big data may lead to more confident decision making. experience, research ideas and development on
And better decisions can mean greater operational fundamental issues and applications about security and
efficiency, cost reduction and reduced risk. Now a privacy issues in cloud and big data environments.
day’s big data is one of the most problems that
researchers try to solve it and focusing their The cloud helps organizations and enables rapid on
researches over it to get ride the problem of how big demand provisioning of server resources such as
data could be managing in the recent systems and CPUs, manage, storage, bandwidth, and share and
managed with the cloud of computing, and the one of analyze their Big Data in a reasonable and simple to
the most important issue is how to gain a ideal use. The cloud infrastructure as a service platform,
security for big data in cloud computing, this paper supported by on demand analytics solution seller that
reviews a survey of big-data with cloud computing makes the large size of data analytics very affordable.
security and the mechanisms that are used to protect As location independent cloud computing Involving
and secure data and also have a privacy for big-data shared services providing resources , software and data
with an available clouds. to systems and The hardware on demand, actually the
storage networking in cloud is a very strong because
Keywords: BigData, Large Data, Cloud Providers, use driver for high performance.
Cloud Computing, NAS, Security, data privacy. Data
protection

1. INTRODUCTION
Big data is known as a datasets with size beyond the
ability of the software tools that used today to manage
and process the data within a dedicated time. With
Variety, Volume, Velocity Big Data such military data
or other unauthorized data need to be protected in a FIG 1. BIG DATA AND CLOUDS

Page 813

According to a recent report the most of data unstructured or semi structured and the size of data exists now is doubling in every two years.4 trillion GBs. and leading one third of these will be filled. ISCSI STORAGE AREA data recorded mostly in nonstandard forms which NETWORK cannot be analyzed using traditional data models and methods. BIG DATA : PRESENT AND FUTURE These days the data are produced from many sources such as social networks. low latency. employment total cost of ownership. whatsapp. manufacturing sensors etc. big data refers essentially to the following data types. record density. So. LinkedIn. With non-blocking big data related jobs will be created globally and only throughput. Big Data today have a wide range of In these above figures Storage Area Networks. However.4 million new Network Attached Storage (NAS). and Social data such as social network and application platforms like Facebook. However. Machine generated and Sensors data such as smart meter. the transactions websites companies. Ethernet prevails as the mainstream technology for Cloud Storage with ISCSI based block storage and By 2015 According to a Gartner report 4. Moreover the huge amount of NETWORK FIG 3. In network attached right decision making. marketing strategies and storage as in figure 4 below the user access data improved customer relations.For example Arista provides Networks with Specifications and product line of switching solutions as shown in figures 2 and 3 below. Arista Networks switches are opportunities are enormous in the big data job markets ideal for cloud storage applications [3]. 10 Gigabit so on. but there are very few training and education offerings Page 814 . NETWORK ATTACHED STORAGE 2. FIG. 4. Conventional enterprise data such as Customer information in Data Base. the challenges but the opportunities are also exists the access storage based on block. website and sensor network. So between 2013 and 2020 it will go to 44 trillion GBs FIG 2. better public services and remotely based on system via network. the requirements of cloud storage needs hypothesized to a group of sub nodes operations performed with some of the units and CPUs advanced. FIBER CHANNEL STORAGE AREA from 4. Twitter and YouTube. Also the total of data volume is expanding constantly.

Amazon didn’t start Reduce and also many open source have been out with a vision to build a big infrastructure services implemented and developed by companies Terradata. or immense computing power and storage needs. The durability can be illustrated with an cloud computing. and has to scale from a few At the same time business stakeholders expect swift. SAS Amazon and many others. Lack of official standards also aggravates Amazon Web Services compatible solutions. Most big data products are mainly based on The cloud computing space has been dominated by open-source technologies. privacy and security problems. Increasingly especially important and needed for interoperability of serious alternatives are emerging like Google Cloud the hardware and software components of commercial Platform and other clouds that mentioned above.000 years on 4. an open source project with a wide industry The rise of cloud computing and cloud data stores has backing. SAP. This article introduces cloud computing and 99. the core cloud architectures. business.999999999% cloud storage. highly durable. In Professional. This is less than an hour outage per discusses what to look for and how to get started with month.9% monthly availability and 99. IBM. S3 achieves this by storing data in multiple Cloud providers come in all shapes and sizes and offer facilities with error checking and self-healing many different products for big data. A company could that can be used for big data include Amazon. Certainly. failures. It has significant advantages over traditional physical deployments. standards are Amazon Web Services until recently.com. cloud platforms come in several forms and The cloud storage challenges in big data analytics fall sometimes have to be incorporated with traditional into two categories: capacity and performance. Rackspace. leads to a cloud providers need to watch closely. Scaling architectures. These projects regularly exhibit impacts us because we need to provide capacity. especially if customers are keeping more of it. CLOUD COMPUTING IN BIG DATA Stack. is something all makers in charge of big data projects. from a platform perspective. cloud storage needs to be highly available. Amazon’s own offering or companies with application programming interface compatible offerings. changeable. and Verizon/Terremark. BIG DATA CLOUD STORAGE However. but it would require fabulous capital expenditures and Page 815 . example.000 objects he can expect to lose one object every 10. Amazon’s S3 cloud storage is the inexpensive. solutions.focusing on this market. standardized technologies. Big data has opened the Currently. Therefore. bursting. and Open 3. the choice of a cloud platform been a precursor and facilitator to the emergence of big standard has implications on which tools are available data. BIG DATA CLOUD PROVIDERS average. Cloudera. This is completely transparent to the user and Some of the cloud providers that offer IaaS services requires no actions or knowledge. IBM.000. Cloud computing is the co modification of and which alternative providers with the same computing time and data storage by means of technology are available. i. If a customer stores 10. one of the most high-profile IaaS service growing interest to new tools production Beginning providers is Amazon web Services with its Elastic with the introduction of Apache Hadoop and Map Compute Cloud (Amazon EC2). Some are processes to detect and repair errors and device household names while others are recently emerging. 5. build and achieve a similarly reliable storage solution AT&T. Consequently. Data retention question of how and which cloud computing is the continues to double and triple year-over-year because optimal choice for their computing needs. that it is a big data project. and durability per year. S3 promises a outcomes. This leads to confuse for decision capacity. Oracle. bytes to terrabytes. and dependable products and project most prominent solution in the space.e.

which entice a few computing of big data across multiple data centers. oriented open source database system. customers data.operational challenges. The resource can be and scale to do this efficiently. MongoDB. backed by SYNNEX’ powerful 7. the Nebula One Cloud Controller. However. Synnex needs or exceptional resource demand. It provides from the first byte processing from accidental or malicious access reliable and scalable storage solutions of a quality through shared resources. PRIVATE CLOUD providers may also offer data storage. have private visualized computing environments and Apache Cassandra provides multi-site distributed isolated storage. the vast majority of customers and projects unrelated. As an open source NoSQL database technology. storage. that many solutions also scale horizontally. and As big data volume. Cloud storage is effectively a boundless public cloud offerings is rarely obtained and the data sink. big data turnkey private cloud however if these demands are extraordinary the appliance powered by Nebula to the channel. e. Real world problems around public Page 816 . a leading distributor of IT products and memory or computing instances which are not solutions. Cloud 6.e. Apache Cassandra. and processing. Private cloud setups are otherwise unachievable. Visualization makes access to other customers’ data extremely difficult. Security concerns. are for while MongoDB is a cross-platform document. i. enabling an open systems in parallel while their services are ported to a and elastic infrastructure to store and manage big data.e. A typical underlying start-ups. which is excellent since it requires no separation of an organization’s data storage and capital outlay or risk. i. prominently for computing performances is operational overhead and risk of failure is significant. servers. however. when Additionally. becoming increasingly scalable. PUBLIC CLOUD distribution model. These are valid concerns fully integrated. To meet these requirements.g. extreme Corporation. transfer. variety and velocity grow processing restricted to specific geographic regions to exponentially. dedicated hardware to rent and the number of nodes reading or writing. Another reason for private cloud agile. When a product proves organizations despite the utilization of industry successful these storage solutions scale virtually standards. Global data centered Private clouds are dedicated to one organization and companies like Google or Facebook have the expertise do not share physical resources. One reason can be to establish a system engineered to deliver workloads for both private cloud for a period to run legacy and demanding Apache Cassandra and MongoDB. They can trade capital expenditure for an requirements and regulations that need a strict operational one. regulations. the enterprise infrastructure must adapt ensure compliance with local privacy laws and in order to utilize it. This big data appliance enables an Public clouds share physical resources for data open and elastic infrastructure to store and manage big transfers. announced it is the first distributor to offer a available in public clouds. Big data projects and provided in-house or externally. The rack question if a cloud architecture is the correct solution level appliance is a fully integrated private cloud has to be raised. cloud providers have captured the trend data is copied in parallel by cluster or parallel for increased security and provide special computing processes the throughput scales linear with environments. benefit from using a cloud storage requirement of private cloud deployments are security service. The return of investment compared to indefinitely. cloud environment culminating in a switch to a The new rack system includes industry-standard cheaper public or hybrid cloud. to adopt private clouds or custom deployments. and efficient if organizations are to remain deployments are legacy systems with special hardware competitive. encrypt virtual private networks as well as encrypted storage to address most security concerns. This enables new products challenging since the economic advantages of scale are and projects with a viable option to start on a small usually not achievable within most projects and scale with low costs.

They also agree on the big challenges presented by Big Data.cloud computing are more mundane like data lock-in Modeling: and fluctuating performance of individual instances. input and solutions. The Expanded involved. practitioners to collaborate on computing techniques and business practices that reduce the risks associated with analyzing massive datasets using innovative data analytics. Interviewed CSA questions that come next: How can we make the members and surveyed security practitioner oriented systems that store and compute the data secure? And. other providers is often more expensive. Formalizing a threat model that covers most of the The data lock-in is a soft measure and works by cyber-attack or data-leakage scenarios. diversity of data sources and formats. utilizing more services from a cloud provider instead of moving data in and out for different services or processes. Volume. Security and privacy issues are magnified by the three V’s of big data: Velocity. along with analysis of internal and external Top 10 Big Data challenges have evolved from the threats and summaries of current approaches to initial list of challenges presented at CSA Congress to mitigating those risks. These factors include variables such as large-scale cloud infrastructures. This is not an Implementation: impossible problem and in practice encourages Implanting the solution in existing infrastructures. 5 CHALLENGES OF CSA’s BIG DATA cloud migrations. the alliance’s members hope to an expanded version that addresses three new distinct prod technology vendors. But lost among all the excitement about the potential of Big Data are the very real security and privacy challenges that threaten to slow this momentum. 8. traditional security WORKING GROUP. Page 817 . and Variety. Usually this is not sensible anyway due to network speed and complexities around dealing with numerous platforms. Characterized a problem as a challenge if the output? The answers to those questions that prompted proposed solution does not cover the problem the group’s latest 39-page report detailing 10 major scenarios. making data inflow to the cloud provider free or very Analysis: cheap. BIG DATA PRIVACY AND SECURITY Big Data remains one of the most talked about technology trends in 2013. Following this exercise. mechanisms. Consequently. often fall short The information security practitioners at the Cloud [17]. By outlining the issues challenges as shown in figure 6 below. The CSA’s Big Data Working Group followed a Security Alliance know that big data and analytics three step process to arrive at top security and privacy systems are here to stay. academic researchers and issues. trade journals to draft an initial list of high priority how can we ensure private data stays private as it security and privacy problems studied published moves through different stages of analysis. streaming nature of data acquisition and the increasingly high volume of inter. which are tailored to securing small-scale static (as opposed to streaming) data. The copying of data out to local systems or Finding tractable solutions based on the threat model. FIG. the Working Group security and privacy challenges facing infrastructure researchers compiled their list of the Top 10 providers and customers.

security is one of the compliance requirements. 6 VORMETRIC PERCEPTION Encryption and Key Management: Data breach 8. Fine-grained Access a new mechanisms to manage . This Big Data analytics security solution allows organizations to gain the benefits of the intelligence gleaned from Big Data analytics while maintaining the security of their data – with no changes to operation of the application or to system operation or administration. Privileged users of all types challenges that arise when systems try to handle the (including system. handing out. network and even cloud concept of big data. CONCLUSION mitigation and compliance regimes require encryption Recently. Data Security Platform: The Vormetric Data Security Platform secures critical data – placing the safeguards and access controls for your data with your data. There are many types of a Security protection and security used such as Vormetric Encryption. More researches required to administrators) can see plaintext information only if overcome the security of big data instead of current specifically enabled to do so. As managing and based access controls that restrict access to data that processing of big data have many problems and has been encrypted allowing only approved access to required more efforts to handle these requirements data by processes and users as required to meet strict when deal with big data. Controls: Vormetric provides the fine-grained. environments at the file system and volume level. handling and also processing the huge centrally managed. there is a virtual certainty that see only encrypted data. storing. The data security platform includes strong encryption. either protected information or critical Intellectual Property (IP) will be present. identify compromised accounts and malicious insiders as well as finding access patterns by processes and Vormetric Encryption: seamlessly protects Big Data users that may represent and APT attack in process. System update and safekeeping algorithms and methods. Variety and velocity which requires processes. not the plaintext source. encryption and key management amount of data as known a Big data deals with three that enables compliance and is transparent to concepts volume . key management. policy analyzing and securing the big data .Given the very large data sets that contribute to a Big administrative processes continue to work freely – but Data implementations. Security Intelligence etc. Data Security Platform. FIG. Vormetric provides the strong. Page 818 . researchers focusing their efforts in how to to safeguard data. This information is Security Intelligence: Vormetric logs capture all access distributed throughout the Big Data implementation as attempts to protected data providing high value. manage . fine-grained access controls and the security intelligence information needed to identify the latest in advanced persistent threats (APTs) and other security attacks on your data. needed with the result that the entire data storage layer security intelligence information that can be used with needs security protection. Encryption and Information and Event Management solution to Key Management. applications and users.

businessweek. Intersect360 Research. International Journal of Application or Innovation in Engineering & Management (IJAIEM). Seyed Reza Taghizadeh2 and Dr. REFERENCES 1.wikipedia.com/articles/2013- 08-07/the-future-of-big-data-apps-and- corporate-knowledge 8. https://en. http://www.techrepublic.” A Survey on Cloud Computing and Current Solution Providers”.9. Vahid Ashktorab1 .html 5. 2.com/Articles/2012/04/30/FEAT- BizTech 6. http://talkincloud. https://en. http://fcw.dummies. http://www.org/wiki/Big_data 11. Addison Snell.‖ June 2011 3. Kamran Zamanifar3 . Cloud Computing in HPC: Usage and Types.com/blog/the- enterprise-cloud/cloud-computing-and-the- rise-of-big-data/ 9. Addison Snell. http://www.networkworld.‖ June 2011 7.com/article/21760 86/cloud-computing/big-data-drives-47-- growth-for-top-50-public-cloud- companies. https://en. Cloud Computing in HPC: Usage and Types. Intersect360 Research. Solving Big Data Problems with Private Cloud Storage.com/how- to/content/big-data-cloud-providers.wikipedia. October 2012 .org/wiki/Cloud_computi ng 12.wikipedia. http://www.org/wiki/Vormetric Page 819 .html 10.com/cloud- computing/041814/gartner-hybrid-cloud-big- data-analytics-top-smart-government-trends 4. Solving Big Data Problems with Private Cloud Storage.