Sie sind auf Seite 1von 27

 Many schemes have been recently advanced for storing data on multiple

clouds. Distributing data over different cloud storage providers(CSPs)


automatically provides users with a certain degree of information leakage
control, for no single point of attack can leak all the information. However,
unplanned distribution of data chunks can lead to high information
disclosure even while using multiple clouds. In this paper, we study an
important information leakage problem caused by unplanned data
distribution in multicloud storage services.
 we present StoreSim, an information leakage aware storage system in multicloud. StoreSim

aims to store syntactically similar data on the same cloud, thus minimizing the user’s

information leakage across multiple clouds. We design an approximate algorithm to

efficiently generate similarity-preserving signatures for data chunks based on MinHash and

Bloom filter, and also design a function to compute the information leakage based on these

signatures. Next, we present an effective storage plan generation algorithm based on

clustering for distributing data chunks with minimal information leakage across multiple

clouds. Finally, we evaluate our scheme using two real datasets from Wikipedia and GitHub.

We show that our scheme can reduce the information leakage by up to 60% compared to

unplanned placement. Furthermore, our analysis on system attackability demonstrates that

our scheme makes attacks on information more complex


 In fact, the data deduplication technique, which is widely adopted by current
cloud storage services like Dropbox is one example of exploiting the
similarities among different data chunks to save disk space and avoid data
retransmission . It identifies the same data chunks by their fingerprints which
are generated by fingerprinting algorithms such as SHA-1, MD5. Any change
to the data will produce a very different fingerprint with high probability.
However, these fingerprints can only detect whether or not the data nodes
are duplicate, which is only good for exact equality testing. Determining
identical chunks is relatively straight forward but efficiently determining
similarity between chunks is an intricate task due to the lack of similarity
preserving fingerprints (or signatures)
 Unplanned distribution of data chunks can lead to high information
disclosure even while using multiple clouds.

 Frequent modifications of files by users result in large amount of similar


chunks1;

 Similar chunks across files, due to which existing CSPs use the data
deduplication technique
 We present StoreSim, an information leakage aware multi cloud storage
system which incorporates three important distributed entities and we also
formulate information leakage optimization problem in multi cloud.

 We propose an approximate algorithm, BFSMinHash, based on Minhash to


generate similarity-preserving signatures for data chunks.

 Based on the information match measured by BFSMinHash, we develop an


efficient storage plan generation algorithm, Clustering, for distributing users
data to different clouds
 However, previous works employed only a single cloud which has both
compute and storage capacity. Our work is different since we consider a
mutlicloud in which each storage cloud is only served as storage without the
ability to compute.

 Our work is not alone in storing data with the adoption of multiple CSPs
these work focused on different issues such as cost optimization, data
consistency and availability.
 DATA OWNER
 METADATA SERVER
 CLOUD SERVICE PROVIDER
 DATA USER

 In this module, we develop the Customer features
functionalities. Customer first register his/her details
and login. Customer can outsource sensitive and
valuable data to cloud by encrypting data and
splitting data in to multiple parts.
 Data owner has option to modify data which is
uploaded to cloud. In this process when user updates
data stored in cloud1 with data which is already
available in cloud2 then total data will be visible in
cloud1 only. In order to solve this problem owner will
check data similarity using minhash and data
matching percentage is calculated and refer to user
where to upload data.
 Metadata servers are used to store the
metadata database about the information of
files, CSPs and users,which usually are
structured data representing the whole cloud
file system.
 In this module, we design the Cloud
functionalities. The Cloud entity can view all
customer details, file upload details and
customer file download details. In this module,
we use the DriveHQ Cloud Service API for the
Cloud Integration and develop the project.
 We consider a system of s storage servers S1, . . .
,Ss, which stores part of data uploaded by data
owner. We assume that each server appropriately
authenticates user. For simplicity and without
loss of generality, we focus on the read/update
storage abstraction of which exports two
operations:
 Data user module is used to check user data
download by requesting file from cloud.The
read routine fetches the stored value v from
the servers. For each j ∈[1 . . . s], piece vj is
downloaded from server Sj and all pieces are
combined into v.
HARDWARE REQUIREMENTS:

 System : Pentium IV 2.4 GHz.

 Hard Disk : 140 GB.

 Floppy Drive : 1.44 Mb.

 Mouse : Logitech.

 Ram : 1 GB.

SOFTWARE REQUIREMENTS:

 Operating system : Windows XP/7.

 Coding Language : JAVA/J2EE

 Data Base : MYSQL

 Tool : Netbeans 7.4


 Many operating system designs can be placed into one of two very rough categories,
depending upon how they implement and use the notions of process and synchronization. One
category, the "Message-oriented System," is characterized by a relatively small, static number
of processes with an explicit message system for communicating among them. The other
category, the "Procedure-oriented System," is characterized by a large, rapidly changing
number of small processes and a process synchronization mechanism based on shared data.In
this paper, it is demonstrated that these two categories are duals of each other and that a
system which is constructed according to one model has a direct counterpart in the other. The
principal conclusion is that neither model is inherently preferable, and the main consideration
for choosing between them is the nature of the machine architecture upon which the system is
being built, not the application which the system will ultimately support. On the Duality of
Operating System Structures.
 The increasing popularity of cloud storage services has lead companies that handle critical
data to think about using these services for their storage needs. Medical record databases,
large biomedical datasets, historical information about power systems and financial data are
some examples of critical data that could be moved to the cloud. However, the reliability and
security of data stored in the cloud still remain major concerns. In this work we present
DepSky, a system that improves the availability, integrity, and confidentiality of information
stored in the cloud through the encryption, encoding, and replication of the data on diverse
clouds that form a cloud-of-clouds. We deployed our system using four commercial clouds
and used PlanetLab to run clients accessing the service from different countries. We observed
that our protocols improved the perceived availability, and in most cases, the access latency,
when compared with cloud providers individually. Moreover, the monetary costs of using
DepSky in this scenario is at most twice the cost of using a single cloud, which is optimal and
seems to be a reasonable cost, given the benefits.
 System analysis is an important activity that takes
place when we are building a new system or
changing existing one. Analysis helps to
understand the existing system and the
requirements necessary for building the new
system. If there is no existing system then analysis
defines only the requirements.
 One of the most important factors in system
analysis is to understand the system and its
problems. A good understanding of the system
enables designer to identify and correct problems.
Based on the drawbacks of the existing system the
system is being planned. So the defining of the
given problem has to be analyzed.
1 Identification of customer needs
2 Create system definitions
3 Perform technical analysis
 Systems design is the process or art of
defining the architecture, components,
modules, interfaces, and data for a system to
satisfy specified requirements. One could see
it as the application of systems theory to
product development. There is some overlap
and synergy with the disciplines of systems
analysis, systems architecture and systems
engineering.
 J. Crowcroft, “On the duality of resilience and privacy,” in Proceedings of theRoyal Society of
London A: Mathematical, Physical and Engineering Sciences, vol.471, no. 2175. The Royal
Society, 2015, p. 20140862.

 Bessani, M. Correia, B. Quaresma, F. Andr´e, and P. Sousa, “Depsky:dependable and secure


storage in a cloud-of-clouds,” ACM Transactions onStorage (TOS), vol. 9, no. 4, p. 12, 2013.

 H. Chen, Y. Hu, P. Lee, and Y. Tang, “Nccloud: A network-coding-basedstorage system in a


cloud-of-clouds,” 2013.

 T. G. Papaioannou, N. Bonvin, and K. Aberer, “Scalia: an adaptive schemefor efficient multi-


cloud storage,” in Proceedings of the International Conferenceon High Performance Computing,
Networking, Storage and Analysis. IEEEComputer Society Press, 2012, p. 20.

 Z. Wu, M. Butkiewicz, D. Perkins, E. Katz-Bassett, and H. V. Madhyastha,“Spanstore: Cost-


effective geo-replicated storage spanning multiple cloudservices,” in Proceedings of the
Twenty-Fourth ACM Symposium on OperatingSystems Principles. ACM, 2013, pp. 292–308.

 G. Greenwald and E. MacAskill, “Nsa prism program taps in to user data ofapple, google and
others,” The Guardian, vol. 7, no. 6, pp. 1–43, 2013.

Das könnte Ihnen auch gefallen