Академический Документы
Профессиональный Документы
Культура Документы
ABSTRACT
In cloud storage services, deduplication technology is commonly used to reduce the
space and bandwidth requirements of services by eliminating redundant data and storing only
a single copy of them. Deduplication is most effective when multiple users outsource the
same data to the cloud storage, but it raises issues relating to security and ownership. Proof
of ownership schemes allow any owner of the same data to prove to the cloud storage server
that he owns the data in a robust way. However, many users are likely to encrypt their data
before outsourcing them to the cloud storage to preserve privacy, but this hampers
deduplication because of the randomization property of encryption. Recently, several
deduplication schemes have been proposed to solve this problem by allowing each owner to
share the same encryption key for the same data. However, most of the schemes suffer from
security flaws, since they do not consider the dynamic changes in the ownership of
outsourced data that occur frequently in a practical cloud storage service. In this paper, we
propose a novel server-side deduplication scheme for encrypted data. It allows the cloud
server to control access to outsourced data even when the ownership changes dynamically by
exploiting randomized convergent encryption and secure ownership group key distribution.
This prevents data leakage not only to revoked users even though they previously owned that
data, but also to an honest-but-curious cloud storage server. In addition, the proposed scheme
guarantees data integrity against any tag inconsistency attack. Thus, security is enhanced in
the proposed scheme. The efficiency analysis results demonstrate that the proposed scheme is
almost as efficient as the previous schemes, while the additional computational overhead is
negligible.
CHAPTER 1
INTRODUCTION
The term cloud computing is a recent buzzword in the IT world. Behind this fancy
poetic phrase there lies a true picture of the future of computing for both in technical
perspective and social perspective. Though the term Cloud Computing is recent but the
idea of centralizing computation and storage in distributed data centers maintained by third
party companies is not new but it came in way back in 1990s along with distributed
computing approaches like grid computing. Cloud computing is aimed at providing IT as a
service to the cloud users on-demand basis with greater flexibility, availability, reliability and
scalability with utility computing model.
private networks, i.e., WAN, LAN or VPN. Applications such as e-mail, web conferencing,
customer relationship management (CRM),all run in cloud.
Basic Concepts There are certain services and models working behind the scene
making the cloud computing feasible and accessible to end users. Following are the working
models for cloud computing:
Deployment Models
Service Models
Deployment models define the type of access to the cloud can have any of the four
types of access: Public, Private, Hybrid and Community. The Public Cloud allows systems
and services to be easily accessible to the general public. Public cloud may be less secure
because of its openness, e.g., e-mail. The Private Cloud allows systems and services to be
accessible within an organization. It offers increased security because of its private nature.
The Community Cloud allows systems and services to be accessible by group of
organizations. The Hybrid Cloud is mixture of public and private cloud. However, the critical
activities are performed using private cloud while the non-critical activities are performed
using public cloud.
1.2 SERVICE
MODELS
Service
Models are the
reference
models on which the Cloud Computing is based. These can be categorized into three basic
service models as listed below:
There are many other service models all of which can take the form like XaaS, i.e., Anything
as a Service. This can be Network as a Service, Business as a Service, Identity as a Service,
Database as a Service or Strategy as a Service. The Infrastructure as a Service (IaaS) is the
most basic level of service. Each of the service models make use of the underlying service
model, i.e., each inherits the security and management mechanism from the underlying
model, as shown in the following diagram:
PaaS provides the runtime environment for applications, development & deployment
tools, etc. In a PaaS model, a cloud provider delivers hardware and software tools usually
those needed for application development -- to its users as a service. A PaaS provider hosts
the hardware and software on its own infrastructure. As a result, PaaS frees users from having
to install in-house hardware and software to develop or run a new application.
PaaS does not typically replace a business' entire infrastructure. Instead, a business
relies on PaaS providers for key services, such as Java development or application hosting.
For example, deploying a typical business tool locally might require an IT team to buy and
install hardware, operating systems, middleware (such as databases, Web servers and so on)
the actual application, define user access or security, and then add the application to existing
systems management or application performance monitoring (APM) tools. IT teams must
then maintain all of these resources over time. A PaaS provider, however, supports all the
underlying computing and software; users only need to log in and start using the platform
usually through a Web browser interface. Most PaaS platforms are geared toward software
development, and they offer developers several advantages. For example, PaaS allows
developers to frequently change or upgrade operating system features. It also helps
development teams collaborate on projects.
Users typically access PaaS through a Web browser. PaaS providers then charge for that
access on a per-use basis. Some PaaS providers charge a flat monthly fee to access the
platform and the apps hosted within it. It is important to discuss pricing, service uptime and
support with a PaaS provider before engaging their services. Since users rely on a provider's
infrastructure and software, vendor lock-in can be an issue in PaaS environments. Other risks
associated with PaaS are provider downtime or a provider changing its development
roadmap. If a provider stops supporting a certain programming language, users may be forced
to change their programming language, or the provider itself. Both are difficult and disruptive
steps. Common PaaS vendors include Salesforce.com's Force.com, which provides an
enterprise customer relationship management (CRM) platform. PaaS platforms for software
development and management include Appear IQ, Mendix, Amazon Web Services (AWS)
Elastic Beanstalk, Google App Engine and Heroku.
1.1.3 SOFTWARE AS A SERVICE (SaaS)
SaaS model allows to use software applications as a service to end users. SaaS
removes the need for organizations to install and run applications on their own computers or
in their own data centers. This eliminates the expense of hardware acquisition, provisioning
and maintenance, as well as software licensing, installation and support. Other benefits of the
SaaS model include:
Accessibility and persistence: Since SaaS applications are delivered over the
Internet, users can access them from any Internet-enabled device and location.
But SaaS also poses some potential disadvantages. Businesses must rely on outside
vendors to provide the software, keep that software up and running, track and report accurate
billing and facilitate a secure environment for the business' data. Providers that experience
service disruptions, impose unwanted changes to service offerings, experience a security
breach or any other issue can have a profound effect on the customers' ability to use those
SaaS offerings. As a result, users should understand their SaaS provider's service-level
agreement, and make sure it is enforced. SaaS is closely related to the ASP (application
service provider) and on demand computing software delivery models. The hosted
application management model of SaaS is similar to ASP: the provider hosts the customers
software and delivers it to approved end users over the internet. In the software on
demand SaaS model, the provider gives customers network-based access to a single copy of
an application that the provider created specifically for SaaS distribution. The applications
source code is the same for all customers and when new features are functionalities are rolled
out, they are rolled out to all customers. Depending upon the service level agreement (SLA),
the customers data for each model may be stored locally, in the cloud or both locally and in
the cloud.
The
organizations
can
integrate
SaaS
applications
with
other
software
using application programming interfaces (APIs). For example, a business can write its own
software tools and use the SaaS provider's APIs to integrate t hose tools with the SaaS
offering. There are SaaS applications for fundamental business technologies, such as email,
sales management, customer relationship management (CRM), financial management, human
resource management, billing and collaboration. Leading SaaS providers include Salesforce,
Oracle, SAP, Intuit and Microsoft.
The concept of cloud computing came into existence in 1950 with implementation of
mainframe computers, accessible via thin/static clients. Since then, cloud computing has been
evolved from static clients to dynamic ones from software to services. The benefit of Cloud
Computing has numerous advantages. Some of them are, one can access applications as
utilities, over the Internet. Manipulate and configure the application online at any time. It
does not require installing a specific piece of software to access or manipulating cloud
application. Cloud Computing offers online development and deployment tools,
programming runtime environment through Platform as a Service model. Cloud resources
are available over the network in a manner that provides platform independent access to any
type of clients. Cloud Computing offers on-demand self-service. The resources can be used
without interaction with cloud service provider. Cloud Computing is highly cost effective
because it operates at higher efficiencies with greater utilization. It just requires an Internet
connection. Cloud Computing offers load balancing that makes it more reliable.
Cloud users use these services provided by the cloud providers and build their
applications in the internet and thus deliver them to their end users. So the cloud users dont
have to worry about installing, maintaining hardware and software needed. And they also can
afford these services as they have to pay as much they use. So the cloud users can reduce
their expenditure and effort in the field of IT using cloud services instead of establishing IT
infrastructure themselves. Cloud is essentially provided by large distributed data centers.
These data centers are often organized as grid and the cloud is built on top of the grid
services. Cloud users are provided with virtual images of the physical machines in the data
centers. This virtualization is one of the key concept of cloud computing as it essentially
builds the abstraction over the physical system. Many cloud applications are gaining
popularity day by day for their availability, reliability, scalability and utility model. These
applications made distributed computing easy as the critical 4 Cloud Computing aspects are
handled by the cloud provider itself.
utilizing cloud computing in a proper way. Both of these applications of cloud computing
have technological as well as social challenges to overcome.
How can the cloud server detect that the underlying file M is the same, and
Even if it can detect this, how can it allow both parties to recover the stored
data, based on their separate secret keys.
A convergent encryption algorithm encrypts an input file with the hash value of the
input file as an encryption key. The ciphertext is given to the server and the user retains the
encryption key. Since convergent encryption is deterministic, identical files are always
encrypted into identical ciphertext, regardless of who encrypts them. Thus, the cloud storage
server can perform deduplication over the ciphertext, and all owners of the file can download
the ciphertext(after the proof-of-ownership (PoW) process optionally) and decrypt it later
since they have the same encryption key for the file.
Convergent encryption has long been studied in commercial systems and has different
encryption variants for secure deduplication [8],[16],[17],[18], which was formalized as
message locked encryption later in [20]. However, convergent encryption suffers from
security flaws with regard to tag consistency and ownership revocation. As an example of the
tag consistency attack issue, suppose User1 and User2 have the same data M , and User1
generates ciphertext CA from M , and then maliciously generates another ciphertext CA from
M(=M). Next, she uploads CA with an honestly generated tag T(CA) =H(M) for a
cryptographic hash function H, which plays the role of data index. When User2 generates
ciphertext CB from M and tries to upload CB, the cloud server checks T(CA) =T(CB). Then,
it deletes CB and keeps only CA. Afterwards, when User2 downloads and decrypts it, the
data would be M , not M, which means the integrity of his data has been compromised.
Recently, message locked encryption (MLE) [20] and leakage-resilient deduplication [19]
schemes have been proposed to solve this problem by introducing additional integrity check
phase for decrypted data.
On the other hand, when a user uploads data that already exist in the cloud storage,
the user should be deterred from accessing the data that were stored before he obtained the
ownership by uploading it (backward secrecy). These dynamic ownership changes may occur
very frequently in a practical cloud system, and thus, it should be properly managed in order
to avoid the security degradation of the cloud service. However, the previous deduplication
schemes could not achieve secure access control under a dynamic ownership changing
environment, in spite of its importance to secure deduplication, because the encryption key is
derived deterministically and rarely updated after the initial key derivation. Therefore, for as
long as revoked users keep the encryption key, they can access the corresponding data in the
cloud storage at any time, regardless of the validity of their ownership.