Unity Id: vjsangha Paper Title: Dynamic Data Deduplication on Cloud Storage Authors: Waraporn Leesakul, Paul Townend, Jie Xu(School of Computing University of Leeds, Leeds) Link: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6830924&tag=1 The paper presents the idea of dynamic data deduplication model. The main motivation of authors is to use deduplication on dynamic nature of cloud data. The current approaches on deduplication on cloud are mostly focused on static nature of data. They emphasized that we are not utilizing full power of deduplication as most volume of data on cloud are dynamic in nature. It starts with detailed discussion on cloud computing and deduplication. Authors then analyze existing work on cloud …show more content…
So, it makes perfect sense of going in flow and introducing subsequent topics. Another strength I noticed that when authors wanted to introduce new architecture they don’t jump directly on to explain it. Authors analyzed existing architectures like SAM, AA-Dedupe, CABdedupe and SHHC, they mentioned benefits and shortcoming of each and every one. After that they reached on finding that these all architectures are lacking to consider dynamic nature of data during duplication. So this is how they reached on necessity of their architecture and introduced it. Two big strength of the architecture are handling issue of dynamicity and consider quality of service as they decide how many copy of a file and chunk should be replicated. This paper also holds very strong on experimental results of proposed solution. The authors experimented the model with many possible set ups like changing number of deduplicators, changing type of operation, and change in the value of the quality of service factor. So overall they touched upon every possible setup to conclude that more than 90% of time savings can be achieved. Another strong point about paper is that authors are very clear about what factors are being compromised in the study and they discussed scope of it. For example, a possible monitor access patter can be implemented where system will
[17] S. Quinlan and S. Dorward. Venti: a new approach to archivalstorage. In Proc. USENIX FAST, Jan 2002.
Permission to make digital or hard copies of all or part of this work for
Usage of remote servers via internet to store, manage and process data instead of using a personal computer is known as Cloud computing. It’s a set of Information Technology services with the ability to scale up or down their service requirements. Most of the cloud services are provided by a third party service provider. In cloud computing, organizations can utilize IT services without in advance investment. Despite its benefits obtained from the cloud computing, the organizations are slow in accepting it due to security issues and challenges. Security is one of the major problems which hinder the growth of cloud. It’s not wise to handing over the important data to another company; such that clients need to be vigilant in understanding the risks of data infringement in this new environment. This paper discusses a detailed analysis of the cloud computing security issues and challenges. (Ayoleke)
This document serves as an overview of Data Loss Prevention (DLP). The following research content will assist individuals to observe the various concepts of Data Loss Prevention. Along with that, this paper shall dwell into the different reasons why Data Loss Prevention is essential, the various types of DLP, their operational modes, DLP planning and design strategies, possible scenarios of deployment, workflow and best practices for DLP operations and various other elements.
This paper clearly illustrates the concepts of Data warehousing and Cloud computing. It also discusses the benefits and disadvantages of implementing Data Warehouse in a Cloud. Both cloud computing and data warehousing are the latest trends in modern computing. DWH is an integrated software component of the cloud and it provides timely support and accurate response to complex queries with Online Analytical Processing (OLAP) and data mining tools. Cloud computing provides reasonable speed of services in a less period of time when it is compared to in-house data warehouse deployment. Reduced cost, a pay-per-use payment model and backups are also made available by cloud computing. Besides these there are several challenges when deploying data warehouses in the cloud. These challenges may include security, computational and network problems. These problems are mainly caused often due to incompatible nature of functional requirements to deploy DWH in the cloud environment. Cloud providers offer a low-end node for computations and a local data warehousing systems is good in terms of CPU, memory and disk bandwidth. Growing need of cloud computing will allow its evolution more in future to accommodate critical DWH. The evolving nature of cloud computing and DWH will help the small and medium sized businesses.
Cloud because of its wide range of applications it allows users to store data their data remotely in the cloud and enjoy the on-demand high quality cloud applications and reveal burden from the local storage, cost and maintenance. In this according to the user’s perspective, including both individuals (private) and enterprises like companies appealing the cloud benefits by storing data remotely into the cloud in a flexible on-demand manner and relief of the burden of storage management along with this he/she can also enjoy the universal data access which dependent geographical locations and avoidance of the capital expenditure, software, hardware and personnel management and maintenances and so on.
ABSTRACT: Cloud computing is a method of providing a set of shared computing resources that includes applications, computing, storage, networking, development, and deployment platforms as well as business processes. In cloud computing, data while transferring as well as storage, and get data back when it is needed there is no assurance about the security of that data stored and also it is not changed by the cloud or TPA. Therefore Security and control over data remain to play a significant role in plans for cloud computing initiatives. Existing research work earlier grant data integrity to be verified,but still there are different drawbacks, firstly basic and mostly needed authorization/authentication process is not present in between Cloud Service provider and TPA. Techniques like authentication and encryption are important, our system can provides those things for security issues. Second: in recent research POR protocol in which the verifier stores only a single cryptographic key, POR can only capable of detection of file corruption or loss, and not prevention. Maintaining the storages can be a tough task and second it requires high resource costs for the implementation. This paper, Propose a formal analysis method called full
As a Business System Analyst, most of my responsibilities involve working with big data. Data management is key to the success of any company. This includes, data security, data availability and assessment. Many companies need to replicate data that is both in cloud or other infrastructures for various purposes including; uptime and resilience, remote backup and storage, branches and other offices and building business in a box to mention a few (KLEYMAN, 2013). If data replication in cloud environment is very important, it is therefore important that companies ensure that they used the best technique in carrying out this activity. This documents provides the analysis of different techniques of data replication in cloud environment. Also, included in this document are my findings on the analysis, recommendation and conclusion
We consider a cloud computing environment consisting of a cloud service provider (CSP), a data owner, and many users [few with read and few with both read/write permission]. The CSP maintains cloud infrastructures, which pool the bandwidth, storage space, and CPU power of many cloud servers to provide 24/7 services. The CSP mainly provides two services: data storage and re-encryption. After obtaining the encrypted data from the data owner, the CSP will store the data. On receiving a data access request from a user, the CSP will re-encrypt the ciphertext based on attributes, and return the re-encrypted ciphertext.
We proposed to check duplication at starting stage. In old architecture duplication checked when data converted in another form means after encryption.
Though client-side deduplication brings in advantages such as reduction of network bandwidth and data upload time, several security threats are associated with it. These threats have to be addressed in order to reap the full potential of client-side deduplication. Halevi et al. [3] identified various threats that affect a remote storage system that implements client-side deduplication.
The new cloud storage, known as OneDrive, features many new updates that can be used with Microsoft Word as well as other Microsoft Office applications such as Excel and PowerPoint. What sets it apart from its predecessor, SkyDrive, is that OneDrive gives you one place for all of your files, including photos, videos, and documents, and it’s available across the devices you use every day. One of the main problems with SkyDrive was that for a cloud storage it was very difficult to use effectively. In fact the vast majority of Microsoft customers still had their data spread out everywhere from numerous folders after being fully aware of SkyDrive’s purpose. Many users familiar with the cloud still had content stored on a device that wasn’t backed up elsewhere.
Moreover, we will discuss about some threats and challenges of using cloud services to reduce cost, minimise data loss and the restoring time.
The National Institute of Standards and Technology describes cloud storage as a model for enabling ubiquitous, on-demand network access to a shared configurable computing resources that can be swiftly accessed and released with minimal effort or service provider collaboration. It is comprised of a collection of hardware and software that allows the infrastructure of the cloud to work in a seamless, unified effort. Depending on the classification of information and the service provider the remote servers can be located within the same facility. The stored data is
instead of occupying all the available space given on a personal computer, a user can