preview

An Effective Image File Storage Technique Using Data De Duplication Essay

Better Essays

ABSTRACT
Recent years have seen a rapid growth in the number of virtual machines and virtual machine images that are managed to support infrastructure as a service (IaaS). For example, Amazon Elastic Compute Cloud (EC2) has 6,521 public virtual machine images. This creates several challenges in management of image files in a cloud computing environment.
In particular, a large amount of duplicate data that exists in image files consumes significant storage space. To address this problem, we propose an effective image file storage technique using data de-duplication with a modified fixed-size block scheme. When a user requests to store an image file, this technique first calculates the fingerprint for the image file, and then compares the fingerprint with the fingerprints in a fingerprint library. If the fingerprint of the image is already in the library, a pointer to the existing fingerprint is used to store this image. Otherwise this image will be processed using the fixed-size block image segmentation method.
The experiments show that this technique can significantly reduce the transmission time of image files that have already existed in storage. Also the deletion rate for image groups which have the same version of operating systems but different versions of software applications is up about 58%.

Keywords: cloud computing, image files, data deduplication.

CHAPTER 1
INTRODUCTION

Cloud Computing enables universal, expedient network access to a shared

Get Access