CFP last date
20 January 2025
Reseach Article

De-duplication Approaches in Cloud Computing Environment: A Survey

by Fatemeh Shieh, Mostafa Ghobaei Arani, Mahboubeh Shamsi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 120 - Number 13
Year of Publication: 2015
Authors: Fatemeh Shieh, Mostafa Ghobaei Arani, Mahboubeh Shamsi
10.5120/21285-4233

Fatemeh Shieh, Mostafa Ghobaei Arani, Mahboubeh Shamsi . De-duplication Approaches in Cloud Computing Environment: A Survey. International Journal of Computer Applications. 120, 13 ( June 2015), 6-10. DOI=10.5120/21285-4233

@article{ 10.5120/21285-4233,
author = { Fatemeh Shieh, Mostafa Ghobaei Arani, Mahboubeh Shamsi },
title = { De-duplication Approaches in Cloud Computing Environment: A Survey },
journal = { International Journal of Computer Applications },
issue_date = { June 2015 },
volume = { 120 },
number = { 13 },
month = { June },
year = { 2015 },
issn = { 0975-8887 },
pages = { 6-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume120/number13/21285-4233/ },
doi = { 10.5120/21285-4233 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:06:06.029141+05:30
%A Fatemeh Shieh
%A Mostafa Ghobaei Arani
%A Mahboubeh Shamsi
%T De-duplication Approaches in Cloud Computing Environment: A Survey
%J International Journal of Computer Applications
%@ 0975-8887
%V 120
%N 13
%P 6-10
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Nowadays increasing the data storage capacity is one of the important challenges, due to the more demands for using cloud services. There have been offered several approaches to identify and remove duplicated data in virtual machines prior to sending their data to a shared storage resource. Therefore method of storage information should be efficient also the method of finding data should be intelligent as much as possible. However, there is no approach among various storing data approaches, to be absolutely expected to have the best performance in the use of bandwidth for storage. One of the useful strategies to have fast and efficient data storage is de-duplication. In this paper, we will address various de-duplication approaches and consider advantages and disadvantages of them.

References
  1. ChengzhangPeng, ZejunJiangb. "Building a Cloud Storage Service System". scienceDirect, pp. 691-696, 2011.
  2. M. Armbrust, A. Fox, R. Gri_th, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia , " Above the clouds: A berkeley view of cloud computing", Technical report, University of California at Berkeley, pp. 2 , 2009.
  3. Monireh Fallah, Mostafa Ghobaei Arani and Mehrdad Maeen. "NASLA: Novel Auto Scaling Approach based on Learning Automata for Web Application in Cloud Computing Environment. "International Journal of Computer Applications 113(2):18-23, March 2015.
  4. Monireh Fallah, Mostafa Ghobaei Arani"ASTAW: Auto-Scaling Threshold-based Approach for Web Application in Cloud Computing Environment. " International Journal of u- and e- Service, Science and Technology (IJUNESST),Vol. 8, No. 3, pp. 221-230, 2015.
  5. J. E. Smith and R. Nair, "The architecture of virtual machines", Computer, 38(5): 2005, pp. 32-38, 2005.
  6. Afife Fereydooni, Mostafa Ghobaei Arani and Mahboubeh Shamsi, "EDLT: An Extended DLT to Enhance Load Balancing in Cloud Computing. " International Journal of Computer Applications 108(7):6-11, December 2014.
  7. Behnaz Seyed Taheri, Mostafa Ghobaei Arani and Mehrdad Maeen, "ACCFLA: Access Control in Cloud Federation using Learning Automata. "International Journal of Computer Applications 107(6):30-40, December 2014.
  8. T. E. Denehy and W. W. Hsu, "Duplicate management for reference data", Technical report, IBM Research, 2003.
  9. Mishra, Deepak, and Sanjeev Sharma. "Comprehensive study of data de-duplication. " International Conference on Cloud, Big Data and Trust , Nov 13-15, RGPV , 2013.
  10. Lin, Iuon-Chang, and Po-ChingChien. "Data De-duplication Scheme for Cloud Storage. " International Journal of Computer and Control (IJ3C), Vol1 2 (2012).
  11. Kaurav, Neha. "An Investigation on Data De-duplication Methods And it's Recent Advancements. " (2014).
  12. Deepu S R, BhaskarR, Shylaja B S, "PERFORMANCE COMPARISON OF DE-DUPLICATION TECHNIQUES FOR STORAGE IN CLOUD COMPUTING ENVIRONMENT ", Asian Journal of Computer Science And Information Technology 4:5 (2014), pp. 42–46, 2014.
  13. InduArora, Dr. Anu Gupta. "Opportunities, Concerns and Challenges in the Adoption of Cloud Storage" ,(IJCSIT) International Journal of Computer Science and Information Technologies, vol 3(3), pp. 4543-4548, 2012.
  14. Shilane, Philip, Grant Wallace, Mark Huang, and Windsor Hsu. "Delta compressed and de-duplicated storage using stream-informed locality. " In Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems, pp. 10-10. USENIX Association, 2012.
  15. Curran, Robert, Wayne Sawdon, and Frank Schmuck. "Efficient method for copying and creating block-level incremental backups of large files and sparse files. " U. S. Patent Application 10/602,159, filed June 24, 2003.
  16. He, Qinlu, Zhanhuai Li, and Xiao Zhang. "Data de-duplication techniques. " In Future Information Technology and Management Engineering (FITME), 2010 International Conference on, vol. 1, pp. 430-433. IEEE, 2010.
  17. Meyer, Dutch T. , and William J. Bolosky, "A study of practical de-duplication. " ACM Transactions on Storage (TOS) 7, no. 4 (2012): 14.
  18. Philip Shilane, Grant Wallace, Mark Huang, Windsor Hsu,"Delta Compressed and De-duplicated Storage Using Stream-Informed Locality" , Journal of Backup Recovery Systems Division EMC Corporation , pp. 3 , 2012.
  19. M. Szeredi, "File system in user space. " , 5th October, 2014.
  20. Purushottam Kulkarni, Fred Douglis, Jason LaVoie , John M. Tracey ,"Redundancy Elimination Within Large Collections of Files" , In Proceedings of the 2004 USENIX Annual Technical Conference, Boston, MA, pp. 7-10 , June 2004.
  21. Kulkarni, Purushottam, Fred Douglis, Jason D. LaVoie, and John M. Tracey. "Redundancy Elimination Within Large Collections of Files. " In USENIX Annual Technical Conference, General Track, pp. 59-72. 2004.
  22. P. Kulkarni, F. Douglis, J. LaVoie, and J. M. Tracey ,"Redundancy elimination within large collections of files", In ATEC '04: Proceedings of the annual conference on USENIX Annual Technical Conference, USENIX Association , pp. 5-5 , 2004.
  23. Policroniades, Calicrates, and Ian Pratt. "Alternatives for Detecting Redundancy in Storage Systems Data. " In USENIX Annual Technical Conference, General Track, pp. 73-86. 2004.
  24. Zhu, Benjamin, Kai Li, and R. Hugo Patterson. "Avoiding the Disk Bottleneck in the Data Domain De-duplication File System. ", InFast, vol. 8, pp. 1-14. 2008.
  25. Deepu S. R. , "Performance Comparison of De-duplication techniques for storage in Cloud computing Environment. " Asian Journal of Computer Science & Information Technology 4, no. 5 (2014).
  26. Joao Tiago , Medeiros Paulo , Escola De Engenharia, Mestrado Em Engenharia ,"Efficient Storage Of Data In Cloud Computing" , Journal Of ACM Computing Surveys(CSUR) , pp. 3-7 , July 2009 .
  27. Uz, Tamer, George Bebis, Ali Erol, and Salil Prabhakar. "Minutiae-based template synthesis and matching for fingerprint authentication. " Computer Vision and Image Understanding 113, no. 9 (2009): 979-992 , 2009.
  28. Bhagwat, Deepavali, KaveEshghi, Darrell DE Long, and Mark Lillibridge. "Extreme binning: Scalable, parallel de-duplication for chunk-based file backup. " In Modeling, Analysis & Simulation of Computer and Telecommunication Systems, 2009. MASCOTS'09. IEEE International Symposium on, pp. 1-9. IEEE, 2009.
  29. Lillibridge, Mark, KaveEshghi, and DeepavaliBhagwat. "Improving restore speed for backup systems that use inline chunk-based de-duplication. ", InFAST, pp. 183-198. , 2013.
  30. K. Jin and E. Miller, "The effectiveness of de-duplication on virtual machine disk images", The Israeli Experimental Systems Conference, In Proc. SYSTOR,pp. 6 , 2009.
  31. Harnik, Danny, Benny Pinkas, and Alexandra Shulman-Peleg. "Side channels in cloud services, the case of de-duplication in cloud storage. " IEEE Security & Privacy 8, no. 6 (2010): 40-47, 2010.
  32. Stringham, Russell R. "Client side data de-duplication. " U. S. Patent 7,814,149, issued October 12, 2010.
  33. Xu, Jia, Ee-Chien Chang, and JianyingZhou. "Weak leakage-resilient client-side de-duplication of encrypted data in cloud storage". In Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security, pp. 195-206. ACM, 2013.
  34. Jin-Yong Ha, Young-Sik Lee, and Jin-Soo Kim, "De-duplication with Block-Level Content-Aware Chunking for Solid State Drives (SSDs)", IEEE International Conference on High Performance Computing and Communications & IEEE International Conference on Embedded and Ubiquitous Computing ,pp. 2 , 2013.
  35. A. Gupta, R. Pisolkar, B. Urgaonkar, and A. Sivasubramaniam, "Leveraging value locality in optimizing NAND flash-based ssds", in Proc. USENIX Conference on File and Storage Technologies, 2011, pp. 7–7 , 2011.
  36. Meyer, Dutch T. , and William J. Bolosky. "A study of practical de-duplication. " ACM Transactions on Storage(TOS) 7, no. 4 (2012): 14.
  37. A. Muthitacharoen, B. Chen, and D. Mazières , "A low bandwidth network file system", In Proceedings of the 18thACM Symposium on Operating Systems Principles (SOSP'01), pp. 174–187 , Oct. 2001.
  38. P. Kulkarni, F. Douglis, J. LaVoie, and J. M. Tracey," Redundancy elimination within large collections of files", In ATEC '04: Proceedings of the annual conference on USENIX Annual Technical Conference, USENIX Association, pp. 5-5 , 2004.
Index Terms

Computer Science
Information Sciences

Keywords

Cloud computing Virtual machine Data storage system De-duplication.