CFP last date
20 January 2025
Reseach Article

AES – MR: A Novel Encryption Scheme for securing Data in HDFS Environment using MapReduce

by Viplove Kadre, Sushil Chaturvedi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 129 - Number 12
Year of Publication: 2015
Authors: Viplove Kadre, Sushil Chaturvedi
10.5120/ijca2015906994

Viplove Kadre, Sushil Chaturvedi . AES – MR: A Novel Encryption Scheme for securing Data in HDFS Environment using MapReduce. International Journal of Computer Applications. 129, 12 ( November 2015), 12-19. DOI=10.5120/ijca2015906994

@article{ 10.5120/ijca2015906994,
author = { Viplove Kadre, Sushil Chaturvedi },
title = { AES – MR: A Novel Encryption Scheme for securing Data in HDFS Environment using MapReduce },
journal = { International Journal of Computer Applications },
issue_date = { November 2015 },
volume = { 129 },
number = { 12 },
month = { November },
year = { 2015 },
issn = { 0975-8887 },
pages = { 12-19 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume129/number12/23124-2015906994/ },
doi = { 10.5120/ijca2015906994 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:23:12.720916+05:30
%A Viplove Kadre
%A Sushil Chaturvedi
%T AES – MR: A Novel Encryption Scheme for securing Data in HDFS Environment using MapReduce
%J International Journal of Computer Applications
%@ 0975-8887
%V 129
%N 12
%P 12-19
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data security is an important issue as far as storage of sensitive data is concerned. Hadoop is usually utilized for storage, large amount of data using its storage technology, namely Hadoop Distributed file System HDFS. Hadoop by default does not contain any security mechanism but as it has grown very much and it is the first choice of the business analyst and industries to store and manage data it is necessary to introduce security solutions to Hadoop in order to secure the important data in the Hadoop environment. Authentication, Authorization, Data Encryption, Security against various attacks are the key levels of data and information security in the Hadoop environment. Efforts have been made in order to attain each and every level of security over the period of recent years. Kerberos is one such effort in order to attain Authentication and Authorization and it succeeded in doing so, but with the attackers having new technologies and hacking tools attackers can easily bypass the security provided by Hadoop's Kerberos Authentication system and then the data at storage level is unencrypted can easily be stolen or damaged which is a big concern. Encryption of large data stored in HDFS is actually a process which takes a lot of time and this time consuming nature of encryption should be controlled by encrypting the data using a parallel method. This study discusses a new technique to perform encryption in parallel using AES-MR (an Advanced Encryption standard based encryption using MapReduce) technique in MapReduce paradigm..The time taken for Performing the encryption and decryption process is relatively less for user generated content. Results show that AES-MR encryption process is found to be faster with mapper function alone in comparison with running the encryption process under mapper function and reducer function. Here a new encryption scheme is given by the combination of AES and MapReduce in order to secure data in HDFS environment.The results generated for encryption on different data proves that the proposed technique is well suited for protecting user generated sensitive data deployed in the HDFS environment.

References
  1. “Hadoop Distributed File System”, http://hadoop.apache.Org /hdfs.
  2. J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” in Proc. 6th Symp. Operating System Design and Implementation (OSDI’04), pp. 137–150, Dec. 2004.
  3. Hadoop Distributed File System, “Working of HDFS Blocks,” http://developer .yahoo.com/hadoop/tutorial/module2.
  4. S. Ghemawat, H. Gobioff, and S.-T. Leung,. “The Google File System,” in Proc. 19th ACM Symp. Operating Systems, pp. 29–43, Oct 2003.
  5. Apache Hadoop, http://hadoop.apache.org/.
  6. HDFS Federation, http://hadoop.apache.org/common/docs/r0.23.0 /hadoop-yarn/ hadoop-yarn-site/Federation.html.
  7. Dean, J., Ghemawat, S., “MapReduce: Simplified Data Processing on Large Clusters,”Communications of the ACM - 50th anniversary issue: 1958 - 2008, volume 51, issue 1 (January 2008), pg 107-113
  8. Shvachko, K., Kuang, H., Radia, S., Chansler, R., “The Hadoop Distributed File Sys- tem,” Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), p.1-10, May 03-07, 2010
  9. White, T, Hadoop: The Definitive Guide. O’Reilly Media, 2009.
  10. Park, S., Lee, J., Chung, T., “Cluster-Based Trust Model against Attacks in Ad-Hoc Networks,” Convergence and Hybrid Information Technology, 2008. ICCIT. ’08. Third International Conference on , vol.1, no., pp.526-532, 11-13 Nov. 2008
  11. Prabhakar, R., Patrick, C., Kandemir, M., “MPISec I/O: Providing Data Confidential- ity in MPI-I/O,” Cluster Computing and the Grid, 2009. CCGRID ’09. 9th IEEE/ACM International Symposium on , vol., no., pp.388-395, 18-21 May 2009
  12. National Institute of Standards and Technology, Federal Information Processing Stan- dards Publication 197, “Advanced Encryption Standard,” November 26 2001
  13. Abadi, D.J.“Data Management in the Cloud: Limitations and Opportunities”. IEEE Data Eng. Bull., 32(1), 3-12. 2009
  14. Chaiken, R., Jenkins, B., Larson, P. Å., Ramsey, B., Shakib, D., Weaver, S., & Zhou, J. “SCOPE: easy and efficient parallel processing of massive data sets”. Proceedings of the VLDB Endowment, 1(2), 1265-1276.2008
  15. Chen, Y., Paxson, V., & Katz, R. H. “What’s new about cloud computing security”. University of California, Berkeley Report No. UCB/EECS-2010-5 January, 20(2010), 2010-5.
  16. Dworkin, M. Recommendation for block cipher modes of operation: The CMAC mode for authentication. “US Department of Commerce, Technology Administration, National Institute of Standards and Technology”.2005.
  17. El-Fotouh, M. A., & Diepold, K. “Statistical Testing for Disk Encryption Modes of Operations”. IACR Cryptology ePrint Archive, 2007, 362. 2007.
  18. Foster, I., Zhao, Y., Raicu, I., & Lu, S. “Cloud computing and grid computing 360-degree compared”. In Grid Computing Environments Workshop, 2008. GCE'08 (pp. 1-10). IEEE. November 2008.
  19. Gropp, W., Lusk, E., & Thakur, R. Using MPI-2: Advanced features of the message-passing interface. MIT press. 1999.
  20. Kaufman, L. M. “Data security in the world of cloud computing. Security & Privacy”, IEEE, 7(4), 61-64. 2009.
  21. Keller, S. S., & Hall, T. A. (2010). “The XTS-AES Validation System (XTSVS)”.
  22. Martin, L. “XTS: A mode of AES for encrypting hard disks”. IEEE Security & Privacy, (3), 68-69. 2010.
  23. Jam, M.R.; Khanli, L.M.; Javan, M.S.; Akbari, M.K., "A survey on security of Hadoop," in Computer and Knowledge Engineering (ICCKE), 2014 4th International eConference on , vol., no., pp.716-721, 29-30 Oct. 2014
  24. Yu Xianqing; Peng Ning; Vouk, M.A., "Enhancing security of Hadoop in a public cloud," in Information and Communication Systems (ICICS), 2015 6th International Conference on , vol., no., pp.38-43, 7-9 April 2015
  25. O’Malley, O., Zhang, K., Radia, S., Marti, R., & Harrell, C. “Hadoop security design”. Yahoo, Inc., Tech. Rep.2009
  26. Becherer, A.Hadoop Security Design: Just Add Kerberos? Really? iSEC Partners. Inc.: San Francisco, CA, USA.2010
  27. Yuan, M. “Study of Security Mechanism based on Hadoop”. In Information Security and Communications Privacy, 6, 042.2012
  28. Wang, L., Tao, J., Ranjan, R., Marten, H., Streit, A., Chen, J., & Chen, D. “G-Hadoop: MapReduce across distributed data centers for data-intensive computing”. In Future Generation Computer Systems, 29(3), 739-750.
  29. Lakhe, B. Introducing Hadoop Security. In Practical Hadoop Security (pp. 37-47).
  30. Lin, H. Y., Shen, S. T., Tzeng, W. G., & Lin, B. S. P. “Toward data confidentiality via integrating hybrid encryption schemes and Hadoop Distributed File System”. In Advanced Information Networking and Applications (AINA), 2012 IEEE 26th International Conference on (pp. 740-747). IEEE. March 2012.
  31. White, T. In Hadoop: The definitive guide. " O'Reilly Media, Inc.".2012
  32. Roy, I., Setty, S. T., Kilzer, A., Shmatikov, V., & Witchel, E. Airavat: “Security and Privacy for MapReduce”. In NSDI (Vol. 10, pp. 297-312).April 2010.
  33. Bhatt, P., Toshiro Yano, E., & Gustavsson, P. M. “Towards a Framework to Detect Multi-stage Advanced Persistent Threats Attacks”. InService Oriented System Engineering (SOSE), 2014 IEEE 8th International Symposium on (pp. 390-395). IEEE. April 2014.
  34. P. Mell and T. Grance, "Draft NIST working definition of cloud computing," Referenced on June. 3rd, vol. 15, 2009.
  35. N. Somu, A. Gangaa, and V. S. Sriram, "Authentication Service in Hadoop Using one Time Pad," Indian Journal of Science and Technology, vol. 7, pp. 56-62, 2014.
  36. M. Yuan, "Study of Security Mechanism based on Hadoop”. Information Security and Communications Privacy, vol. 6, p. 042, 2012.
  37. V. Shukla, "Hadoop Security Today & Tomorrow," ed: Hortonworks Inc., 2014.
  38. Comprehensive and Coordinated Security for Enterprise Hadoop. Available: http://hortonworks.com/labs/security/ 2014
  39. L. Wang, J. Tao, H. Marten, A. Streit, S. U. Khan, J. Kolodziej, et al., "MapReduce across distributed clusters for data-intensive applications," in Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International, 2012, pp. 2004-2011.
  40. J. Zhao, L. Wang, J. Tao, J. Chen, W. Sun, R. Ranjan, et al., "A security framework in G-Hadoop for big data computing across distributed Cloud data centres," Journal of Computer and System Sciences, vol. 80, pp.994-1007, 2014
Index Terms

Computer Science
Information Sciences

Keywords

Hadoop Hadoop distributed file systems (HDFS) Data Encryption MapReduce and AES-MR.