We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Reseach Article

Performance Analysis of Hadoop Map Reduce on Eucalyptus Private Cloud

by Jobby P Jacob, Anirban Basu
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 79 - Number 17
Year of Publication: 2013
Authors: Jobby P Jacob, Anirban Basu
10.5120/13960-1661

Jobby P Jacob, Anirban Basu . Performance Analysis of Hadoop Map Reduce on Eucalyptus Private Cloud. International Journal of Computer Applications. 79, 17 ( October 2013), 10-13. DOI=10.5120/13960-1661

@article{ 10.5120/13960-1661,
author = { Jobby P Jacob, Anirban Basu },
title = { Performance Analysis of Hadoop Map Reduce on Eucalyptus Private Cloud },
journal = { International Journal of Computer Applications },
issue_date = { October 2013 },
volume = { 79 },
number = { 17 },
month = { October },
year = { 2013 },
issn = { 0975-8887 },
pages = { 10-13 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume79/number17/13960-1661/ },
doi = { 10.5120/13960-1661 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:53:14.695725+05:30
%A Jobby P Jacob
%A Anirban Basu
%T Performance Analysis of Hadoop Map Reduce on Eucalyptus Private Cloud
%J International Journal of Computer Applications
%@ 0975-8887
%V 79
%N 17
%P 10-13
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The cost effectiveness and the ease of maintenance are the reasons behind the increasing popularity of Cloud Computing. The need to reduce the execution time of programs on Cloud platforms have led to development of Hadoop [12]. This paper analyzes the performance of K-Means Clustering Algorithm when running on Hadoop MapReduce on Eucalyptus [5] platform. Running Eucalyptus for Hadoop requires lot of customization for the software to run as discussed here. Several tools like Ganglia, TestDSFIO. java, Linux performance measuring tools have been used to measure the performance. The paper discusses how performance of K-Means Clustering Algorithm scales up with number of nodes on Eucalyptus cloud. Results of measurement of the disk, network, memory bandwidth, data throughput and average I/O are presented here.

References
  1. Grace Nila Ramamoorthy:K-Means Clustering Using Hadoop MapReduce. Published by UCD School of Computer Science and Informatics.
  2. Chen He, Derek Weitzel, David Swanson, Ying Lu. HOG: Distributed Hadoop MapReduce on the Grid Published by 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
  3. Xuan Wang,Clustering in the cloud:Clustering Algorithms to Hadoop Map/Reduce Framework" (2010),Published by Technical Reports-Computer Science by Texas State University.
  4. Apache Software Foundation, Hdfs user guide http://hadoop. apache. org/hdfs/docs/current/hdfsuserguide
  5. MapReduce tutorial Apache Hadoop 1. 2. 1 documentation by Hadoop wiki.
  6. Eucalyptus Systems, Inc. Eucalyptus 3. 3. 1, Eucalyptus Administration Guide (2. 0), 2010.
  7. J. Dean and S. Ghema wat. MapReduce: Simplified Data Processing on Large Clusters. Proceedingsof 6th Symposium on Operating SystemDesign and Implementation,Published by Communications of ACM,Volume 51 Issue 1,January 2008.
  8. Cloudera. Cloudera's distribution including Apache hadoop.
  9. A. T. Velte,T. J. Velte,and R. Elsenpeter. Cloud Computing- A Practical Approach,Published by The McGraw-Hill Companies, 2010.
  10. Tom White Hadoop- The Definitive Guide,Published by O'Reilly Media/Yahoo Press, 2nd edition, 2010.
  11. Huan Liu and Dan Orban, Cloud MapReduce: a MapReduce Implementation on top of a Cloud Operating System. Published in Cluster, Cloud and Grid Computing(CCGrid) 2011,11th IEEE/ACM International Symposium.
  12. Weizhong Zhao,Huifang Ma,Qing He,Parallel K-Means clustering based on MapReduce Published by The Key Laboratory of Intelligent Information Processing, Institute of Computing Technology,Chinese Accademy of Sciences.
  13. The Apache Hadoop Ecosystem,University of Cloudera, OnlineResources.
  14. Amazon. Amazon elastic block storage (ebs),aws documentation by Amazon Elastic Compute Cloud User Guide.
  15. Blaise Barney, Introduction to Parallel Computing, Published By Lawrence Livermore National Laboratory.
Index Terms

Computer Science
Information Sciences

Keywords

BigData Hadoop MapReduce Eucalyptus Cloud Ganglia.