Energy Efficient and Reliable Job Submission in Hadoop Clusters

Call for Paper

November Edition

IJCA solicits high quality original research papers for the upcoming November edition of the journal. The last date of research paper submission is 20 October 2025

Submit your paper

Know more

The week's pick

Zero Trust Architecture Implementation in Enterprise Networks: Evaluating Effectiveness Against Cyber Threats

Stephen Kofi Dotse Samuel Yao Sebuabe Augustus Obeng Silas Asani Abudu Edna Awisie Pappoe

Random Articles

Reseach Article

Energy Efficient and Reliable Job Submission in Hadoop Clusters

Published on August 2012 by G Sudha Sadasivam, S Sangeetha, R Radhakrishnan

Information Processing and Remote Computing

Foundation of Computer Science USA

IPRC - Number 1

August 2012

Authors: G Sudha Sadasivam, S Sangeetha, R Radhakrishnan

299e10e7-185b-4503-a147-16c129749b8a

G Sudha Sadasivam, S Sangeetha, R Radhakrishnan . Energy Efficient and Reliable Job Submission in Hadoop Clusters. Information Processing and Remote Computing. IPRC, 1 (August 2012), 6-11.

@article{

author = { G Sudha Sadasivam, S Sangeetha, R Radhakrishnan },

title = { Energy Efficient and Reliable Job Submission in Hadoop Clusters },

journal = { Information Processing and Remote Computing },

issue_date = { August 2012 },

volume = { IPRC },

number = { 1 },

month = { August },

year = { 2012 },

issn = 0975-8887,

pages = { 6-11 },

numpages = 6,

url = { /specialissues/iprc/number1/7997-1004/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Special Issue Article

%1 Information Processing and Remote Computing

%A G Sudha Sadasivam

%A S Sangeetha

%A R Radhakrishnan

%T Energy Efficient and Reliable Job Submission in Hadoop Clusters

%J Information Processing and Remote Computing

%@ 0975-8887

%V IPRC

%N 1

%P 6-11

%D 2012

%I International Journal of Computer Applications

Abstract

MapReduce paradigm is highly suitable for large scale data intensive applications in the cloud environment. The scale of these applications necessitates minimization of cluster power consumption to reduce operational costs and carbon footprint. Energy consumption can be reduced by selective power down of nodes during periods of low utilization. Hadoop is basically used for batch processing of huge jobs. Before jobs are submitted, the files used them are uploaded into the cluster. A file is split up into a number of chunks and distributed across the Hadoop cluster. This paper addresses the problem of block allocation in distributed file system to improve reliability and energy efficiency. A framework to reduce power requirements of a cluster by identifying the number of replicas and their placement for reliable completion of the job has been designed. This will address the issues like block allocation, reliable job submission and minimization of cluster nodes to reduce power consumption. This framework is integrated with hadoop's namenode. The scheduler component in Hadoop has also been modified to enable submission of jobs to active data node containing data to be operated on. A greedy approach and an evolutionary approach using Particle Swarm Optimization (PSO) has been designed to identify suitable nodes to be activated in a cluster. Experimental results demonstrate the performance of these approaches.

References

Hadoop- http://developer. yahoo. com â Hadoop internals and tutorial.
HDFS - http://hadoop. apache. org/hdfs â Study about HDFS and its development.
R. Buyya, M. Murshed, Gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing, Concurrency and Computation: Practice and Experience 14 (2002), 1175{1220. doi:http://dx. doi. org/10. 1002/cpe. 710.
Willis Lang and Jignesh M. Patel, Energy Management for MapReduce Clusters,Computer Sciences Department, University of WisconsinMadison,USA.
Nitesh Maheshwari, Radheshyam Nanduri, Vasudeva Varma, Dynamic Energy Efficient Data Placement and Cluster Reconfiguration Algorithm for MapReduce Framework, Search and Information Extraction Lab, Language Technologies Research Centre (LTRC), IIIT Hyderabad.
Jacob Leverich, Christos Kozyrakis, On the Energy (In)efficiency of Hadoop Clusters, Computer Systems Laboratory, Stanford University.
Yanpei Chen, Laura Keys, Randy Katz , Hadoop Summit 2009 â Towards Energy Efficient Hadoop -, RAD Lab, UC Berkeley.
Hyeong S. Kim Dong In Shin Young Jin Yu Hyeonsang Eom Heon Y. Yeom,, Towards Energy Proportional Cloud for Data Processing Frameworks, School of Computer Science and Engineering, Seoul National University.
M. Weiser, B. Welch, A. Demers, S. Shenker, Scheduling for reducedcpu energy, in: OSDI '94: Proceedings of the 1st USENIX conferenconOperating Systems Design and Implementation, USENIX Association,Berkeley, CA, USA, 1994, p. 2.
A. Rangasamy, R. Nagpal, Y. Srikant, Compiler-directed frequencyand voltage scaling for a multiple clock domain microarchitecture, in: CF '08: Proceedings of the 5th conference on Computing frontiers, ACM, New York, NY,USA, 2008, pp. 209{218. doi:http://doi. acm. org/10. 1145/1366230. 1366267
A. R. Lebeck, X. Fan, H. Zeng, C. Ellis, Power aware page allocation, in: ASPLOS-IX: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, ACM,NewYork,NY,USA,2000,pp. 105{116. doi:http://doi. acm. org/10. 1145/378993. 379007.
D. P. Helmbold, D. D. Long, T. L. Sconyers, B. Sherrod, Adaptive diskspindown for mobile computers, Mobile Networks and Applications 5(2000) 285{297.
M. Elnozahy, M. Kistler, R. Rajamony, Energy conservation policiesfor web servers, in: USITS'03: Proceedings of the 4th conference onUSENIX Symposium on Internet Technologies and Systems, USENIXAssociation, Berkeley, CA, USA, 2003.
E. V. Carrera, E. Pinheiro, R. Bianchini, Conserving disk energy in net-work servers, in: ICS '03: Proceedings of the 17th annual internationalconference on Supercomputing, ACM, New York, NY, USA, 2003, pp. 86{97. doi:http://doi. acm. org/10. 1145/782814. 782829.
S. Gurumurthi, A. Sivasubramaniam, M. Kandemir, H. Franke, Drpm:Dynamic speed control for power management in server class disks,Computer Architecture, International Symposium on 0 (2003) 169. doi:http://doi. ieeecomputersociety. org/10. 1109/ISCA. 2003. 1206998.
Jan Stoess , Christoph Klee , Stefan Domthera , Frank Bellosa, Transparent, Power-Aware Migration in Virtualized Systems.
Akshat Verma, Puneet Ahuja and Anindya Neogi, pMapper: Power and Migration Cost Aware Application Placement in Virtualized Systems
Live Data Center Migration acrossWANs:A Robust Cooperative Context Aware ApproachK. K. Ramakrishnan, Prashant Shenoy , Jacobus Van der MerweAT&T Labs-Research / ?? University of Massachusetts
intelligence R. Jeyarani , R. Vasanth Ram , N. Nagaveni, Design and implementation of adaptive power-aware virtual machine provisioner (APA-VMP) using swarm
Power-aware linear programming based scheduling for heterogeneous computer clusters. Rini T Kaushik, Milind Bhandarkar, GreenHDFS: Towards an energy-conserving, storage-efficient, hybrid Hadoop compute cluster.
WOL â http://wikipedia. org/wol.
PSO reference - http://www. swarmintelligence. org â Study about PSO.

Index Terms

Computer Science

Information Sciences

Keywords

Energy Efficiency Hadoop Reliability Pso