Survey on MapReduce Scheduling Algorithms

Liya Thomas; Syama R

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper

Know more

The week's pick

Assessing LLMs as Cognitive Interpreters of Student Prompts: A Typological Framework

Tadeu da Ponte Matevz Vremec Matej Mertik

Random Articles

Reseach Article

Survey on MapReduce Scheduling Algorithms

by Liya Thomas, Syama R

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 95 - Number 23

Year of Publication: 2014

Authors: Liya Thomas, Syama R

10.5120/16733-6903

Liya Thomas, Syama R . Survey on MapReduce Scheduling Algorithms. International Journal of Computer Applications. 95, 23 ( June 2014), 9-13. DOI=10.5120/16733-6903

@article{ 10.5120/16733-6903,

author = { Liya Thomas, Syama R },

title = { Survey on MapReduce Scheduling Algorithms },

journal = { International Journal of Computer Applications },

issue_date = { June 2014 },

volume = { 95 },

number = { 23 },

month = { June },

year = { 2014 },

issn = { 0975-8887 },

pages = { 9-13 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume95/number23/16733-6903/ },

doi = { 10.5120/16733-6903 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:20:12.042658+05:30

%A Liya Thomas

%A Syama R

%T Survey on MapReduce Scheduling Algorithms

%J International Journal of Computer Applications

%@ 0975-8887

%V 95

%N 23

%P 9-13

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

MapReduce is a programming model used by Google to process large amount of data in a distributed computing environment. It is usually used to perform distributed computing on clusters of computers. Computational processing of data stored on either a file system or a database usually occurs. MapReduce takes the advantage of locality of data, processing data on or near the storage areas, thereby avoiding unnecessary data transmission. The simplicity of the programming model and the automatic handling of node failures hiding the complexity of fault tolerance make MapReduce to be used for both commercial and scientific applications. As MapReduce clusters have become popular these days, their scheduling is one of the important factor which is to be considered. In order to achieve good performance a MapReduce scheduler must avoid unnecessary data transmission. Hence different scheduling algorithms for MapReduce are necessary to provide good performance. This paper provides an overview of four different scheduling algorithms for MapReduce namely; Scheduling algorithm in Hadoop, Longest Approximate Time to End (LATE) MapReduce scheduling algorithm, Self-Adaptive MapReduce(SAMR) scheduling algorithm and Enhanced Self-Adaptive MapReduce scheduling algorithm(ESAMR). An overview of these techniques is provided through this paper. Advantages and disadvantages of these algorithms are identified.

References

J. Dean and S. Ghemawat, "Mapreduce: simplied data processing on large clusters," (2004) in OSDI 2004: Proceedings of 6th Symposium on Operating System Design and Implemention,(New York), pp. 137–150, ACM Press.
J. Dean and S. Ghemawat, "MapReduce: a flexible data processing tool,"(2010) Communications of the ACM, vol. 53, no. 1, pp. 72–77.
C. Jin and R. Buyya, "MapReduce programming model for . NET-based distributed computing," (2009) in Proceedings of the 15th European Conference on Parallel Processing.
" Apache Hadoop. " http://hadoop. apache. org.
Hadoop Distributed File System, http://hadoop. apache. org/hdfs.
M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica, "Improving mapreduce performance in heterogeneous environments," (2008) in 8th Usenix Symposium on Operating Systems Design and Implementation, (New York), pp. 29–42, ACM Press.
Quan Chen; Daqiang Zhang; Minyi Guo; Qianni Deng; Song Guo; , "SAMR: A Self-adaptive MapReduce Scheduling Algorithm in Heterogeneous Environment,"(2010) Computer and Information Technology (CIT), 2010 IEEE 10th International Conference on , vol. , no. , pp. 2736-2743.
Xiaoyu Sun, Chen He and Ying Lu "ESAMR: An Enhanced Self-Adaptive MapReduce Scheduling Algorithm"(2012) IEEE 18th International Conference on Parallel and Distributed Systems.
R. Nanduri, N. Maheshwari, A. Reddyraja, and V. Varma, "Job aware scheduling algorithm for mapreduce framework,"(2011) in Proceedings of the 3rd International Conference on Cloud Computing Technology and Science, CLOUDCOM '11, (Washington, DC, USA), pp. 724–729, IEEE Computer Society.
"K-means. " http://en. wikipedia. org/wiki/K-means clustering.
G. Hamerly and C. Elkan, "Alternatives to the k-means algorithm that find better clusterings,"(2002) in Proceedings of the 11th international conference on Information and knowledge management, CIKM '02, (New York, NY, USA), pp. 600–607, ACM.

Index Terms

Computer Science

Information Sciences

Keywords

MapReduce Programming model Scheduling algorithms