Boosting the Performance of MapReduce by Better Resource Utilization in Cluster

Pooja Malikwade; S.B.Jadhav

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

An Easily Comprehendible Unicode based Sorting Algorithm for Bangla Words

October

2013

Detection and Prevention of Sybil Attack in MANET using MAC Address

July

2015

A Comparative Study of Assessing Software Reliability using SPC: An MMLE Approach

July

2012

Performance Comparison of Three Types of Sensor Matrices for Indoor Multi-Robot Localization

Nov

2018

Reseach Article

Boosting the Performance of MapReduce by Better Resource Utilization in Cluster

by Pooja Malikwade, S.B.Jadhav

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 112 - Number 16

Year of Publication: 2015

Authors: Pooja Malikwade, S.B.Jadhav

10.5120/19753-1535

Pooja Malikwade, S.B.Jadhav . Boosting the Performance of MapReduce by Better Resource Utilization in Cluster. International Journal of Computer Applications. 112, 16 ( February 2015), 29-33. DOI=10.5120/19753-1535

@article{ 10.5120/19753-1535,

author = { Pooja Malikwade, S.B.Jadhav },

title = { Boosting the Performance of MapReduce by Better Resource Utilization in Cluster },

journal = { International Journal of Computer Applications },

issue_date = { February 2015 },

volume = { 112 },

number = { 16 },

month = { February },

year = { 2015 },

issn = { 0975-8887 },

pages = { 29-33 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume112/number16/19753-1535/ },

doi = { 10.5120/19753-1535 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:49:40.737786+05:30

%A Pooja Malikwade

%A S.B.Jadhav

%T Boosting the Performance of MapReduce by Better Resource Utilization in Cluster

%J International Journal of Computer Applications

%@ 0975-8887

%V 112

%N 16

%P 29-33

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

MapReduce implementations are being used for processing large data sets. MapReduce performs parallel computations to speed up the job processing. When performing parallel computations the skew that arises due large indivisible records or uneven distribution of data slows down the job execution process and lowers the cluster throughput. We provide a solution, by proposing an automatic system that handles skew which is compatible with MapReduce framework and is transparent to users. The proposed system makes use of idle resources in the cluster for skew handing. Task repartitioning method is implemented for the purpose of skew handling. The output order is maintained even after task repartitioning. The proposed system requires no extra input from the users and imposes minimum overhead in the absence of skew.

References

J. Dean and S. Ghemawat, “Mapreduce: simplified data processing on large clusters,” Commun. ACM, vol. 51, pp. 107–113, January 2008.
K. Ren, Y. Kwon, M. Balazinska, and B. Howe, “Hadoops adolescence: A comparative workload analysis from three research clusters,” in Proceedings of IEEE 8th International Conference on e-Business Engineering, ser. ICEBE’2011, 2011.
“Apache hadoop, http://hadoop.apache.org/.”
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, “Dryad: distributed data-parallel programs from sequential building blocks,” in Proc.of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, ser. EuroSys ’07, 2007.
M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica, “Improving mapreduce performance in heterogeneous environments,” in Proc. of the 8th USENIX conference on Operating systems design and implementation, ser. OSDI’08, 2008.
G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris, “Reining in the outliers in map-reduce clusters using mantri,” in Proc. of the 9th USENIX conference on Operating systems design and implementation, ser. OSDI’10, 2010.
Q. Chen, C. Liu, and Z. Xiao, “Improving mapreduce performance using smart speculative execution strategy,” IEEE Transactions on Computers, vol. 99, no. PrePrints, p. 1, 2013.
Z. Guo, M. Pierce, G. Fox, and M. Zhou, “Automatic task re-organization in mapreduce,” in Proceedings of the 2011 IEEE International Conference on Cluster Computing, ser. CLUSTER ’11. Washington, DC, USA: IEEE Computer Society, 2011, pp. 335–343.
K. Morton, A. Friesen, M. Balazinska, and D. Grossman. Estimating the progress of MapReduce pipelines. In Proc. of the 26nd ICDE Conf., Mar. 2010.
R. Chaiken, B. Jenkins, P.-A. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou, “Scope: easy and efficient parallel processing of massive data sets,” Proc. VLDB Endow., vol. 1, pp. 1265–1276, August 2008.
X. Pan, J. Tan, S. Kavulya, R. Gandhi, and P. Narasimhan, “Ganesha: blackbox diagnosis of mapreduce systems,” SIGMETRICS Perform. Eval. Rev., vol. 37, pp. 8–13, January 2010.
H.-c. Yang, A. Dasdan, R.-L. Hsiao, and D. S. Parker, “Map-reducemerge: simplified relational data processing on large clusters,” in Proc. of the 2007 ACM SIGMOD international conference on Management of data, ser. SIGMOD ’07, 2007.
M. C. Schatz. CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics, 25(11):1363{1369, June 2009.
M. Shah, J. Hellerstein, and E. Brewer. Highly-available, fault-tolerant, parallel dataows. In Proc. of the SIGMOD Conf., June 2004.

Index Terms

Computer Science

Information Sciences

Keywords

Data skew MapReduce parallel database systems performance gain skew handling