CFP last date
20 December 2024
Reseach Article

Enhancing Dynamic Capacity Scheduler for Data Intensive Jobs

by Sukhmani Goraya, Vikas Khullar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 121 - Number 12
Year of Publication: 2015
Authors: Sukhmani Goraya, Vikas Khullar
10.5120/21592-4682

Sukhmani Goraya, Vikas Khullar . Enhancing Dynamic Capacity Scheduler for Data Intensive Jobs. International Journal of Computer Applications. 121, 12 ( July 2015), 21-24. DOI=10.5120/21592-4682

@article{ 10.5120/21592-4682,
author = { Sukhmani Goraya, Vikas Khullar },
title = { Enhancing Dynamic Capacity Scheduler for Data Intensive Jobs },
journal = { International Journal of Computer Applications },
issue_date = { July 2015 },
volume = { 121 },
number = { 12 },
month = { July },
year = { 2015 },
issn = { 0975-8887 },
pages = { 21-24 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume121/number12/21592-4682/ },
doi = { 10.5120/21592-4682 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:08:47.422033+05:30
%A Sukhmani Goraya
%A Vikas Khullar
%T Enhancing Dynamic Capacity Scheduler for Data Intensive Jobs
%J International Journal of Computer Applications
%@ 0975-8887
%V 121
%N 12
%P 21-24
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Management of Big Data is a Challenging issue. The MapReduce environment is the widely used key solution for data intensive jobs. We will analyze map reduce pipelining and along with processing of Map phase and Reduce phase. Core schedulers FIFO, Fair and Capacity Schedulers have been discussed. The Scheduler assigns MapReduce task to the resources and there is a challenge to the scheduler to schedule the task in a way that it is scalable. Existing work shows the performance of the Hadoop depends upon input data and configuration of the cluster. In this paper, we have analyzed the execution time for data intensive jobs with increasing volume of the data set. We have also compared the execution time of the task with existing scheduler and our proposed method for the scheduler.

References
  1. Jilan Chen, Dan Wang and Wenbing Zhao," A Task Scheduling Algorithm for Hadoop Platform" in Journal of Computers , vol. 8, no. 4, April 2013.
  2. S. Ghemawat, H. Gobioff, and S. -T. Leung, "The google file system," in 19th ACM Symposium on Operating Systems Principles, Lake George, NY, Oct. 2003.
  3. Shiori KURAZUMI *, Tomoaki TSUMURA *, Shoichi SAITO * and Hiroshi MATSUO *, "Dynamic processing slots scheduling for I/O intensive jobs of Hadoop MapReduce" in 2012 Third International Conference on Networking and Computing.
  4. Harshawardhan S. Bhosale1 , Prof. Devendra P. Gadekar2 "A paper on Big Data and Hadoop" in International Journal of Scientific and Research Publications, Volume 4, Issue 10, October 2014.
  5. M. Isard, M. Budiu, Y. Yu, "Distributed Data-Parallel Programs from Sequential Building Blocks," Proceedings of the 2nd ACM SIGOPS European Conference on Computer Systems, ACM, 59-72. 2007.
  6. Hadoop's Fair Scheduler. https://hadoop. apache. org/docs/r1. 2. 1/fair_sche Duler. [As accessed on 9 Feb. 2015].
  7. Y. Chen, S. Alspaugh, and R. H. Katz, ""Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of Mapreduce Workloads,?? Proc. VLDB Endowment, vol. 5, no. 12, Aug. 2012.
  8. Umesh V. Nikam, Anup W. Burange, Abhishek A. Gulhane, "Big Data and HADOOP: A Big Game Changer", International Journal of Advance Research in Computer Science and Management Studies, Volume 1, Issue 7, ISSN: 2321-7782, DEC 2013.
  9. N. Tiwari, "Scheduling and Energy Efficiency Improvement Techniques for Hadoop Mapreduce: State of Art and Directions for Future Research (Doctoral dissertation, Indian Architectures, algorithms and programming. " IEEE; 2011. p. 213–17.
  10. MapReduce NextGen aka YARN aka MRv2 http://hadoop. apache. org/docs/current/hadoop-yarn/hadoop-yarn- site/CapacityScheduler. html
Index Terms

Computer Science
Information Sciences

Keywords

Hadoop MapReduce Capacity Scheduler Fair Scheduler FIFO Scheduler HDFS.