National Conference on Advances in Computing |
Foundation of Computer Science USA |
NCAC2015 - Number 6 |
December 2015 |
Authors: Poonam S. Patil, Rajesh N. Phursule |
Poonam S. Patil, Rajesh N. Phursule . Size based Multithreaded Scheduler for Hadoop Framework. National Conference on Advances in Computing. NCAC2015, 6 (December 2015), 20-23.
The majority of large-scale data severe applications executed by data centers are based on MapReduce or its open-source implementation i. e. Hadoop. For processing huge sum of data in parallel Hadoop programming framework provides Distributed File System (HDFS)[2] and MapReduce Programming Model[3]. Job scheduling is an imperative process in Hadoop MapReduce. Hadoop comes with three types of schedulers namely FIFO, Fair and Capacity Scheduler. In some processing scenario these traditional scheduling algorithm of Hadoop cannot meet the performance requirements and fairness criteria of Big Data Processing. To address this issue new efficient scheduler is require who can identify the data size first and processed accordingly for performance improvement. This new MapReduce scheduling scheme Will improves MapReduce performance and erasure high speed data processing. Proposed system will analyze the data size of individual DataNode and create threads based on threshold value decided by proposed scheduler. Processing of the threads is done parallel on individual DataNode by task tracker which will ultimately improve the data process performance. Because of that task Tracker will does the work in less time than the time required by the traditional Scheduler.