International Conference on Emerging Trends in Technology and Applied Sciences |
Foundation of Computer Science USA |
ICETTAS2015 - Number 2 |
September 2015 |
Authors: Reema Rhine, Nikhila T Bhuvan |
eccd6089-703d-4273-920d-137a5267cad2 |
Reema Rhine, Nikhila T Bhuvan . Improved Input Data Splitting in MapReduce. International Conference on Emerging Trends in Technology and Applied Sciences. ICETTAS2015, 2 (September 2015), 23-26.
The performance of MapReduce greatly depends on its data splitting process which happens before the map phase. This is usually done using naive methods which are not at all optimal. In this paper, an Improved Input Splitting technology based on locality is explained which aims at addressing the input data splitting problems which affects the job performance seriously. Improved Input Splitting clusters data blocks from a same node into the same single partition, so that it is processed by one map task. This method avoids the time for slot reallocation and multiple tasks initializing. Experiment results demonstrated that this can improve the MapReduce processing performance largely than the traditional Hadoop implementation.