International Conference on Advancements in Engineering and Technology |
Foundation of Computer Science USA |
ICAET2015 - Number 3 |
August 2015 |
Authors: Ruchi Mittal, Harpreet Kaur |
1e197d94-867e-46d4-8e3f-a0b702992cb7 |
Ruchi Mittal, Harpreet Kaur . A Survey on Data Placement and Workload Scheduling Algorithms in Heterogeneous Network for Hadoop. International Conference on Advancements in Engineering and Technology. ICAET2015, 3 (August 2015), 22-28.
The elastic scalability and fault tolerance of the cloud computing has led to a wide range of real world applications. However, processing requirements of Big Data in these applications pose a humongous challenge for achieving desired performance levels. MapReduce is an effective parallel distributed programming model for handling large unstructured datasets in cloud applications. Hadoop, an open source implementation of the MapReduce model, is currently being employed for high performance processing of Big Data. The current Hadoop implementation considers the nodes of a cluster in a homogeneous environment where each node has the same computing capacity and workload. But in real world applications the nodes may have different computing capacities and workloads resulting in a heterogeneous environment. In such heterogeneous environment the default Hadoop implementation does not yield the expected performance. This paper includes a survey on the algorithms proposed by different authors on (a) data placement strategies and (b) workload scheduling for Hadoop in heterogeneous network.