International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 92 - Number 5 |
Year of Publication: 2014 |
Authors: Gurpreet Singh Bedi, Ashima Singh |
10.5120/16009-5051 |
Gurpreet Singh Bedi, Ashima Singh . Big Data Analysis with Dataset Scaling in Yet Another Resource Negotiator (YARN). International Journal of Computer Applications. 92, 5 ( April 2014), 46-50. DOI=10.5120/16009-5051
The data is exceedingly large day by day. In some organizations, there is a need to analyze and process the gigantic data. This is a big data problem often faced by these organizations. It is not possible for single machine to handle that data. So we have used Apache Hadoop Distributed File System (HDFS) for storage and analysis. This paper shows experimental work done on the MapReduce Application on Health sector dataset. The result shows the behavior of the MapReduce application framework to map and reduce the big volume of the data. The main problem is to check the behavior of the MapReduce applications by increasing the size of dataset. Our analysis lies in understanding the Apache MapReduce application performance. We expect that execution time increases linearly with the dataset size but our analysis shows sometimes the execution time varies non-linearly with the increase in the dataset size. The experimental result shows that with scaling the datasets execution time distinguishes.