International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 113 - Number 1 |
Year of Publication: 2015 |
Authors: Satish Gopalani, Rohan Arora |
10.5120/19788-0531 |
Satish Gopalani, Rohan Arora . Comparing Apache Spark and Map Reduce with Performance Analysis using K-Means. International Journal of Computer Applications. 113, 1 ( March 2015), 8-11. DOI=10.5120/19788-0531
Big Data has long been the topic of fascination for Computer Science enthusiasts around the world, and has gained even more prominence in the recent times with the continuous explosion of data resulting from the likes of social media and the quest for tech giants to gain access to deeper analysis of their data. This paper discusses two of the comparison of - Hadoop Map Reduce and the recently introduced Apache Spark – both of which provide a processing model for analyzing big data. Although both of these options are based on the concept of Big Data, their performance varies significantly based on the use case under implementation. This is what makes these two options worthy of analysis with respect to their variability and variety in the dynamic field of Big Data. In this paper we compare these two frameworks along with providing the performance analysis using a standard machine learning algorithm for clustering (K-Means).