CFP last date
20 January 2025
Reseach Article

Spark is superior to Map Reduce over Big Data

by Shaik Farook, G. Lakshmi Narayana, B. Tarakeswara Rao
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 133 - Number 1
Year of Publication: 2016
Authors: Shaik Farook, G. Lakshmi Narayana, B. Tarakeswara Rao
10.5120/ijca2016907721

Shaik Farook, G. Lakshmi Narayana, B. Tarakeswara Rao . Spark is superior to Map Reduce over Big Data. International Journal of Computer Applications. 133, 1 ( January 2016), 13-16. DOI=10.5120/ijca2016907721

@article{ 10.5120/ijca2016907721,
author = { Shaik Farook, G. Lakshmi Narayana, B. Tarakeswara Rao },
title = { Spark is superior to Map Reduce over Big Data },
journal = { International Journal of Computer Applications },
issue_date = { January 2016 },
volume = { 133 },
number = { 1 },
month = { January },
year = { 2016 },
issn = { 0975-8887 },
pages = { 13-16 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume133/number1/23749-2016907721/ },
doi = { 10.5120/ijca2016907721 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:29:54.437731+05:30
%A Shaik Farook
%A G. Lakshmi Narayana
%A B. Tarakeswara Rao
%T Spark is superior to Map Reduce over Big Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 133
%N 1
%P 13-16
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In the Big Data group, MapReduce has been seen as one of the key empowering methodologies for taking care of ceaselessly expanding requests on figuring assets forced by Big Datasets yet at the same time numerous issues arrive with MapReduce keeping in mind the end goal to handle a much more extensive cluster of employments, combination into Hadoop's native file system. The purpose behind this is the high versatility of the MapReduce worldview which takes into account hugely parallel and circulated execution over an expansive number of figuring hubs. This paper address the how supplant MapReduce with Apache Spark as the default preparing for Hadoop.Apache Spark is superior to MapReduce towards leads issues and difficulties in taking care of Big Data with the target of giving an outline of the field, encouraging better arranging and administration of Enormous Information ventures ,larger amount reflection and speculation of MapReduce.

References
  1. P. Zadrozny and R. Kodali, Big Data Analytics using Splunk, Berkeley, CA, USA: Apress, 2013. F. Ohlhorst, Big Data Analytics: Turning Big Data into Big Money, Hoboken, N.J, USA: Wiley, 2013.
  2. J. Dean and S. Ghemawat, "MapReduce: Simplified data processing on large clusters," Commun ACM, 51(1), pp. 107-113, 2008.
  3. Apache Hadoop, http://hadoop.apache.org.
  4. F. Li, B. C. Ooi, M. T. Özsu and S. Wu, "Distributed data management using MapReduce," ACM Computing Surveys, 46(3), pp. 1-42, 2014.
  5. C. Doulkeridis and K. Nørvåg, "A survey of large-scale analytical query processing in MapReduce," The VLDB Journal, pp. 1-26, 2013.
  6. P. Bhatotia, A. Wieder, R. Rodrigues, U. A. Acar and R. Pasquin, "Incoop: MapReduce for incremental computations," Proc. of the 2nd ACM Symposium on Cloud Computing, 2011.
  7. Y. Chen, S. Alspaugh and R. Katz, "Interactive analytical processing in Big Data systems: A cross-industry study of MapReduce workloads," Proc. of the VLDB Endowment, 5(12), pp. 1802-1813, 2012.
  8. http://www.sparkbigdata.com/102-spark-blog-slim-baltagi/1-getting-started
  9. S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar, M. Tolton and T. Vassilakis, "Dremel: Interactive analysis of Web-scale datasets," Proc. of the VLDB Endowment, 3(1-2), pp. 330-339, 2010.
Index Terms

Computer Science
Information Sciences

Keywords

Big Data Big Data Analytics MapReduce Apache Spark