Heterogeneous Data Processing using Hadoop and Java Map/Reduce

Jasmeet Singh Puaar; Ramanjeet Kaur

Call for Paper

November Edition

IJCA solicits high quality original research papers for the upcoming November edition of the journal. The last date of research paper submission is 20 October 2025

Submit your paper

Know more

The week's pick

Zero Trust Architecture Implementation in Enterprise Networks: Evaluating Effectiveness Against Cyber Threats

Stephen Kofi Dotse Samuel Yao Sebuabe Augustus Obeng Silas Asani Abudu Edna Awisie Pappoe

Random Articles

Reseach Article

Heterogeneous Data Processing using Hadoop and Java Map/Reduce

by Jasmeet Singh Puaar, Ramanjeet Kaur

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 146 - Number 9

Year of Publication: 2016

Authors: Jasmeet Singh Puaar, Ramanjeet Kaur

10.5120/ijca2016910846

Jasmeet Singh Puaar, Ramanjeet Kaur . Heterogeneous Data Processing using Hadoop and Java Map/Reduce. International Journal of Computer Applications. 146, 9 ( Jul 2016), 13-16. DOI=10.5120/ijca2016910846

@article{ 10.5120/ijca2016910846,

author = { Jasmeet Singh Puaar, Ramanjeet Kaur },

title = { Heterogeneous Data Processing using Hadoop and Java Map/Reduce },

journal = { International Journal of Computer Applications },

issue_date = { Jul 2016 },

volume = { 146 },

number = { 9 },

month = { Jul },

year = { 2016 },

issn = { 0975-8887 },

pages = { 13-16 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume146/number9/25425-2016910846/ },

doi = { 10.5120/ijca2016910846 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:49:57.052928+05:30

%A Jasmeet Singh Puaar

%A Ramanjeet Kaur

%T Heterogeneous Data Processing using Hadoop and Java Map/Reduce

%J International Journal of Computer Applications

%@ 0975-8887

%V 146

%N 9

%P 13-16

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In this paper, the objective is to do analysis of New York Stock Exchange's heterogeneous sample data using java map-reduce on Hadoop platform. Java programming as well as Java map-reduce API has been used to work upon huge amount of data i.e. BIG DATA. The source data is of heterogeneous type. The format and the structure of the data files worked with are different. So, it was challenging to handle the data and send it to the mappers to get a single reduced output file. The analysis of NYSE's data was done to find out the maximum and minimum price of every particular stock exchange for each year and to calculate average stock price of any stock exchange for a particular year by using record of its dividends in the sample data. This has been done by usage of data from two different files namely: dividends.csv and sample_prices.csv .The output of the program was saved to the HDFS file system. This output can then be saved to our NTFS file system using Sqoop or the files can be manually copied to our system for further processing.

References

Thomas H. Davenport, 2014 big data @ work Harvard business review press.
T. Kraska, "Finding the Needle in the Big Data Systems Haystack," IEEE Internet Computing, vol. 17, no. 1, pp. 84-86, 2013.
Lekha R.Nair, 2014 Research in Big Data and Analytics: An Overview IJCA Volume 108
Siddharth Mehta 2015 Big Data analytics made easy with SQL and MapReduce
Online Searcher: Information Discovery, Technology, Strategies Volume 38, Number 2 - March/April 2014
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In Communications of the ACM, 51 (1): 107-113, 2008.
https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html
P. Amuthabala Kavya. T.C 2016 Outlook on various scheduling approaches in Hadoop P. Amuthabala et al. / International Journal on Computer Science and Engineering (IJCSE).
Manisha R. Thakare S.W. Mohod A.N. Thakare Various Data-Mining Techniques for Big Data IJCA Number 8
Kvn Krishna Mohan, K Prem Sai Reddy 2016 Efficient Big Data Processing in Hadoop MapReduce IJARCSSE Volume 6 Issue 3

Index Terms

Computer Science

Information Sciences

Keywords

Heterogeneous data processing MapReduce Big data Data Analysis HDFS multiple input NYSE data.