International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 146 - Number 9 |
Year of Publication: 2016 |
Authors: Jasmeet Singh Puaar, Ramanjeet Kaur |
10.5120/ijca2016910846 |
Jasmeet Singh Puaar, Ramanjeet Kaur . Heterogeneous Data Processing using Hadoop and Java Map/Reduce. International Journal of Computer Applications. 146, 9 ( Jul 2016), 13-16. DOI=10.5120/ijca2016910846
In this paper, the objective is to do analysis of New York Stock Exchange's heterogeneous sample data using java map-reduce on Hadoop platform. Java programming as well as Java map-reduce API has been used to work upon huge amount of data i.e. BIG DATA. The source data is of heterogeneous type. The format and the structure of the data files worked with are different. So, it was challenging to handle the data and send it to the mappers to get a single reduced output file. The analysis of NYSE's data was done to find out the maximum and minimum price of every particular stock exchange for each year and to calculate average stock price of any stock exchange for a particular year by using record of its dividends in the sample data. This has been done by usage of data from two different files namely: dividends.csv and sample_prices.csv .The output of the program was saved to the HDFS file system. This output can then be saved to our NTFS file system using Sqoop or the files can be manually copied to our system for further processing.