A Map Reduce Hadoop Implementation of Random Tree Algorithm based on Correlation Feature Selection

Aman Gupta; Pranita Jain

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

Comparison of Preprocessing Algorithms using an Affordable EEG Headset

Feb

2017

Impact of Mobility on Energy Consumption of AODV Protocol for Routing in Mobile Ad Hoc Networks

Oct

2016

Performance Evaluation and Comparison of PDTMRP and MAODV

May

2015

Development of Kannada Speech Corpus for Continuous Speech Recognition

Jun

2018

Reseach Article

A Map Reduce Hadoop Implementation of Random Tree Algorithm based on Correlation Feature Selection

by Aman Gupta, Pranita Jain

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 160 - Number 5

Year of Publication: 2017

Authors: Aman Gupta, Pranita Jain

10.5120/ijca2017913055

Aman Gupta, Pranita Jain . A Map Reduce Hadoop Implementation of Random Tree Algorithm based on Correlation Feature Selection. International Journal of Computer Applications. 160, 5 ( Feb 2017), 41-44. DOI=10.5120/ijca2017913055

@article{ 10.5120/ijca2017913055,

author = { Aman Gupta, Pranita Jain },

title = { A Map Reduce Hadoop Implementation of Random Tree Algorithm based on Correlation Feature Selection },

journal = { International Journal of Computer Applications },

issue_date = { Feb 2017 },

volume = { 160 },

number = { 5 },

month = { Feb },

year = { 2017 },

issn = { 0975-8887 },

pages = { 41-44 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume160/number5/27073-2017913055/ },

doi = { 10.5120/ijca2017913055 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:05:53.839921+05:30

%A Aman Gupta

%A Pranita Jain

%T A Map Reduce Hadoop Implementation of Random Tree Algorithm based on Correlation Feature Selection

%J International Journal of Computer Applications

%@ 0975-8887

%V 160

%N 5

%P 41-44

%D 2017

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Random Tree is a popular data classification classifier for machine learning. Feature reduction is one of the important research issues in big data. Most existing feature reduction algorithms are now faced with two challenging problems. On one hand, they have infrequently taken granular computing into thinking. On the other hand, they still cannot deal with massive data. Massive data processing is a difficult problem in the age of big data. Traditional feature reduction algorithms are generally time-consuming when facing big data. For speedily processing, we introduce a scalable fast approximate attribute reduction algorithm with Map Reduce. We divide the original data into many tiny chunks, and use reduction algorithm for each chunk. The reduction algorithm is based on correlation feature selection and generates decision rules by using Random Tree Classifier. Finally, feature reduction algorithm is proposed in data and task parallel using Hadoop Map Reduce framework with WEKA environment. Experimental results demonstrate that the proposed classifier can scale well and efficiently process big data.

References

Borthakur, D. The Hadoop Distributed File System: Architecture and Design, 2007.
Jiawei Han, Yanheng Liu, Xin Sun A Scalable Random Forest Algorithm Based on Map Reduce, IEEE 2013.
Q. He, F.Z. Zhuang, J. e. Li, Z.z. Shi. Parallel implementation of classification algorithms based on Map Reduce. RSKT, LNAI 6401,pp. 655-662, 2010
Http://wiki.pentaho.com/display/DATAMINING/RandomTree
M. Hall 1999, Correlation-based Feature Selection for Machine Learning
Baris Senliol, gokhan gulgezen, "Fast Correlation Based Filter with a different search strategy." Computer and Information Sciences, 2008. ISCIS'08. 23rd International Symposium on. IEEE, 2008.
Junbo Zhang, Tianrui Li a, Da Ruan, Zizhe Gao, Chengbing Zhao, A parallel method for computing rough set approximations,2012.
https://archive.ics.uci.edu/ml/datasets.html

Index Terms

Computer Science

Information Sciences

Keywords

Hadoop Map Reduce Random Tree Big Data Correlation.