International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 160 - Number 5 |
Year of Publication: 2017 |
Authors: Aman Gupta, Pranita Jain |
10.5120/ijca2017913055 |
Aman Gupta, Pranita Jain . A Map Reduce Hadoop Implementation of Random Tree Algorithm based on Correlation Feature Selection. International Journal of Computer Applications. 160, 5 ( Feb 2017), 41-44. DOI=10.5120/ijca2017913055
Random Tree is a popular data classification classifier for machine learning. Feature reduction is one of the important research issues in big data. Most existing feature reduction algorithms are now faced with two challenging problems. On one hand, they have infrequently taken granular computing into thinking. On the other hand, they still cannot deal with massive data. Massive data processing is a difficult problem in the age of big data. Traditional feature reduction algorithms are generally time-consuming when facing big data. For speedily processing, we introduce a scalable fast approximate attribute reduction algorithm with Map Reduce. We divide the original data into many tiny chunks, and use reduction algorithm for each chunk. The reduction algorithm is based on correlation feature selection and generates decision rules by using Random Tree Classifier. Finally, feature reduction algorithm is proposed in data and task parallel using Hadoop Map Reduce framework with WEKA environment. Experimental results demonstrate that the proposed classifier can scale well and efficiently process big data.