International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 186 - Number 58 |
Year of Publication: 2024 |
Authors: Ahmad Farhan AlShammari |
10.5120/ijca2024924341 |
Ahmad Farhan AlShammari . Implementation of Feature Selection using Correlation Matrix in Python. International Journal of Computer Applications. 186, 58 ( Dec 2024), 29-34. DOI=10.5120/ijca2024924341
The goal of this research is to develop a feature selection program using correlation matrix in Python. Feature selection is used to determine the most important features in data. It helps to reduce the number of features, decrease the complexity of computations, increase the accuracy, and improve the performance of the applied model. Correlation matrix is used to measure the correlation between the input (independent) features and the output (dependent) feature. The input features that are highly correlated with the output feature are identified, filtered, and selected. The basic steps of feature selection using correlation matrix are explained: preparing data (input and output), creating transpose of input data, creating data matrix, computing correlation matrix, plotting correlation matrix, selecting features (adding relevant features and removing redundant features), and printing selected features. The developed program was tested on an experimental dataset. The program successfully performed the basic steps of feature selection using correlation matrix and provided the required results.