International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 127 - Number 15 |
Year of Publication: 2015 |
Authors: Arpit Singh, Anuradha Purohit |
10.5120/ijca2015906677 |
Arpit Singh, Anuradha Purohit . A Survey on Methods for Solving Data Imbalance Problem for Classification. International Journal of Computer Applications. 127, 15 ( October 2015), 37-41. DOI=10.5120/ijca2015906677
The term “data imbalance” in classification is a well established phenomenon in which data set contains unbalanced class distributions. Dataset is called unbalanced if it contains at least one class which is presented by very few examples. A range of solutions have been proposed for the problem of data imbalance including data sampling, cost evaluation of model, bagging, boosting, Genetic Programming (GP) based methods etc. This paper presents a survey of various methods introduced by researchers to handle data imbalance problem in order to improve classification performance and further the comparison between the methods on the basis of their advantages and disadvantages is done.