International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 183 - Number 6 |
Year of Publication: 2021 |
Authors: Ragini Gour, Ramratan Ahirwal |
10.5120/ijca2021921342 |
Ragini Gour, Ramratan Ahirwal . (ISSBM) Improved Synthetic Sampling based on Model for Imbalance Data. International Journal of Computer Applications. 183, 6 ( Jun 2021), 29-35. DOI=10.5120/ijca2021921342
In the data mining research domain imbalanced data is characterized by the rigorous variation in scrutiny frequency between classes and has expected a lot of consideration. The forecast performances usually depreciate as classifiers learn from data imbalanced, as most of classifiers presume the class division is balanced or the costs for different types of classification errors are the same. Although several methods have been analyzed to deal with imbalance problems, it is still difficult to oversimplify those methods to achieve stable improvement in most cases. In this study, we propose a novel framework called Improved Synthetic Sampling Based on Model (ISSBM) to deal with imbalance problems, in which we integrate improved modeling and sampling techniques to generate synthetic data. The key inspiration behind the proposed method is to use deterioration models to capture the relationship between features and to consider data multiplicity in the process of data generation. We conduct experiments on many datasets and compare the proposed method with 5 methods. The experimental results indicate that the proposed method is not only qualified or comparative but also very stable. We also provide detailed analysis of the proposed method to empirically demonstrate why it could generate good data samples.