CFP last date
20 February 2025
Reseach Article

Handling Class Imbalance in Mobile Telecoms Customer Churn Prediction

by Clement Kirui, Li Hong, Edgar Kirui
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 72 - Number 23
Year of Publication: 2013
Authors: Clement Kirui, Li Hong, Edgar Kirui
10.5120/12680-9446

Clement Kirui, Li Hong, Edgar Kirui . Handling Class Imbalance in Mobile Telecoms Customer Churn Prediction. International Journal of Computer Applications. 72, 23 ( June 2013), 7-13. DOI=10.5120/12680-9446

@article{ 10.5120/12680-9446,
author = { Clement Kirui, Li Hong, Edgar Kirui },
title = { Handling Class Imbalance in Mobile Telecoms Customer Churn Prediction },
journal = { International Journal of Computer Applications },
issue_date = { June 2013 },
volume = { 72 },
number = { 23 },
month = { June },
year = { 2013 },
issn = { 0975-8887 },
pages = { 7-13 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume72/number23/12680-9446/ },
doi = { 10.5120/12680-9446 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:38:41.274347+05:30
%A Clement Kirui
%A Li Hong
%A Edgar Kirui
%T Handling Class Imbalance in Mobile Telecoms Customer Churn Prediction
%J International Journal of Computer Applications
%@ 0975-8887
%V 72
%N 23
%P 7-13
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Class imbalance is a major problem that is often experienced when dealing with rare events, such as churn recognition in the mobile telecommunications industry. In this work, various strategies of addressing the problem are studied and a demonstration of how under-sampling and Synthetic Minority Oversampling Technique (SMOTE) can be used to address the problem is given. The two techniques are implemented individually first, and then we take the hybrid approach by combining both SMOTE and undersampling. For performance evaluation, two predictive techniques, C4. 5 decision tree and Naïve Bayes classifier with 10-fold cross validation are used. TPR and FPR values are obtained and used to generate ROC curves from which AUC values are calculated and performance comparison of the three techniques is performed. Results show that the hybrid approach achieves better performance.

References
  1. R. Mattison, The Telco Churn Management Handbook, XiT Press, 2006.
  2. H. B. Q, T. Kechadi, B. Buckley, G. Kiernan, E. Keogh and T. Rashid, "A new feature set with new window techniques for customer churn prediction in land-line telecommunications," Expert Systems With Applications, vol. 37, pp. 3657-3665, 2010.
  3. C. Kirui, L. Hong, W. Cheruiyot and H. Langat, "Predicting Customer Churn in Mobile Telecommunications Using Probabilistic Classifiers in Data Mining," International Journal of Computer Science Issues, vol. 10, no. 2, p. 165, 2013.
  4. N. Japkowicz, "The class imbalance problem: Significance and strategies," in International Conference on Artificial Intelligence (IC-AI'2000), 2000.
  5. P. N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Pearson Education Asia Inc. , 2006.
  6. X. -Y. Liu, J. Wu and a. Z. -H. Zhou, "Exploratory Undersampling for Class-Imbalance Learning," IEEE TRANSACTIONS ON SYSTEMS, MAN AND CYBERNETICS – PART B, 2008.
  7. C. X. Ling and V. S. Sheng, "Cost-Sensitive Learning and the Class Imbalance Problem," Encyclopedia of Machine Learning. C. Sammut (Ed. ). , 2008.
  8. M. Galar, A. Fern´andez, E. Barrenechea and H. Bustince, "A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches," IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, 2011.
  9. N. V. Chawla, K. W. Bowyer, L. O. Hall and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-sampling Technique," Journal of Artificial Intelligence Research, vol. 16, p. 321–357, 2002.
  10. J. Han, M. Kamber and J. Pei, Data Mining Concepts and Techniques, Morgan Kaufmann Publishers, 2012.
  11. Q. J. R, C4. 5: Programs for Machine Learning, 1993.
  12. I. H. Witten, E. Frank and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2011.
  13. S. Y. Hung, D. C. Yen and H. Y. Wang, "Applying data mining to telecom churn management," Expert Systems with Applications, vol. 31, p. 515–524, 2006.
  14. B. Huang, T. Kechadi, B. Buckley, G. Kiernan, E. Keogh and T. Rashid, "A new feature set with new window techniques for customer churn prediction in land-line telecommunications," Expert Systems with Applications, vol. 37, p. 3657–3665, 2010.
  15. J. Burez and D. V. d. Poel, "Handling class imbalance in customer churn prediction," Expert systems with Applications, vol. 36, pp. 4626-4636, 2009.
  16. X. Guo, Y. Yin, C. Dong, G. Yang and G. Zhou, "On the class imbalance problem," in Fourth International Conference on Natural Computation, 2008.
  17. Y. Sangho, J. Koehler and A. Ghobarah, "Prediction of advertiser churn for google adwords," in JSM Proceedings, 2010.
  18. D. Sculley, M. E. Otey, M. Pohl, B. Spitznagel, J. Hainsworth and Y. Zhou, "Detecting adversarial advertisements in the wild," in the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011.
Index Terms

Computer Science
Information Sciences

Keywords

Class Imbalance Customer Churn Over-sampling Under-sampling Prediction