CFP last date
20 August 2024
Reseach Article

Enhancing Customer Churn Prediction using Machine Learning and Deep Learning Approaches with Principal Component Analysis

by Md Saidul Islam, Taofica Amrine, Tahmina Akter, Muhammad Anwarul Azim
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 185 - Number 44
Year of Publication: 2023
Authors: Md Saidul Islam, Taofica Amrine, Tahmina Akter, Muhammad Anwarul Azim
10.5120/ijca2023923255

Md Saidul Islam, Taofica Amrine, Tahmina Akter, Muhammad Anwarul Azim . Enhancing Customer Churn Prediction using Machine Learning and Deep Learning Approaches with Principal Component Analysis. International Journal of Computer Applications. 185, 44 ( Nov 2023), 21-27. DOI=10.5120/ijca2023923255

@article{ 10.5120/ijca2023923255,
author = { Md Saidul Islam, Taofica Amrine, Tahmina Akter, Muhammad Anwarul Azim },
title = { Enhancing Customer Churn Prediction using Machine Learning and Deep Learning Approaches with Principal Component Analysis },
journal = { International Journal of Computer Applications },
issue_date = { Nov 2023 },
volume = { 185 },
number = { 44 },
month = { Nov },
year = { 2023 },
issn = { 0975-8887 },
pages = { 21-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume185/number44/32984-2023923255/ },
doi = { 10.5120/ijca2023923255 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:28:37.407285+05:30
%A Md Saidul Islam
%A Taofica Amrine
%A Tahmina Akter
%A Muhammad Anwarul Azim
%T Enhancing Customer Churn Prediction using Machine Learning and Deep Learning Approaches with Principal Component Analysis
%J International Journal of Computer Applications
%@ 0975-8887
%V 185
%N 44
%P 21-27
%D 2023
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this research effort, we present a comprehensive approach for predicting customer churn using a combination of traditional Ma- chine Learning and Deep Learning methodologies. The primary focus of this investigation centers on the crucial phase of Data Pre-Processing, involving fundamental tasks such as the handling of missing data, removal of duplicates, and the elimination of outliers. To enhance data quality and representation, techniques such as Data Transformation, Normalization, and Principal Com- ponent Analysis (PCA) have been employed. To tackle class im- balance, the method of Random Over-Sampling has been implemented. The process of Feature Extraction encompasses One- Hot Encoding and PCA, further enhancing data representation. Subsequently, a diverse set of predictive models has been evaluated, including Random Forest (RF), Support Vector Classifier (SVC), Gaussian Naive Bayes (GNB), Decision Tree (DT), XG- Boost (XGB), Logistic Regression (LR), Artificial Neural Net- work (ANN), Convolutional Neural Network (CNN), Long Short- Term Memory (LSTM), and Recurrent Neural Network (RNN). The results indicate that XGBoost surpasses other models, achieving an exceptional accuracy of 98.26%. Furthermore, a hybrid CNN & XGB model demonstrates an impressive accuracy of 97.53%.

References
  1. Cenggoro, T.W., Wirastari, R.A., Rudianto, E., Mohadi, M.I., Ratj, D. and Pardamean, B., 2021. Deep learning as a vec- tor embedding model for customer churn. Procedia Computer Science, 179, pp.624-631.
  2. Wang, X., Nguyen, K. and Nguyen, B.P., 2020, January. Churn prediction using ensemble learning. In Proceedings of the 4th international conference on machine learning and soft computing (pp. 56-60).
  3. Ullah, I., Raza, B., Malik, A.K., Imran, M., Islam, S.U. and Kim, S.W., 2019. A churn prediction model using random for- est: analysis of machine learning techniques for churn predic- tion and factor identification in telecom sector. IEEE access, 7, pp.60134-60149.
  4. Halibas, A.S., Matthew, A.C., Pillai, I.G., Reazol, J.H., Delvo, E.G. and Reazol, L.B., 2019, January. Determining the inter- vening effects of exploratory data analysis and feature engi- neering in telecoms customer churn modelling. In 2019 4th MEC International Conference on Big Data and Smart City (ICBDSC) (pp. 1-7). IEEE.
  5. Suh, Y., 2023. Machine learning based customer churn pre- diction in home appliance rental business. Journal of big Data, 10(1), p.41.
  6. Kim, S. and Lee, H., 2022. Customer churn prediction in in- fluencer commerce: An application of decision trees. Procedia Computer Science, 199, pp.1332-1339.
  7. de Lima Lemos, R.A., Silva, T.C. and Tabak, B.M., 2022. Propension to customer churn in a financial institution: A machine learning approach. Neural Computing and Applica- tions, 34(14), pp.11751-11768.
  8. Xu, J., Li, X., He, Z. and Zhou, J., 2022. Early Warning of Telecom Customer Churn Based on Multialgorithm Model Optimization. Frontiers in Energy Research, 10, p.946933.
  9. Umayaparvathi, V. and Iyakutti, K., 2016. A survey on cus- tomer churn prediction in telecom industry: Datasets, meth- ods and metrics. International Research Journal of Engineer- ing and Technology (IRJET), 3(04).
  10. Gerpott, T.J., Rams, W. and Schindler, A., 2001. Customer re- tention, loyalty, and satisfaction in the German mobile cellu- lar telecommunications market. Telecommunications policy, 25(4), pp.249-269.
  11. Vafeiadis, T., Diamantaras, K.I., Sarigiannidis, G. and Chatzisavvas, K.C., 2015. A comparison of machine learning techniques for customer churn prediction. Simulation Mod- elling Practice and Theory, 55, pp.1-9.
  12. Lu, N., Lin, H., Lu, J. and Zhang, G., 2012. A customer churn prediction model in telecom industry using boosting. IEEE Transactions on Industrial Informatics, 10(2), pp.1659-1665.
  13. Lee, Y.H., Wei, C.P., Cheng, T.H. and Yang, C.T., 2012. Nearest-neighbor-based approach to time-series classifica- tion. Decision Support Systems, 53(1), pp.207-217.
  14. Zabkowski, T.S. and Szczesny, W., 2012. Insolvency model- ing in the cellular telecommunication industry. Expert Sys- tems with Applications, 39(8), pp.6879-6886.
  15. Saravanan, M. and Vijay Raajaa, G.S., 2012. A graph-based churn prediction model for mobile telecom networks. In Ad- vanced Data Mining and Applications: 8th International Con- ference, ADMA 2012, Nanjing, China, December 15-18, 2012. Proceedings 8 (pp. 367-382). Springer Berlin Heidel- berg.
  16. Keramati, A. and Ardabili, S.M., 2011. Churn analysis for an Iranian mobile operator. Telecommunications Policy, 35(4), pp.344-356.
  17. Farquad, M.A.H., Ravi, V. and Raju, S.B., 2014. Churn pre- diction using comprehensible support vector machine: An analytical CRM application. Applied Soft Computing, 19, pp.31-40.
  18. Sundarkumar, G.G., Ravi, V. and Siddeshwar, V., 2015, De- cember. One-class support vector machine based undersam- pling: Application to churn prediction and insurance fraud detection. In 2015 IEEE International Conference on Compu- tational Intelligence and Computing Research (ICCIC) (pp. 1-7). IEEE.
  19. Maldonado, S., Flores, A´ ., Verbraken, T., Baesens, B. and We- ber, R., 2015. Profit-based feature selection using support vec- tor machines–General framework and an application for cus- tomer retention. Applied Soft Computing, 35, pp.740-748.
  20. Vafeiadis, T., Diamantaras, K.I., Sarigiannidis, G. and Chatzisavvas, K.C., 2015. A comparison of machine learning techniques for customer churn prediction. Simulation Mod- elling Practice and Theory, 55, pp.1-9.
  21. Chagas, B.N.R., Viana, J.A.N., Reinhold, O., Lobato, F., Ja- cob, A.F. and Alt, R., 2018, December. Current applications of machine learning techniques in CRM: a literature review and practical implications. In 2018 IEEE/WIC/ACM Interna- tional Conference on Web Intelligence (WI) (pp. 452-458). IEEE.
  22. Sailaja, G., Jayasree, K., Keerthana, M., Kavya, D., Mohan, C.R. and Reddy, M.N.K., Dynamic Churn Prediction Using Machine Learning Algorithms on Telecommunication.
  23. VLN, R.K. and Deeplakshmi, P., 2021, January. Dynamic churn prediction using machine learning algorithms-predict your customer through customer behaviour. In 2021 Interna- tional Conference on Computer Communication and Infor- matics (ICCCI) (pp. 1-6). IEEE.
  24. Domingos, E., Ojeme, B. and Daramola, O., 2021. Experi- mental analysis of hyperparameters for deep learning-based churn prediction in the banking sector. Computation, 9(3), p.34.
  25. Katelaris, L. and Themistocleous, M., 2017. Predicting cus- tomer churn: Customer behavior forecasting for subscription- based organizations. In Information Systems: 14th Euro- pean, Mediterranean, and Middle Eastern Conference, EM- CIS 2017, Coimbra, Portugal, September 7-8, 2017, Proceed- ings 14 (pp. 128-135). Springer International Publishing.
  26. Kumar, S. and Kumar, M., 2019. Predicting customer churn using artificial neural network. In Engineering Applications of Neural Networks: 20th International Conference, EANN 2019, Xersonisos, Crete, Greece, May 24-26, 2019, Proceed- ings 20 (pp. 299-306). Springer International Publishing.
  27. Yontar, M., Dag˘, O¨ .H.N. and Yanık, S., 2020. Using sup- port vector machine for the prediction of unpaid credit card debts. In Intelligent and Fuzzy Techniques in Big Data Ana- lytics and Decision Making: Proceedings of the INFUS 2019 Conference, Istanbul, Turkey, July 23-25, 2019 (pp. 377-385). Springer International Publishing.
  28. Yazdani, R., Taghipourian, M.J., Pourpasha, M.M. and Hosseini, S.S., 2022. Attracting potential customers in E- commerce environments: a comparative study of metaheuris- tic algorithms. Processes, 10(2), p.369.
  29. Breiman, L., 2001. Random forests. Machine learning, 45, pp.5-32.
  30. Pearl, J., 2000. Models, reasoning and inference. Cambridge, UK: CambridgeUniversityPress, 19(2), p.3.
  31. Duda, R.O. and Hart, P.E., 2006. Pattern classification. John Wiley & Sons.
  32. Breiman, L., Friedman, J., Stone, C.J. and Olshen, R.A., 1984. Classification and regression trees. CRC press.
  33. Chen, T. and Guestrin, C., 2016, August. Xgboost: A scal- able tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
  34. Cox, D.R., 1958. The regression analysis of binary sequences. Journal of the Royal Statistical Society Series B: Statistical Methodology, 20(2), pp.215-232.
  35. Rumelhart, D.E., Hinton, G.E. and Williams, R.J., 1986. Learning representations by back-propagating errors. nature, 323(6088), pp.533-536.
  36. LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P., 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), pp.2278-2324.
  37. Hochreiter, S. and Schmidhuber, J., 1997. Long short-term memory. Neural computation, 9(8), pp.1735-1780.
  38. Jordan, M.I., 1997. Serial order: A parallel distributed pro- cessing approach. In Advances in psychology (Vol. 121, pp. 471-495). North-Holland.
Index Terms

Computer Science
Information Sciences

Keywords

Customer Churn Prediction Principal Com- ponent Analysis Data Pre-Processing XGBoost CNN Customer Retention.