CFP last date
20 January 2025
Reseach Article

A Novel Approach to Predict Diabetes Mellitus by Statistical Analysis and using Advanced Classification Algorithm

by Saima Sultana, Mahmudul Hasan Khandaker, Abdullah Al Momen, Mohoshi Haque, Nazmus Sakib
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 175 - Number 38
Year of Publication: 2020
Authors: Saima Sultana, Mahmudul Hasan Khandaker, Abdullah Al Momen, Mohoshi Haque, Nazmus Sakib
10.5120/ijca2020920950

Saima Sultana, Mahmudul Hasan Khandaker, Abdullah Al Momen, Mohoshi Haque, Nazmus Sakib . A Novel Approach to Predict Diabetes Mellitus by Statistical Analysis and using Advanced Classification Algorithm. International Journal of Computer Applications. 175, 38 ( Dec 2020), 17-24. DOI=10.5120/ijca2020920950

@article{ 10.5120/ijca2020920950,
author = { Saima Sultana, Mahmudul Hasan Khandaker, Abdullah Al Momen, Mohoshi Haque, Nazmus Sakib },
title = { A Novel Approach to Predict Diabetes Mellitus by Statistical Analysis and using Advanced Classification Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { Dec 2020 },
volume = { 175 },
number = { 38 },
month = { Dec },
year = { 2020 },
issn = { 0975-8887 },
pages = { 17-24 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume175/number38/31700-2020920950/ },
doi = { 10.5120/ijca2020920950 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:40:36.905672+05:30
%A Saima Sultana
%A Mahmudul Hasan Khandaker
%A Abdullah Al Momen
%A Mohoshi Haque
%A Nazmus Sakib
%T A Novel Approach to Predict Diabetes Mellitus by Statistical Analysis and using Advanced Classification Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 175
%N 38
%P 17-24
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Diabetes is a severe, enduring disorder with a huge impact on the existence and health of individuals and the people around them. It happens due to insufficient production of insulin in human body. After a thorough research on this disease, it can be said that diagnosing diabetes at the early stage can help patients to control it and also knowing the probability of having the disease can be useful to the patients for taking necessary steps. So, for the prediction of this disease, a different approach has been taken which is developing a mathematical equation. To develop this equation, some basic medical information of a person have been used as parameters. Using this equation, 80% accuracy has been achieved. Three machine learning algorithms have been used: Logistic Regression, Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) on the dataset to verify the credibility of this equation. The accuracy attained for Logistic Regression, SVM and KNN is 86%, 91% and 83% respectively.

References
  1. N. Jothi, W. Husain, et al., “Data mining in healthcare–a review,” Procedia Computer Science, vol. 72, pp. 306–313, 2015.
  2. Rosenstock, J., Park, G., Zimmerman, J., & Glargine, U. I. (2000). Basal insulin glargine (HOE 901) versus NPH insulin in patients with type 1 diabetes on multiple daily insulin regimens. US Insulin Glargine (HOE 901) Type 1 Diabetes Investigator Group. Diabetes care, 23(8), 1137-1142.
  3. Lal, B. S. (2016). Diabetes: Causes, Symptoms And Treatments. book: Public Health Environment and Social Issues in India, Edition, 1, 55-67.
  4. Mellitus, D. (2005). Diagnosis and classification of diabetes mellitus. Diabetes care, 28(S37), S5-S10.
  5. Ndisang, J. F., Vannacci, A., & Rastogi, S. (2017). Insulin resistance, type 1 and type 2 diabetes, and related complications 2017.
  6. Dariush Mozaffarian, Aruna Kamineni, Mercedes Carnethon, Luc Djoussé, Kenneth J. Mukamal, David Siscovick.” Lifestyle Risk Factors and New-Onset Diabetes Mellitus in Older Adults: The Cardiovascular Health Study.” Archives of Internal Medicine, vol.169, issue.8, Pages.798.
  7. Atlas, D. (2015). International diabetes federation. IDF Diabetes Atlas, 7th edn. Brussels, Belgium: International Diabetes Federation.
  8. Chatterjee, S., Khunti, K., & Davies, M. J. (2017). Type 2 diabetes. The Lancet, 389(10085), 2239-2251
  9. Ray, D. E., Matchett, S. C., Baker, K., Wasser, T., & Young, M. J. (2005). The effect of body mass index on patient outcomes in a medical ICU. Chest, 127(6), 2125-2131.
  10. Centers for Disease Control and Prevention (CDC. (2004). Prevalence of overweight and obesity among adults with diagnosed diabetes--United States, 1988-1994 and 1999-2002. MMWR. Morbidity and mortality weekly report, 53(45), 1066.
  11. Cheung, B. M., & Li, C. (2012). Diabetes and hypertension: is there a common metabolic pathway?. Current atherosclerosis reports, 14(2), 160-166.
  12. Lipman, M. L., & Schiffrin, E. L. (2012). What is the ideal blood pressure goal for patients with diabetes mellitus and nephropathy?. Current cardiology reports, 14(6), 651-659.
  13. O'Sullivan, J. B., & Mahan, C. M. (1965). Blood Sugar Levels, Glycosuria, and Body Weight Related to Development of Diabetes Mellitus: The Oxford Epidemiologic Study 17 Years Later. JAMA, 194(6), 587-592.
  14. Touma, C., & Pannain, S. (2011). Does lack of sleep cause diabetes. Cleve Clin J Med, 78(8), 549-58.
  15. S. R. Colberg, R. J. Sigal, J. E. Yardley, M. C. Riddell, D. W. Dunstan, P. C. Dempsey, E. S. Horton, K. Castorino, and D. F. Tate, “Physical activity/exercise and diabetes: a position statement of the american diabetes association,” Diabetes care, vol. 39, no. 11, pp. 2065–2079, 2016.
  16. Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., & Feinstein, A. R. (1996). A simulation study of the number of events per variable in logistic regression analysis. Journal of clinical epidemiology, 49(12), 1373-1379.
  17. Sperandei, S. (2014). Understanding logistic regression analysis. Biochemia medica: Biochemia medica, 24(1), 12-18.
  18. Noble, W. S. (2006). What is a support vector machine?. Nature biotechnology, 24(12), 1565-1567.
  19. Hua, S., & Sun, Z. (2001). A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. Journal of molecular biology, 308(2), 397-407.
  20. Peterson, L. E. (2009). K-nearest neighbor. Scholarpedia, 4(2), 1883.
  21. H.Wu, S. Yang, Z. Huang, J. He, and X.Wang, “Type 2 diabetes mellitus prediction model based on data mining,” Informatics in Medicine Unlocked, vol. 10, pp. 100–107, 2018.
  22. A. A. Aljumah, M. G. Ahamad, and M. K. Siddiqui, “Application of data mining: Diabetes health care in young and old patients,” Journal of King Saud University-Computer and Information Sciences, vol. 25, no. 2, pp. 127–136, 2013.
  23. K. Chui, W. Alhalabi, S. Pang, P. Pablos, R. Liu, and M. Zhao, “Disease diagnosis in smart healthcare: Innovation, technologies and applications,” Sustainability, vol. 9, no. 12, p. 2309, 2017.
  24. B. Nithya and V. Ilango, “Predictive analytics in health care using machine learning tools and techniques,” in 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 492–499, IEEE, 2017.
  25. Wang, L., Kong, L., Wu, F., Bai, Y., & Burton, R. (2005). Preventing chronic diseases in China. The lancet, 366(9499), 1821-1824.
  26. “The link between a lack of sleep and type 2 diabetes.”https://www.sleepfoundation.org/articles/link-between-lack-sleep-and-type-2-diabetes. Accessed: 2019-12-26.
  27. “Diabetes.” https://www.mayoclinic.org/diseases-conditions/diabetes/symptoms-causes/syc-20371444. Accessed: 2020-01-23.
  28. Crilly, P. (2020). Managing hypertension: the role of diet and exercise. The Pharmaceutical Journal, 304(7934).
  29. Spiegel, K., Knutson, K., Leproult, R., Tasali, E., & Cauter, E. V. (2005). Sleep loss: a novel risk factor for insulin resistance and Type 2 diabetes. Journal of applied physiology, 99(5), 2008-2019.
  30. Alva, M. L., Hoerger, T. J., Zhang, P., & Gregg, E. W. (2017). Identifying risk for type 2 diabetes in different age cohorts: does one size fit all?. BMJ Open Diabetes Research and Care, 5(1).
  31. Narayan, K. V., Boyle, J. P., Thompson, T. J., Gregg, E. W., & Williamson, D. F. (2007). Effect of BMI on lifetime risk for diabetes in the US. Diabetes care, 30(6), 1562-1566.
  32. ” Diabetes - Diagnosis and treatment - Mayo Clinic.“ https://www.mayoclinic.org/diseases-conditions/diabetes/diagnosis-treatment/drc-20371451. Accessed: 2020-11-20.
  33. Alehegn, M., Joshi, R. R., & Mulay, P. Diabetes Analysis And Prediction Using Random Forest, KNN, Naïve Bayes, And J48: An Ensemble Approach.
  34. Duke, D. L., Thorpe, C., Mahmoud, M., & Zirie, M. (2008, March). Intelligent Diabetes Assistant: Using machine learning to help manage diabetes. In 2008 IEEE/ACS International Conference on Computer Systems and Applications (pp. 913-914). IEEE.
  35. Kumar, P. S., & Pranavi, S. (2017, December). Performance analysis of machine learning algorithms on diabetes dataset using big data analytics. In 2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions)(ICTUS) (pp. 508-513). IEEE.
  36. Mirshahvalad, R., & Zanjani, N. A. (2017, September). Diabetes prediction using ensemble perceptron algorithm. In 2017 9th International Conference on Computational Intelligence and Communication Networks (CICN) (pp. 190-194). IEEE.
  37. Priyadarshini, R., Dash, N., & Mishra, R. (2014, February). “A Novel approach to predict diabetes mellitus using modified Extreme learning machine.” In 2014 International Conference on Electronics and Communication Systems (ICECS) (pp. 1-5). IEEE.
  38. M. Adam, E. Y. Ng, S. L. Oh, M. L. Heng, Y. Hagiwara, J. H. Tan, J. W. Tong, and U. R.Acharya, “Automated characterization of diabetic foot using nonlinear features extracted from thermograms,” Infrared Physics & Technology,vol. 89, pp. 325–337, 2018.
  39. M. R. Devi and J. M. Shyla, “Analysis of various data mining techniques to predict diabetes mellitus,” International Journal of Applied Engineering Research, vol. 11, no. 1, pp. 727–730, 2016.
  40. S. Bashir, U. Qamar, and F. H. Khan, “Intellihealth: a medical decision support application using a novel weighted multi-layer classifier ensemble framework,” Journal of biomedical informatics, vol. 59, pp. 185–200, 2016.
  41. S. R. Colberg, R. J. Sigal, J. E. Yardley, M. C. Riddell, D. W. Dunstan, P. C. Dempsey, E. S. Horton, K. Castorino, and D. F. Tate, “Physical activity/exercise and diabetes: a position statement of the american diabetes association,” Diabetes care, vol. 39, no. 11, pp. 2065–2079, 2016.
Index Terms

Computer Science
Information Sciences

Keywords

Diabetes Mellitus Logistic Regression SVM KNN Machine Learning Algorithms Prediction System Age Body Mass Index (BMI) Blood Pressure Blood Sugar Exercise Time and Sleeping Time.