CFP last date
20 February 2025
Reseach Article

Implementation of Classification using K-Nearest Neighbors (KNN) in Python

by Ahmad Farhan AlShammari
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 186 - Number 33
Year of Publication: 2024
Authors: Ahmad Farhan AlShammari
10.5120/ijca2024923894

Ahmad Farhan AlShammari . Implementation of Classification using K-Nearest Neighbors (KNN) in Python. International Journal of Computer Applications. 186, 33 ( Jul 2024), 19-24. DOI=10.5120/ijca2024923894

@article{ 10.5120/ijca2024923894,
author = { Ahmad Farhan AlShammari },
title = { Implementation of Classification using K-Nearest Neighbors (KNN) in Python },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2024 },
volume = { 186 },
number = { 33 },
month = { Jul },
year = { 2024 },
issn = { 0975-8887 },
pages = { 19-24 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume186/number33/implementation-of-classification-using-k-nearest-neighbors-knn-in-python/ },
doi = { 10.5120/ijca2024923894 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-07-26T02:24:58+05:30
%A Ahmad Farhan AlShammari
%T Implementation of Classification using K-Nearest Neighbors (KNN) in Python
%J International Journal of Computer Applications
%@ 0975-8887
%V 186
%N 33
%P 19-24
%D 2024
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The goal of this research is to develop a classification program using K-Nearest Neighbors (KNN) method in Python. Classification helps to predict the categories of data by comparing the features of test and input data. The distances between the test and input data are measured and sorted to find the (k) nearest neighbors. Then, the predicted category of data is determined by the most common vote among the nearest neighbors. The basic steps of classification using k-nearest neighbors are explained: preparing observed data, preparing test data, computing distances, sorting distances, computing neighbors, performing majority voting, computing predictions, computing confusion matrix, and computing model accuracy. The developed program was tested on an experimental dataset. The program successfully performed the basic steps of classification using k-nearest neighbors and provided the required results.

References
  1. Sammut, C., & Webb, G. I. (2011). "Encyclopedia of Machine Learning". Springer Science & Business Media.
  2. Jung, A. (2022). "Machine Learning: The Basics". Singapore: Springer.
  3. Kubat, M. (2021). "An Introduction to Machine Learning". Cham, Switzerland: Springer.
  4. Li, H. (2023). "Machine Learning Methods". Springer Nature.
  5. Mohammed, M., Khan, M. B., & Bashier, E. B. M. (2016). "Machine Learning: Algorithms and Applications". Crc Press.
  6. Dey, A. (2016). "Machine Learning Algorithms: A Review". International Journal of Computer Science and Information Technologies, 7 (3), 1174-1179.
  7. Bonaccorso, G. (2018). "Machine Learning Algorithms: Popular Algorithms for Data Science and Machine Learning". Packt Publishing.
  8. Jo, T. (2021). "Machine Learning Foundations: Supervised, Unsupervised, and Advanced Learning". Springer.
  9. Chopra, D., & Khurana, R. (2023). "Introduction to Machine Learning with Python". Bentham Science Publishers.
  10. Müller, A. C., & Guido, S. (2016). "Introduction to Machine Learning with Python: A Guide for Data Scientists". O'Reilly Media.
  11. Raschka, S. (2015). "Python Machine Learning". Packt Publishing.
  12. Forsyth, D. (2019). "Applied Machine Learning". Cham, Switzerland: Springer.
  13. Sarkar, D., Bali, R., & Sharma, T. (2018). "Practical Machine Learning with Python". Apress.
  14. Swamynathan, M. (2019). "Mastering Machine Learning with Python in Six Steps: A Practical Implementation Guide to Predictive Data Analytics using Python". Apress.
  15. Kong, Q., Siauw, T., & Bayen, A. (2020). "Python Programming and Numerical Methods: A Guide for Engineers and Scientists". Academic Press.
  16. Unpingco, J. (2022). "Python for Probability, Statistics, and Machine Learning". Cham, Switzerland: Springer.
  17. Brandt, S. (2014). "Data Analysis: Statistical and Computational Methods for Scientists and Engineers". Springer.
  18. VanderPlas, J. (2017). "Python Data Science Handbook: Essential Tools for Working with Data". O'Reilly Media.
  19. James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). "An Introduction to Statistical Learning: With Applications in Python". Springer Nature.
  20. Aggarwal, C. C. (2020). "Data Classification: Algorithms and Applications". CRC Press
  21. Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). "Supervised Machine Learning: A Review of Classification Techniques". Emerging Artificial Intelligence Applications in Computer Engineering, 160(1), 3-24.
  22. Neal, R.M. (2006). "Pattern Recognition and Machine Learning". Technometrics, 49, 366 - 366.
  23. Kotsiantis, S.B. (2007). "Supervised Machine Learning: A Review of Classification Techniques". Informatica. 31, 249-268.
  24. Sharma, A., Kaur, A., & Semwal, A. (2022). "Supervised and Unsupervised Prediction Application of Machine Learning". In 2022 International Conference on Cyber Resilience (ICCR), (pp. 1-5). IEEE.
  25. Bao, W. (2016). "Introduction to Machine Learning: K-Nearest Neighbors". Annals of Translational Medicine, 4 (11), 218.
  26. Matloff, N. (2017). "Statistical Regression and Classification: from Linear Models to Machine Learning". Chapman and Hall/CRC.
  27. Soofi, A. A., & Awan, A. (2017). "Classification Techniques in Machine Learning: Applications and Issues". Journal of Basic & Applied Sciences, 13(1), 459-465.
  28. Fix, E., & Hodges, J. L. (1951). "Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties". USAF School of Aviation Medicine, Randolph Field, Texas.
  29. Cover, Thomas M.; Hart, Peter E. (1967). "Nearest Neighbor Pattern Classification". IEEE Transactions on Information Theory. 13(1), 21–27.
  30. Python: https://www.python.org
  31. Numpy: https://www.numpy.org
  32. Pandas: https:// pandas.pydata.org
  33. Matplotlib: https://www. matplotlib.org
  34. NLTK: https://www.nltk.org
  35. SciPy: https://scipy.org
  36. SK Learn: https://scikit-learn.org
  37. Kaggle: https://www.kaggle.com
Index Terms

Computer Science
Information Sciences

Keywords

Artificial Intelligence Machine Learning Classification K-Nearest Neighbors KNN Euclidean Distance Confusion Matrix Python Programming