A Comprehensive Performance Analysis of Supervised Machine Learning Techniques for Sentiment Analysis

Korakot Matarat; Chaidan Mingmuang; Weerasak Charoenrat

Call for Paper

October Edition

IJCA solicits high quality original research papers for the upcoming October edition of the journal. The last date of research paper submission is 22 September 2025

Submit your paper

Know more

The week's pick

RESPONSIVE WEB DESIGN FOR ENHANCED USER EXPERIENCE (UX) AND USER INTERFACE (UI)

Victor Aienobe Muhammad Zahid Iqbal

Random Articles

Reseach Article

A Comprehensive Performance Analysis of Supervised Machine Learning Techniques for Sentiment Analysis

by Korakot Matarat, Chaidan Mingmuang, Weerasak Charoenrat

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 186 - Number 7

Year of Publication: 2024

Authors: Korakot Matarat, Chaidan Mingmuang, Weerasak Charoenrat

10.5120/ijca2024923409

Korakot Matarat, Chaidan Mingmuang, Weerasak Charoenrat . A Comprehensive Performance Analysis of Supervised Machine Learning Techniques for Sentiment Analysis. International Journal of Computer Applications. 186, 7 ( Feb 2024), 35-42. DOI=10.5120/ijca2024923409

@article{ 10.5120/ijca2024923409,

author = { Korakot Matarat, Chaidan Mingmuang, Weerasak Charoenrat },

title = { A Comprehensive Performance Analysis of Supervised Machine Learning Techniques for Sentiment Analysis },

journal = { International Journal of Computer Applications },

issue_date = { Feb 2024 },

volume = { 186 },

number = { 7 },

month = { Feb },

year = { 2024 },

issn = { 0975-8887 },

pages = { 35-42 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume186/number7/a-comprehensive-performance-analysis-of-supervised-machine-learning-techniques-for-sentiment-analysis/ },

doi = { 10.5120/ijca2024923409 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-22T22:17:52.859284+05:30

%A Korakot Matarat

%A Chaidan Mingmuang

%A Weerasak Charoenrat

%T A Comprehensive Performance Analysis of Supervised Machine Learning Techniques for Sentiment Analysis

%J International Journal of Computer Applications

%@ 0975-8887

%V 186

%N 7

%P 35-42

%D 2024

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Sentiment analysis plays a crucial role in deciphering opinions and emotions expressed in textual data, with wide-ranging applications in business such as customer feedback analysis and social media monitoring. This paper conducts a thorough performance analysis of supervised machine learning algorithms in sentiment analysis, utilising the Wongnai reviews dataset, which comprises 40,000 reviews. By utilising a sophisticated preprocessing pipeline and conducting a comparative analysis of feature extraction methods, the research improves sentiment analysis by eliminating stop words (e.g., < > □% I < / # + -;- * & @ $). Subsequently, it will eradicate words that are meaningless for processing the text, for example, มี, เฉยๆ, เช่นใด, เพียงแต่, น้อยๆ, ข้างเคียง and hashtag removal, POS tagging, sentiment score computation, and TF-IDF analysis. The research introduces a novel approach to dominant feature extraction, surpassing traditional bag-of-words methods. By applying six algorithms Logistic Regression (LR), Multinomial Naïve Bayes (NB), Decision Tree Classifier (DT), Neural Network (NN), Gradient Descent (SGD), and Support Vector Machine (SVC), the study compares their accuracy, precision, and recall values, revealing notable insights within the context of Wongnai reviews. In conclusion, this paper not only contributes to understanding sentiment analysis performance but also serves as a valuable resource for optimising models in diverse domains. SVC emerges as the top-performing algorithm by achieving a 0.73 accuracy score, outclassing LR, NB, NN, and SGD with identical performances by achieving a 0.72 accuracy score, while DT exhibits the lowest performance. Further analysis combining TF-IDF with BoW shows improved performance by SGD and SVC by achieving a 0.74 accuracy score, reinforcing the superior performance of SVC in this experiment. This concise summary provides a foundation for practitioners and researchers engaged in sentiment analysis, aiding informed decision-making and paving the way for future exploration with advanced machine learning algorithms.

References

Abdul, M., Abdul, K., and Abu, K. (2019). Comparison of Naive Bayes and SVM Algorithm based on Sentiment Analysis Using Review Dataset. 2019 8th International Conference System Modeling and Advancement in Research Trends (SMART), Moradabad, India, pp. 266 - 270, Nov. 2019
Alexander, M., Elmina, H., Francisco, C. and Ofer, E. (2021). Sentiment analysis using TF–IDF weighting of UK MPs’ tweets on Brexit. Knowledge-Based Systems,Vol. 228, Sep. 2021
Abdulwahab, A. and Mustafa, A. (2019). Sentiment Analysis of Product Reviews Using Bag of Words and Bag of Concepts. IJEIE, Vol. 11. No.2. pp.49-60, Dec. 2019
Azwa, A. and Andrew, S. (2019). Predicting Supervise Machine Learning Performances for Sentiment Analysis Using Contextual-Based Approaches. in IEEE Access, vol. 8, pp. 17722-17733, Dec. 2020
Devansh, S., Arun, S. and Sudha, P. (2022). Sentimental Analysis Using Supervised Learning Algorithms. ICCAKM, Dubai, United Arab Emirates, pp. 1-6, Dec. 2022
Elena, R., Martin, H., Matthias, W. and Marcelo, J. (2018). More than Bags of Words: Sentiment Analysis with Word Embeddings. Communication Methods and Measures, Vol. 12, No. 2, pp. 140-157, Apr. 2018
Furqan, R., Madiha, W., Vaibhav, R. and Arif, Mehmood. (2021). A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis. PLoS ONE. Vol. 16, No. 2, Feb. 2021
Hafiz, M., et al. (2021). Sentiment Analysis of Online Food Reviews using Big Data Analytics. Elementary Education Online. Vol. 20, No. 2, pp. 827-836, Apr. 2021
Kanwal, Z., Narmeen, B. and Soomaiya, H. (2020). Sentiment Analysis and Classification of Restaurant Reviews using Machine Learning. ACIT, Giza, Egypt, pp. 1-6, Jan. 2020
Kotagiri, S., and Mary, S. (2019). Aspect Based Sentiment Analysis using POS Tagging and TFIDF. IJEAT, Vol. 8, No. 6, Aug. 2019
Manasee, G. (2015). The Process of Sentiment Analysis: A Study. International Journal of Computer Applications, Vol. 126, No. 7, Sep. 2015
Marwan, O., Moustafa, H., Nacereddine, H. and Amani, S. (2019). Sentiment Classifier: Logistic Regression for Arabic Services’ Reviews in Lebanon. International Conference on Computer and Information Sciences (ICCIS), Sakaka, Saudi Arabia, pp. 1-5, May. 2019
Metin, B., and Haldun, K. (2019). Sentiment Analysis with Term Weighting and Word Vectors. The International Arab Journal of Information Technology, Vol. 16, No. 5, Sep. 2019
Mohammad, F., and Riyanarto, S. (2019). A comparative study of sentiment analysis using SVM and SentiWordNet. Indonesian Journal of Electrical Engineering and Computer Science, Vol. 13, No. 3, pp. 902-909, Mar. 2019
Mohd, Y., Muhammad, L. and Liyana, Z. (2019). A Review on Sentiment Analysis Techniques and Applications. IOP Conference Series Materials Science and Engineering, Vol. 551, Fab. 2019
Mohamed, C. et al. (2021). LSTM, VADER and TF-IDF based Hybrid Sentiment Analysis Model. IJACSA, Vol. 12, No. 7, Jul. 2021
Pooja, M. and Sharnil, P. (2020). A Review on Sentiment Analysis Methodologies, Practices and Applications. IJSTR, Vol. 9, No. 2, Feb. 2020
Raj, P. et al. (2016). Comparative Evaluation of Supervised Learning Algorithms for Sentiment Analysis of Movie Reviews. International Journal of Computer Applications, Vol. 142, No. 1, May. 2016
Rutuja, R., Sumit, K. and Ruchi, R. (2022). Comparison of Artificial Intelligence Algorithms in Plant Disease Prediction. Revue d'Intelligence Artificielle, Vol. 36, No. 2, pp. 185-193, Apr. 2022
Manjula, D. et al. (2023). Twitter Sentiment Analysis using Collaborative Multi Layer Perceptron (MLP) Classifier. ICCCI, Coimbatore, India, pp. 1-6, May. 2023
Samriti, S., Gurvinder, S. and Manik, S. (2021). A comprehensive review and analysis of supervisedlearning and soft computing techniques for stress diagnosis in humans. Computers in Biology and Medicine, Vol. 134, Jul. 2021
Samruddhi, K. (2019). Classification Model to Predict the Sentiment of Hotel Review. IRJCS, Vol. 6, No. 6, Jun. 2019
Saleh, N. et al. (2022). Data Analytics for the Identification of Fake Reviews Using Supervised Learning. Computers, Materials & Continua, Vol. 70, No. 2, Sep. 2022
Siva, P. et al. (2019). Feature-Based Opinion Mining for Amazon Product’s using MLT. International Journal of Innovative Technology and Exploring Engineering, Vol. 8, No. 11, Sep. 2019
Siyin, L. et al. (2021). Research on Text Sentiment Analysis Based on Neural Network and Ensemble Learning. Revue d'Intelligence Artificielle, Vol. 35, No. 1, pp. 63-70, Feb. 2021
Satyendra, S., Krishan, K. and Brajesh, K. (2022). Sentiment Analysis of Twitter Data Using TF-IDF and Machine Learning Techniques. International Conference on Machine Learning, Big Data, Cloud and Parallel Computing, Faridabad, India, pp. 252-255, May. 2022
Shamsa, U. et al. (2018). Sentiment Analysis Approaches and Applications: A Survey. International Journal of Computer Applications, Vol. 181, No. 1, Jul. 2018
Tanatorn, T., Nuttapong, S. and Udomsak, D. (2020). Sentiment Classification on Thai Social Media Using a Domain-Specific Trained Lexicon, ECTI-CON, Phuket, Thailand, pp. 580-583, Jun. 2020
Tejaswini, M. and Choudhari, G. (2019). Implementation of Sentiment Classification of Movie Reviews by Supervised Machine Learning Approaches. ICCMC, Erode, India, pp. 1197-1200, Mar. 2019
Vivian, L. et al. (2019). Semi-supervised Learning for Sentiment Classification using Small Number of Labeled Data. The Fifth Information Systems International Conference, Vol. 161, pp. 577-584, Jan. 2019
Shadi, D. (2018). Optimizing Stochastic Gradient Descent in Text Classification Based on Fine-Tuning Hyper-Parameters Approach. IJCSIS, Vol. 16, No. 12, Dec. 2018
Waqar, M. et al. (2020). Sentiment analysis of Product Reviews in the Absence of Labelled data using Supervised Learning Approaches. Malaysian Journal of Computer Science, Vol. 32, No. 2, pp. 118-132, Apr. 2020
Korakot, C. (2021). Wongnai corpus. https://github.com/ wongnai/wongnai-corpus

Index Terms

Computer Science

Information Sciences

Keywords

Performance analysis Supervised learning Bag-of-words TF-IDF analysis Thai language data analysis Sentiment analysis.