| International Journal of Computer Applications |
| Foundation of Computer Science (FCS), NY, USA |
| Volume 187 - Number 104 |
| Year of Publication: 2026 |
| Authors: Amila Hrnjić, Zerina Altoka |
10.5120/ijca1dc7d28ef5a6
|
Amila Hrnjić, Zerina Altoka . BERT vs. Logistic Regression: Classifying Mental Health-Related Text using Machine Learning and Natural Language Processing. International Journal of Computer Applications. 187, 104 ( May 2026), 32-39. DOI=10.5120/ijca1dc7d28ef5a6
With the rise of mental health discussions in online spaces, the ability to automatically detect emotionally sensitive content has become increasingly important. This study compares traditional machine learning (ML) methods with deep learning models to evaluate their effectiveness in classifying mental health-related texts. Two models were tested: Logistic regression (LR) with TF-IDF features and the BERT transformer model. A balanced dataset containing labeled text samples was used, with standard natural language processing (NLP) preprocessing applied. Model performance was evaluated using precision, recall, F1-score, and AUC. Results show that BERT outperforms logistic regression across all metrics, achieving an F1-score of 0.95 and an AUC of 0.99. Confusion matrices and ROC curves confirmed BERT’s superior accuracy and its ability to reduce false classifications. These findings highlight the strength of deep learning models in understanding nuanced language, which is crucial in the mental health domain. Overall, the study confirms that transformer-based models like BERT offer a more reliable approach to classifying emotionally sensitive content, with promising applications in early detection tools and mental health support systems.