| International Journal of Computer Applications |
| Foundation of Computer Science (FCS), NY, USA |
| Volume 187 - Number 91 |
| Year of Publication: 2026 |
| Authors: Azizur Rahman, Nakib Uddin Ahmed |
10.5120/ijca2026926617
|
Azizur Rahman, Nakib Uddin Ahmed . Enhancing Online Recruitment Fraud Detection: A Comparative Analysis of Gradient Boosting and Transformer Architectures under Severe Class Imbalance. International Journal of Computer Applications. 187, 91 ( Mar 2026), 1-10. DOI=10.5120/ijca2026926617
Through the exponential rise in online recruitment services, the job hunting process has been simplified to a great extent, but has also created a breed of online job ads that are extremely dangerous to job seekers in terms of data security and finances. It is computationally hard to differentiate legitimate and illegitimate postings because of the advanced linguistic structure of fake advertisements and because the real-world data is severely class imbalanced. This research paper presents a comparative and indepth analysis of Machine Learning (ML), Deep Learning (DL), and Transformer-based architectures in detecting fraudulent job postings automatically. A dataset of 17,883 records was utilized, and robust text preprocessing techniques were applied, such as semantic representation using Word2Vec embeddings. The Synthetic Minority Over-Sampling Technique (SMOTE) was applied to address the significant imbalance between authentic (17,014) and invalid (866) samples. A broad range of classifiers was evaluated, including Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Decision Tree (DT), XGBoost (XGB), and Logistic Regression (LR), along with Deep Learning models (ANN, LSTM) and state-of-the-art Transformers (BERT, RoBERTa). Experimental outcomes showed that ensemble learning and Transformer-based models are highly effective compared to traditional linear classifiers. In particular, XGBoost delivered the best results with 99.44% accuracy and an F1-score of 0.99, followed closely by Random Forest (99.37%) and RoBERTa (98.81%). SVM, on the other hand, demonstrated a low level of efficacy with an accuracy of 50.44 per cent. The results indicate that the combination of SMOTE with gradient-boosting algorithms or pre-trained Transformers offers a highly promising framework for protecting the online recruitment ecosystem against fraud cases.