CFP last date
20 December 2024
Reseach Article

A New Technique to Classification of Bengali News Grounded on ML and DL Models

by Tamim Al Mahmud, Sazeda Sultana, Antara Mondal
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 185 - Number 18
Year of Publication: 2023
Authors: Tamim Al Mahmud, Sazeda Sultana, Antara Mondal
10.5120/ijca2023922897

Tamim Al Mahmud, Sazeda Sultana, Antara Mondal . A New Technique to Classification of Bengali News Grounded on ML and DL Models. International Journal of Computer Applications. 185, 18 ( Jun 2023), 15-21. DOI=10.5120/ijca2023922897

@article{ 10.5120/ijca2023922897,
author = { Tamim Al Mahmud, Sazeda Sultana, Antara Mondal },
title = { A New Technique to Classification of Bengali News Grounded on ML and DL Models },
journal = { International Journal of Computer Applications },
issue_date = { Jun 2023 },
volume = { 185 },
number = { 18 },
month = { Jun },
year = { 2023 },
issn = { 0975-8887 },
pages = { 15-21 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume185/number18/32795-2023922897/ },
doi = { 10.5120/ijca2023922897 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:29:21.049863+05:30
%A Tamim Al Mahmud
%A Sazeda Sultana
%A Antara Mondal
%T A New Technique to Classification of Bengali News Grounded on ML and DL Models
%J International Journal of Computer Applications
%@ 0975-8887
%V 185
%N 18
%P 15-21
%D 2023
%I Foundation of Computer Science (FCS), NY, USA
Abstract

News classification is the process of categorizing news articles into predefined categories based on their content. It involves utilizing techniques such as machine learning and NLP to analyze the textual features of news articles and assign them to relevant categories. This enables efficient organization and retrieval of news articles, aiding users in accessing information based on their interests. Many topics are covered by news, including local, national, and international news, politics, business, sports, and entertainment. In our native Bengali language, there are numerous classifications for news. The motivation for working in this research is because of the limited research in Bengali text and the scarcity of resources. This study proposed a new technique for the categorization of the news based on a comparison among different machine learning and deep learning model results. First, we have collected more than 20K data. Afterwards, applied several machine learning and deep learning models to pre-processed and cleaned data. Then apply our proposed technique on evaluated results produced by the machine learning and deep learning models to make a final decision to identify the news category. In our proposed technique, we counted the predicted categories from three types of Machine Learning models and two types of Deep Learning models. Finally, we computed the maximum predicted class and assigned the category for the specific news.

References
  1. Abdul Bari Parves, Abdullah Al Imran, and Md. Riazur Rahman. 2020. Incorporating Supervised Learning Algorithms with NLP Techniques to Classify BengaliLanguage Forms. In Proceedings of the International Conference on Computing Advancements (ICCA 2020). Association for Computing Machinery, New York, NY, USA, Article 62,1–7. https://doi.org/10.1145/3377049.3377110
  2. M. M. Rahman, M. A. Z. Khan and A. A. Biswas, “Bangla News Classification using Graph Convolutional Networks,” 2021 International Conference on Computer Communication and Informatics (ICCCI), 2021, pp. 1-5, https://doi.org/10.1109/ICCCI50826.2021.9402567
  3. M. Tanvir Alam and M. Mofijul Islam, ”BARD:Bangla Article Classification Using a New Comprehensive Dataset,” 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), 2018, pp. 1-5, https://doi.org/10.1109/ICBSLP.2018.8554382
  4. A. N. Chy, M. H. Seddiqui and S. Das, “Bangla news classification using naive Bayes classifier,” 16th Int’l Conf. Computer and Information Technology, 2014, pp. 366-371, https://doi.org/10.1109/ICCITechn.2014.6997369.354–355
  5. Rifat Rahman. 2020. A Benchmark Study on Machine Learning Methods using Several Feature Extraction Techniques for News Genre Detection from Bangla News Articles & Titles. In Proceedings of the 7th International Conference on Networking, Systems and Security (NSysS '20). Association for Computing Machinery, New York, NY, USA, 25–35. https://doi.org/10.1145/3428363.3428373
  6. M. M. H. Shahin, T. Ahmmed, S. H. Piyal and M.Shopon, “Classification of Bangla News Articles Using Bidirectional Long Short Term Memory,” 2020 IEEE Region 10 Symposium (TENSYMP), 2020, pp. 1547-1551, https://doi.org/10.1109/TENSYMP50017.2020.9230737
  7. S. Rahman, S. K. Mithila, A. Akther and K.M.Alam, ”An Empirical Study of Machine Learning-based Bangla News Classification Methods,” 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2021,pp. 1-6, https://doi.org/10.1109/ICCCNT51525.2021.9579655
  8. S. Tabashum, M. M. Hossain, A. Islam, M. Y. Mahafi Taz Zahara and F. N. Fami, "Performance Analysis of Most Prominent Machine Learning and Deep Learning Algorithms In Classifying Bangla Crime News Articles," 2020 IEEE Region 10 Symposium (TENSYMP), Dhaka, Bangladesh, 2020, pp. 1273-1277, https://doi.org/10.1109/TENSYMP50017.2020.9230785
  9. M. G. Hussain, M. Rashidul Hasan, M. Rahman,J.Protim and S. Al Hasan, ”Detection of Bangla Fake News using MNB and SVM Classifier,” 2020 International Conference on Computing, Electronics Communications Engineering (iCCECE), 2020, pp. 81-85, https://doi.org/10.1109/iCCECE49321.2020.9231167
  10. Rachna Jain, Deepak Kumar Jain, Dharana, and Ntika Sharma. 2021. Fake News Classification: A Quantitative Research Description. ACM Trans. Asian Low-Resour. Lang.Inf. Process. 21, 1, Article 3 (January2022), 17 pages. https://doi.org/10.1145/3447650
  11. A. A. Babu, A. Hasan and S. Ismail, “Suggesting an Informative Order of Bangla Follow up News on a Specific Issue from Several Days Newspapers,” 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), 2018, pp. 1-5, https://doi.org/10.1109/ICBSLP.2018.8554538
  12. K. Wohiduzzaman and S. Ismail, ‘Recommendation System for Bangla News Article with Anaphora Resolution,” 2018 4th International Conference on Electrical Engineering and Information Communication Technology (iCEEiCT), 2018, pp. 467-472, https://doi.org/10.1109/CEEICT.2018.8628075
  13. Q. A. R. Adib, M. H. K. Mehedi, M. S. Sakib, K. K. Patwary, M. S. Hossain and A. A. Rasel, “A Deep Hybrid Learning Approach to Detect Bangla Fake News,” 2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), 2021, pp. 442-447, https://doi.org/10.1109/ISMSIT52890.2021.9604712
  14. M. N. H. Hridoy, M. M. Islam and A. Khatun, “Aspect Based Sentiment Analysis for Bangla Newspaper Headlines,” 2021 3rd International Conference on Sustainable Technologies for Industry 4.0 (STI), 2021, pp. 1-4, https://doi.org/10.1109/STI53101.2021.9732611
  15. A. Motaleb et al., “Analyzing Human Abilities to Detect Fake Bangla News with Respect to Different Features,”2021 IEEE 12th Annual Ubiquitous Computing, Electronics Mobile Communication Conference (UEMCON), 2021, pp.0337-0346, https://doi.org/10.1109/UEM-CON53757.2021.9666493
  16. R. Amin, N. S. Sworna, M. N. K. Liton and N. Hossain, “Abstractive Headline Generation from Bangla News Articles Using Seq2Seq RNNs with Global Attention,” 2021 International Conference on Science Contemporary Technologies (ICSCT), 2021, pp. 1-5, https://doi.org/10.1109/ICSCT53883.2021.9642642
  17. A. S. Sharma, M. A. Mridul and M. S. Islam, “Automatic Detection of Satire in Bangla Documents: A CNN Approach Based on Hybrid Feature Extraction Model,” 2019 International Conference on Bangla Speech and Language Processing (ICBSLP), 2019, pp. 1-5, https://doi.org/10.1109/ICBSLP47725.2019.201517
  18. Q. Ishtiaque Mahmud, N. Islam Chowdhury and M.Masum, ”Reducing Feature Space and Analyzing Effects of Using Non Linear Kernels in SVM for Bangla News Categorization,” 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), 2018, pp. 1-6, https://doi.org/10.1109/ICB-SLP.2018.8554844
  19. K. Salehin, M. K. Alam,M. A. Nabi, F. Ahmed and F. B. Ashraf, ”A Comparative Study of Different Text Classification Approaches for Bangla News Classification,” 2021 24th International Conference on Computer and Information Technology (ICCIT), 2021, pp.1-6, https://doi.org/10.1109/ICCIT54785.2021.9689843
  20. L. Nahar, Z. Sultana, N. Jahan, and U. Jannat, “Filtering Bengali Political and Sports News of Social Media from Textual Information,” 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), 2019, pp. 1-6, https://doi.org/10.1109/ICASERT.2019.8934605
  21. O. Sarkar, M. F. Ahamed, T. T. Khan, M. K. Ghosh and M. R. Islam, “An Experimental Framework of Bangla Text Classification for Analyzing Sentiment Applying CNNBiLSTM,” 2021 2nd International Conference for Emerging Technology (INCET), 2021, pp.1-6, https://doi.org/10.1109/INCET51464.2021.945639
  22. T. A. Mahmud, S. Sultana, T. I. Chowdhury and F.R. Anando, “A New Approach to Analysis of Public Sentiment on Padma Bridge in Bangla Text,” 2022 4th International Conference on Sustainable Technologies for Industry 4.0 (STI), Dhaka, Bangladesh, 2022, pp.1-6, https://doi.org/10.1109/STI56238.2022.10103315
  23. M. G. Hussain, T. A. Mahmud and W. Akthar, “An Approach to Detect Abusive Bangla Text,” 2018 International Conference on Innovation in Engineering and Technology (ICIET), Dhaka, Bangladesh, 2018, pp. 1-5, https://doi.org/10.1109/CIET.2018.8660863
  24. M. G. Hussain, S. Kabir, T. A. Mahmud, A. Khatun and M. J. Islam, “Assessment of Bangla Descriptive Answer Script Digitally,” 2019 International Conference on Bangla Speech and Language Processing (ICBSLP), Sylhet, Bangladesh, 2019, pp. 1-4, https://doi.org/10.1109/ICBSLP47725.2019.202042
  25. T. Ahmed, S. F. Mukta, T. Al Mahmud, S. A.Hasan and M. Gulzar Hussain, ”Bangla Text Emotion Classification using LR, MNB and MLP with TF-IDF CountVectorizer,” 2022 26th International Computer Science and Engineering Conference (ICSEC), Sakon Nakhon, Thailand, 2022, pp. 275-280, https://doi.org/10.1109/ICSEC56337.2022.10049341
  26. Md Gulzar Hussain, Tamim Al Mahmud. (2019). A Technique For Perceiving Abusive Bangla Comments. GREEN UNIVERSITY OF BANGLADESH JOURNAL OF SCIENCE AND ENGINEERING, 04(01). https://doi.org/10.5281/zenodo.3544583
  27. Tamim Al Mahmud, Md Gulzar Hussain, Sumaiya Kabir, Hasnain Ahmad, and Mahmudus Sobhan. 2020.A Keyword Based Technique to Evaluate Broad Question Answer Script. In Proceedings of the 2020 9th International Conference on Software and Computer Applications (ICSCA 2020). Association for Computing Machinery, New York, NY, USA, 167–171 https://doi.org/10.1145/3384544.3384604
Index Terms

Computer Science
Information Sciences

Keywords

Newspaper Classification Pre-Processing Data Visualization Splitting Dataset Tokenization Support Vector Machine Gated Recurrent Unit Singular Vector Decomposition Long Short-Term Memory Bag of Word Linear Support Vector Classification Random Forest