CFP last date
20 December 2024
Reseach Article

Comparative Study of Various Techniques for Rude and Threat dialect Detection in Marathi

by Bhushan Nikam, Nita Patil
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 185 - Number 48
Year of Publication: 2023
Authors: Bhushan Nikam, Nita Patil
10.5120/ijca2023923314

Bhushan Nikam, Nita Patil . Comparative Study of Various Techniques for Rude and Threat dialect Detection in Marathi. International Journal of Computer Applications. 185, 48 ( Dec 2023), 35-40. DOI=10.5120/ijca2023923314

@article{ 10.5120/ijca2023923314,
author = { Bhushan Nikam, Nita Patil },
title = { Comparative Study of Various Techniques for Rude and Threat dialect Detection in Marathi },
journal = { International Journal of Computer Applications },
issue_date = { Dec 2023 },
volume = { 185 },
number = { 48 },
month = { Dec },
year = { 2023 },
issn = { 0975-8887 },
pages = { 35-40 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume185/number48/33017-2023923314/ },
doi = { 10.5120/ijca2023923314 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:29:09.607650+05:30
%A Bhushan Nikam
%A Nita Patil
%T Comparative Study of Various Techniques for Rude and Threat dialect Detection in Marathi
%J International Journal of Computer Applications
%@ 0975-8887
%V 185
%N 48
%P 35-40
%D 2023
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Rude and threatening language recognitions aim to protect individuals and online communities from harmful and offensive content. It can be applied in various contexts, like comment sections or other online communication social channels. This paper compares various tools and techniques for Abusive and Threat Language Detection in Marathi. The research observations of the methods, strategies, and features needed to implement Marathi abusive and threat language detection are reported.

References
  1. Banerjee et al., “Exploring Transformer Based Models to Identify Hate Speech and Offensive Content in English and Indo-Aryan Languages.” arXiv, November 27, 2021. http://arxiv.org/abs/2111.13974.
  2. Chanda, Supriya, Sacchit D Sheth, and Sukomal Pal. “Coarse and Fine-Grained Conversational Hate Speech and Offensive Content Identification in Code-Mixed Languages Using Fine-Tuned Multilingual Embedding,” https://ceur-ws.org/Vol-3395/T7-3.pdf FIRE’22: Forum for Information Retrieval Evaluation, December 9-13, 2022, India
  3. Chanda et al., “Fine-Tuning Pre-Trained Transformer Based Model for Hate Speech and Offensive Content Identification in English, Indo-Aryan and Code-Mixed (English-Hindi) Languages,” Fire (2021). https://ceur-ws.org/Vol-3159/T1-44.pdf
  4. Chavan et al., “A Twitter BERT Approach for Offensive Language Detection in Marathi.” arXiv, December 20, 2022. http://arxiv.org/abs/2212.10039.
  5. Conneau et al., “Unsupervised cross-lingual representation learning at scale”, arXiv preprint arXiv:1911.02116 (2019).
  6. Das, Mithun, Somnath Banerjee, and Animesh Mukherjee. “Data Bootstrapping Approaches to Improve Low Resource Abusive Language Detection for Indic Languages.” arXiv, April 26, 2022. http://arxiv.org/abs/2204.12543.
  7. Devlin et al., “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.” arXiv, May 24, 2019. http://arxiv.org/abs/1810.04805.
  8. Joshi Raviraj, “L3Cube-MahaCorpus and MahaBERT: Marathi Monolingual Corpus, Marathi BERT Language Models, and Resources” arXiv:2202.01159 [cs.CL]. https://doi.org/10.48550/arXiv.2202.01159
  9. Gaikwad et al., “Cross-Lingual Offensive Language Identification for Low Resource Languages: The Case of Marathi.” arXiv, September 8, 2021. http://arxiv.org/abs/2109.03552.
  10. Gajbhiye et al., “Machine Learning Models for Hate Speech Identification in Marathi Language,” Fire (2021) https://ceur-ws.org/Vol-3159/T1-37.pdf
  11. Gamal et al., “Intelligent Multi-Lingual Cyber-Hate Detection in Online Social Networks: Taxonomy, Approaches, Datasets, and Open Challenges.” Big Data and Cognitive Computing 7, no. 2 (March 24, 2023): 58. https://doi.org/10.3390/bdcc7020058.
  12. Glazkova et al., “Fine-Tuning of Pre-Trained Transformers for Hate, Offensive, and Profane Content Detection in English and Marathi,” arXiv:2110.12687 [cs.CL],2021 https://doi.org/10.48550/arXiv.2110.12687
  13. Gokhale et al., “Spread Love Not Hate: Undermining the Importance of Hateful Pre-Training for Hate Speech Detection.” arXiv, December 11, 2022. http://arxiv.org/abs/2210.04267.
  14. Kakwani et al., “IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages” (https://aclanthology.org/2020.findings-emnlp.445
  15. Kalra et al., “Hate Speech Detection in Marathi and Code-Mixed Languages Using TF-IDF and Transformers-Based BERT-Variants,” FIRE 2022: Forum for Information Retrieval Evaluation, December 9-13, 2022, India. https://ceur-ws.org/Vol-3395/T7-13.pdf
  16. Khanuja et al., Muril: Multilingual representations for indian languages, arXiv preprint arXiv:2103.10730 (2021).
  17. Kumari Kirti, and Jyoti Prakash Singh. “Machine Learning Approach for Hate Speech and Offensive Content Identification in English and Indo Aryan Code-Mixed Languages,” Forum for Information Retrieval Evaluation, December 9-13, 2022, India https://ceur-ws.org/Vol-3395/T7-10.pdf
  18. Mandl et al. “Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech.” Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval Evaluation (2021): n. pag.CEUR Workshop Proceedings (CEUR-WS.org) https://arxiv.org/abs/2112.09301
  19. Pawar Rohit, and Rajeev R. Raje. “Multilingual Cyberbullying Detection System.” In 2019 IEEE International Conference on Electro Information Technology (EIT), 040–044. Brookings, SD, USA: IEEE, 2019. https://doi.org/10.1109/EIT.2019.8833846.
  20. Pires T., Schlinger E. and Garrette D., How multilingual is multilingual bert?, arXiv preprint arXiv:1906.01502 (2019).
  21. Ranasinghe et al., “Overview of the HASOC Subtrack at FIRE 2022: Offensive Language Identification in Marathi.” arXiv, November 18, 2022. http://arxiv.org/abs/2211.10163.
  22. Sanh V. et al, DistilBERT, a distilled version of bert: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108 (2019).
  23. Vani V Dikshithaand and Bharathi B “Hate Speech and Offensive Content Identification in Multiple Languages Using Machine Learning Algorithms,” (FIRE). CEUR-WS. org, 2022 - ceur-ws.org
  24. Velankar et al., “Hate and Offensive Speech Detection in Hindi and Marathi.” arXiv, November 1, 2021. http://arxiv.org/abs/2110.12200.
  25. Velankar et al., “L3Cube-MahaHate: A Tweet-Based Marathi Hate Speech Detection Dataset and BERT Models.” arXiv, May 22, 2022. http://arxiv.org/abs/2203.13778.
  26. Velankar et al., “Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi,” 13739:121–28, 2023. https://doi.org/10.1007/978-3-031-20650-4_10.
  27. Zampieri et al., “Predicting the Type and Target of Offensive Social Media Posts in Marathi.” Social Network Analysis and Mining 12, no. 1 (December 2022): 77. https://doi.org/10.1007/s13278-022-00906-8.
Index Terms

Computer Science
Information Sciences

Keywords

Transformer Monolingual multilingual algorithms.