Comparative Study of Various Techniques for Rude and Threat dialect Detection in Marathi

Bhushan Nikam; Nita Patil

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper

Know more

The week's pick

Real-time Synchronization Mechanisms Between Batch-oriented Legacy Systems and Modern Interfaces in the Retirement Domain

Balamurugan Krishnaswamy Gnanasekaran

Random Articles

Trust Enhancing Model for Cloud Environment

December

2015

Fuzzy Crime Investigation Framework for Tracking Data Theft based on USB Storage

December

2013

A New Ranking Algorithm for Search Engine: Content’s Weight based Page Ranking

Oct

2016

Online Customer Care: An Android Application for Mobile Customers using Speech Synthesis

Jul

2016

Reseach Article

Comparative Study of Various Techniques for Rude and Threat dialect Detection in Marathi

by Bhushan Nikam, Nita Patil

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 185 - Number 48

Year of Publication: 2023

Authors: Bhushan Nikam, Nita Patil

10.5120/ijca2023923314

Bhushan Nikam, Nita Patil . Comparative Study of Various Techniques for Rude and Threat dialect Detection in Marathi. International Journal of Computer Applications. 185, 48 ( Dec 2023), 35-40. DOI=10.5120/ijca2023923314

@article{ 10.5120/ijca2023923314,

author = { Bhushan Nikam, Nita Patil },

title = { Comparative Study of Various Techniques for Rude and Threat dialect Detection in Marathi },

journal = { International Journal of Computer Applications },

issue_date = { Dec 2023 },

volume = { 185 },

number = { 48 },

month = { Dec },

year = { 2023 },

issn = { 0975-8887 },

pages = { 35-40 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume185/number48/33017-2023923314/ },

doi = { 10.5120/ijca2023923314 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T01:29:09.607650+05:30

%A Bhushan Nikam

%A Nita Patil

%T Comparative Study of Various Techniques for Rude and Threat dialect Detection in Marathi

%J International Journal of Computer Applications

%@ 0975-8887

%V 185

%N 48

%P 35-40

%D 2023

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Rude and threatening language recognitions aim to protect individuals and online communities from harmful and offensive content. It can be applied in various contexts, like comment sections or other online communication social channels. This paper compares various tools and techniques for Abusive and Threat Language Detection in Marathi. The research observations of the methods, strategies, and features needed to implement Marathi abusive and threat language detection are reported.

References

Banerjee et al., “Exploring Transformer Based Models to Identify Hate Speech and Offensive Content in English and Indo-Aryan Languages.” arXiv, November 27, 2021. http://arxiv.org/abs/2111.13974.
Chanda, Supriya, Sacchit D Sheth, and Sukomal Pal. “Coarse and Fine-Grained Conversational Hate Speech and Offensive Content Identification in Code-Mixed Languages Using Fine-Tuned Multilingual Embedding,” https://ceur-ws.org/Vol-3395/T7-3.pdf FIRE’22: Forum for Information Retrieval Evaluation, December 9-13, 2022, India
Chanda et al., “Fine-Tuning Pre-Trained Transformer Based Model for Hate Speech and Offensive Content Identification in English, Indo-Aryan and Code-Mixed (English-Hindi) Languages,” Fire (2021). https://ceur-ws.org/Vol-3159/T1-44.pdf
Chavan et al., “A Twitter BERT Approach for Offensive Language Detection in Marathi.” arXiv, December 20, 2022. http://arxiv.org/abs/2212.10039.
Conneau et al., “Unsupervised cross-lingual representation learning at scale”, arXiv preprint arXiv:1911.02116 (2019).
Das, Mithun, Somnath Banerjee, and Animesh Mukherjee. “Data Bootstrapping Approaches to Improve Low Resource Abusive Language Detection for Indic Languages.” arXiv, April 26, 2022. http://arxiv.org/abs/2204.12543.
Devlin et al., “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.” arXiv, May 24, 2019. http://arxiv.org/abs/1810.04805.
Joshi Raviraj, “L3Cube-MahaCorpus and MahaBERT: Marathi Monolingual Corpus, Marathi BERT Language Models, and Resources” arXiv:2202.01159 [cs.CL]. https://doi.org/10.48550/arXiv.2202.01159
Gaikwad et al., “Cross-Lingual Offensive Language Identification for Low Resource Languages: The Case of Marathi.” arXiv, September 8, 2021. http://arxiv.org/abs/2109.03552.
Gajbhiye et al., “Machine Learning Models for Hate Speech Identification in Marathi Language,” Fire (2021) https://ceur-ws.org/Vol-3159/T1-37.pdf
Gamal et al., “Intelligent Multi-Lingual Cyber-Hate Detection in Online Social Networks: Taxonomy, Approaches, Datasets, and Open Challenges.” Big Data and Cognitive Computing 7, no. 2 (March 24, 2023): 58. https://doi.org/10.3390/bdcc7020058.
Glazkova et al., “Fine-Tuning of Pre-Trained Transformers for Hate, Offensive, and Profane Content Detection in English and Marathi,” arXiv:2110.12687 [cs.CL],2021 https://doi.org/10.48550/arXiv.2110.12687
Gokhale et al., “Spread Love Not Hate: Undermining the Importance of Hateful Pre-Training for Hate Speech Detection.” arXiv, December 11, 2022. http://arxiv.org/abs/2210.04267.
Kakwani et al., “IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages” (https://aclanthology.org/2020.findings-emnlp.445
Kalra et al., “Hate Speech Detection in Marathi and Code-Mixed Languages Using TF-IDF and Transformers-Based BERT-Variants,” FIRE 2022: Forum for Information Retrieval Evaluation, December 9-13, 2022, India. https://ceur-ws.org/Vol-3395/T7-13.pdf
Khanuja et al., Muril: Multilingual representations for indian languages, arXiv preprint arXiv:2103.10730 (2021).
Kumari Kirti, and Jyoti Prakash Singh. “Machine Learning Approach for Hate Speech and Offensive Content Identification in English and Indo Aryan Code-Mixed Languages,” Forum for Information Retrieval Evaluation, December 9-13, 2022, India https://ceur-ws.org/Vol-3395/T7-10.pdf
Mandl et al. “Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech.” Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval Evaluation (2021): n. pag.CEUR Workshop Proceedings (CEUR-WS.org) https://arxiv.org/abs/2112.09301
Pawar Rohit, and Rajeev R. Raje. “Multilingual Cyberbullying Detection System.” In 2019 IEEE International Conference on Electro Information Technology (EIT), 040–044. Brookings, SD, USA: IEEE, 2019. https://doi.org/10.1109/EIT.2019.8833846.
Pires T., Schlinger E. and Garrette D., How multilingual is multilingual bert?, arXiv preprint arXiv:1906.01502 (2019).
Ranasinghe et al., “Overview of the HASOC Subtrack at FIRE 2022: Offensive Language Identification in Marathi.” arXiv, November 18, 2022. http://arxiv.org/abs/2211.10163.
Sanh V. et al, DistilBERT, a distilled version of bert: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108 (2019).
Vani V Dikshithaand and Bharathi B “Hate Speech and Offensive Content Identification in Multiple Languages Using Machine Learning Algorithms,” (FIRE). CEUR-WS. org, 2022 - ceur-ws.org
Velankar et al., “Hate and Offensive Speech Detection in Hindi and Marathi.” arXiv, November 1, 2021. http://arxiv.org/abs/2110.12200.
Velankar et al., “L3Cube-MahaHate: A Tweet-Based Marathi Hate Speech Detection Dataset and BERT Models.” arXiv, May 22, 2022. http://arxiv.org/abs/2203.13778.
Velankar et al., “Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi,” 13739:121–28, 2023. https://doi.org/10.1007/978-3-031-20650-4_10.
Zampieri et al., “Predicting the Type and Target of Offensive Social Media Posts in Marathi.” Social Network Analysis and Mining 12, no. 1 (December 2022): 77. https://doi.org/10.1007/s13278-022-00906-8.

Index Terms

Computer Science

Information Sciences

Keywords

Transformer Monolingual multilingual algorithms.