CFP last date
22 July 2024
Call for Paper
August Edition
IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 22 July 2024

Submit your paper
Know more
Reseach Article

Voice Recognition for Gujarati Dialects: An in-depth Survey

by Meera M. Shah, Hiren R. Kavathiya
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 186 - Number 5
Year of Publication: 2024
Authors: Meera M. Shah, Hiren R. Kavathiya
10.5120/ijca2024923112

Meera M. Shah, Hiren R. Kavathiya . Voice Recognition for Gujarati Dialects: An in-depth Survey. International Journal of Computer Applications. 186, 5 ( Jan 2024), 1-4. DOI=10.5120/ijca2024923112

@article{ 10.5120/ijca2024923112,
author = { Meera M. Shah, Hiren R. Kavathiya },
title = { Voice Recognition for Gujarati Dialects: An in-depth Survey },
journal = { International Journal of Computer Applications },
issue_date = { Jan 2024 },
volume = { 186 },
number = { 5 },
month = { Jan },
year = { 2024 },
issn = { 0975-8887 },
pages = { 1-4 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume186/number5/33066-2024923112/ },
doi = { 10.5120/ijca2024923112 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:29:47.637900+05:30
%A Meera M. Shah
%A Hiren R. Kavathiya
%T Voice Recognition for Gujarati Dialects: An in-depth Survey
%J International Journal of Computer Applications
%@ 0975-8887
%V 186
%N 5
%P 1-4
%D 2024
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Voice recognition technology nowadays is gaining so much importance, and plenty of work has been done on it for different languages like English, Arabic, Hindi, Chinese, etc. But when we talk about a language like Gujarati, we find a particular lack of work. In this paper, we examined the process of voice recognition in Gujarati. The systematic literature review for voice recognition has been shown here. This paper mainly focuses on the problems that can be found in voice recognition systems for Gujarati.

References
  1. Xu, J., Wang, X., Xu, S., & Liu, W. (2020). Deep multi-metric learning for text-independent speaker verification. Neurocomputing, 410,394–400.https://doi.org/10.1016/j.neucom.2020.06.045
  2. Devi, K. J., Singh, N. H., & Thongam, K. (2020). Automatic Speaker Recognition from Speech Signals Using Self Organizing Feature Map and Hybrid Neural Network. Microprocessors and Microsystems, 79, 103264. https://doi.org/10.1016/j.micpro.2020.103264
  3. Bian, T., Chen, F., & Xu, L. (2019). Self-attention-based speaker recognition using Cluster-Range Loss. Neurocomputing, 368, 59–68. https://doi.org/10.1016/j.neucom.2019.08.046
  4. Maurya, A., Kumar, D. P., & Agarwal, R. (2018). Speaker Recognition for Hindi Speech Signal using MFCC-GMM Approach. Procedia Computer Science, 125, 880–887. https://doi.org/10.1016/j.procs.2017.12.112
  5. Kinnunen, T., Karpov, E., & Fränti, P. (2006). Real-time speaker identification and verification. IEEE Transactions on Audio, Speech, and Language Processing, 14(1), 277–288. https://doi.org/10.1109/tsa.2005.853206
  6. Gupta M, Singh RK, Singh S. G-Cocktail: An Algorithm to Address Cocktail Party Problem of Gujarati Language using CatBoost. Research Square; 2021. DOI: 10.21203/rs.3.rs-305722/v1.
  7. Patel, J. A., & Nandurbarkar, A. B. (2015). Development and Implementation of Algorithms for Speaker recognition for Gujarati Language. International Research Journal of Engineering and Technology (IRJET).
  8. Xu, J., Wang, X., Xu, S., & Liu, W. (2020b). Deep multi-metric learning for text-independent speaker verification. Neurocomputing, 410, 394–400. https://doi.org/10.1016/j.neucom.2020.06.045
  9. Hanifa, R. M., Isa, K., & Mohamad, S. (2021). A review on speaker recognition: Technology and challenges. Computers & Electrical Engineering, 90, 107005. https://doi.org/10.1016/j.compeleceng.2021.107005
  10. Mokgonyane, T. B., Sefara, T. J., Modipa, T. I., Mogale, M. M., Manamela, M. J., & Manamela, P. J. (2019). Automatic Speaker Recognition System based on Machine Learning Algorithms. 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA). https://doi.org/10.1109/robomech.2019.8704837
  11. Kakade, M. N., & Salunke, D. B. (2020). An Automatic Real Time Speech-Speaker Recognition System: A Real Time Approach. Lecture Notes in Electrical Engineering, 151–158. https://doi.org/10.1007/978-981-13-8715-9_19
  12. Tiwari, V., Hashmi, M. S., Keskar, A. G., & Shivaprakash, N. C. (2019). Speaker identification using multi-modal I-vector approach for varying length speech in voice interactive systems. Cognitive Systems Research, 57, 66–77. https://doi.org/10.1016/j.cogsys.2018.09.028
  13. Ghoniem, R. M., & Shaalan, K. (2017). A Novel Arabic Text-independent Speaker Verification System based on Fuzzy Hidden Markov Model. Procedia Computer Science, 117, 274–286. https://doi.org/10.1016/j.procs.2017.10.119
  14. Shahnawazuddin, S., Adiga, N., Sai, B. T., Ahmad, W., & Kathania, H. K. (2019). Developing speaker independent ASR system using limited data through prosody modification based on fuzzy classification of spectral bins. Digital Signal Processing, 93, 34–42. https://doi.org/10.1016/j.dsp.2019.06.015
  15. Mehra, P., & Jain, P. (2021). ERIL: An Algorithm for Emotion Recognition from Indian Languages Using Machine Learning. Wireless Personal Communications. https://doi.org/10.21203/rs.3.rs-449758/v1
  16. Nawaz, S., Saeed, M., Morerio, P., Mahmood, A., Gallo, I., Yousaf, M. H., & Del Bue, A. (2021). Cross-modal Speaker Verification and Recognition: A Multilingual Perspective. Computer Vision and Pattern Recognition. https://doi.org/10.1109/cvprw53098.2021.00184
  17. Saleem, S., Subhan, F., Naseer, N., Bais, A., & Imtiaz, A. (2020). Forensic speaker recognition: A new method based on extracting accent and language information from short utterances. Forensic Science International: Digital Investigation, 34, 300982. https://doi.org/10.1016/j.fsidi.2020.300982
  18. Mehra, P., & Verma, S. B. (2022). BERIS: An mBERT-based Emotion Recognition Algorithm from Indian Speech. ACM Transactions on Asian and Low-Resource Language Information Processing, 21(5), 1–19. https://doi.org/10.1145/3517195
  19. Farsiani, S., Izadkhah, H., & Lotfi, S. (2022). An optimum end-to-end text-independent speaker identification system using convolutional neural networks. Computers & Electrical Engineering, 100, 107882. https://doi.org/10.1016/j.compeleceng.2022.107882
  20. Patel H., Virparia P., - “Generic Model for Text Dependent Automatic Gujarati Speaker Recognition”, International Journal of Emerging Trends & Technology in Computer Science, Vol. 1, Issue 3, September – October 2012
  21. Patel J., Patel P., and Virparia P., - “Voice Enabled Telephony Commands using Gujarati Speech Recognition”, International Journal of Advanced Research in Computer Science and Software Engineering”, Vol. 3, Issue 10, October 2013
  22. Chojnacka, R., Pelecanos, J., Wang, Q., Moreno, I.L. (2021) SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System. Proc. Interspeech 2021, 1064-1068, doi:10.21437/Interspeech.2021-646
  23. Purnima P., Bhatt S, - “Automatic Speech Recognition of Gujarati Digits using Dynamic Time Warping”, International Journal of Engineering and Innovative Technology, Vol. 3, Issue 12, June 2014
  24. Rania M. Ghoniem, Khaled Shaalan, (2017), A Novel Arabic Text-independent Speaker Verification System based on Fuzzy Hidden Markov Model, Procedia Computer Science,Volume 117.
  25. Kharibam Jilenkumari Devi, Nangbam Herojit Singh, Khelchandra Thongam,(2017),Automatic Speaker Recognition from Speech Signals Using Self Organizing Feature Map and Hybrid Neural Network,Microprocessors and Microsystems,Volume 79,(2020).
  26. Tengyue Bian, Fangzhou Chen, Li Xu, (2019), Self-attention based speaker recognition using Cluster-Range Loss,Neurocomputing,Volume 368.
  27. Ankur Maurya, Divya Kumar, R.K. Agarwal, (2018), Speaker Recognition for Hindi Speech Signal using MFCC-GMM Approach,Procedia Computer Science,Volume 125.
  28. Shabnam Farsiani, Habib Izadkhah, Shahriar Lotfi, (2022), An optimum end-to-end text-independent speaker identification system using convolutional neural network,Computers and Electrical Engineering,Volume 100
  29. M. M. Kabir, M. F. Mridha, J. Shin, I. Jahan and A. Q. Ohi, "A Survey of Speaker Recognition: Fundamental Theories, Recognition Methods and Opportunities," in IEEE Access, vol. 9, 2021.
  30. Mohammad K. Nammous, Khalid Saeed, Paweł Kobojek,Using a small amount of text-independent speech data for a BiLSTM large-scale speaker identification approach, Journal of King Saud University - Computer and Information Sciences,Volume 34, Issue 3,2022.
  31. Sajid Saleem, Fazli Subhan, Noman Naseer, Abdul Bais, Ammara Imtiaz,Forensic speaker recognition: A new method based on extracting accent and language information from short utterances,Forensic Science International: Digital Investigation,Volume 34,2020.
  32. Monika Gupta, R K Singh, Sachin Singh et al. G-Cocktail: An Algorithm to Address Cocktail Party Problem of Gujarati Language using CatBoost, 17 March 2021.
  33. B. Pandey, A. Ranjan, R. Kumar and A. Shukla, "Multilingual speaker recognition using ANFIS," 2010 2nd International Conference on Signal Processing Systems, 2010.
  34. T. Kinnunen, E. Karpov and P. Franti, "Real-time speaker identification and verification," in IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 1, pp. 277-288, Jan. 2006.
Index Terms

Computer Science
Information Sciences

Keywords

Voice Recognition Speech Processing Gujrati Feature Extraction MFCC HMM