CFP last date
20 January 2025
Reseach Article

Speech Emotion Recognition of Sanskrit Language using Machine Learning

by Sujay G. Kakodkar, Samarth Borkar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 179 - Number 51
Year of Publication: 2018
Authors: Sujay G. Kakodkar, Samarth Borkar
10.5120/ijca2018917326

Sujay G. Kakodkar, Samarth Borkar . Speech Emotion Recognition of Sanskrit Language using Machine Learning. International Journal of Computer Applications. 179, 51 ( Jun 2018), 23-28. DOI=10.5120/ijca2018917326

@article{ 10.5120/ijca2018917326,
author = { Sujay G. Kakodkar, Samarth Borkar },
title = { Speech Emotion Recognition of Sanskrit Language using Machine Learning },
journal = { International Journal of Computer Applications },
issue_date = { Jun 2018 },
volume = { 179 },
number = { 51 },
month = { Jun },
year = { 2018 },
issn = { 0975-8887 },
pages = { 23-28 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume179/number51/29524-2018917326/ },
doi = { 10.5120/ijca2018917326 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:58:56.927973+05:30
%A Sujay G. Kakodkar
%A Samarth Borkar
%T Speech Emotion Recognition of Sanskrit Language using Machine Learning
%J International Journal of Computer Applications
%@ 0975-8887
%V 179
%N 51
%P 23-28
%D 2018
%I Foundation of Computer Science (FCS), NY, USA
Abstract

A modern development in technology is Speech Emotion Recognition (SER). SER in partnership with Humane-Machine interaction (HMI) has advanced machine intelligence. An emotion precise HMI is designed by integrating speech processing and machine learning algorithm which is sculpted to formulate an automated smart and secure application for detecting emotions in a household as well as in commercial application. This project presents a study of distinguishing emotions by acoustic speech recognition (ASR) using K-means nearest neighbor (K-NN), a machine learning (ML) technique. The most significant paralinguistic information obtained from spectral features is presented by ASR i.e. by using Mel frequency cepstrum coefficient (MFCC). The most important processing techniques methods include feature extraction, feature selection, and classification of emotions. A customized dataset consisting of speech corpus, simulated emotion samples in the Sanskrit language is used to classify emotions in different emotional classes i.e. happy, sad, excitement, fear, anger and disgust. The emotions are classified using a K-NN algorithm over 2 separate models, based on the soft and high pitch voice. Model 1 and 2 achieved about 72.95% and 76.96% recognition rates respectively.

References
  1. S. Wu, T. Falk, and W. Chan, “Automatic speech emotion recognition using modulation spectral features”, Science Direct - Speech communication, vol. 53, pp. 768-785, 2011.
  2. Kun H., Dong Y., and Ivan T., “Speech emotion recognition using deep neural network and extreme learning machine”, In Proceedings of INTERSPEECH, pp. 223-237, 2014.
  3. J. Han, Z. Zhang, and F. Ringeval, “Prediction-based learning for continuous emotion recognition in speech”, in IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, 2017, pp. 5005-5009.
  4. G. Caridakis, G. Castellano, L. Kessous, A. Raouzaiou, L. Malatesta, S. Asteriadis, K. Karpouzis, “Emotion recognition through multiple modalities: face, body gesture, speech”, Springer Berlin Heidelberg, pp 92-103, 2008.
  5. K. Wang, Z. Chu, K. Wang, T. Yu, L.Liu, “Speech emotion recognition using multiple classifiers”, Springer International Publishing, pp. 84-93, 2017.
  6. I. Theodoras, C. N. Anaganostopoulous, I. Giannoukos, “Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011”, Artificial Intelligence Review, vol. 43, pp. 155-177, 2012.
  7. S. Koolagudi and K. Rao “Emotion recognition from speech: a review”, International Journal on Speech Technol, vol.15, pp.99-117, 2012.
  8. P. Bahadur, A. Jain, D. Chauhan, “Architecture of english to sanskrit machine translation”, SAI Intelligent Systems Conference, London, 2015, pp. 616-624.
  9. S. Ladake and A. Gurjar, “Analysis and dissecton of sanskrit divine sound om using digital signal processing to study the science behind om chanting”, 7th International Conference on Intelligent Systems, Modelling and Simulation, Bangkok, 2016, pp 169-173.
  10. P.Y. Oudeyer, “The production and recognition of emotions in speech: features and algorithms”, International Journal of Human-Computer Studies, vol. 59, pp. 157–183, 2003.
  11. M. Gjoreski, H. Gjoreski, and A. Kulakov, "Machine Learning Approach for Emotion Recognition in Speech," International Journal of Computing and Informatics, vol. 38, pp. 377-384, 2014.
  12. S. Casale, A. Russo, and G. Scebba, “Speech emotion classification using machine learning algorithms”, IEEE International Conference on Semantic Computing, Santa Monica, 2008, pp. 158-165.
  13. R. Amani, I. Trabelsi, and N. Ellouze, “Automatic emotion recognition using generative and discriminative classifiers in GMM mean space”, in International Conference on Advanced Technologies for Signal and Image Processing, 2015, pp. 767-770.
  14. M. Lang, B. Schuller, and G. Rigoll, “Hidden markov model-based speech emotion recognition”, IEEE International Conference on Acoustics, Speech, and Signal Processing, Washington, 2003, pp. 1-4.
  15. S. Foo, T. Nwe, and C. De Silva, “Speech emotion recognition using hidden Markov models”, Speech communication, vol. 41, no. 4, pp. 603-623, 2003.
  16. M. Savargiv and A. Bastanfard, “Real-time speech emotion recognition by minimum number of features”, IEEE conference on Artificial Intelligence and Robotics (IRANOPEN), Qazvin , 2016, pp. 72-76.
  17. N. Akrami, F. Noroozi, and G. Anbarjafari, “Speech-based emotion recognition and next reaction prediction”, 25th Signal Processing and Communications Applications Conference, Antalya, 2017, pp. 1-6.
  18. B. Abraham, A. Davletcharova, S. Sugathan, and A. P. James, "Detection and analysis of emotion from speech signals," International Symposium on Computer Vision and the Internet, vol. 58, pp. 91-96, 2015.
  19. N. Jamil, F. Apand, and R. Hamzah, “Influences of age in emotion recognition of spontaneous speech a case of an under-resourced language”, International Conference on Speech Technology and Human-Computer Dialogue, Bucharest, 2017, pp. 1-6.
  20. S. Kakodkar and S. Borkar, “Acoustics Speech Processing of Sanskrit Language”, International Journal of Computer Applications, vol. 180, pp. 27-32, 2018.
  21. S. Sahoo N. Das, P. Sahoo, “Word extraction from speech recognition using correlation coefficients”, International Journal of Computer Applications, vol. 51, pp. 21-25, 2012.
  22. R. Singh, S. Arora, “Automatic speech recognition: a review”, International Journal of Computer Applications, vol. 60, pp. 34-44, 2012.
  23. J. Nicholson, K. Takahashi, and R. Nakatsu, “Emotion recognition in speech using neural network”, Neural Computing and Applications, Springer, vol. 9, pp. 290-296, 2000.
  24. A. Benba, A. Jilbab, A. Hammouch, “Detecting patients with parkinson’s disease with mel frequency cepstral coefficient and support vector machine”, International Journal on Electrical Engineering and Informatics, vol. 7, pp 297-307, 2015.
  25. Brian C, J. Moore, “The role of temporal fine structure processing in pitch perception, masking, and speech perception for normal-hearing and hearing-impaired people”, Journal of the Association for Research in Otolaryngology, vol. 9, pp. 399–406, 2008.
  26. R. Rajoo and C.C. Aun, “Influences of languages in speech emotion recognition: a comparative study using malay, english and mandarin language”, IEEE Symposium on Computer Applications & Industrial Electronics, Batu Feringghi, 2016, pp. 35-39.
  27. A. Fayjie, B. Kachari, M. Singh “A survey report on speech recognition system”, International Journal of Computer Applications, vol. 121, 2015.
  28. N. Wasvani and S. Sharma, “Speech recognition system: A review”, International Journal of Computer Applications, vol. 115, 2015.
  29. W. Westera, K. Bahreini, and R. Nadolski, “Towards real-time speech emotion recognition for affective e-learning”, Education and Information Technologies, vol. 21, no. 5, pp. 1367–1386, 2016.
Index Terms

Computer Science
Information Sciences

Keywords

Speech emotion recognition Machine learning Mel frequency cepstrum coefficient Sanskrit language K-NN.