CFP last date
20 January 2025
Reseach Article

Emotion Recognition and Classification in Speech using Artificial Neural Networks

by Akash Shaw, Rohan Kumar Vardhan, Siddharth Saxena
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 145 - Number 8
Year of Publication: 2016
Authors: Akash Shaw, Rohan Kumar Vardhan, Siddharth Saxena
10.5120/ijca2016910710

Akash Shaw, Rohan Kumar Vardhan, Siddharth Saxena . Emotion Recognition and Classification in Speech using Artificial Neural Networks. International Journal of Computer Applications. 145, 8 ( Jul 2016), 5-9. DOI=10.5120/ijca2016910710

@article{ 10.5120/ijca2016910710,
author = { Akash Shaw, Rohan Kumar Vardhan, Siddharth Saxena },
title = { Emotion Recognition and Classification in Speech using Artificial Neural Networks },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2016 },
volume = { 145 },
number = { 8 },
month = { Jul },
year = { 2016 },
issn = { 0975-8887 },
pages = { 5-9 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume145/number8/25296-2016910710/ },
doi = { 10.5120/ijca2016910710 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:48:14.042617+05:30
%A Akash Shaw
%A Rohan Kumar Vardhan
%A Siddharth Saxena
%T Emotion Recognition and Classification in Speech using Artificial Neural Networks
%J International Journal of Computer Applications
%@ 0975-8887
%V 145
%N 8
%P 5-9
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

To date, little research has been done in emotion classification and recognition in speech. Therefore, there is a need to discuss why this topic is interesting and present a system for classifying and recognizing emotions through speech using neural networks through this article. The proposed system will be speaker independent since a database of speech samples will be used. Various classifiers will used to differentiate emotions such as neutral, anger, happy, sad, etc. The database will consist of emotional speech samples. Prosodic features like pitch, energy, formant frequencies and spectral features like mel frequency cepstral coefficients will be used in the system. Further the classifiers will be trained by using these features for classifying emotions accurately. Following classification, these features will be used to recognize the emotion of the speech sample. Thus, many components like pre-processing of speech, MFCC features, classifiers, prosodic features come together in the implementation of emotion recognition system using speech.

References
  1. L. Rabiner and B. H. Juang, 1993, Fundamentals of Speech Recognition.
  2. Lawrence Rabiner, Ronald Schafer, Introduction to digital speech processing.
  3. Nicholson J., Takahashi K., Nakatsu R., "Emotion Recognition in Speech using Neural Networks", IEEE Trans. Neural Information Proc., 1999, Vol. 2, 495-501.
  4. Wouter Gevaert, Georgi Tsenov, Valeri Mladenov, "Neural Networks used for Speech Recognition", JOURNAL OF AUTOMATIC CONTROL, 2010, University of Belgrade.
  5. S. Haykin, 1999, Neural Networks: a comprehensive foundation.
  6. J. Ang, R. Dhillon, A. Krupski, E. Shriberg, A. Stolcke, "Prosody-Based Automatic Detection of Annoyance and Frustration in Human-Computer Dialog", Proc. ICSLP, Denver, Colorado, USA, 2002, 2037-2040.
  7. S. Yacoub, S. Simske, X. Lin, J. Burns, "Recognition of Emotions in Interactive Voice Response Systems", Proc. European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, 729-732.
  8. J. Liscombe, "Detecting Emotion in Speech: Experiments in Three Domains". Proc. HLT/NAACL, New York, NY, USA, 2006, 231-234.
  9. R.P. Hobson, "The autistic child's appraisal of expressions of emotion. Journal of Child Psychology and Psychiatry", 27, 1986, 321-342.
  10. K.A. Loveland, B. TUNALI-KOTOSKI, Y.R. Chen, J. Ortegon, D.A. Pearson, K.A. Brelsford, M.C. Gibbs, "Emotion recognition in autism: Verbal and nonverbal information", Development and Psychopathology, 9(3), 1997, 579-593.
  11. T. Vogt, E. André, "Improving Automatic Emotion Recognition from Speech via Gender Differentiation", Proc. Language Resources and Evaluation Conference, Genoa, Italy, 2006, 1123-1126.
  12. F. Dellaert, T. Polzin, A. Waibel, "Recognizing Emotion in Speech", Proc. ICSLP, Philadelphia, PA, USA, 1996, 1970-1973.
  13. M. W. Bhatti, Y. Wang, and L. Guan, "A neural network approach for human emotion recognition in speech", IEEE ISCAS, Vancouver, May 2004, 23-26
  14. Snell. R. "Formant location from LPC analysis data", IEEE Transactions on Speech and Audio Processing, 1(2),1993, pp. 129–134.
Index Terms

Computer Science
Information Sciences

Keywords

ANN MFCC prosodic features emotion classification and recognition pre-processing.