CFP last date
20 December 2024
Reseach Article

Performance Evaluation of Learning Classifiers for Speech Emotions Corpus using Combinations of Prosodic Features

by Syed Abbas Ali, Sitwat Zehra, Afsheen Arif
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 76 - Number 2
Year of Publication: 2013
Authors: Syed Abbas Ali, Sitwat Zehra, Afsheen Arif
10.5120/13221-0634

Syed Abbas Ali, Sitwat Zehra, Afsheen Arif . Performance Evaluation of Learning Classifiers for Speech Emotions Corpus using Combinations of Prosodic Features. International Journal of Computer Applications. 76, 2 ( August 2013), 35-43. DOI=10.5120/13221-0634

@article{ 10.5120/13221-0634,
author = { Syed Abbas Ali, Sitwat Zehra, Afsheen Arif },
title = { Performance Evaluation of Learning Classifiers for Speech Emotions Corpus using Combinations of Prosodic Features },
journal = { International Journal of Computer Applications },
issue_date = { August 2013 },
volume = { 76 },
number = { 2 },
month = { August },
year = { 2013 },
issn = { 0975-8887 },
pages = { 35-43 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume76/number2/13221-0634/ },
doi = { 10.5120/13221-0634 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:44:51.957827+05:30
%A Syed Abbas Ali
%A Sitwat Zehra
%A Afsheen Arif
%T Performance Evaluation of Learning Classifiers for Speech Emotions Corpus using Combinations of Prosodic Features
%J International Journal of Computer Applications
%@ 0975-8887
%V 76
%N 2
%P 35-43
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper introduces the speech emotion corpus, a multilingual speech emotion database recorded in the provincial languages of Pakistan: Urdu, Punjabi, Pashto and Sindhi for analyzing the speech emotions present in the recorded speech signals with the four different emotions (Anger, Sadness, Comfort and Happiness). The objective of this paper is to evaluate the performance of the learning classifiers (MLP, Naive Bayes, J48, and SMO) for speech emotion corpus recorded in the provincial languages of Pakistan with different combinations of prosodic features in term of classification accuracy and time taken to build models. The experimental results clearly show that the J48 classifier performs far better than all other classifiers in term of both classification accuracy and model building time. SMO indicates slightly better classification accuracy than Naïve Bayes classifiers whereas; Naïve Bayes exhibit minimum model building time as compared to MLP.

References
  1. M. L. Minsky. The society of mind. New York, N. Y. : Simon and Schuster, 1986.
  2. D. Ververidis and C. Kotropulos, "Emotional speech recognition: Resources, features, and methods," Speech Communication Elsevier, Vol. 48, pp. 1162-1181, 2006.
  3. A. Batliner, C. Hacker, S. Steidl, E. Nöth, S. D'Arcy, M. Russel and M. Wong, "'you stupid tin box' - children interacting with the AIBO robot: a cross linguistic emotional speech corpus," in Proceedings of the 4th International Conference of Language Resources and Evaluation (LREC '04), pp. 171-174, 2004.
  4. K. R. Scherer, D. Grandjean, L. T. Johnstone, G. Klasmeyer, " Acoustic correlates of task load and stress," in Proceedings of the International Conference on Spoken Language Processing (ICSLP '02), pp. 2017-2020, 2002.
  5. D. C. Ambrus, "Collecting and recording of an emotional speech database," Technical Report, Faculty of Electrical Engineering, Institute of Electronics, University of Maribor. , 2000.
  6. G. M. Gonzalez, "Bilingual computer-assisted psychological assessment: an innovative approach for screening depression in Chicanos/Latinos," Technical Report TR-0039, University of Michigan. , 1999.
  7. H. G. Wallbott and K. R. Scherer, "Cues and channels in emotion recognition," Journal of personality and social psychology, Vol. 51, pp. 690-699, 1986.
  8. L. Anolli, L. Wang, F. Mantovani and A. De Toni, "The Voice of Emotion in Chinese and Italian Young Adults," Journal of Cross-Cultural Psychology, Vol. 39, pp. 565-598, 2008.
  9. P. Ekman, "An argument for basic emotions, "Cognition and Emotion, Vol. 6, pp. 169-200, 1992.
  10. M. Kuremastsu et al, "An extraction of emotion in human speech using speech synthesize and classifiers for each emotion," WSEAS Transaction on Information Science and Applications, Vol . 5(3), pp. 246-251, 2008.
  11. J. Atkinson, "Correlation analysis of the physiological factors controlling fundamental voice frequency," Journal of the Acoustic Society of America, Vol. 63(1), pp. 211-222, 1978.
  12. C. Tseng, and Y. Lee, "Intensity in relation to prosody organization," In International Symposium on Chinese Spoken Language Processing, pp. 217-220, Hong-Kong, China. 2004
  13. D. Beaver, B. Zack Clarck, E. Flemming, T. F. Jaeger and M. Wolters, "When semantics meets phonetics: Acoustical studies of second-occurrence focus," Journal of the Linguistic Society of America, Vol. 83(2), pp. 245-276, 2008.
  14. Formant-Wikipedia, the free encyclopedia. http://en. wikipedia. org/wiki/Formant.
  15. Pitch-Wikipedia, the free encyclopedia. http://en. wikipedia. org/wiki/Pitch_(music).
  16. A. Batliner, S. Steidl, B. Schuller, D. Seppi, K. Laskowski, T. Vogt, L. Devillers, L. Vidrascu, N. Amir, L. Kessous, and V. Aharonson, "Combining Efforts for Improving Automatic Classification of Emotional User States," In Proc. of IS-LTC, pages 240—245, 2006.
  17. I. Iriondo, S. Planet, J. C. Socoro, and F. Alias, "Objective and Subjective Evaluation of an Expressive Speech Corpus," Advances in Nonlinear Speech Processing, LNCS, 4885:86, 2007.
  18. D. Morrison and L. C. De Silva, "Voting Ensembles for Spoken Affect Classification," Journal of Network and Computer Applications, 30(4):1356—1365, 2007.
  19. M. Shami and W. Verhelst, "Automatic Classification of Expressiveness in Speech: A Multi-Corpus Study," Speaker Classification II, LNCS, 4441:43—56, 2007.
  20. L. Vidrascu and L. Devillers, "Annotation and Detection of Blended Emotions in Real Human-Human Dialogs Recorded in a Call Center," In Proc. of 2005 IEEE International Conference on Multimedia and Expo, pages 944—947, 2005.
  21. S. Yilmazyildiz, W. Mattheyses, Y. Patsi, and W. Verhelst, "Expressive Speech Recognition and Synthesis as Enabling Technologies for Affective Robot-Child Communication," PCM 2006, LNCS, 4261:1—8, 2006.
  22. J. Han and M. Kamber. Data Mining: Concepts and Techniques. Elsevier, 2nd edition, 2006.
  23. S. Haykin, R. Neurais – Principios e Pratica, Bookman, 2 ed. , Porto Alegre, 2001.
  24. C. M Bishop, Neural Networks for Pattern Recognition, Oxford, New York, 1995.
  25. J. C. Platt, "Sequential Minimal Optimization: A fast algorithm for training Support Vector Machine," Technical Report MSR-TR-98-14, Microsoft Research, 1998.
  26. R. R. Bouckaert, E. Frank, M. H. R. Kirkby, P. Reutemann, S. D. Scuse, WEKA Manual for Version 3-7-5, October 28, 2011.
  27. I. H. Witten and E. Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann Publishers Inc. , San Francisco, CA, USA, 2nd edition, 2005.
  28. R. Dimov, WEKA: Practical Machine Learning Tools and Techniques in Java, 2006/07.
Index Terms

Computer Science
Information Sciences

Keywords

Learning Classifier Prosodic Features Speech Emotion Corpus Emotions