CFP last date
20 February 2025
Reseach Article

Automatic Speaker Age Estimation and Gender Dependent Emotion Recognition

by Shivaji J. Chaudhari, Ramesh M. Kagalkar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 117 - Number 17
Year of Publication: 2015
Authors: Shivaji J. Chaudhari, Ramesh M. Kagalkar
10.5120/20644-3383

Shivaji J. Chaudhari, Ramesh M. Kagalkar . Automatic Speaker Age Estimation and Gender Dependent Emotion Recognition. International Journal of Computer Applications. 117, 17 ( May 2015), 5-10. DOI=10.5120/20644-3383

@article{ 10.5120/20644-3383,
author = { Shivaji J. Chaudhari, Ramesh M. Kagalkar },
title = { Automatic Speaker Age Estimation and Gender Dependent Emotion Recognition },
journal = { International Journal of Computer Applications },
issue_date = { May 2015 },
volume = { 117 },
number = { 17 },
month = { May },
year = { 2015 },
issn = { 0975-8887 },
pages = { 5-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume117/number17/20644-3383/ },
doi = { 10.5120/20644-3383 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:59:37.255921+05:30
%A Shivaji J. Chaudhari
%A Ramesh M. Kagalkar
%T Automatic Speaker Age Estimation and Gender Dependent Emotion Recognition
%J International Journal of Computer Applications
%@ 0975-8887
%V 117
%N 17
%P 5-10
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Gender-dependent age, emotions (stress and feeling) are speaker qualities being examined in voice-based speaker voice processing system, these qualities or characteristics play important role in the Human and Computer Interaction (HCI). Grouping speaker attributes is an important task in the fields of Voice Processing, Sspeech Synthesis, Forensics, Language Learning, Assessment, furthermore Speaker Identification to increase the performance of voice processing system, also enhance the emotion identification depend on two-stage recognizer that identify the gender of speaker male or female And then recognize the emotions. Noise elimination technique eliminate the noisy sound from audio clip. Mel-Frequency Cepstral Coefficients (MFCCs) is a feature extraction technique broadly utilized as an important part of Automatic voice processing for unique feature extraction. The system contains the Gaussian mixture model (GMM) supervectors as features for a support vector machine (SVM) for the large data classification into different group based on the margin between the two different classes. Principal component analysis (PCA)is used to reduce the large dimension size of feature vector to improve the system performance and accuracy in HCI.

References
  1. Gil Dobry, Ron M. Hecht, Mireille Avigal and Yaniv Z, SEPTEMBER, 2011. "Supervector Dimension Reduction for Efficient Speaker Age Estimation Based on the Acoustic Speech Signal",IEEE transction VOL. 19, NO. 7,
  2. Hugo Meinedo1 and Isabel Trancoso, 2008 "Age and Gender Classification using Fusion of Acoustic and Prosodic Features",Spoken Language Systems Lab, INESC-ID Lisboa, Portugal, Instituto Superior Tecnico, Lisboa, Portugal.
  3. Ismail Mohd Adnan Shahin, 2013"Gender-dependent emotion recognition based on HMMs and SPHMMs",Int J Speech Technol, Springer 16:133141.
  4. Mohamad Hasan Bahari and Hugo Van h, ITN2008 "Speaker Age Estimation and Gender Detection Based on Supervised Non-Negative Matrix Factorization", Centre for Processing Speech and Images Belgium.
  5. Chul Min Lee and Shrikanth S. Narayanan, 2005 "Toward Detecting Emotions in Spoken Dialogs", IEEE transaction 1063-6676.
  6. Tetsuya Takiguchi and Yasuo Ariki, 2006 "Robust feature extraction using kernel PCA",Department of Computer and System Engg Kobe University, Japan, ICASSP 1-4244-0469.
  7. Michael Feld, Felix Burkhardt and Christian Muller, 2010 "Automatic Speaker Age and Gender Recognition in the Car for Tailoring Dialog and Mobile Services",German Research Center for Artificial Intelligence, INTERSPEECH.
  8. Md Afzal Hossan, Sheeraz Memon and Mark A Gregory, "A Novel Approach for MFCC Feature extraction", RMIT university, Melbourne, Australia, IEEE, 2010.
  9. Ruben Solera-Ure, 2008 "Real-time Robust Automatic Speech Recognition Using Compact Support Vector Machines",TEC 2008-06382 and TEC 2008-02473.
  10. Marc Ferras, Cheung-Chi Leung, Claude Barras, and Jean-Luc Gauvain, 2010 "Comparison of Speaker Adaptation Methods as Feature Extraction for SVM-Based Speaker Recognition",IEEE Transaction 1558-7916.
  11. Wei HAN and Cheong fat CHAN, 2006 "An Efficient MFCC Extraction Method in Speech Recognition",Department of Electronic Engineering, The Chinese University of Hong Kong Hong Kong, 7803-9390-06/IEEE ISCAS.
  12. Arif Ullah Khan and L. P. Bhaiya, 2008 "Text Dependent Method for Person Identification through Voice Segment",ISSN- 2277-1956 IJECSE.
  13. Felix Burkhardt, Martin Eckert, Wiebke Johannsen and Joachim Stegmann, 2010"A Database of Age and Gender Annotated Telephone Speech", Deutsche Telekom AG Laboratories, Ernst-Reuter-Platz 7, 10587 Berlin, Germany.
  14. Lingli Yu and Kaijun Zhou, March 2014, "A Comparative Study on Support Vector Machines Classifiers for Emotional Speech Recognition", Immune Computation (IC) Volume:2, Number:1.
  15. Rui Martins, Isabel Trancoso, Alberto Abad and Hugo Meinedo, 2009, "Detection of Childrens Voices", Intituto Superior Tecnico, Lisboa, Portugal INESC-ID Lisboa, Portugal.
  16. Chao Gao, Guruprasad Saikumar, Amit Srivastava and Premkumar Natarajan, 2011, "OpenSet Speaker Identification in Broadcast News", IEEE 978-1-4577-0539.
Index Terms

Computer Science
Information Sciences

Keywords

Human and Computer Interaction (HCI) Mel-Frequency Cepstral Coefficients (MFCCs) Support Vector Machine (SVM) Gaussian Mixture Model (GMM) Principal Component Analysis (PCA).