International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 186 - Number 37 |
Year of Publication: 2024 |
Authors: Purnima Chandrasekar, Shailendra Pratap Shastri |
10.5120/ijca2024923947 |
Purnima Chandrasekar, Shailendra Pratap Shastri . Step-by-step Approach to Automatic Speech Emotion Recognition. International Journal of Computer Applications. 186, 37 ( Aug 2024), 37-43. DOI=10.5120/ijca2024923947
Humans use emotions to express themselves naturally either through facial expressions or through speech. Emotions play an important role in influencing the decision-making capability of human beings as human mind is influenced by personal experiences as well as physiological, communicative and behavioral reaction to external stimulus. While considering emotions displayed through speech, one needs to understand that a speech signal not only conveys the emotional state of the speaker which is visible from the intent of the message as well as the gender of the person and the language spoken. While an effective communication between humans through speech ensures exchange of right amount of ideas, messages and perceptions, interaction between human and machine with the same intent becomes challenging as a machine is expected to mimic the mechanism of human perception. Automatic Speech Emotion recognition (ASER) systems has found usefulness in several applications viz. healthcare, counseling, call center communication etc. Primary to this system are three basic components viz. creation of emotional speech corpus, extraction of features relevant to emotion detection and classification of emotion in the test speech using appropriate classifiers. This paper surveys extensively the prominent features extracted, several dimension reduction techniques and classifiers commonly used in recent times. It also throws light on the concept of auto encoders being used in recent times in the process of ASER.