Acoustics Speech Processing of Sanskrit Language

Sujay G. Kakodkar; Samarth Borkar

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

Acoustics Speech Processing of Sanskrit Language

by Sujay G. Kakodkar, Samarth Borkar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 180 - Number 38

Year of Publication: 2018

Authors: Sujay G. Kakodkar, Samarth Borkar

10.5120/ijca2018917017

Sujay G. Kakodkar, Samarth Borkar . Acoustics Speech Processing of Sanskrit Language. International Journal of Computer Applications. 180, 38 ( May 2018), 27-32. DOI=10.5120/ijca2018917017

@article{ 10.5120/ijca2018917017,

author = { Sujay G. Kakodkar, Samarth Borkar },

title = { Acoustics Speech Processing of Sanskrit Language },

journal = { International Journal of Computer Applications },

issue_date = { May 2018 },

volume = { 180 },

number = { 38 },

month = { May },

year = { 2018 },

issn = { 0975-8887 },

pages = { 27-32 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume180/number38/29380-2018917017/ },

doi = { 10.5120/ijca2018917017 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T01:03:02.145368+05:30

%A Sujay G. Kakodkar

%A Samarth Borkar

%T Acoustics Speech Processing of Sanskrit Language

%J International Journal of Computer Applications

%@ 0975-8887

%V 180

%N 38

%P 27-32

%D 2018

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Speech processing (SP) is the latest trend in technology. An intelligent and precise human-machine interaction (HMI) is designed to engineer an automated, smart and secure application for household and commercial application. The existing methods highlight the absence of the speech processing in the under-resourced languages. The novelty of this work is that it presents a study of acoustic speech processing (ASP) using spectral components of Mel frequency cepstrum coefficient (MFCC) of Sanskrit language. A customized speech database is created as no generic database is available in Sanskrit. The processing method includes speech signal isolation, feature selection and extraction of selected features for applications. The speech is processed over a custom dataset consisting of Sanskrit speech corpus. The spectral features are calculated over 13 coefficients providing improved performance. The results obtained highlight the performance of the proposed system with the variation of the lifter parameter.

References

S. Dhonde and S. Jagade, “Significance of frequency band selection of mfcc for text-independent speaker identification”, In Proceedings of the International Conference on Data Engineering and Communication Technology, Springer International Publishing, pp 217 -224, 2017.
A. Benba, A. Jilbab, A. Hammouch, “Detecting patients with parkinson’s disease with mel frequency cepstral coefficient and support vector machine”, International Journal on Electrical Engineering and Informatics, vol. 7, pp 297-307, 2015.
D. Desai and M. Joshi, “Speaker recognition using mfcc and hybrid model of vq and gmm”, Recent Advances in Intelligent Informatics, Springer International Publishing, pp. 53-63, vol. 235, 2014.
S. Casale, A. Russo, and G. Scebba, “Speech emotion classification using machine learning algorithms”, IEEE International Conference on Semantic Computing, Santa Monica, 2008, pp. 158-165.
M. Savargiv and A. Bastanfard, “Real-time speech emotion recognition by minimum number of features”, IEEE conference on Artificial Intelligence and Robotics (IRANOPEN), Qazvin , 2016, pp. 72-76.
N. Akrami, F. Noroozi, and G. Anbarjafari, “Speech-based emotion recognition and next reaction prediction”, 25th Signal Processing and Communications Applications Conference, Antalya, 2017, pp. 1-6.
Zhaoyan Zhang, “Mechanics of human voice production and control”, The Journal of Acoustical Society of America, pp. 2614–2635, vol. 140, 2016.
P. Bahadur, A. Jain, D. Chauhan, “Architecture of english to sanskrit machine translation”, SAI Intelligent Systems Conference, London, 2015, pp. 616-624.
S. Ladake and A. Gurjar, “Analysis and dissecton of sanskrit divine sound om using digital signal processing to study the science behind om chanting”, 7th International Conference on Intelligent Systems, Modelling and Simulation, Bangkok, 2016, pp 169-173.
J. Yao, and Y. Zhang, “Bionic wavelet transform ;A new time-frequency method based on an auditory model”, IEEE Transaction on Biomedical Engineering, vol. 48, pp. 856-863, 2001.
C. Pierrakeas, V. C. Georgopoulos and G. A. Malandraki “Online collaboration environments in telemedicine applications of speech therapy”, In IEEE Proceedings Engineering in Medicine and Biology, pp 2183 – 2186, Shangai, 2005.
R. Gamasu, “ECG based integrated mobile tele-medicine system for emergency health tribulations”, International Journal of Biosci Biotechno, vol. 6, pp. 83-94, 2014.
P.Y. Oudeyer, “The production and recognition of emotions in speech: features and algorithms”, International Journal of Human-Computer Studies, vol. 59, pp. 157–183, 2003.
S. Koolagudi and K. Rao “Emotion recognition from speech: a review”, International Journal on Speech Technol, vol.15, pp.99-117, 2012.
N. Jamil, F. Apand, and R. Hamzah, “Influences of age in emotion recognition of spontaneous speech a case of an under-resourced language”, International Conference on Speech Technology and Human-Computer Dialogue, Bucharest, 2017, pp. 1-6.
B. Logan, “Mel frquency cepstrum coefficient for music modeling”, In Proceedings of International Synopsium on Music Information Retrieval, 2000.
I. Trablesi and D. Ayad, “A multi-level data fusion for speaker identification on telephone speech”, International Journal of Speech Processing, Image Processing and Pattern Recognition, vol. 6, pp. 33-42, 2012.
W. Westera, K. Bahreini, and R. Nadolski, “Towards real-time speech emotion recognitionfor affective e-learning”, Education and Information Technologies, vol. 21, no. 5, pp. 1367–1386, 2016.
S. Gaikwad B. Gawali P. Yannawar, “A review on speech recognition technique”, International Journal of Computer Applications, vol. 10, pp. 16-24, 2010.
W. Ghai and N. Singh, “ Literature review on automatic speech recognition”, International Journal of Computer Applications, vol. 41, pp. 42-50, 2012.
A. Vadwala, K. Suthar, Y. Karmakar and N. Pandya, “Survey paper on different speech recognition algorithm: challenges and techniques”, International Journal of Computer Applications, vol. 175, pp. 31-36, 2017.
I. Theodoras, C. N. Anaganostopoulous, I. Giannoukos, “Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011”, Artificial Intelligence Review, vol. 43, pp. 155-177, 2012.
S. Wu, T. Falk, and W. Chan, “Automatic speech emotion recognition using modulation spectral features”, Science Direct - Speech communication, vol. 53, pp. 768-785, 2011.
Brian C, J. Moore, “The role of temporal fine structure processing in pitch perception, masking, and speech perception for normal-hearing and hearing-impaired people”, Journal of the Association for Research in Otolaryngology, vol. 9, pp. 399–406, 2008.
R. Rajoo and C.C. Aun, “Influences of languages in speech emotion recognition: a comparative study using malay, english and mandarin language”, IEEE Symposium on Computer Applications & Industrial Electronics, Batu Feringghi, 2016, pp. 35-39.
S. Sahoo N. Das, P. Sahoo “Word extraction from speech recognition using correlation coefficients”, International Journal of Computer Applications, vol. 51, pp. 21-25, 2012.
R. Singh, S. Arora “Automatic speech recognition: a review”, International Journal of Computer Applications, vol. 60, pp. 34-44, 2012.
J. Nicholson,, K. Takahashi, and R. Nakatsu, “Emotion recognition in speech using neural network”, Neural Computing and Applications, Springer, vol. 9, pp. 290-296, 2000.
A. Batliner, J. Buckow, H. Niemann, E. Noth, and Warnke, “Verbmobile Foundations of speech to speech translation”. Springer, pp. 122-130, 2000.
A. Fayjie, B. Kachari, M. Singh “A survey report on speech recognition system”, International Journal of Computer Applications, vol. 121, 2015.
N. Wasvani and S. Sharma, “Speech recognition system: A review”, International Journal of Computer Applications, vol. 115, 2015.\
N. Trivedi, S. Ahuja,V. Kumar, R. Chadha, S. Singh, “Speech recognition by wavelet analysis”, International Journal of Computer Applications, vol. 15, 2011.

Index Terms

Computer Science

Information Sciences

Keywords

Speech processing Human-machine interaction Mel frequency cepstrum coefficient Sanskrit language