Development of Kannada Speech Corpus for Continuous Speech Recognition

Anand H. Unnibhavi; D. S. Jangamshetti

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Development of Kannada Speech Corpus for Continuous Speech Recognition

by Anand H. Unnibhavi, D. S. Jangamshetti

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 179 - Number 53

Year of Publication: 2018

Authors: Anand H. Unnibhavi, D. S. Jangamshetti

10.5120/ijca2018917255

Anand H. Unnibhavi, D. S. Jangamshetti . Development of Kannada Speech Corpus for Continuous Speech Recognition. International Journal of Computer Applications. 179, 53 ( Jun 2018), 1-4. DOI=10.5120/ijca2018917255

@article{ 10.5120/ijca2018917255,

author = { Anand H. Unnibhavi, D. S. Jangamshetti },

title = { Development of Kannada Speech Corpus for Continuous Speech Recognition },

journal = { International Journal of Computer Applications },

issue_date = { Jun 2018 },

volume = { 179 },

number = { 53 },

month = { Jun },

year = { 2018 },

issn = { 0975-8887 },

pages = { 1-4 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume179/number53/29537-2018917255/ },

doi = { 10.5120/ijca2018917255 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:59:05.485615+05:30

%A Anand H. Unnibhavi

%A D. S. Jangamshetti

%T Development of Kannada Speech Corpus for Continuous Speech Recognition

%J International Journal of Computer Applications

%@ 0975-8887

%V 179

%N 53

%P 1-4

%D 2018

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The paper presents, development of Kannada speech corpus for speaker independent continuous speech recognition. Speech corpus plays a key role in construction of Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) synthesis. The speech corpora is developed for the age group between 21 years to 45 years. Speech corpus for ASR system is developed by collecting text corpus in which data is recorded corresponding to the text corpus followed by Transliteration (phonetic representation of the text corpus) and finally a pronunciation dictionary is developed.

References

Hay Mar Soe Naing, Aye Mya Hlaing, “A Myanmar Large Vocabulary Continuous Speech Recognition System”, Proceedings of APSIPA Annual Summit and Conference 16-19 Dec. 2015 IEEE.
Ahmad A. M. Abushariah, Teddy S. Gunawan,“English Digits Speech Recognition System Based on Hidden Markov Models”, International Conference on Computer and Communication Engineering (ICCCE 2010), 11-13 May 2010, Kuala Lumpur, Malaysia 978-1-4244-6235- 3/10/$26.00 ©2010 IEEE.
Rohit Kumar, S.P. Kishore, Anumanchipalli Gopalakrishna, Rahul Chitturi, Sachin Joshi, Satinder Singh, “Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems”, Proceedings of International Conference on Speech and Computer (SPECOM), Patras, Greece, Oct 2005 IEEE.
Biswajit Das, Sandipan Mandal and Pabitra mitra, “Bengali speech corpus for continuous automatic speech recognition system”, International Conference on Speech Database and Assessments(Oriental COCOSDA) 2011 IEEE.
Niklas Vanhainen and Giampiero Salvi, “Free Acoustic and Language Models for Large Vocabulary Continuous Speech Recognition in Swedish”, Ninth International Conference on Language Resources and Evaluation (LREC'14), May, 26-31, 2014,Reykjavik, Iceland, ISBN: 978-2-9517408-8-4.
Chalapathy Neti, Nitendra Rajput and Ashish Verma, “A Large Vocabulary Continuous Speech Recognition System for Hindi”, IBM India Research Lab, Volume: 48, Issue 5.6, Sep. 2004 IEEE.
Tejas Godambe and Samudravijaya K, “Speech Data Acquisition for voice based Agricultural Information Retrieval”, Proceeding of 39th All India DLA Conference, Punjab University, Patiala, India 2011.
Tejas Godambe and Samudravijaya K, “Speech Data Acquisition for voice based Agricultural Information Retrieval”, Proceeding of 39th All India DLA Conference, Punjab University, Patiala, India 2011.
G. V. Mantena, S. Rajendran, B. Rambabu, S. V. Gangashetty, B. Yegnanarayana and K. Prahallad, "A speech-based conversation system for accessing agriculture commodity prices in indian languages", Proceedings of IEEE Hands-free Speech Communication and Microphone Arrays Edinburgh UK, Edinburgh, UK, 2011.
https://en.wikipedia.org/wiki/Kannada.
http://shodhganga.inflibnet.ac.in/bitstream/10603/104462/12/12_chapter %202.pdf
http://www.sif.us.es/fil/publicaciones/apuntes/mpinedaperez/Wavesurfer. Pdf.
http://www.voxforge.org/home/docs/faq/faq/what-is-g2p
http://www.personal.psu.edu/ejp10/symbolcodes/bylanguage/kannadach art.html.
http://languagemanuals.weebly.com/uploads/4/8/5/3/4853169/kannada.pdf

Index Terms

Computer Science

Information Sciences

Keywords

ASR TTS G2P Speech corpus