CFP last date
20 February 2025
Reseach Article

Improving Accessibility and Independence for Blind/Visually Impaired Persons based on Speech Synthesis Technology

by Manpreet Kaur Dhaliwal, Rohini Sharma
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 186 - Number 28
Year of Publication: 2024
Authors: Manpreet Kaur Dhaliwal, Rohini Sharma
10.5120/ijca2024923768

Manpreet Kaur Dhaliwal, Rohini Sharma . Improving Accessibility and Independence for Blind/Visually Impaired Persons based on Speech Synthesis Technology. International Journal of Computer Applications. 186, 28 ( Jul 2024), 12-20. DOI=10.5120/ijca2024923768

@article{ 10.5120/ijca2024923768,
author = { Manpreet Kaur Dhaliwal, Rohini Sharma },
title = { Improving Accessibility and Independence for Blind/Visually Impaired Persons based on Speech Synthesis Technology },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2024 },
volume = { 186 },
number = { 28 },
month = { Jul },
year = { 2024 },
issn = { 0975-8887 },
pages = { 12-20 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume186/number28/improving-accessibility-and-independence-for-blindvisually-impaired-persons-based-on-speech-synthesis-technology/ },
doi = { 10.5120/ijca2024923768 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-07-26T23:00:21.036467+05:30
%A Manpreet Kaur Dhaliwal
%A Rohini Sharma
%T Improving Accessibility and Independence for Blind/Visually Impaired Persons based on Speech Synthesis Technology
%J International Journal of Computer Applications
%@ 0975-8887
%V 186
%N 28
%P 12-20
%D 2024
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Speech is a crucial communication tool and Text-to-Speech systems are revolutionizing the world by enabling disabled persons to access information and achieve independence. This study investigates the relevance and effects of speech synthesis systems in enhancing the independence and accessibility of people with visual impairments. An overview of voice synthesis technology, followed by categories of speech synthesis systems is given in this study. Studies that increase BVIPs' freedom and accessibility are also considered in the analysis. To evaluate the speech quality of synthesis systems in terms of naturalness and intelligibility, the pilot study is carried out utilizing the gTTS, pyttsx3, SpeechT5, and Bark models. It has been observed that SpeechT5 and pyttsx3 are performing very well in terms of naturalness and intelligibility.

References
  1. Z.J. Muhsin, R. Qahwaji, F. Ghanchi, M. Al-Taee, Review of substitutive assistive tools and technologies for people with visual impairments: recent advancements and prospects, J. Multimodal User Interfaces. 18 (2024) 135–156. https://doi.org/10.1007/s12193-023-00427-4.
  2. A.W. Scott, N.M. Bressler, S. Ffolkes, J.S. Wittenborn, J. Jorkasky, Public Attitudes About Eye and Vision Health, JAMA Ophthalmol. 134 (2016) 1111. https://doi.org/10.1001/jamaophthalmol.2016.2627.
  3. A.D. Flaxman, J.S. Wittenborn, T. Robalik, R. Gulia, R.B. Gerzoff, E.A. Lundeen, J. Saaddine, D.B. Rein, K.N. Baldonado, C. Davidson, M.C. Dougherty, M.R. Duenas, D.S. Friedman, K.M. Jackson, C.E. Joslin, B.E.K. Klein, P.A. Lamuda, Y. Liu, F.C. Lum, N.L. Okeke, N.P. Sinha, B.K. Swenor, J.P. Todd, E. Tolbert, Prevalence of Visual Acuity Loss or Blindness in the US, JAMA Ophthalmol. 139 (2021) 717. https://doi.org/10.1001/jamaophthalmol.2021.0527.
  4. P. Vashist, S.S. Senjam, V. Gupta, N. Gupta, B.R. Shamanna, M. Wadhwani, P. Shukla, S. Manna, S. Yadav, A. Bharadwaj, Blindness and visual impairment and their causes in India: Results of a nationally representative survey, PLoS One. 17 (2022) 1–14. https://doi.org/10.1371/journal.pone.0271736.
  5. K.A. Vashist Praveen, National Blindness & Visual Impairment Survey 2015-19: A Summary Report, Dir. Gen. Heal. Serv. (2019) 1–18. https://npcbvi.mohfw.gov.in/writeReadData/mainlinkFile/File341.pdf.
  6. Assistive Technologies for Visually Impaired Market, (n.d.). https://www.factmr.com/report/4635/assistive-technologies-demand-for-visually-impaired-market.
  7. A.F. for the Blind, Screen Readers - Browse by Category - American Foundation for the Blind, (2019). https://www.afb.org/blindness-and-low-vision/using-technology/assistive-technology-products/screen-readers (accessed January 23, 2024).
  8. J.P. Bigham, C.M. Prince, R.E. Ladner, WebAnywhere: A screen reader on-the-go, W4A’08 Proc. 2008 Int. Cross-Disciplinary Conf. Web Access. W4A. (2008) 73–82. https://doi.org/10.1145/1368044.1368060.
  9. S. Sandhya, K.A.S. Devi, Accessibility evaluation of websites using screen reader, in: 2011 7th Int. Conf. Next Gener. Web Serv. Pract., IEEE, 2011: pp. 338–341. https://doi.org/10.1109/NWeSP.2011.6088201.
  10. O. MyEye, OrCam MyEye For People Who Are Blind or Visually Impaired, (n.d.). https://www.orcam.com/en/myeye2/ (accessed January 23, 2024).
  11. C. Jicol, T. Lloyd-Esenkaya, M.J. Proulx, S. Lange-Smith, M. Scheller, E. O’Neill, K. Petrini, Efficiency of Sensory Substitution Devices Alone and in Combination With Self-Motion for Spatial Navigation in Sighted and Visually Impaired, Front. Psychol. 11 (2020). https://doi.org/10.3389/fpsyg.2020.01443.
  12. M.A. Rahman, S. Siddika, M.A. Al-Baky, M.J. Mia, An automated navigation system for blind people, Bull. Electr. Eng. Informatics. 11 (2022) 201–212. https://doi.org/10.11591/eei.v11i1.3452.
  13. H. Ali A., S.U. Rao, S. Ranganath, T.S. Ashwin, G.R.M. Reddy, A Google Glass Based Real-Time Scene Analysis for the Visually Impaired, IEEE Access. 9 (2021) 166351–166369. https://doi.org/10.1109/ACCESS.2021.3135024.
  14. A. Nasser, K. Zhu, P.V.M. Rao, Poster: Colortact: A Finger Wearable Audio-tactile Device Using Customizable Color Tagging, in: UbiComp/ISWC 2018 - Adjun. Proc. 2018 ACM Int. Jt. Conf. Pervasive Ubiquitous Comput. Proc. 2018 ACM Int. Symp. Wearable Comput., ACM, New York, NY, USA, 2018: pp. 178–181. https://doi.org/10.1145/3267305.3267583.
  15. P. Mejia, L.C. Martini, F. Grijalva, A.M. Zambrano, CASVI: Computer Algebra System Aimed at Visually Impaired People. Experiments, IEEE Access. 9 (2021) 157021–157034. https://doi.org/10.1109/ACCESS.2021.3129106.
  16. D. Bragg, K. Reinecke, R.E. Ladner, Expanding a Large Inclusive Study of Human Listening Rates, ACM Trans. Access. Comput. 14 (2021). https://doi.org/10.1145/3461700.
  17. Á. Csapó, G. Wersényi, H. Nagy, T. Stockman, A survey of assistive technologies and applications for blind users on mobile platforms: a review and foundation for research, J. Multimodal User Interfaces. 9 (2015) 275–286. https://doi.org/10.1007/s12193-015-0182-7.
  18. D. Sasirekha, E. Chandra, Text To Speech : a Simple Tutorial, Int. J. Soft Comput. Eng. 2 (2012) 275–278. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.682.5362&rep=rep1&type=pdf.
  19. S. Kayte, M. Mundada, J. Gujrathi, Hidden Markov Model based Speech Synthesis: A Review, Int. J. Comput. Appl. 130 (2015) 35–39. https://doi.org/10.5120/ijca2015906965.
  20. R. A., J. S., Concatenative Speech Synthesis: A Review, Int. J. Comput. Appl. 136 (2016) 1–6. https://doi.org/10.5120/ijca2016907992.
  21. L.A. Valbon ADEMI1, NATURAL LANGUAGE PROCESSING AND TEXT-TO-SPEECH TECHNOLOGY Valbon, J. Nat. Sci. Math. (2023) 1–9. https://doi.org/UDC:003.2:[004.51/52:004.934.5.
  22. S. Tiomkin, D. Malah, S. Shechtman, Z. Kons, A hybrid text-to-speech system that combines concatenative and statistical synthesis units, IEEE Trans. Audio, Speech Lang. Process. 19 (2011) 1278–1288. https://doi.org/10.1109/TASL.2010.2089679.
  23. D. Malathi, S. Gopika, D. Awasthi, D. Jayaseeli, Voice Automation Mail System for Visually Impaired, in: 2023 Int. Conf. Netw. Commun., IEEE, 2023: pp. 1–6. https://doi.org/10.1109/ICNWC57852.2023.10127558.
  24. G. Nirosha, R. Dr Velmani, Raspberry Pi based Sign to Speech Conversion System for Mute Community, IOP Conf. Ser. Mater. Sci. Eng. 981 (2020) 042005. https://doi.org/10.1088/1757-899X/981/4/042005.
  25. R. Prabha, M. Razmah, G. Saritha, R. Asha, S.G. A, R. Gayathiri, Vivoice - Reading Assistant for the Blind using OCR and TTS, in: 2022 Int. Conf. Comput. Commun. Informatics, IEEE, 2022: pp. 01–07. https://doi.org/10.1109/ICCCI54379.2022.9740877.
  26. A. Ashveena, J. Bala Deepika, S.P. Mary, D.U. Nandini, Portable Camera based Identification System for Visually Impaired People, 7th Int. Conf. Trends Electron. Informatics, ICOEI 2023 - Proc. (2023) 1444–1450. https://doi.org/10.1109/ICOEI56765.2023.10126008.
  27. T.J. Alahmadi, A.U. Rahman, H.K. Alkahtani, H. Kholidy, Enhancing Object Detection for VIPs Using YOLOv4_Resnet101 and Text-to-Speech Conversion Model, Multimodal Technol. Interact. 7 (2023). https://doi.org/10.3390/mti7080077.
  28. Y.-C. Huang, C.-H. Tsai, Speech-Based Interface for Visually Impaired Users, in: 2018 IEEE 20th Int. Conf. High Perform. Comput. Commun. IEEE 16th Int. Conf. Smart City; IEEE 4th Int. Conf. Data Sci. Syst., IEEE, 2018: pp. 1223–1228. https://doi.org/10.1109/HPCC/SmartCity/DSS.2018.00206.
  29. V. Ademi, L. Ademi, NATURAL LANGUAGE PROCESSING AND TEXT-TO-SPEECH TECHNOLOGY, J. Nat. Sci. Math. 8 (2023) 299–306. https://doi.org/https://eprints.unite.edu.mk/1528/1/JNSM%202023-299-306.pdf.
  30. S.C. Madre, S.B. Gundre, OCR Based Image Text to Speech Conversion Using MATLAB, in: Proc. 2nd Int. Conf. Intell. Comput. Control Syst. ICICCS 2018, IEEE, 2018: pp. 858–861. https://doi.org/10.1109/ICCONS.2018.8663023.
  31. V. Adusumilli, M.F. Shaik, N. Kolavennu, L.B.M.T. Adepu, A. V. Prabhu, I.R. Raja, Reading Aid and Translator with Raspberry Pi for Blind people, 2023 9th Int. Conf. Adv. Comput. Commun. Syst. ICACCS 2023. 1 (2023) 327–331. https://doi.org/10.1109/ICACCS57279.2023.10113042.
  32. S. Agrawal, N. Agrawal, Recogniition and Speech Conversion of Devnagri Script using CNN, in: 2023 2nd Int. Conf. Innov. Technol., IEEE, 2023: pp. 1–4. https://doi.org/10.1109/INOCON57975.2023.10101034.
  33. P. Swetha, AI Based Assistance for Visually Impaired People Using TTS (Text To Speech), Int. J. Innov. Res. Sci. Technol. 01 (2021) 8–014. www.ijirst.com.
  34. S. K M, R. Pandian, Real-Time Photo Captioning for Assisting Blind and Visually Impaired People Using LSTM Framework, IEEE Sensors Lett. 7 (2023) 1–4. https://doi.org/10.1109/LSENS.2023.3327565.
  35. Indrianto, Abdurrasyid, M.N.I. Susanti, A. Ramadhan, Text-to-speech on health monitoring bracelet for the visually impaired, Bull. Electr. Eng. Informatics. 12 (2023) 3826–3836. https://doi.org/10.11591/eei.v12i6.5369.
  36. M. Ervasti, M. Isomursu, I. Idigoras Leibar, Touch- and audio-based medication management service concept for vision impaired older people, 2011 IEEE Int. Conf. RFID-Technologies Appl. RFID-TA 2011. (2011) 244–251. https://doi.org/10.1109/RFID-TA.2011.6068645.
  37. B. Lučić, S. Ostrogonac, N. Vujnović Sedlar, M. Sečujski, Educational Applications for Blind and Partially Sighted Pupils Based on Speech Technologies for Serbian, Sci. World J. 2015 (2015) 1–14. https://doi.org/10.1155/2015/839252.
  38. A. Ali, S. Khusro, SA-MEAS: Sympy-based automated mathematical equations analysis and solver, SoftwareX. 25 (2024) 101596. https://doi.org/10.1016/j.softx.2023.101596.
  39. M.N.H. Khan, M.A.H. Arovi, H. Mahmud, M.K. Hasan, H.A. Rubaiyeat, Speech based text correction tool for the visually impaired, in: 2015 18th Int. Conf. Comput. Inf. Technol., IEEE, 2015: pp. 150–155. https://doi.org/10.1109/ICCITechn.2015.7488059.
  40. C. Edirisinghe, N. Podari, A.D. Cheok, A multi-sensory interactive reading experience for visually impaired children; a user evaluation, Pers. Ubiquitous Comput. 2018 (2018) 807–819. https://doi.org/10.1007/s00779-018-1127-4.
  41. D. Vander Wilt, M.M. Farbood, A new approach to creating and deploying audio description for live theater, Pers. Ubiquitous Comput. 25 (2021) 771–781. https://doi.org/10.1007/s00779-020-01406-2.
  42. F. Portet, M. Vacher, C. Golanski, C. Roux, B. Meillon, Design and evaluation of a smart home voice interface for the elderly: Acceptability and objection aspects, Pers. Ubiquitous Comput. 17 (2013) 127–144. https://doi.org/10.1007/s00779-011-0470-5.
  43. S.S. Singh, M. Agrawal, M. Eliazer, Collision detection and prevention for the visually impaired using computer vision and machine learning, Adv. Eng. Softw. 179 (2023) 103424. https://doi.org/10.1016/j.advengsoft.2023.103424.
  44. L. Hakobyan, J. Lumsden, D. O’Sullivan, H. Bartlett, Mobile assistive technologies for the visually impaired, Surv. Ophthalmol. 58 (2013) 513–528. https://doi.org/10.1016/j.survophthal.2012.10.004.
  45. R. Bin Islam, S. Akhter, F. Iqbal, M. Saif Ur Rahman, R. Khan, Deep learning based object detection and surrounding environment description for visually impaired people, Heliyon. 9 (2023) e16924. https://doi.org/10.1016/j.heliyon.2023.e16924.
  46. A. Charishma, A.A. Vaishnavi, D. Rajeswara Rao, T.T. Sri, Smart Reader for Visually Impaired, in: 2023 9th Int. Conf. Adv. Comput. Commun. Syst., IEEE, 2023: pp. 349–352. https://doi.org/10.1109/ICACCS57279.2023.10113122.
  47. U. Gawande, N. Rathod, P. Bodkhe, P. Kolhe, H. Amlani, C. Thaokar, Novel Machine Learning based Text-To-Speech Device for Visually Impaired People, in: 2023 2nd Int. Conf. Smart Technol. Syst. Next Gener. Comput., IEEE, 2023: pp. 1–5. https://doi.org/10.1109/ICSTSN57873.2023.10151637.
  48. T.M. Sivate, N. Pillay, K. Moorgas, N. Singh, Autonomous Classification and Spatial Location of Objects from Stereoscopic Image Sequences for the Visually Impaired, in: 2022 Int. Conf. Electr. Comput. Energy Technol., IEEE, 2022: pp. 1–6. https://doi.org/10.1109/ICECET55527.2022.9872538.
  49. F. Makhmudov, M. Mukhiddinov, A. Abdusalomov, K. Avazov, U. Khamdamov, Y.I. Cho, Improvement of the end-to-end scene text recognition method for “text-to-speech” conversion, Int. J. Wavelets, Multiresolution Inf. Process. 18 (2020) 2050052. https://doi.org/10.1142/S0219691320500526.
  50. I. Flores, G.C. Lacdang, C. Undangan, J. Adtoon, N.B. Linsangan, Smart Electronic Assistive Device for Visually Impaired Individual through Image Processing, 2021 IEEE 13th Int. Conf. Humanoid, Nanotechnology, Inf. Technol. Commun. Control. Environ. Manag. HNICEM 2021. (2021) 1–6. https://doi.org/10.1109/HNICEM54116.2021.9731961.
  51. Memoona Mushtaq, Muhammad Munwar Iqbal, Ayesha Mariam, Aatka Ali, Muhammad Nabeel Asghar, Object Detection and Recognition for Virtual Vision: Using Text-to-Speech Conversion Technique, J. Comput. Biomed. Informatics. 4 (2022) 175–184. https://doi.org/10.56979/401/2022/82.
  52. H. Fernandes, P. Costa, V. Filipe, H. Paredes, J. Barroso, A review of assistive spatial orientation and navigation technologies for the visually impaired, Univers. Access Inf. Soc. 18 (2019) 155–168. https://doi.org/10.1007/s10209-017-0570-8.
  53. M.D. Messaoudi, B.A.J. Menelas, H. Mcheick, Review of Navigation Assistive Tools and Technologies for the Visually Impaired, Sensors. 22 (2022). https://doi.org/10.3390/s22207888.
  54. B. Kuriakose, R. Shrestha, F.E. Sandnes, Tools and Technologies for Blind and Visually Impaired Navigation Support: A Review, IETE Tech. Rev. (Institution Electron. Telecommun. Eng. India). 39 (2022) 3–18. https://doi.org/10.1080/02564602.2020.1819893.
  55. A.R. Façanha, T. Darin, W. Viana, J. Sánchez, O&M Indoor Virtual Environments for People Who Are Blind, ACM Trans. Access. Comput. 13 (2020). https://doi.org/10.1145/3395769.
  56. T.D. Chala, A.C. Guta, M.H. Asebel, Design and Development of a Text-to-Speech Synthesizer for Afan Oromo, SN Comput. Sci. 3 (2022) 1–7. https://doi.org/10.1007/s42979-022-01306-7.
  57. Y. Wang, R.J. Skerry-Ryan, D. Stanton, Y. Wu, R.J. Weiss, N. Jaitly, Z. Yang, Y. Xiao, Z. Chen, S. Bengio, Q. Le, Y. Agiomyrgiannakis, R. Clark, R.A. Saurous, Tacotron: Towards End-to-End Speech Synthesis, in: Interspeech 2017, ISCA, ISCA, 2017: pp. 4006–4010. https://doi.org/10.21437/Interspeech.2017-1452.
  58. M. Podsiadło, S. Chahar, Text-to-Speech for Individuals with Vision Loss: A User Study, in: Interspeech 2016, ISCA, ISCA, 2016: pp. 347–351. https://doi.org/10.21437/Interspeech.2016-1376.
  59. M. Gahlawat, A. Malik, P. Bansal, Natural Speech Synthesizer for Blind Persons Using Hybrid Approach, Procedia Comput. Sci. 41 (2014) 83–88. https://doi.org/10.1016/j.procs.2014.11.088.
  60. S. Lukose, S.S. Upadhya, Text to speech synthesizer-formant synthesis, in: 2017 Int. Conf. Nascent Technol. Eng., IEEE, 2017: pp. 1–4. https://doi.org/10.1109/ICNTE.2017.7947945.
  61. B. Asiedu Asante, H. Imamura, Speech Recognition and Speech Synthesis Models for Micro Devices, ITM Web Conf. 27 (2019) 05001. https://doi.org/10.1051/itmconf/20192705001.
  62. S.N. Kayte, M. Mundada, S. Gaikwad, B. Gawali, Performance evaluation of speech synthesis techniques for english language, Adv. Intell. Syst. Comput. 439 (2016) 253–262. https://doi.org/10.1007/978-981-10-0755-2_27.
  63. A. Valizada, S. Jafarova, E. Sultanov, S. Rustamov, Development and evaluation of speech synthesis system based on deep learning models, Symmetry (Basel). 13 (2021) 1–12. https://doi.org/10.3390/sym13050819.
  64. J. Ao, R. Wang, L. Zhou, C. Wang, S. Ren, Y. Wu, S. Liu, T. Ko, Q. Li, Y. Zhang, Z. Wei, Y. Qian, J. Li, F. Wei, SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing, Proc. Annu. Meet. Assoc. Comput. Linguist. 1 (2022) 5723–5738. https://doi.org/10.18653/v1/2022.acl-long.393.
  65. Suno-ai, Bark, (n.d.). https://github.com/suno-ai/bark.
  66. Google, gTTS, (n.d.). https://gtts.readthedocs.io/en/latest/.
  67. N.M. Bhat, pyttsx3 2.90, GNU Gen. Public Licens. V3. (n.d.). https://pypi.org/project/pyttsx3/.
Index Terms

Computer Science
Information Sciences

Keywords

Blind/Visually impaired persons gTTS Speech Synthesis SpeechT5 pyttsx3 Mean Opinion Scale