Building a natural sounding Text-To-Speech system for the Nepali language: research and development challenges and solutions

  • Roop Shree Ratna Bajracharya Department of Computer Science and Engineering, Kathmandu University, Nepal
  • Santosh Regmi KEIV Technologies Pvt. Ltd.
  • Bal Krishna Bal Department of Computer Science and Engineering, Kathmandu University, Nepal
  • Balaram Prasain Central Deartment of Linguistics, Tribhuvan University, Nepal
Keywords: Nepali Text-to-Speech, Festival Speech Synthesis, Unit Selection Speech Synthesis

Abstract

Text-to-Speech (TTS) synthesis has come far from its primitive synthetic monotone voices to more natural and intelligible sounding voices. One of the direct applications of a natural sounding TTS systems is the screen reader applications for the visually impaired and the blind community. The Festival Speech Synthesis System uses a concatenative speech synthesis method together with the unit selection process to generate a natural sounding voice. This work primarily gives an account of the efforts put towards developing a Natural sounding TTS system for Nepali using the Festival system. We also shed light on the issues faced and the solutions derived which can be quite overlapping across other similar under-resourced languages in the region.

Abstract
0
PDF
0

Author Biographies

Roop Shree Ratna Bajracharya, Department of Computer Science and Engineering, Kathmandu University, Nepal

Roop Shree Ratna Bajracharya (bajracharya.roop@gmail.com) is a Faculty Member at
Department of Computer Science and Engineering, Kathmandu University.

Santosh Regmi, KEIV Technologies Pvt. Ltd.

Santosh Regmi (regmi.santosh32@gmail.com) is Managing director of KEIV Technologies Pvt. Ltd.

Bal Krishna Bal, Department of Computer Science and Engineering, Kathmandu University, Nepal

Dr. Bal Krishna Bal (bal@ku.edu.np) is an Associate Professor at Department of
Computer Science and Engineering, Kathmandu University.

Balaram Prasain, Central Deartment of Linguistics, Tribhuvan University, Nepal

Dr. Balaram Prasain (prasain2003@yahoo.com) is an Associate Professor at Central Deartment of Linguistics, Tribhuvan University.

Published
2019-12-31
How to Cite
Bajracharya, R., Regmi, S., Bal, B., & Prasain, B. (2019). Building a natural sounding Text-To-Speech system for the Nepali language: research and development challenges and solutions. Gipan, 4, 106-116. https://doi.org/10.3126/gipan.v4i0.35461
Section
Articles