Building a natural sounding Text-To-Speech system for the Nepali language: research and development challenges and solutions

Roop Shree Ratna Bajracharya; Santosh Regmi; Bal Krishna Bal; Balaram Prasain

doi:10.3126/gipan.v4i0.35461

Authors

Roop Shree Ratna Bajracharya Department of Computer Science and Engineering, Kathmandu University, Nepal
Santosh Regmi KEIV Technologies Pvt. Ltd.
Bal Krishna Bal Department of Computer Science and Engineering, Kathmandu University, Nepal
Balaram Prasain Central Deartment of Linguistics, Tribhuvan University, Nepal

Keywords:

Nepali Text-to-Speech, Festival Speech Synthesis, Unit Selection Speech Synthesis

Abstract

Text-to-Speech (TTS) synthesis has come far from its primitive synthetic monotone voices to more natural and intelligible sounding voices. One of the direct applications of a natural sounding TTS systems is the screen reader applications for the visually impaired and the blind community. The Festival Speech Synthesis System uses a concatenative speech synthesis method together with the unit selection process to generate a natural sounding voice. This work primarily gives an account of the efforts put towards developing a Natural sounding TTS system for Nepali using the Festival system. We also shed light on the issues faced and the solutions derived which can be quite overlapping across other similar under-resourced languages in the region.

Abstract

484

PDF

1

Author Biographies

Roop Shree Ratna Bajracharya, Department of Computer Science and Engineering, Kathmandu University, Nepal

Roop Shree Ratna Bajracharya (bajracharya.roop@gmail.com) is a Faculty Member at
Department of Computer Science and Engineering, Kathmandu University.

Santosh Regmi, KEIV Technologies Pvt. Ltd.

Santosh Regmi (regmi.santosh32@gmail.com) is Managing director of KEIV Technologies Pvt. Ltd.

Bal Krishna Bal, Department of Computer Science and Engineering, Kathmandu University, Nepal

Dr. Bal Krishna Bal (bal@ku.edu.np) is an Associate Professor at Department of
Computer Science and Engineering, Kathmandu University.

Balaram Prasain, Central Deartment of Linguistics, Tribhuvan University, Nepal

Dr. Balaram Prasain (prasain2003@yahoo.com) is an Associate Professor at Central Deartment of Linguistics, Tribhuvan University.

Building a natural sounding Text-To-Speech system for the Nepali language: research and development challenges and solutions

Authors

Keywords:

Abstract

Author Biographies

Roop Shree Ratna Bajracharya, Department of Computer Science and Engineering, Kathmandu University, Nepal

Santosh Regmi, KEIV Technologies Pvt. Ltd.

Bal Krishna Bal, Department of Computer Science and Engineering, Kathmandu University, Nepal

Balaram Prasain, Central Deartment of Linguistics, Tribhuvan University, Nepal

Downloads

Published

How to Cite

Issue

Section

License

How to Cite

Information