Title
Estonian Text-to-Speech Synthesis with Non-autoregressive Transformers
Abstract
While text-to-speech synthesis with non-autoregressive Transformers has achieved state-of-the-art quality for many languages, the methodology of Estonian text-to-speech synthe-sis has not been revised for neural methods. This paper evaluates the quality of Estonian text-to -speech with Transformer-based models using different language-specific data processing steps. Additionally, we conduct a human evaluation to show how well these models can learn the pat-terns of Estonian pronunciation, given different amounts of training data and varying degrees of phonetic information. Our error analysis shows that using a simple multi-speaker approach can significantly decrease the number of pronunciation errors, while some information can also be helpful to a smaller extent.
Year
DOI
Venue
2022
10.22364/bjmc.2022.10.3.17
BALTIC JOURNAL OF MODERN COMPUTING
Keywords
DocType
Volume
speech technology, text -to -speech synthesis, Estonian
Journal
10
Issue
ISSN
Citations 
3
2255-8942
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Liisa Ratsep100.34
Rasmus Lellep200.34
Mark Fishel300.68