Phoneme-guided Dysarthric speech conversion With non-parallel data by joint training - Citegraph

Paper Info

Title
Phoneme-guided Dysarthric speech conversion With non-parallel data by joint training

Abstract
The phonetic structures of dysarthric speech are more difficult to discriminate than those of normal speech. Therefore, in this paper, we propose a novel voice conversion framework for dysarthric speech by learning disentangled audio-transcription representations. The novelty of this method is that it simultaneously takes both audio and its corresponding transcription as training inputs. We constrain the extracted linguistic representation from the audio input to be close to the linguistic representation from the transcription input, forcing them to share the same distribution. Furthermore, the proposed model can generate appropriate linguistic representations without any transcripts during the testing stage. The results of objective and subjective evaluations showed that the proposed method exhibits higher intelligibility and better speaker similarity of the converted speech than those of the baseline approaches.

Year	DOI	Venue
2022	10.1007/s11760-021-02119-6	Signal, Image and Video Processing
Keywords	DocType	Volume
Voice conversion, Dysarthric speech, Autoencoder, Non-parallel data	Journal	16
Issue	ISSN	Citations
6	1863-1703	0
PageRank	References	Authors
0.34	10	5

Authors (5 rows)

Cited by (0 rows)

References (10 rows)

Name	Order	Citations	PageRank
Chen, Xunquan	1	0	0.34
Oshiro, Atsuki	2	0	0.34
J. Chen	3	112	23.18
Takashima, Ryoichi	4	0	0.34
Tetsuya Takiguchi	5	85	8.77

1