Title | ||
---|---|---|
Improving Sub-Phone Modeling For Better Native Language Identification With Non-Native English Speech |
Abstract | ||
---|---|---|
Identifying a speaker's native language with his speech in a second language is useful for many human-machine voice interface applications. In this paper, we use a sub-phone-based i-vector approach to identify non-native English speakers' native languages by their English speech input. Time delay deep neural networks (TDNN) are trained on LVCSR corpora for improving the alignment of speech utterances with their corresponding sub-phonemic "senone" sequences. The phonetic variability caused by a speaker's native language can be better modeled with the sub-phone models than the conventional phone model based approach. Experimental results on the database released for the 2016 Interspeech ComParE Native Language challenge with 11 different L1s show that our system outperforms the best system by a large margin (87.2% UAR compared to 81.3% UAR for the best system from the 2016 ComParE challenge). |
Year | DOI | Venue |
---|---|---|
2017 | 10.21437/Interspeech.2017-245 | 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION |
Keywords | Field | DocType |
time delay deep neural network, i-vector, native language identification | Computer science,Speech recognition,Phone,Natural language processing,Artificial intelligence,Native-language identification | Conference |
ISSN | Citations | PageRank |
2308-457X | 0 | 0.34 |
References | Authors | |
10 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Qian Yao | 1 | 527 | 51.55 |
Keelan Evanini | 2 | 79 | 20.23 |
Xinhao Wang | 3 | 57 | 15.23 |
David Suendermann-Oeft | 4 | 3 | 2.17 |
Robert A. Pugh | 5 | 0 | 0.68 |
Patrick Lange | 6 | 9 | 8.42 |
Hillary R. Molloy | 7 | 0 | 0.34 |
Frank K. Soong | 8 | 1395 | 268.29 |