Title
Improving Sub-Phone Modeling For Better Native Language Identification With Non-Native English Speech
Abstract
Identifying a speaker's native language with his speech in a second language is useful for many human-machine voice interface applications. In this paper, we use a sub-phone-based i-vector approach to identify non-native English speakers' native languages by their English speech input. Time delay deep neural networks (TDNN) are trained on LVCSR corpora for improving the alignment of speech utterances with their corresponding sub-phonemic "senone" sequences. The phonetic variability caused by a speaker's native language can be better modeled with the sub-phone models than the conventional phone model based approach. Experimental results on the database released for the 2016 Interspeech ComParE Native Language challenge with 11 different L1s show that our system outperforms the best system by a large margin (87.2% UAR compared to 81.3% UAR for the best system from the 2016 ComParE challenge).
Year
DOI
Venue
2017
10.21437/Interspeech.2017-245
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION
Keywords
Field
DocType
time delay deep neural network, i-vector, native language identification
Computer science,Speech recognition,Phone,Natural language processing,Artificial intelligence,Native-language identification
Conference
ISSN
Citations 
PageRank 
2308-457X
0
0.34
References 
Authors
10
8
Name
Order
Citations
PageRank
Qian Yao152751.55
Keelan Evanini27920.23
Xinhao Wang35715.23
David Suendermann-Oeft432.17
Robert A. Pugh500.68
Patrick Lange698.42
Hillary R. Molloy700.34
Frank K. Soong81395268.29