The Effect Of Changes In Speech Features On The Recognition Accuracy Of ASR System: A Study On The Malay Speech Impaired Children
Main Article Content
Abstract
Speech impairments refers to disability that causes the human speech production to deviate from the norm. Although there have been several researches undertaken to identify the differences between non-impaired and impaired speech, little is known about their effects on the speech intelligibility and the performance of ASR systems in recognizing impaired speech of children. This study investigates the speech features of impaired speech in relation to intelligibility deficits and degradation in ASR performance; which includes, formant frequencies, intensity, fundamental frequency (F0) and perturbation features such as jitter and shimmer. As there is no existing speech database for performing the evaluation, we have developed a speech database of speech impaired children and have analysed the impaired speech features. We have identified significant differences in the selected features. We also have identified the relationship between the ASR system’s Word Error Rate (WER) of impaired speeches with the speech features. The results show that there are significant differences in F0, jitter and shimmer across the Control Group (CG) and the Speech Impaired Group (SIG). This paper explains the differences between impaired speeches and non-impaired speeches that can be used in developing automated speech recognition system. We have observed that F0 affects the ASR performance and was found to be a significant predictor that influences the accuracy of vowel phonemes /e/ and /u/.