• School of Information and Computer Science, Taiyuan University of Technology, Taiyuan 030024, P. R. China;
XUE Peiyun, Email: 236139168@qq.com
Export PDF Favorites Scan Get Citation

In this paper, we propose a multi-scale mel domain feature map extraction algorithm to solve the problem that the speech recognition rate of dysarthria is difficult to improve. We used the empirical mode decomposition method to decompose speech signals and extracted Fbank features and their first-order differences for each of the three effective components to construct a new feature map, which could capture details in the frequency domain. Secondly, due to the problems of effective feature loss and high computational complexity in the training process of single channel neural network, we proposed a speech recognition network model in this paper. Finally, training and decoding were performed on the public UA-Speech dataset. The experimental results showed that the accuracy of the speech recognition model of this method reached 92.77%. Therefore, the algorithm proposed in this paper can effectively improve the speech recognition rate of dysarthria.

Citation: ZHAO Jianxing, XUE Peiyun, BAI Jing, SHI Chenkang, YUAN Bo, SHI Tongtong. A multiscale feature extraction algorithm for dysarthric speech recognition. Journal of Biomedical Engineering, 2023, 40(1): 44-50. doi: 10.7507/1001-5515.202205049 Copy

Copyright © the editorial department of Journal of Biomedical Engineering of West China Medical Publisher. All rights reserved

  • Previous Article

    Study on the method of polysomnography sleep stage staging based on attention mechanism and bidirectional gate recurrent unit
  • Next Article

    Fetal electrocardiogram signal extraction and analysis method combining fast independent component analysis algorithm and convolutional neural network