The acoustic detection method based on machine learning and signal processing is an important method of pathological voice detection and the extraction of voice features is one of the most important. Currently, the features widely used have disadvantage of dependence on the fundamental frequency extraction, being easily affected by noise and high computational complexity. In view of these shortcomings, a new method of pathological voice detection based on multi-band analysis and chaotic analysis is proposed. The gammatone filter bank was used to simulate the human ear auditory characteristics to analyze different frequency bands and obtain the signals in different frequency bands. According to the characteristics that turbulence noise caused by chaos in voice will worsen the spectrum convergence, we applied short time Fourier transform to each frequency band of the voice signal, then the feature gammatone short time spectral self-similarity (GSTS) was extracted, and the chaos degree of each band signal was analyzed to distinguish normal and pathological voice. The experimental results showed that combined with traditional machine learning methods, GSTS reached the accuracy of 99.50% in the pathological voice database of Massachusetts Eye and Ear Infirmary (MEEI) and had an improvement of 3.46% compared with the best existing features. Also, the time of the extraction of GSTS was far less than that of traditional nonlinear features. These results show that GSTS has higher extraction efficiency and better recognition effect than the existing features.
Citation: ZHAO Denghuang, ZHOU Changwei, ZHU Xincheng, ZHANG Xiaojun, TAO Zhi. Pathological voice detection based on gammatone short time spectral self-similarity. Journal of Biomedical Engineering, 2022, 39(4): 694-701, 712. doi: 10.7507/1001-5515.202107037 Copy
Copyright © the editorial department of Journal of Biomedical Engineering of West China Medical Publisher. All rights reserved