Likelihood-maximizing-based multiband spectral subtraction for robust speech recognition

Babaali, B; Sameti, H Safayani, M Sharif University of Technology

Please enable javascript in your browser.

Likelihood-maximizing-based multiband spectral subtraction for robust speech recognition

Babaali, B ; Sharif University of Technology | 2009

697 Viewed

Type of Document: Article
DOI: 10.1155/2009/878105
Publisher: 2009
Abstract:
Automatic speech recognition performance degrades significantly when speech is affected by environmental noise. Nowadays, the major challenge is to achieve good robustness in adverse noisy conditions so that automatic speech recognizers can be used in real situations. Spectral subtraction (SS) is a well-known and effective approach; it was originally designed for improving the quality of speech signal judged by human listeners. SS techniques usually improve the quality and intelligibility of speech signal while speech recognition systems need compensation techniques to reduce mismatch between noisy speech features and clean trained acoustic model. Nevertheless, correlation can be expected between speech quality improvement and the increase in recognition accuracy. This paper proposes a novel approach for solving this problem by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy. This will incorporate important information of the statistical models of the recognition engine as a feedback for tuning SS parameters. By using this architecture, we overcome the drawbacks of previously proposed methods and achieve better recognition accuracy. Experimental evaluations show that the proposed method can achieve significant improvement of recognition rates across a wide range of signal to noise ratios
Keywords:
Acoustic models ; Automatic speech recognition ; Automatic speech recognizers ; Compensation techniques ; Environmental noise ; Experimental evaluations ; Human listeners ; Multi bands ; Noisy speech ; Quality of speech ; Real situations ; Recognition accuracies ; Recognition engines ; Recognition rates ; Robust speech recognition ; Spectral subtractions ; Speech qualities ; Speech recognition systems ; Speech recognizers ; Speech signals ; Statistical models ; Signal to noise ratio ; Speech analysis ; Speech intelligibility ; Speech recognition
Source: Eurasip Journal on Advances in Signal Processing ; Volume 2009 , 2009 ; 16876172 (ISSN)
URL: https://link.springer.com/article/10.1155/2009/878105

Friend's email
Your name
Your email
enter code