Loading...

A Soft Spectrographic Mask Estimation for Speech Recognition

Esmaeelzadeh, Vahid | 2008

508 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 39033 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Sameti, Hossein
  7. Abstract:
  8. Nowadays, robustness of the Automatic Speech Recognition (ASR) systems against various noises is major challenge in these systems. Missing feature speech recognition approaches are our goal in this thesis for achieving robust ASR systems. In these approaches, low SNR regions of a spectrogram are considered to be “missing” or “unreliable” and are removed from the spectrogram. Noise compensation is carried out by either estimating the missing regions from the remaining regions in some manner prior to recognition, or by performing recognition directly on incomplete spectrograms. These techniques clearly require a "spectrographic mask" which accurately labels the reliable and unreliable regions of a spectrogram. The most difficult aspect of missing feature methods is the estimation of the spectrographic masks that identify unreliable spectrographic components. In this thesis, a soft spectrographic mask is estimated. This estimation is based on speech processing and Bayesian classifiers. Two new measures, subband spectral entropy and sparsity, for estimating the reliability of voiced speech are proposed. The main characteristic of the proposed measures is the robustness against non-stationary noises. By these measures and Bayesian classifiers, soft masks are estimated. The spectrographic masks generated by the Bayesian classifier result in recognition accuracy that is better than the previously reported methods of mask estimation
  9. Keywords:
  10. Spectral Mask ; Automatic Speech Recognition ; Missing Feature Theory ; Sparsity ; Spectral Entropy ; Bayesian Classifier

 Digital Object List

 Bookmark

No TOC