Loading...

Using Discriminative Training Approaches for Large Vocabulary Isolated Word Recognition

Osati, Majid | 2017

510 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 50905 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Sameti, Hossein
  7. Abstract:
  8. In this study, isolated word recognition problem has been studied in large scale and different acoustic models are engaged to solve the problem. Acoustic models, based on discriminative training methods, are compared our proposed approach with other available training methods. Acoustic models are built and trained based on HMM-GMM, HMM- subspace GMM and HMM-DNN using different training criteria such as Maximum Mutual Information (MMI), boosted MMI, Minimum Phoneme Error (MPE), and state-level Minimum Bayesian Risk (sMBR). Using these discriminative approaches led to improvement of speech recognition systems. Boosted MMI with boosting factor of 0.3 for HMM-DNN has resulted in Word Error Rate (WER) of 7.72, showing an improvement of 20.7 comparative WER over DNN as the state-of-the-art method. By increasing the lexicon size from 5236 to 8084 words, the WER to 8.15. A new continuous dataset is collected for speech recognition in order to utilize combination of continuous and isolated datasets. The continuous dataset includes 47 hours. This dataset is cleaned and segmented using automatic methods based on alignment and has led to improvement in performance of continuous speech recognition. An HMM-DNN model is built and trained using the continuous dataset, and then parameters of this model are optimized using isolated training dataset trained by MMI. This cretria approach resulted in 1.2 percent improvement in comparative WER over equivalent model based on isolated dataset
  9. Keywords:
  10. Acoustic Model ; Speech Recognition ; Discriminative Training ; Isolated Word Recognition ; Continuous Dataset

 Digital Object List

 Bookmark

No TOC