Loading...

On the Use of Artificial Neural Networks in Automatic Speech Recognition

Hassani, Adel | 2015

787 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: English
  3. Document No: 48217 (55)
  4. University: Sharif University of Technology, International Campus, Kish Island
  5. Department: Science and Engineering
  6. Advisor(s): Ghorshi, Mohammad Ali; Khayyat, Amir Ali Akbar
  7. Abstract:
  8. In this thesis, the Artificial Neural Networks (ANN) will be used in Automatic Speech Recognition (ASR) instead of Hidden Markov Models (HMM). Hidden Markov Model is one of the most dominant Bayesian network technologies and is the most successful model in current ASR systems. However, excessive training time is a major issue in speech recognition based on Hidden Markov Model (HMM). This thesis presents an Artificial Neural Network language model for human speech by mapping the spectral features of speech namely the formants, cepstrum (Mel-Frequency Cepstral Coefficients (MFCCs)) and Power Spectral Density (PSD) as features of samples of specific words into a discrete vector space. The Artificial Neural Network is trained with feed forward and recurrent neural networks topologies to operate as a probability estimator. The smooth nature of the resulting distributions will reduce perplexity for limited data sets of the vocabulary. Therefore, optimized features extraction and discrete converting algorithms are employed for an efficient implementation of feed forward and Elman recurrent neural networks during training process. A word-class interpretation of the neural network inputs and outputs, which are defined by their relevant features, is demonstrated to obtain improved perplexity over formants, MFCCs and PSD model when training data is limited
  9. Keywords:
  10. Feedforward Neural Network ; Artificial Neural Network ; Automatic Speech Recognition ; Power Spectral Density (PSD)Analysis ; Formant Speech Synthesis ; Elman Recurrent Neural Networks (ERNN)

 Digital Object List

 Bookmark

No TOC