Loading...

Using Structural Language Modeling in Continous Speech Recognition Systems

SheikhShab, Golnar | 2009

433 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 39557 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Sameti, Hossein
  7. Abstract:
  8. Language model is one of the most important parsts of an automated speech recognition system whiche incorporates the knowledge of Natural Language to the system to improve its accuracy. Conventionally used language model in recognition systems is ngram which usually is extracted from a large corpus using related frequency method. ngram model approximates the probability of a word sequence by multiplying its ngram probabilities and thus does not take into account the long distance dependencies. So, syntactic language models could be of interest. In this research after probing different syntactic language models, a mehtod for re-estimating ngram model, introduced by Stolcke in 1994, was rcognized as the most suitable one for being implemented and used in Farsi speech recognition systems. Results of speech recognition using this language model showed that it can approximate conventional ngram and interpolating the two can improve the system’s accuracy. However, the most important feature of the so-called Stolcke model is its extensiblity and that it doesn’t face sparse data problem. It is worth mentioning that during this research, was introduced a method for increasing the speed of well-known Inside-outside algorithm for estimating probabilities of a stochastic context-free grammar using a large raw corpus
  9. Keywords:
  10. Inside-Outside Algorithm ; Speech Recognition ; Persian Speech ; Syntactic Language Model ; Stochastic Context-free Grammer

 Digital Object List

 Bookmark

No TOC