Loading...

Building and incorporating language models for Persian continuous speech recognition systems

Bahrani, M ; Sharif University of Technology | 2006

310 Viewed
  1. Type of Document: Article
  2. Publisher: European Language Resources Association (ELRA) , 2006
  3. Abstract:
  4. In this paper building statistical language models for Persian language using a corpus and incorporating them in Persian continuous speech recognition (CSR) system are described We used Persian Text Corpus for building the language models First we preprocessed the texts of corpus by correcting the different orthography of words Also, the number of POS tags was decreased by clustering POS tags manually Then we extracted word based monogram and POS-based bigram and trigram language models from the corpus We also present the procedure of incorporating language models in a Persian CSR system By using the language models 274% reduction m word error rate was achieved in the best case
  5. Keywords:
  6. Character recognition ; Computational linguistics ; Continuous speech recognition ; Natural language processing systems ; Continuous speech ; Language model ; Persian languages ; Persian Text Corpus ; Persians ; Statistical language models ; Tri grams ; Word error rate ; Speech recognition
  7. Source: 5th International Conference on Language Resources and Evaluation, LREC 2006, 22 May 2006 through 28 May 2006 ; 2006 , Pages 2590-2593
  8. URL: https://aclanthology.org/L06-1014