Loading...

Using ASR methods for OCR

Arora, A ; Sharif University of Technology | 2019

580 Viewed
  1. Type of Document: Article
  2. DOI: 10.1109/ICDAR.2019.00111
  3. Publisher: IEEE Computer Society , 2019
  4. Abstract:
  5. Hybrid deep neural network hidden Markov models (DNN-HMM) have achieved impressive results on large vocabulary continuous speech recognition (LVCSR) tasks. However, the recent approaches using DNN-HMM models are not explored much for text recognition. Inspired by the current work in automatic speech recognition (ASR) and machine translation, we present an open vocabulary sub-word text recognition system. The sub-word lexicon and sub-word language model (LM) helps in overcoming the challenge of recognizing out of vocabulary (OOV) words, and a time delay neural network (TDNN) and convolution neural network (CNN) based DNN-HMM optical model (OM) efficiently models the sequence dependency in the line image. We present results on 12 datasets with training data varying from 6k lines to 600k lines. The system is built for 8 languages, i.e., English, French, Arabic, Chinese, Farsi, Tamil, Russian, and Korean. We report competitive results on several commonly used handwritten and printed text datasets. © 2019 IEEE
  6. Keywords:
  7. ASR ; BPE ; LF MMI ; OCR ; Computational linguistics ; Continuous speech recognition ; Deep neural networks ; Hidden Markov models ; Neural networks ; Optical character recognition ; Speech transmission ; Automatic speech recognition ; Convolution neural network ; Large vocabulary continuous speech recognition ; Machine translations ; Open Vocabulary ; Out of vocabulary words ; Time delay neural networks ; Character recognition
  8. Source: 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019, 20 September 2019 through 25 September 2019 ; 2019 , Pages 663-668 ; 15205363 (ISSN); 9781728128610 (ISBN)
  9. URL: https://ieeexplore.ieee.org/document/8978150