Loading...

High accuracy farsi language character segmentation and recognition

Kiaei, P ; Sharif University of Technology | 2019

382 Viewed
  1. Type of Document: Article
  2. DOI: 10.1109/IranianCEE.2019.8786480
  3. Publisher: Institute of Electrical and Electronics Engineers Inc , 2019
  4. Abstract:
  5. Despite many advances in optical character recognition in general, there are still serious challenges remaining in recognizing Farsi text. The main reason is the cursive nature of the letters in written Farsi, i.e., depending on the position of a letter within a word, it might join to its neighboring letters, which consequently changes the shape of the character. As a result, each letter can have up to four different character shapes. In addition to the problem of segmenting the characters, the increased number of characters makes the recognition task even more challenging. This paper introduces a complete framework for character recognition, including a method for segmenting the characters and one for classifying the resulting separated characters. Character segmentation is performed using a new sliding-window algorithm with a high accuracy rate of 98.23%. With a total of 32 Farsi letters resulting in 114 character shapes, an almost perfect character recognition rate of 99.94% is achieved using the proposed Fisher characters method. The final system, including segmentation and recognition modules, achieves a recognition rate of 98.17% and is robust against the scale and rotation of the image, and the font size of the written text
  6. Keywords:
  7. Connected character segmentation ; Farsi character recognition ; OCR ; Sliding-window algorithm ; Optical character recognition ; Character segmentation ; Connected characters ; Farsi language ; Font size ; High-accuracy ; Scale and rotation ; Sliding window algorithms ; Written texts ; Image segmentation
  8. Source: 27th Iranian Conference on Electrical Engineering, ICEE 2019, 30 April 2019 through 2 May 2019 ; 2019 , Pages 1692-1698 ; 9781728115085 (ISBN)
  9. URL: https://ieeexplore.ieee.org/document/8786480