Loading...

A new segmentation technique for multi font Farsi/Arabic texts

Omidyeganeh, M ; Sharif University of Technology | 2005

273 Viewed
  1. Type of Document: Article
  2. DOI: 10.1109/ICASSP.2005.1415515
  3. Publisher: 2005
  4. Abstract:
  5. Segmentation is a very important stage of Farsi/Arabie character recognition systems. A new segmentation algorithm -for multi font Farsi/Arabic texts- based on the conditional labeling of the up contour and down contour is presented. A pre-processing technique is used to adjust the local base line for each subword. This algorithm uses adaptive base line for each subword to improve the segmentation results. This segmentation algorithm, in addition to up and down contours, takes advantage of their curvatures also. The algorithm was tested on a data set of printed Farsi texts, containing 22236 characters, in 18 different fonts. 97% of characters were correctly segmented. © 2005 IEEE
  6. Keywords:
  7. Adaptive systems ; Algorithms ; Character recognition ; Data reduction ; Image segmentation ; Word processing ; Contour ; Curvatures ; Segmentation algorithm ; Subword ; Text processing
  8. Source: 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05, Philadelphia, PA, 18 March 2005 through 23 March 2005 ; Volume II , 2005 , Pages II757-II760 ; 15206149 (ISSN); 0780388747 (ISBN); 9780780388741 (ISBN)
  9. URL: https://ieeexplore.ieee.org/document/1415515