Loading...

Persian Named Entity Recognition

Jalali Farahani, Farane | 2020

879 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 53326 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Ghassem-Sani, Gholamreza
  7. Abstract:
  8. Named entity recognition (NER) is one of important tasks in natural language processing (NLP). Named entities consist of specific nouns such as personal names, organizations, locations, etc., which refer to important entities in text. NER contributes towards other NLP tasks such as machine translation, text summarization ,and text classification. In the recent decade, with respect to development of deep learning (DL) methods, considerable progress has been made in this field. The objective here is to propose an efficient method for NER in Farsi (Persian) text through DL methods. Since deep neural networks require a great deal of training data, and due to the fact that Farsi lacks such data, we have tried to apply transfer learning and active learning approaches. BERT pre-trained model is applied here, which is based on transfer learning to take advantage of transferring knowledge from source task to destination task. BERT is capable of supporting more than 100 languages including Farsi. The architecture of our proposed method is based on BERT and conditional random field (CRF). The results of applying supervised learning method on Arman corpus is 84.23% and 80.80% word-level and phrase-level F1-score, respectively. Our proposed method on PEYMA corpus has 86.14% and 82.05% word-level and phrase-level F1-score, respectively. By applying active learning methods with 30% of the Arman corpus and 20% of the PEYMA corpus separately, a 92.15% and 92.41% efficiency of supervised learning have been obtained, respectively
  9. Keywords:
  10. Natural Language Processing ; Active Learning ; Transfer Learning ; Named Entity Recognition ; Bidirectional Encoder Representations from Transformers (BERT)Model

 Digital Object List

 Bookmark

...see more