Loading...

Pattern Based Relation Extraction on Presian News Articles

Cholmaghani Qaheh, Ali | 2016

539 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 48643 (31)
  4. University: Sharif University of Technology
  5. Department: Languages and Linguistics Center
  6. Advisor(s): Bahrani, Mohammad; Sameti, Hossein
  7. Abstract:
  8. Relation extraction is known as a main task in information extraction. There are two main approach in this field, rule based and statistical approaches. This thesis applied a rule based relation extraction approach. In this research we tried to recognize Persian syntactic and morphological patterns to extract relation between named entities. At first we annotated a news dataset by person,organization and location named entity tags which is included more than 100 thousand tokens. After that we found there are 1037 relations 2197 candidate relations. Candidate and labled relations extracted between two entities which is located in a clause. These relations are "PERS_PERS-COMMENTING", "PERS_PERS-MEETING", "PERS_PERSCOOPERATION","ORG_ORG-TRADING","ORG_ORG-COMPETITION", "LOC_LOC-LOCATEDIN","PERS_ORG-AFFILIATION", "PERS_LOC-LOCATEDIN","PERS_LOC-BELONGTO", "ORG_LOC-LOCATEDIN".After recognizing relations we extracted some patterns for each relation. These patterns extracted by morphological and syntactical rules. We evaluated extraction patterns on test dataset and found 87% precision, 72% recall and 79% F-measure
  9. Keywords:
  10. Natural Language Processing ; Information Extraction ; Relation Extraction ; Computational Linguistics ; Text Mining ; Persian Texts

 Digital Object List

 Bookmark

No TOC