Loading...

Automatic Extraction of Persian Named Entities’ Knowledge Graph from Web Sources

Azami, Hamid | 2019

743 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 52416 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Izadi, Mohammad
  7. Abstract:
  8. Knowledge graphs are structured data sources which are widely used in the information process techniques. There are general and specialized knowledge graphs out there. These graphs will be used as the kernel of future search engines. Due to the lack of proper and tested Persian knowledge graphs, a method for knowledge graph extraction from news sources of the web has been introduced in this research.A knowledge graph extraction system from the unstructured web sources has been implemented in this research. In order to achieve this, a training dataset for the classifier was first extracted from semi-structured data of Wikipedia pages. At that time sentences were extracted from the unstructured data of the web, news sources, in order to create the test set. After that a multiclass logistic classifier was trained with the training and test sets. This classifier has been employed to extract relations from news sources. A knowledge graph has been extracted in this research which its nodes are triples of the type (subject, predicate, object). Entities in this graph are organizations, persons and locations. This graph contains 3,592,032 entities. This graph has been assessed and showed high precision in lexical, uniqueness and compatibility metrics because of the use of distant supervision technique
  9. Keywords:
  10. Knowledge Graph ; Structured Data ; Semi-Structured Data ; Logistic Classifier

 Digital Object List

 Bookmark

...see more