Loading...

Unsupervised Persian Keyword Extraction Using Exemplar Terms

Alidoust, Ali | 2016

417 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 48739 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Sameti, Hossein; Ghasem Sani, Gholam Reza
  7. Abstract:
  8. Keywords or keyphrases are of importance as the smallest unit of representing the meaning of a text. Automated Keyword Extraction (AKE), as one of the natural language processing tasks is used in various applications such as searching, indexing and information retrieval. Keywords of scientific articles are basically specified manually by their authors, whereas most of the information available on the internet lack such keywords. In this research, we endeavor to automatically extract keywords of a set of Persian paper abstracts using an unsupervised machine learning method. The method used is to extract a set of candidate phrases from the text, and to cluster the document words to find a set of exemplar terms, and then to use these terms to extract the keywords from the candidate phrases. Clustering is used to ensure that the extracted keywords can cover all of the major subjects expressed in the text. According to recent reviews, this method has outperformed other methods on English paper abstracts. Our effort is to combine this method with Persian language processing tools and to improve its performance. In experiments, we managed to increase the F1 measure by %10.7, using innovative refinement methods
  9. Keywords:
  10. Machine Learning ; Natural Language Processing ; Keyword Extaction ; Unsupervised Learning

 Digital Object List

 Bookmark

...see more