Loading...
Multidocument Keyphrase Extraction Using Recurrent Neural Networks
Doostmohammadi, Ehsan | 2019
498
Viewed
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 51848 (31)
- University: Sharif University of Technology
- Department: Languages and Linguistics Center
- Advisor(s): Sameti, Hossein; Bokaei, Mohammad Hadi
- Abstract:
- Keyphrase extraction, as an important open problem of Natural Language Processing (NLP), is useful as a stand-alone task in the field of Information Extraction and as an upstream task for Information Retrieval, text summarization and classification,etc. In this study, regarding the needs in Persian NLP, artificial neural networks are adopted to extract keyphrases from single documents and a graph-based re-scoring method is proposed for multidocument keyphrase extraction. The proposed method for extracting keyphrases from multiple documents consists of two steps: (1) extracting keyphrases of each document in a cluster using a sequence to sequence model with attention, and (2) re-scoring the extracted keyphrases using an unsupervised graph-based method in a way that the keyphrases related to all of the documents score higher. The main problem with neural networks is their need for a huge amount of training data, which is solved using relatively high-quality keyphrases from news websites and agencies. Another corpus of 101 clusters of news is additionally labeled for measuring the performance of the multidocument phase. Since sequence to sequence models are able to capture absent keyprhases, the problem of keyphrase generation is addressed in this research as well. In the single-document phase, the deep model has obtained an F1-score of 50.59%, while the best baseline model could only achieve 21.73%. The deep model has also performed well in the task of keyphrase generation. The proposed re-scoring method has resulted in 4.1% increase in F1-score in the multidocument phase with k of 10
- Keywords:
- Multidocument Keyphrase Extraction ; Keyphrase Extraction ; Keyphrase Generation ; Recurrent Neural Networks ; Sequence to Sequence Learning ; Deep Learning
-
محتواي کتاب
- view
- مقدمه و معرفی
- پیشینهٔ پژوهش و بحثهای نظری
- روش پیشنهادی
- تجزیه و تحلیل دادهها
- آزمایشها و نتایج
- جمعبندی و پیشنهادها
- کتابنامه
- واژهنامهٔ فارسی به انگلیسی
- واژهنامهٔ انگلیسی به فارسی
- پیوست: ریز نتایج