Loading...
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 48345 (31)
- University: Sharif University of Technology
- Department: Languages and Linguistics Center
- Advisor(s): Bahrani, Mohammad
- Abstract:
- Pronoun resolution is one of the challenges of natural language processing. The proposed solutions range from heuristic rule-based to machine learning data driven approaches. In this thesis, we followed a previous machine learning base work to Persian pronoun anaphora resolution. The primary goal of this thesis was to improve results, mainly by extracting more balanced data and to add more features to the extracted feature vectors used in classification. Using PCAC2008 dataset, we considered noun phrase structure as a way to extract more suitable training data. Features added to the extracted data include syntactic and semantic features. Then, we trained and tested different machine learning classifiers in order to compare the returned results. The best result was achieved using the C4.5 decision tree algorithm. The result showed a significant improvement over the previous work by achieving 75% F-measure compared to 45%. An analysis of extracted features and their contribution are also discussed
- Keywords:
- Natural Language Processing ; Coreference Resolution ; Machine Learning ; Pronoun Resolution ; Pronoun
-
محتواي کتاب
- view