Loading...
Persian pronoun resolution using data driven approaches
Nourbakhsh, A ; Sharif University of Technology | 2017
927
Viewed
- Type of Document: Article
- DOI: 10.1007/978-3-319-67642-5_48
- Publisher: Springer Verlag , 2017
- Abstract:
- Pronoun resolution is one of the challenges of natural language processing (NLP). The proposed solutions range from heuristic rule-based to machine learning data driven approaches. In this article, we follow a previous machine learning approach on Persian pronoun anaphora resolution. The primary goal of this paper is to improve the results, mainly by extracting more balanced data through using heuristic rules in instance sampling, and utilizing more relevant features in classification. Using PCAC2008 dataset, we consider noun phrase structure as a way to extract more suitable training data. Incorporated features include syntactic and semantic information. Finally, we train and test different classifiers in order to find and compare the results. The best result is achieved by using the C4.5 decision tree classifier. The results show a significant improvement over the previous work by achieving 75% F-measure compared to 45%. An analysis of extracted features and their contribution are also discussed. © Springer International Publishing AG 2017
- Keywords:
- Coreference resolution ; Machine learning algorithms ; Persian pronouns ; Pronoun resolution ; Artificial intelligence ; Computational grammars ; Data mining ; Decision trees ; Learning systems ; Natural language processing systems ; Semantics ; Anaphora resolution ; C4.5 Decision tree classifier ; Co-reference resolutions ; Data-driven approach ; Machine learning approaches ; Persians ; Pronoun resolution ; Semantic information ; Learning algorithms
- Source: 23rd International Conference on Information and Software Technologies, ICIST 2017, 12 October 2017 through 14 October 2017 ; Volume 756 , 2017 , Pages 574-585 ; 18650929 (ISSN); 9783319676418 (ISBN)
- URL: https://link.springer.com/chapter/10.1007%2F978-3-319-67642-5_48
