Variants of vector space reductions for predicting the compositionality of English noun compounds

Alipoor, P ; Sharif University of Technology | 2020

308 Viewed
  1. Type of Document: Article
  2. Publisher: European Language Resources Association (ELRA) , 2020
  3. Abstract:
  4. Predicting the degree of compositionality of noun compounds such as snowball and butterfly is a crucial ingredient for lexicography and Natural Language Processing applications, to know whether the compound should be treated as a whole, or through its constituents, and what it means. Computational approaches for an automatic prediction typically represent and compare compounds and their constituents within a vector space and use distributional similarity as a proxy to predict the semantic relatedness between the compounds and their constituents as the compound's degree of compositionality. This paper provides a systematic evaluation of vector-space reduction variants across kinds, exploring reductions based on part-of-speech next to and also in combination with Principal Components Analysis using Singular Value Decomposition, and word2vec embeddings. We show that word2vec and nouns-only dimensionality reductions are the most successful and stable vector space reduction variants for our task. © European Language Resources Association (ELRA), licensed under CC-BY-NC
  5. Keywords:
  6. Compositionality ; Noun Compounds ; Dimensionality reduction ; Forecasting ; Functional analysis ; Natural language processing systems ; Semantics ; Singular value decomposition ; Vectors ; Automatic prediction ; Computational approach ; Distributional similarities ; Natural language processing applications ; Principal components analysis ; Semantic relatedness ; Space reductions ; Systematic evaluation ; Vector spaces
  7. Source: 12th International Conference on Language Resources and Evaluation, LREC 2020, 11 May 2020 through 16 May 2020 ; 2020 , Pages 4379-4387
  8. URL: https://www.aclweb.org/anthology/2020.lrec-1.539