Loading...
Automatic Difficulty Estimation of Thematic Similarity MultipleChoice Questions
Akef, Soroosh | 2021
878
Viewed
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 53828 (31)
- University: Sharif University of Technology
- Department: Languages and Linguistics Center
- Advisor(s): Sameti, Hossein; Bokaei, Mohammad Hadi
- Abstract:
- This project has been conducted in two related phases: In the first phase, we have attempted to write a program capable of answering thematic similarity multiple-choice questions without utilizing any training data. The best performance in this phase was attained by the 25-topic LDA model using the Hellinger distance between the probability distributions of the poetic verses. This model managed to attain an accuracy of 42%, which is very close to the average human performance of 43%. In the second phase, two tasks of seven-class classification and binary classification were defined based on the p-value of the questions. To this end, the questions were initially ranked according to the performance of the question answering system and a novel confidence coefficient. Subsequently, the questions were classified according to the p-value distribution among the classes. The best performance through this novel approach was attained by the question answering system based on the 30-topic LDA model with an F-1 score of 24% on the seven-class classification task and an F-1 score of 62% on the binary classification task. In this project, we introduced the novel task of answering thematic similarity multiple-choice questions, which can be used as a downstream task to evaluate the next generation of language representation models. Moreover, it was demonstrated that the assumption that there exists a correlation between machine performance and human performance must not be made without prior verification, and effort needs to be put into finding a model whose behavior resembles that of a human
- Keywords:
- Latent Dirrichlet Allocation (LDA) ; Question Answering ; Educational Technology ; Poem Processing ; Poetry ; Question Difficulty Estimation ; Bidirectional Encoder Representations from Transformers (BERT)Model