Automatic Difficulty Estimation of Thematic Similarity MultipleChoice Questions

Akef, Soroosh; Sameti, Hossein Bokaei, Mohammad Hadi

Please enable javascript in your browser.

Automatic Difficulty Estimation of Thematic Similarity MultipleChoice Questions

Akef, Soroosh | 2021

878 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 53828 (31)
University: Sharif University of Technology
Department: Languages and Linguistics Center
Advisor(s): Sameti, Hossein; Bokaei, Mohammad Hadi
Abstract:
This project has been conducted in two related phases: In the first phase, we have attempted to write a program capable of answering thematic similarity multiple-choice questions without utilizing any training data. The best performance in this phase was attained by the 25-topic LDA model using the Hellinger distance between the probability distributions of the poetic verses. This model managed to attain an accuracy of 42%, which is very close to the average human performance of 43%. In the second phase, two tasks of seven-class classification and binary classification were defined based on the p-value of the questions. To this end, the questions were initially ranked according to the performance of the question answering system and a novel confidence coefficient. Subsequently, the questions were classified according to the p-value distribution among the classes. The best performance through this novel approach was attained by the question answering system based on the 30-topic LDA model with an F-1 score of 24% on the seven-class classification task and an F-1 score of 62% on the binary classification task. In this project, we introduced the novel task of answering thematic similarity multiple-choice questions, which can be used as a downstream task to evaluate the next generation of language representation models. Moreover, it was demonstrated that the assumption that there exists a correlation between machine performance and human performance must not be made without prior verification, and effort needs to be put into finding a model whose behavior resembles that of a human
Keywords:
Latent Dirrichlet Allocation (LDA) ; Question Answering ; Educational Technology ; Poem Processing ; Poetry ; Question Difficulty Estimation ; Bidirectional Encoder Representations from Transformers (BERT)Model

Digital Object List

محتواي کتاب
view

Bookmark

Friend's email
Your name
Your email
enter code