Loading...
Improved Medical Visual Question Answering using Deep Learning
Javadi Joortani, Maedeh | 2022
0
Viewed
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 57852 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Manzuri, Mohammad Taghi
- Abstract:
- In recent years, with the advances in powerful neural networks, techniques called transfer learning and self-supervised methods have become more popular. Also, attention mechanism modules have naturally provided the possibility of pre-training with a self-supervised method. Embedding these modules in recent transformer-based architectures has improved image and text tasks. In this study, a way to enhance the visual input vectors of the multimode Bert model based on transformer architecture is introduced. Also, an effective and interpretable structure of a combined convolution and transformer method is presented, which is pre-trained on data from this field before realignment for a medical visual question-answering task. Finally, we Fine tune the enhanced architecture for a dataset in the Persian language. Two evaluation metrics, arithmetic mean and harmonic, are also introduced to compensate for the biases of the data set, then we achieve remarkable results on two datasets of the medical field
- Keywords:
- Neural Networks ; Transfer Learning ; Transformers ; Attention Mechanism ; Self-Supervised Pre-Training ; Visual Question Answering
-
محتواي کتاب
- view