Loading...

A Self-Tag Rectifier Model for Automatic Image Annotation

Ghostan Khatchatoorian, Artin | 2020

435 Viewed
  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 53428 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Jamzad, Mansour; Beigy, Hamid
  7. Abstract:
  8. Automatic image annotation is an image retrieval mechanism to extract relative semantic tags from visual contents. The number of digital images uploaded in the virtual world is rapidly growing every day. Most of those images are not assigned with proper tags or labels. Although automatic image annotation methods are developed to assign proper tags to images, most of these methods assign some irrelevant tags and also sometimes a few relevant tags are missing. So far, the improvements of accuracy in newly developed automatic image annotation methods have been about one or two percent in F1-score compared to the previous methods. To reach much better performance, we analyzed most of the state-of-the-art image annotation methods. Although each method has its unique perspective for solving the problem and focuses on a minor part of the annotation problem, but their architectures seem to have room for improvement. Therefore, continuing to develop a similar approach most possibly will end up with more or less the same results. Another challenge in this field is the imbalanced datasets and the difficulty of successfully learning tags from such datasets. Even if a nearly balanced dataset existed for the image annotation, it is unlikely to find a single learner which could learn all tags with the same accuracy. To overcome these challenges, this thesis aims to design a detailed general-purpose architecture that allows researchers in annotation field to improve the performance and accuracy of different models in their annotation method. The proposed architecture has three primary parts: feature extraction, learning, and annotation. In feature extraction, we used a deep feature vector. In learning, we define a clustering model which reduces the Hamming loss by learning each cluster instead of learning each tag. In order to improve the learning rate, we used machine learning and probability bases. Our proposed learning model has two learners, a set of SVM classifiers and a tag categorization algorithm. In the annotation part, we introduced the novel idea of post rectifying methods which aim to remove irrelevant tags from the annotation result and if possible, replace them with the relevant tags. The post rectifying methods are independent of feature vectors, datasets, and annotation methods. To resolve the imbalanced datasets challenge, we suggest a novel integration system that selects an elite group of models from all existing annotation models and then combines them to get the best advantage of each model’s learning technique. As a result, we could study the training dataset of those models without the need for direct access to that dataset. As this algorithm is independent of the annotation model or datasets, it could be used to combine the currently available annotation models and those developed in the future, along with their datasets and learning models. As a result, the proposed architecture, algorithms and novel ideas, resulted in new accuracy milestones in F1-score on the most commonly used datasets. In our proposed architecture, the N+ measure which shows the number of tags with non-zero recalls showed that we could recall all tags in IAPRTC-12 and ESP-Games datasets. Our suggested new architecture could open a new horizon in automatic image annotation by reaching the F1-Score of 0.5934, 0.5439, and 0.4935 for Corel5k, IAPRTC-12, and ESP-Games datasets which are the highest scores reported so far. We believe the ideas suggested in this thesis could be an integration ground for automatic image annotation models
  9. Keywords:
  10. Image Annotation ; Learning ; Image Retrieval ; Integrated System ; Integration Mechanisms ; Automatic Images Annotation (AIA) ; Annotation Post Rectifying ; Annotation Architecture

 Digital Object List

 Bookmark

  • Blank Page
  • Blank Page