Loading...

Automatic Image Annotation by Multi-view Non-negative Matrix Factorization

Rad, Roya | 2017

663 Viewed
  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 49847 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Jamzad, Mansour
  7. Abstract:
  8. Nowadays the number of digital images has largely increased because of progress in internet technology. Management of this volume of data needs an efficient system for browsing, categorizing, and searching the images. The goal of this research is to design a system for automatic annotation of unobserved images for better search in image data bases. Automatic image annotation is a multi-label classification problem with many labels which suggests some words for describing the content of an image. Designing AIA systems faces chanllenges like semantic gap between low level image features and high level human expressions (tags), incompelete tags and imbalance images per tags in the datasets. Non-negative matrix factorization or NMF is a process in which for a matrix of non-negative data, two non-negative matrices are found such that multiplication of them is equal to the first matrix. The strategy of NMF is the extraction of a latent semantic space from some feature vectors which represents the patterns that make the content of images, more clear, far from the troublesome complexities and noises. In fact representing the images in this space gives higher level feature vectors. These new features describe in a better way, the existing structure in an image collection and relationships between their features. In this thesis, we design an AIA system using multi-view non-negative matrix factorization. Here, the term of “view” is the feature vectors being extracted from an image dataset. These datasets must include the natural images with tags which some of them are used for training and some used for testing. Each train image must have at least one tag and for each test tags we must have at least one train image. Also the tags must express the visual and non-abstract concepts. In this proposal four approaches are suggested for managing multi views in NMF framework. In the first approach all visual views are concatenated and are treated as one view which beside text view are factorized such that the representation of images in the latent spaces of these two views to be similar. In the second approach each visual view is treated as an individual view and we try to represent images in the latent spaces of these views in a similar way. In the third approach, for factorizing more freely, the constraint of the sameness of the dimension of latent spaces is relaxed. In the fourth approach the views are divided in the homogeneous groups and we consider to both common and specific parts of representation in the latent space. After extracting the latent spaces, the test images are mapped to these spaces and the distance of each test images to all train images are computed and averaged by some learned weights, to find nearest neighbors. Considering the tags of these neighbors and the distances, some tags for test images are suggested. Finally, the proposed approaches are evaluated on three famous image datasets. Comparison by several metrics demonstrates that the proposed methods often show better or competitive results with respect to the state-of-the-art literatures in AIA systems
  9. Keywords:
  10. Image Retrieval ; Dimension Reduction ; Non-Negative Matrix Factorization (NMF) ; Latent Factors Test ; Automatic Tagging ; Image Annotation ; Automatic Images Annotation (AIA) ; Multiview Data

 Digital Object List

 Bookmark

No TOC