Image Annotation Using Semi-supervised Learning

Amiri, Hamid; Jamzad, Mansour

Please enable javascript in your browser.

Image Annotation Using Semi-supervised Learning

Amiri, Hamid | 2015

1182 Viewed

Type of Document: Ph.D. Dissertation
Language: Farsi
Document No: 47311 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Jamzad, Mansour
Abstract:
Aautomatic image annotation that assigns some labels to input images and provides a textual description for the contents of images has become an active field in machine vision community. To design an annotation system, we need a dataset that contains images and labels for them. However, a large amount of manual efforts is required to annotate all images in a dataset. To reduce the demand of annotation systems on the labeled images, one solution is to exploit useful information embedded into the unlabeled images and incorporate them into learning process. In machine learning community, semi-supervised learning (SSL) has been introduced with the aim of incorporating unlabeled samples into the training phase of a classifier. In this research, we propose novel approaches for semi-supervised image annotation. At the first step, we use semi-supervised generative models. To this end, images with similar contents are categorized into a semantic class which is called a concept in this research. Then, we propose an approach that constructs a generative model for each concept in two main steps. At first, a generative model is constructed for each concept based on the labeled images in that concept. The second step incorporates the unlabeled images using a modified EM algorithm to update the parameters of generative models. In the next step of this research, we focus on semi-supervised graph-based learning for image annotation. Conventional graph-based image annotation methods integrate various features into a single descriptor and consider one node for each descriptor on the learning graph. However, this graph does not capture the information of individual features, making it unsuitable for propagating the labels of annotated images. To overcome the above problem in this research, we consider each of visual features as an independent modality, resulting into a multi-modal representation for images. To efficiently combine the visual modalities, a specific subgraph is constructed for each modality and then subgraphs are connected to each other to form a supergraph. We aim to conduct label propagation on the supergraph. However, the size of supergraph grows linearly with the number of visual features. Thus, it is essential to handle large computational complexity of label propagation on the supergraph. To this end, we extract some prototypes from the feature vectors of images and incorporate them into the supergraph construction. The learning process is then conducted on the prototypes, instead of a large number of feature vectors. Therefore, we formulate the learning framework in such a way that we extract the labels of prototypes Finally, the labels of images are reconstructed from the labels of prototypes. With the above approach, we reach a scalable framework for graph-based image annotation. To evaluate the proposed approaches, we conduct our experiments on five standard datasets and compute precision and recall metrics. Our experiments reveal that we could improve the performance of annotation systems using semi-supervised learning techniques. Moreover,in comparison to other semi-supervised image annotation methods, our approaches achieve higher precision and recall for annotating input images
Keywords:
Scalability ; Semi-Supervised Learning ; Generating Model ; Graph-Based Learning ; Image Annotation ; Multi-Model Representation

Digital Object List

محتواي کتاب
view

Bookmark

Friend's email
Your name
Your email
enter code