Image Annotation Using Semi-supervised Learning

Amiri, Hamid; Jamzad, Mansour

Please enable javascript in your browser.

Image Annotation Using Semi-supervised Learning

Amiri, Hamid | 2015

1273 Viewed

Type of Document: Ph.D. Dissertation
Language: Farsi
Document No: 47311 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Jamzad, Mansour
Abstract:
Aautomatic image annotation that assigns some labels to input images and provides a textual description for the contents of images has become an active field in machine vision community. To design an annotation system, we need a dataset that contains images and labels for them. However, a large amount of manual efforts is required to annotate all images in a dataset. To reduce the demand of annotation systems on the labeled images, one solution is to exploit useful information embedded into the unlabeled images and incorporate them into learning process. In machine learning community, semi-supervised learning (SSL) has been introduced with the aim of incorporating unlabeled samples into the training phase of a classifier. In this research, we propose novel approaches for semi-supervised image annotation. At the first step, we use semi-supervised generative models. To this end, images with similar contents are categorized into a semantic class which is called a concept in this research. Then, we propose an approach that constructs a generative model for each concept in two main steps. At first, a generative model is constructed for each concept based on the labeled images in that concept. The second step incorporates the unlabeled images using a modified EM algorithm to update the parameters of generative models. In the next step of this research, we focus on semi-supervised graph-based learning for image annotation. Conventional graph-based image annotation methods integrate various features into a single descriptor and consider one node for each descriptor on the learning graph. However, this graph does not capture the information of individual features, making it unsuitable for propagating the labels of annotated images. To overcome the above problem in this research, we consider each of visual features as an independent modality, resulting into a multi-modal representation for images. To efficiently combine the visual modalities, a specific subgraph is constructed for each modality and then subgraphs are connected to each other to form a supergraph. We aim to conduct label propagation on the supergraph. However, the size of supergraph grows linearly with the number of visual features. Thus, it is essential to handle large computational complexity of label propagation on the supergraph. To this end, we extract some prototypes from the feature vectors of images and incorporate them into the supergraph construction. The learning process is then conducted on the prototypes, instead of a large number of feature vectors. Therefore, we formulate the learning framework in such a way that we extract the labels of prototypes Finally, the labels of images are reconstructed from the labels of prototypes. With the above approach, we reach a scalable framework for graph-based image annotation. To evaluate the proposed approaches, we conduct our experiments on five standard datasets and compute precision and recall metrics. Our experiments reveal that we could improve the performance of annotation systems using semi-supervised learning techniques. Moreover,in comparison to other semi-supervised image annotation methods, our approaches achieve higher precision and recall for annotating input images
Keywords:
Scalability ; Semi-Supervised Learning ; Generating Model ; Graph-Based Learning ; Image Annotation ; Multi-Model Representation

Digital Object List

محتواي کتاب
view

Bookmark

چکیده
مقدمه
- تعریف مساله و اهمیت آن
- چارچوب پژوهش و نوآوری‌ها
- ساختار رساله
بررسی مطالعات پیشین
- یادگیری نیمه‌نظارتی
  - مدل‌های مولد نیمه‌نظارتی
  - یادگیری مبتنی بر گراف
    - چارچوب کلی یادگیری مبتنی بر گراف
    - روش‌های ساخت گراف همسایگی
  - یادگیری نیمه‌نظارتی مقیاس‌پذیر
    - روش استنتاج بدون پارامتر
    - روش ماشین بردار نماینده‌ها
    - روش گراف مرجع
- برچسب‌زنی خودکار تصاویر
  - برچسب‌زنی با نظارت تصاویر
    - یادگیری ‌با نظارت یک در برابر چند کلاس
    - مدل‌های مبتنی بر متغیرهای مخفی
    - یادگیری با نظارت چند کلاسه
    - روش‌های مبتنی بر جستجو
  - برچسب‌زنی نیمه‌نظارتی تصاویر
    - نزدیکترین زنجیره پوشا
    - یادگیری چندنمونه‌ای نیمه‌نظارتی
    - برچسب‌زنی با گراف تُنُک kNN
    - گراف دو بخشی تصاویر و برچسب‌ها
    - انتشار سریع برچسب با گراف معنایی
    - انتشار چندوجهی برچسب‌ها
- جمع‌بندی
برچسب‌زنی تصاویر با استفاده از مدل‌های مولد نیمه‌نظارتی
- مروری بر خوشه‌بندی طیفی
- ساختار پایگاه داده تصاویر
- استخراج ویژگی
- استخراج نماینده
- ساخت مدل اولیه
  - استخراج نماینده‌های اولیه
  - برازش توزیع تصادفی هر خوشه
  - ترکیب توزیع‌های تصادفی
- تابع هدف برای تخمین پارامترها
- شرکت دادن داده‌های بدون برچسب
  - محاسبه احتمالات پسین
  - به‌روزرسانی پارامترهای توزیع‌های احتمالاتی خوشه‌ها
  - پالایش امضاهای نامرتبط
  - تخمین پارامترهای مدل اولیه
  - اثبات روابط تخمین پارامترها
- الگوریتم برچسب‌زنی تصاویر
- نتایج پیاده‌سازی
  - انتخاب پارامترها
  - ارزیابی چارچوب یادگیری بانظارت
  - ارزیابی چارچوب یادگیری نیمه‌نظارتی
  - آنالیز حساسیت روش پیشنهادی به قطعه‌بندی
  - آنالیز حساسیت روش پیشنهادی به پارامترهای موجود
  - مقایسه با سایر روش‌ها
  - اعمال روش پیشنهادی بر روی پایگاه داده‌های برچسب‌زنی در سطح تصاویر
- جمع‌بندی
برچسب‌زنی نیمه‌نظارتی تصاویر بر روی اَبَرگراف
- مرور کلی روش پیشنهادی
- ساخت ابرگراف
  - ساخت گراف داده
    - ترکیب در سطح فاصله
    - ترکیب در سطح گراف
  - شرکت دادن برچسب‌های معنایی
- استنتاج بر روی ابرگراف
- تحلیل پیچیدگی زمانی
- ترکیب امتیازها و برچسب‌زنی
- نتایج پیاده‌سازی
  - بردارهای ویژگی و پایگاه‌های تصاویر
  - نتایج پایگاه داده VOC Pascal
  - نتایج پایگاه داده Corel5k
  - نتایج پایگاه داده IAPR TC-12
  - ترکیب در سطح فاصله در مقابل ترکیب در سطح گراف
- جمع‌بندی
برچسب‌زنی مقیاس‌پذیر نیمه‌نظارتی بر روی ابرگراف
- نمادها و اختصارات ریاضی
- ساخت ابرگراف
  - تشکیل گراف داده
    - ساخت گراف نمونه‌ها
    - تشکیل گراف نماینده‌ها
    - ارتباط بین زیرگراف‌های نمونه‌ها و نماینده‌ها
    - تعیین یال‌های بین زیرگراف‌های نماینده‌ها
  - مشارکت دادن برچسب‌های معنایی
  - بررسی ابرگراف از دیدگاه یادگیری منیفلد
- استنتاج بر روی ابرگراف
- رویه برچسب‌زنی
- تحلیل پیچیدگی زمانی
- نتایج پیاده‌سازی
  - پایگاه‌های داده و بردارهای ویژگی
  - انتخاب پارامترها
  - نتایج پایگاه داده Corel5k
  - نتایج پایگاه داده IAPR TC-12
  - نتایج پایگاه داده NUS-WIDE-LITE
  - ارزیابی بار محاسباتی
  - آنالیز حساسیت روش پیشنهادی نسبت به پارامترها
- جمع‌بندی
نتیجه‌گیری و کارهای آتی
- راهکارهای آتی
فهرست منابع
واژه‌نامه انگلیسی به فارسی
واژه‌نامه فارسی به انگلیسی

Friend's email
Your name
Your email
enter code