Loading...

Self-Supervised Image Representation Learning

Aghababazadeh, Arash | 2021

499 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 54329 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Kasaei, Shohreh
  7. Abstract:
  8. Self-supervied learning is a method to reduce the need for large labeled datasets in supervised learning. In self-supervised learning, the goal is to design a pretext task that can be trained without any labels. This pretext task results in learning a representation of data that can reduce the need for labels when used for different tasks. In the domain of images, data augmenting transformations which are often a composition of simple transformations such as random cropping and color jitter have been used for the design of pretext tasks. These simple transformations can cause information loss in some datasets which limits the usage of the learned representations for various downstream tasks. A systematic approach to design data augmenting transformations is required. In this research, disentangled data augmenting transforms are proposed using the assumption that images are generated by a two-step generative process. First, a set of meaningful factors of variation are generated then images are generated conditioned on the factors of variation. A disentangled data augmenting transform generates an output image given an input image and an arbitrary factor of variation, in a way that the input and output image share the factor of variation. An adversarial loss is proposed to train the disentangled data augmenting transforms, alongside two regularization terms to ensure disentanglement. The proposed model is evaluated on four benchmark datasets (Dsprites, Scream-Dsprites, 3D Shapes and CelebA) using $\beta$-VAE Score and Factor Score. On Dsprites dataset, the proposed method outperforms the previous methods by %4.6 and %8.6 evaluated by $\beta$-VAE Score and Factor Score respectively and gives competitive results on other datasets. The scalability of the model is tested by training on CelebA dataset and qualitative results demonstrate the disentangled augmentation of data
  9. Keywords:
  10. Self-Supervised Learning ; Deep Learning ; Computer Vision ; Images Representation Learning ; Supervised Learning

 Digital Object List

 Bookmark

...see more