An Efficient semi-supervised multi-label classifier capable of handling missing labels

Hosseini Akbarnejad, A ; Sharif University of Technology | 2019

363 Viewed
  1. Type of Document: Article
  2. DOI: 10.1109/TKDE.2018.2833850
  3. Publisher: IEEE Computer Society , 2019
  4. Abstract:
  5. Multi-label classification has received considerable interest in recent years. Multi-label classifiers usually need to address many issues including: handling large-scale datasets with many instances and a large set of labels, compensating missing label assignments in the training set, considering correlations between labels, as well as exploiting unlabeled data to improve prediction performance. To tackle datasets with a large set of labels, embedding-based methods represent the label assignments in a low-dimensional space. Many state-of-the-art embedding-based methods use a linear dimensionality reduction to map the label assignments to a low-dimensional space. However, by doing so, these methods actually neglect the tail labels-labels that are infrequently assigned to instances. In this paper, we propose an embedding-based method that non-linearly embeds the label vectors using a stochastic approach, thereby predicting the tail labels more accurately. Moreover, the proposed method has excellent mechanisms for handling missing labels, dealing with large-scale datasets, as well as exploiting unlabeled data. Experiments on real-world datasets show that our method outperforms state-of-the-art multi-label classifiers by a large margin, in terms of prediction performance, as well as training time. Our implementation of the proposed method is available online at:https://github.com/Akbarnejad/ESMC_ Implementation. © 1989-2012 IEEE
  6. Keywords:
  7. Missing labels ; Multi-label classification ; Probabilistic model ; Semi-supervised learning ; Correlation methods ; Forecasting ; Mathematical transformations ; Personnel training ; Probabilistic logics ; Stochastic systems ; Supervised learning ; Dimensionality reduction ; Gaussian Processes ; Multi label classification ; Probabilistic modeling ; Semi- supervised learning ; Classification (of information)
  8. Source: IEEE Transactions on Knowledge and Data Engineering ; Volume 31, Issue 2 , 2019 , Pages 229-242 ; 10414347 (ISSN)
  9. URL: https://ieeexplore.ieee.org/document/8356120