Sharif Digital Repository / Sharif University of Technology / Search result

Exploiting multiview properties in semi-supervised video classification

, Article 2012 6th International Symposium on Telecommunications, IST 2012 ; 2012 , Pages 837-842 ; 9781467320733 (ISBN) Karimian, M ; Tavassolipour, M ; Kasaei, S ; Sharif University of Technology

Abstract

In large databases, availability of labeled training data is mostly prohibitive in classification. Semi-supervised algorithms are employed to tackle the lack of labeled training data problem. Video databases are the epitome for such a scenario; that is why semi-supervised learning has found its niche in it. Graph-based methods are a promising platform for semi-supervised video classification. Based on the multiview characteristic of video data, different features have been proposed (such as SIFT, STIP and MFCC) which can be utilized to build a graph. In this paper, we have proposed a new classification method which fuses the results of manifold regularization over different graphs. Our...

Supervised neighborhood graph construction for semi-supervised classification

, Article Pattern Recognition ; Volume 45, Issue 4 , April , 2012 , Pages 1363-1372 ; 00313203 (ISSN) Rohban, M. H ; Rabiee, H. R ; Sharif University of Technology

Abstract

Graph based methods are among the most active and applicable approaches studied in semi-supervised learning. The problem of neighborhood graph construction for these methods is addressed in this paper. Neighborhood graph construction plays a key role in the quality of the classification in graph based methods. Several unsupervised graph construction methods have been proposed that have addressed issues such as data noise, geometrical properties of the underlying manifold and graph hyper-parameters selection. In contrast, in order to adapt the graph construction to the given classification task, many of the recent graph construction methods take advantage of the data labels. However, these...

Semi-supervised ensemble learning of data streams in the presence of concept drift

, Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) ; Volume 7209 LNAI, Issue PART 2 , 2012 , Pages 526-537 ; 03029743 (ISSN) ; 9783642289309 (ISBN) Ahmadi, Z ; Beigy, H ; Sharif University of Technology

Abstract

Increasing access to very large and non-stationary datasets in many real problems has made the classical data mining algorithms impractical and made it necessary to design new online classification algorithms. Online learning of data streams has some important features, such as sequential access to the data, limitation on time and space complexity and the occurrence of concept drift. The infinite nature of data streams makes it hard to label all observed instances. It seems that using the semi-supervised approaches have much more compatibility with the problem. So in this paper we present a new semi-supervised ensemble learning algorithm for data streams. This algorithm uses the majority...

Unilateral semi-supervised learning of extended hidden vector state for Persian language understanding

, Article NLP-KE 2011 - Proceedings of the 7th International Conference on Natural Language Processing and Knowledge Engineering, 27 November 2011 through 29 November 2011, Tokushima ; 2011 , Pages 165-168 ; 9781612847283 (ISBN) Jabbari, F ; Sameti, H ; Bokaei, M. H ; Chinese Association for Artificial Intelligence; IEEE Signal Processing Society ; Sharif University of Technology

2011

Abstract

The key element of a spoken dialogue system is Spoken Language Understanding (SLU) part. HVS and EHVS are two most popular statistical methods employed to implement the SLU part which need lightly annotated data. Since annotation is a time consuming, we present a novel semi-supervised learning for EHVS to reduce the human labeling effort using two different statistical classifiers, SVM and KNN. Experiments are done on a Persian corpus, the University Information Kiosk corpus. The experimental results show improvements in performance of semi-supervised EHVS, trained by both labeled and unlabeled data, compared to EHVS trained by just initially labeled data. The performance of EHVS improves...

Efficient iterative Semi-Supervised Classification on manifold

, Article Proceedings - IEEE International Conference on Data Mining, ICDM ; 2011 , Pages 228-235 ; 15504786 (ISSN); 9780769544090 (ISBN) Farajtabar, M ; Rabiee, H. R ; Shaban, A ; Soltani Farani, A ; National Science Foundation (NSF) - Where Discoveries Begin; University of Technology Sydney; Google; Alberta Ingenuity Centre for Machine Learning; IBM Research ; Sharif University of Technology

Abstract

Semi-Supervised Learning (SSL) has become a topic of recent research that effectively addresses the problem of limited labeled data. Many SSL methods have been developed based on the manifold assumption, among them, the Local and Global Consistency (LGC) is a popular method. The problem with most of these algorithms, and in particular with LGC, is the fact that their naive implementations do not scale well to the size of data. Time and memory limitations are the major problems faced in large-scale problems. In this paper, we provide theoretical bounds on gradient descent, and to overcome the aforementioned problems, a new approximate Newton's method is proposed. Moreover, convergence...

Isograph: Neighbourhood graph construction based on geodesic distance for semi-supervised learning

, Article Proceedings - IEEE International Conference on Data Mining, ICDM, 11 December 2011 through 14 December 2011 ; December , 2011 , Pages 191-200 ; 15504786 (ISSN) ; 9780769544083 (ISBN) Ghazvininejad, M ; Mahdieh, M ; Rabiee, H. R ; Roshan, P. K ; Rohban, M. H ; Sharif University of Technology

2011

Abstract

Semi-supervised learning based on manifolds has been the focus of extensive research in recent years. Convenient neighbourhood graph construction is a key component of a successful semi-supervised classification method. Previous graph construction methods fail when there are pairs of data points that have small Euclidean distance, but are far apart over the manifold. To overcome this problem, we start with an arbitrary neighbourhood graph and iteratively update the edge weights by using the estimates of the geodesic distances between points. Moreover, we provide theoretical bounds on the values of estimated geodesic distances. Experimental results on real-world data show significant...

Active learning from positive and unlabeled data

, Article Proceedings - IEEE International Conference on Data Mining, ICDM, 11 December 2011 through 11 December 2011 ; December , 2011 , Pages 244-250 ; 15504786 (ISSN) ; 9780769544090 (ISBN) Ghasemi, A ; Rabiee, H. R ; Fadaee, M ; Manzuri, M. T ; Rohban, M. H ; Sharif University of Technology

2011

Abstract

During recent years, active learning has evolved into a popular paradigm for utilizing user's feedback to improve accuracy of learning algorithms. Active learning works by selecting the most informative sample among unlabeled data and querying the label of that point from user. Many different methods such as uncertainty sampling and minimum risk sampling have been utilized to select the most informative sample in active learning. Although many active learning algorithms have been proposed so far, most of them work with binary or multi-class classification problems and therefore can not be applied to problems in which only samples from one class as well as a set of unlabeled data are...

HMM based semi-supervised learning for activity recognition

, Article SAGAware'11 - Proceedings of the 2011 International Workshop on Situation Activity and Goal Awareness, 18 September 2011 through 18 September 2011, Beijing ; September , 2011 , Pages 95-99 ; 9781450309264 (ISBN) Ghazvininejad, M ; Rabiee, H. R ; Pourdamghani, N ; Khanipour, P ; Sharif University of Technology

2011

Abstract

In this paper, we introduce a novel method for human activity recognition that benefits from the structure and sequential properties of the test data as well as the training data. In the training phase, we obtain a fraction of data labels at constant time intervals and use them in a semi-supervised graph-based method for recognizing the user's activities. We use label propagation on a k-nearest neighbor graph to calculate the probability of association of the unlabeled data to each class in this phase. Then we use these probabilities to train an HMM in a way that each of its hidden states corresponds to one class of activity. These probabilities are used to learn the transition probabilities...

Manifold coarse graining for online semi-supervised learning

, Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5 September 2011 through 9 September 2011 ; Volume 6911 LNAI, Issue PART 1 , September , 2011 , Pages 391-406 ; 03029743 (ISSN) ; 9783642237799 (ISBN) Farajtabar, M ; Shaban, A ; Rabiee, H. R ; Rohban, M. H ; Sharif University of Technology

2011

Abstract

When the number of labeled data is not sufficient, Semi-Supervised Learning (SSL) methods utilize unlabeled data to enhance classification. Recently, many SSL methods have been developed based on the manifold assumption in a batch mode. However, when data arrive sequentially and in large quantities, both computation and storage limitations become a bottleneck. In this paper, we present a new semi-supervised coarse graining (CG) algorithm to reduce the required number of data points for preserving the manifold structure. First, an equivalent formulation of Label Propagation (LP) is derived. Then a novel spectral view of the Harmonic Solution (HS) is proposed. Finally an algorithm to reduce...

A hybrid supervised semi-supervised graph-based model to predict one-day ahead movement of global stock markets and commodity prices

, Article Expert Systems with Applications ; Volume 105 , 2018 , Pages 159-173 ; 09574174 (ISSN) Negahdari Kia, A ; Haratizadeh, S ; Bagheri Shouraki, S ; Sharif University of Technology

Abstract

Market prediction has been an important machine learning research topic in recent decades. A neglected issue in prediction is having a model that can simultaneously pay attention to the interaction of global markets along historical data of the target markets being predicted. As a solution, we present a hybrid supervised semi-supervised model called HyS3 for direction of movement prediction. The graph-based semi-supervised part of HyS3 models the markets global interactions through a network designed with a novel continuous Kruskal-based graph construction algorithm called ConKruG. The supervised part of the model injects results extracted from each market's historical data to the network...

Leveraging multi-modal fusion for graph-based image annotation

, Article Journal of Visual Communication and Image Representation ; Volume 55 , 2018 , Pages 816-828 ; 10473203 (ISSN) Amiri, S. H ; Jamzad, M ; Sharif University of Technology

Academic Press Inc 2018

Abstract

Considering each of the visual features as one modality in image annotation task, efficient fusion of different modalities is essential in graph-based learning. Traditional graph-based methods consider one node for each image and combine its visual features into a single descriptor before constructing the graph. In this paper, we propose an approach that constructs a subgraph for each modality in such a way that edges of subgraph are determined using a search-based approach that handles class-imbalance challenge in the annotation datasets. Multiple subgraphs are then connected to each other to have a supergraph. This follows by introducing a learning framework to infer the tags of...

One step toward a richer model of unsupervised grammar induction

, Article International Conference on Recent Advances in Natural Language Processing, RANLP 2005, 21 September 2005 through 23 September 2005 ; Volume 2005-January , 2005 , Pages 197-203 ; 13138502 (ISSN) ; 9549174336 (ISBN) Feili, H ; Ghassem Sani, G. R ; Angelova G ; Bontcheva K ; Mitkov R ; Nicolov N ; Nikolov N ; Sharif University of Technology

Association for Computational Linguistics (ACL) 2005

Abstract

Probabilistic Context-Free Grammars (PCFGs) are useful tools for syntactic analysis of natural languages. Availability of large Treebank has encouraged many researchers to use PCFG in language modeling. Automatic learning of PCFGs is divided into three different categories, based on the needed data set for the training phase: supervised, semi-supervised and unsupervised. Most current inductive methods are supervised, which need a bracketed data set in the training phase. However, lack of this kind of data set in many languages, has encouraged us to pay more attention to unsupervised approaches. So far, unsupervised approaches have achieved little success. By considering a history-based...

ACoPE: An adaptive semi-supervised learning approach for complex-policy enforcement in high-bandwidth networks

, Article Computer Networks ; Volume 166 , 2020 Noferesti, M ; Jalili, R ; Sharif University of Technology

Elsevier B.V 2020

Abstract

Today's high-bandwidth networks require adaptive analyzing approaches to recognize the network variable behaviors. The analyzing approaches should be robust against the lack of prior knowledge and provide data to impose more complex policies. In this paper, ACoPE is proposed as an adaptive semi-supervised learning approach for complex-policy enforcement in high-bandwidth networks. ACoPE detects and maintains inter-flows relationships to impose complex-policies. It employs a statistical process control technique to monitor accuracy. Whenever the accuracy decreased, ACoPE considers it as a changed behavior and uses data from a deep packet inspection module to adapt itself with the change. The...

Semi-supervised parallel shared encoders for speech emotion recognition

, Article Digital Signal Processing: A Review Journal ; Volume 118 , 2021 ; 10512004 (ISSN) Pourebrahim, Y ; Razzazi, F ; Sameti, H ; Sharif University of Technology

Elsevier Inc 2021

Abstract

Supervised speech emotion recognition requires a large number of labeled samples that limit its use in practice. Due to easy access to unlabeled samples, a new semi-supervised method based on auto-encoders is proposed in this paper for speech emotion recognition. The proposed method performed the classification operation by extracting the information contained in unlabeled samples and combining it with the information in labeled samples. In addition, it employed maximum mean discrepancy cost function to reduce the distribution difference when the labeled and unlabeled samples were gathered from different datasets. Experimental results obtained on different emotional speech datasets...

Automatic image annotation using semi-supervised generative modeling

, Article Pattern Recognition ; Volume 48, Issue 1 , January , 2015 , Pages 174-188 ; 00313203 (ISSN) Amiri, S. H ; Jamzad, M ; Sharif University of Technology

Elsevier Ltd 2015

Abstract

Image annotation approaches need an annotated dataset to learn a model for the relation between images and words. Unfortunately, preparing a labeled dataset is highly time consuming and expensive. In this work, we describe the development of an annotation system in semi-supervised learning framework which by incorporating unlabeled images into training phase reduces the system demand to labeled images. Our approach constructs a generative model for each semantic class in two main steps. First, based on Gamma distribution, a generative model is constructed for each semantic class using labeled images in that class. The second step incorporates the unlabeled images by using a modified EM...

Classification of NPPs transients using change of representation technique: A hybrid of unsupervised MSOM and supervised SVM

, Article Progress in Nuclear Energy ; Volume 117 , 2019 ; 01491970 (ISSN) Moshkbar Bakhshayesh, K ; Mohtashami, S ; Sharif University of Technology

Elsevier Ltd 2019

Abstract

This study introduces a new identifier for nuclear power plants (NPPs) transients. The proposed identifier changes the representation of input patterns. Change of representation is a semi-supervised learning algorithm which employs both of labeled and unlabeled input data. In the first step, modified self-organizing map (MSOM) carries out an unsupervised learning algorithm on labeled and unlabeled patterns and generates a new metric for input data. In the second step, support vector machine (SVM) as a supervised learning algorithm classifies the input patterns using the generated metric of the first step. In contrast to unsupervised learning algorithms, the proposed identifier does not...

Network-based direction of movement prediction in financial markets

, Article Engineering Applications of Artificial Intelligence ; Volume 88 , February , 2020 Kia, A. N ; Haratizadeh, S ; Shouraki, S. B ; Sharif University of Technology

Elsevier Ltd 2020

Abstract

Market prediction has been an important research problem for decades. Having better predictive models that are both more accurate and faster has been attractive for both researchers and traders. Among many approaches, semi-supervised graph-based prediction has been used as a solution in recent researches. Based on this approach, we present two prediction models. In the first model, a new network structure is introduced that can capture more information about markets’ direction of movements compared to the previous state of the art methods. Based on this novel network, a new algorithm for semi-supervised label propagation is designed that is able to prediction the direction of movement faster...

Incremental evolving domain adaptation

, Article IEEE Transactions on Knowledge and Data Engineering ; Volume 28, Issue 8 , 2016 , Pages 2128-2141 ; 10414347 (ISSN) Bitarafan, A ; Soleymani Baghshah, M ; Gheisari, M ; Sharif University of Technology

IEEE Computer Society

Abstract

Almost all of the existing domain adaptation methods assume that all test data belong to a single stationary target distribution. However, in many real world applications, data arrive sequentially and the data distribution is continuously evolving. In this paper, we tackle the problem of adaptation to a continuously evolving target domain that has been recently introduced. We assume that the available data for the source domain are labeled but the examples of the target domain can be unlabeled and arrive sequentially. Moreover, the distribution of the target domain can evolve continuously over time. We propose the Evolving Domain Adaptation (EDA) method that first finds a new feature space...

An efficient semi-supervised multi-label classifier capable of handling missing labels

, Article IEEE Transactions on Knowledge and Data Engineering ; 2018 ; 10414347 (ISSN) Hosseini Akbarnejad, A ; Soleymani Baghshah, M ; Sharif University of Technology

IEEE Computer Society 2018

Abstract

Multi-label classification has received considerable interest in recent years. Multi-label classifiers usually need to address many issues including: handling large-scale datasets with many instances and a large set of labels, compensating missing label assignments in the training set, considering correlations between labels, as well as exploiting unlabeled data to improve prediction performance. To tackle datasets with a large set of labels, embedding-based methods represent the label assignments in a low dimensional space. Many state-of-the-art embedding-based methods use a linear dimensionality reduction to map the label assignments to a low-dimensional space. However, by doing so, these...

An Efficient semi-supervised multi-label classifier capable of handling missing labels

, Article IEEE Transactions on Knowledge and Data Engineering ; Volume 31, Issue 2 , 2019 , Pages 229-242 ; 10414347 (ISSN) Hosseini Akbarnejad, A ; Soleymani Baghshah, M ; Sharif University of Technology

IEEE Computer Society 2019

Abstract

Multi-label classification has received considerable interest in recent years. Multi-label classifiers usually need to address many issues including: handling large-scale datasets with many instances and a large set of labels, compensating missing label assignments in the training set, considering correlations between labels, as well as exploiting unlabeled data to improve prediction performance. To tackle datasets with a large set of labels, embedding-based methods represent the label assignments in a low-dimensional space. Many state-of-the-art embedding-based methods use a linear dimensionality reduction to map the label assignments to a low-dimensional space. However, by doing so, these...