Loading...
Search for:
semi-supervised-learning
0.007 seconds
Total 49 records
Regularization from the Machine Learning Point of View
, M.Sc. Thesis Sharif University of Technology ; Daneshgar, Amir (Supervisor)
Abstract
In traditional machine learning approaches to classification, one uses only a labeled set to train the classifier. Labeled instances however are often difficult, expensive, or time consuming to obtain, as they require the efforts of experienced human annotators. Meanwhile unlabeled data may be relatively easy to collect, but there has been few ways to use them. Semi-supervised learning addresses this problem by using large amount of unlabeled data, together with the labeled data, to build better classifiers. Because semi-supervised learning requires less human effort and gives higher accuracy.Formally, this intuition corresponds to estimating a label function f on the graph so that it...
Semi-supervised Learning and its Application to Image Categorization
, M.Sc. Thesis Sharif University of Technology ; Rabiee, Hamid Reza (Supervisor)
Abstract
Traditional methods for data classification only make use of the labeled data. However, in most of the applications, labeling the unlabeled data is expensive, time consuming and requires expert knowledge. To overcome these problems, Semi-supervised Learning (SSL) methods have become an area of recent research that aim to effectively addressing the problem of limited labeled data.One of the recently introduced SSL methods is the classification based on geometric structure of the data, namely the data manifold. In this approach unlabeled data is utilized to recover the underlying structure of the data. The common assumption is that despite of being represented in a high dimensional space, data...
Persian Statistical Natural Language Understanding Based on Partially Annotated Corpus
, M.Sc. Thesis Sharif University of Technology ; Sameti, Hossein (Supervisor)
Abstract
Spoken language understanding unit is one of the most important parts of a spoken dialogue system. The input of this system is the output of speech recognition unit. The main function of this unit is to extract the semantic information from the input utterances. There are two main types of approaches to do this task: rule-based approaches, and data-driven approaches. Today data-driven approaches are of more interest because they are more flexible and robust compared to the rule-based approaches. The main drawback of these methods is that they need a large amount of fully annotated or in some cases Treebank data. Preparing such data is time consuming and expensive. The goal of this thesis is...
Fault Detection and Smart Monitoring of Industrial Fans Based on Vibration Signals
, M.Sc. Thesis Sharif University of Technology ; Manzuri Shalmani, Mohammad Taghi (Supervisor)
Abstract
Data Oriented Smart Monitoring for Industrial Machineries include approaches for fault detection and prognosis which only rely on non-stationary signals sampled from sensors and do not rely on physical model of machineries nor expert knowledge. Fault detection is task of determining state of machinery in present moment using past data. But in Prognosis focus is on predicting future state of machinery using past data. Most researches in this category are based on supervised algorithms, but in many applications labeling data is expensive. In this thesis some approaches for semi-superviseddiagnosis, based on markov random walk an K-NN have been implemented, also some improvements for K-NN have...
Unsupervised Domain Adaptation via Representation Learning
, M.Sc. Thesis Sharif University of Technology ; Soleymani, Mahdieh (Supervisor)
Abstract
The existing learning methods usually assume that training and test data follow the same distribution, while this is not always true. Thus, in many cases the performance of these learning methods on the test data will be severely degraded. We often have sufficient labeled training data from a source domain but wish to learn a classifier which performs well on a target domain with a different distribution and no labeled training data. In this thesis, we study the problem of unsupervised domain adaptation, where no labeled data in the target domain is available. We propose a framework which finds a new representation for both the source and the target domain in which the distance between these...
Video Classification Usinig Semi-supervised Learning Methods
, M.Sc. Thesis Sharif University of Technology ; Kasaei, Shohreh (Supervisor)
Abstract
In large databases, availability of labeled training data is mostly prohibitive in classification. Semi-supervised algorithms are employed to tackle the lack of labeled training data problem. Video databases are the epitome for such a scenario; that is why semi-supervised learning has found its niche in it. Graph-based methods are a promising platform for semi-supervised video classification. Based on the multiview characteristic of video data, different features have been proposed (such as SIFT, STIP and MFCC) which can be utilized to build a graph. In this project, we have proposed a new classification method which fuses the results of manifold regularization over different graphs. Our...
Recognition of Human Activities by Using Machine Learning Methods
, M.Sc. Thesis Sharif University of Technology ; Rabiee, Hamid Reza (Supervisor)
Abstract
In this research, we have used machine learning methods to approach the problem of human activity recognition. As the process of labeling the data in this problem is so costly and time consuming, and regarding the copious available unlabeled data, semi supervised methods have a high performance in this problem. In recent years, graph based methods have became very populaer among semi supervised learning methods. However, constructing a graph on the data which presents their structure in a proper manner has remained a main challenge in these methods. One of the causes of this problem is the existance of the shortcut edges. In this report, we will first introduce a method to solve the problem...
Improving Graph Construction for Semi-supervised Learning in Computer Vision Applications
, M.Sc. Thesis Sharif University of Technology ; Rabiee, Hamid Reza (Supervisor)
Abstract
Semi-supervised Learning (SSL) is an extremely useful approach in many applications where unlabeled data can be easily obtained. Graph based methods are among the most studied branches in SSL. Since neighborhood graph is a key component in these methods, we focus on methods of graph construction in this project. Graph construction methods based on Euclidean distance have the common problem of creating shortcut edges. Shortcut edges refer to the edges which connect two nearby points that are far apart on the manifold. Specifically, we show both in theory and practice that using geodesic distance for selecting and weighting edges results in more appropriate neighborhood graphs. We propose an...
Adaptation for Evolving Domains
, M.Sc. Thesis Sharif University of Technology ; Soleymani Baghshah, Mahdieh (Supervisor)
Abstract
Until now many domain adaptation methods have been proposed. A major limitation of almost all of these methods is their assumption that all test data belong to a single stationary target distribution and a large amount of unlabeled data is available for modeling this target distribution. In fact, in many real world applications, such as classifying scene image with gradually changing lighting and spam email identification, data arrives sequentially and the data distribution is continuously evolving. In this thesis, we tackle the problem of adaptation to a continuously evolving target domain that has been recently introduced and propose the Evolving Domain Adaptation (EDA) method to classify...
Behavior-Driven Security Policy Enforcement on High Bandwidth Networks
, Ph.D. Dissertation Sharif University of Technology ; Jalili, Rasool (Supervisor)
Abstract
High-bandwidth network analysis is challenging, resource consuming, and inaccurate due to the high volume, velocity, and variety characteristics of the network traffic. Today's high-bandwidth networks require adaptive analyzing approaches to recognize the network variable behaviors. The analyzing approaches should be robust against the lack of prior knowledge and provide data to impose more complex policies.This thesis introduces complex policy relation and proposes a two-layer framework to enforce complex policies, named HB2DS. The proposed framework is equipped with the mechanism and policy layers. The mechanism layer processes network packets header and payload to generate a flow stream....
Semi-Supervised Kernel Learning for Pattern Classification
, Ph.D. Dissertation Sharif University of Technology ; Rabiee, Hamid Reza (Supervisor)
Abstract
Supervised kernel learning has been the focus of research in recent years. Although these methods are developed based on rigorous frameworks, they fail to improve the classification accuracy in real world applications. In order to find the origin of this problem, it should be noted that the kernel function represents a prior knowledge on the labeling function. Similar to other learning problem, learning this prior knowledge needs another prior knowledge. In supervised kernel learning, only naive assumptions can be used as the prior knowledge. These include minimizing the ℓ1 and ℓ2 norms of the kernel parameters.
As an alternative approach, in Semi-Supervised Learning (SSL), unlabeled...
As an alternative approach, in Semi-Supervised Learning (SSL), unlabeled...
Content-Based Image Retrieval Using Relevance Feedback and Semi-Supervised Learning
, M.Sc. Thesis Sharif University of Technology ; Manzuri, Mohammad Taghi (Supervisor)
Abstract
Content-Based Image Retrieval has been an active research area in recent years, due to the vast amount of digital media available via the Internet. In this work we formulate contentbased image retrieval as machine learning problem using the relevance feedback technique and propose a learning algorithm adapted to specific properties of image retrieval. After studying specific properties of image retrieval as a machine learning problem, we propose a Bayesian framework for image retrieval based on one-class learning and test it on different image datasets. The proposed method is a kernel based approach and can also utilize domain knowledge in the form of prior knowledge in constructing a model...
Application of Semi-Supervised Learning in Image Processing
, M.Sc. Thesis Sharif University of Technology ; Rabiee, Hamidreza (Supervisor)
Abstract
In recent years, the emergence of semi-supervised learning methods has broadened the scope of machine learning, especially for pattern classification. Besides obviating the need for experts to label the data, efficient use of unlabeled data causes a significant improvement in supervised learning methods in many applications. With the advent of statistical learning theory in the late 80's, and the emergence of the concept of regularization, kernel learning has always been in deep concentration. In recent years, semi-supervised kernel learning, which is a combination of the two above-mentioned viewpoints, has been considered greatly.
Large number of dimensions of the input data along with...
Large number of dimensions of the input data along with...
Information Retrieval from Incomplete Observations
, Ph.D. Dissertation Sharif University of Technology ; Marvasti, Farokh (Supervisor)
Abstract
In this dissertation, Data analysis and information retrieval from incomplete observations are investigated in different applications. Incomplete observations may be induced by lack of observations or part of data affected by specific noise (quantization noise). Data-driven algorithms are among important hot topics. Our goal is to process the lost information inducing certain assumption on big data structures. Then, the approach is to mathematically model the problem of interest as an optimization problem. Next, the designed algorithms for the optimization problems are proposed trying to cut down on the computational complexity of as well as enhancing recovery accuracy for big data...
Online Distance Metric Learning
, M.Sc. Thesis Sharif University of Technology ; Beigy, Hamid (Supervisor)
Abstract
Distance Metric Learning algorithms have been widely used in Machine Learning methods recently. In these algorithms a distance function between objecs (data points) is learned based on their labels or similarity and dissimilarity constraints. Recent works have shown that a good precision is obtained in classification or clustering methods which use these functions. Since in the current systems many of data points do not exist at the beginning and are added to the training set as the algorithm is run, online methods are needed to update learned metric due to new data.
In this thesis, we proposed a new online distance metric learning method that has higher performance than existing...
In this thesis, we proposed a new online distance metric learning method that has higher performance than existing...
Identification and Forecasting of Nuclear Power Plants Transients by Semi-Supervised Method with Change of Representation Technique
, M.Sc. Thesis Sharif University of Technology ; Ghofrani, Mohamad Bagher (Supervisor) ; Moshkbar Bakhshayesh, Khalil (Supervisor)
Abstract
In this work, we aim to find a way to identify and forecast transients in nuclear power plants with the aid of semi-supervised machine learning algorithm. Forecasting and identifying transients in nuclear power plants at the early stages of formation are essential for safety considerations and precautionary measures. The use of machine learning algorithms provides an intelligent control mechanism that, along with the main operator of the power plant, raises the transient detection and identification rate. Our algorithm of choice is to change the way data is presented, which is a semi-supervised learning approach. The algorithm consists of two methods: quantum dynamics clustering...
Weakly Supervised Semantic Segmentation Using Deep Neural Networks
, M.Sc. Thesis Sharif University of Technology ; Kasaei, Shohreh (Supervisor)
Abstract
Semantic segmentation which is the classification of every pixel in an input image is a fundamental task in the fields of computer vision and scene understanding. Applications of semantic segmentation include usage in autonomous vehicles and robotics. Since in this task dense annotation of images in the dataset is needed, recent methods have been proposed to utilize weakly-supervised and semi-supervised learning using data with weak labels and unlabeled data respectively. Because the amount of fully labeled data might not be sufficient in such methods, some papers have proposed to employ depth input data due to its rich geometrical and local information when available. In this research, an...
Continual Learning Using Unsupervised Data
, M.Sc. Thesis Sharif University of Technology ; Soleymani Baghshah, Mahdieh (Supervisor)
Abstract
The existing continual learning methods are mainly focused on fully-supervised scenarios and are still not able to take advantage of unlabeled data available in the environment. Some recent works tried to investigate semi-supervised continual learning (SSCL) settings in which the unlabeled data are available, but it is only from the same distribution as the labeled data. This assumption is still not general enough for real-world applications and restricts the utilization of unsupervised data. In this work, we introduce Open-Set Semi-Supervised Continual Learning (OSSCL), a more realistic semi-supervised continual learning setting in which out-of-distribution (OoD) unlabeled samples in the...
Image Annotation Using Semi-supervised Learning
, Ph.D. Dissertation Sharif University of Technology ; Jamzad, Mansour (Supervisor)
Abstract
Aautomatic image annotation that assigns some labels to input images and provides a textual description for the contents of images has become an active field in machine vision community. To design an annotation system, we need a dataset that contains images and labels for them. However, a large amount of manual efforts is required to annotate all images in a dataset. To reduce the demand of annotation systems on the labeled images, one solution is to exploit useful information embedded into the unlabeled images and incorporate them into learning process. In machine learning community, semi-supervised learning (SSL) has been introduced with the aim of incorporating unlabeled samples into the...
3D Medical Images Segmentation by Effective Use of Unlabeled Data
, M.Sc. Thesis Sharif University of Technology ; Soleymani Baghshah, Mahdieh (Supervisor)
Abstract
Image segmentation in medical imaging, as one of the most important branches of medical image analysis, often faces the challenge of limited labeled data for application in deep learning methods. The high cost of data collection and the need for expertise in image segmentation, particularly in three-dimensional images such as MRI and CT or sequence images like CMR, have all contributed to this problem, even for popular networks like U-Net, which struggle to achieve high accuracy. As a result, research efforts have focused on semi-supervised learning approaches, weakly supervised learning, as well as multi-instance learning in medical image segmentation. Unfortunately, each of these methods...