Loading...
Search for: semi-supervised-clustering
0.006 seconds

    Metric learning for semi-supervised clustering using pairwise constraints and the geometrical structure of data

    , Article Intelligent Data Analysis ; Volume 13, Issue 6 , 2009 , Pages 887-899 ; 1088467X (ISSN) Baghshah Soleymani, B ; Bagheri Shouraki, S ; Sharif University of Technology
    Abstract
    Metric learning is a powerful approach for semi-supervised clustering. In this paper, a metric learning method considering both pairwise constraints and the geometrical structure of data is introduced for semi-supervised clustering. At first, a smooth metric is found (based on an optimization problem) using positive constraints as supervisory information. Then, an extension of this method employing both positive and negative constraints is introduced. As opposed to the existing methods, the extended method has the capability of considering both positive and negative constraints while considering the topological structure of data. The proposed metric learning method can improve performance of... 

    Scalable semi-supervised clustering by spectral kernel learning

    , Article Pattern Recognition Letters ; Vol. 45, issue. 1 , August , 2014 , p. 161-171 ; ISSN: 01678655 Soleymani Baghshah, M ; Afsari, F ; Bagheri Shouraki, S ; Eslami, E ; Sharif University of Technology
    Abstract
    Kernel learning is one of the most important and recent approaches to constrained clustering. Until now many kernel learning methods have been introduced for clustering when side information in the form of pairwise constraints is available. However, almost all of the existing methods either learn a whole kernel matrix or learn a limited number of parameters. Although the non-parametric methods that learn whole kernel matrix can provide capability of finding clusters of arbitrary structures, they are very computationally expensive and these methods are feasible only on small data sets. In this paper, we propose a kernel learning method that shows flexibility in the number of variables between... 

    Study and Proposal for an Improved Method of Semi-Supervised Clustering

    , M.Sc. Thesis Sharif University of Technology Abdollahi Alibeik, Mohammad (Author) ; Mahdavi Amiri, Nezameddin (Supervisor) ; Abolhassani, Hassan (Supervisor)
    Abstract
    Nowadays, clustering is one of the most common data mining tasks used for data categorization upon their similarities and analysis of these data groups in both industry and academia is of interest. Clustering does not need any supervision. The theme of this thesis is using some prior knowledge to improve clustering algorithms. The prior knowledge is in the form of some constraints determined by supervision to allowing preventing some combination data to be in one cluster or allocate some data in the same cluster. Here, prior knowledge in the form of constraints is used to modify a heuristic optimization algorithm (harmony search) and a novel "semi-supervised harmony clustering" is proposed.... 

    Clustering based on the Structure of the Data and Side Information

    , Ph.D. Dissertation Sharif University of Technology Soleymani Baghshah, Mahdieh (Author) ; Bagheri Shouraki, Saeed (Supervisor)
    Abstract
    Clustering is one of the important problems in machine learning, data mining, and pattern recognition fields. When the considered feature space for data representation is not suitable for discrimination of data groups, the data clustering problem may be a difficult problem that cannot be solved properly. In the other words, when the Euclidean distance cannot describe the dissimilarity of data pairs appropriately, the common clustering algorithms may not be helpful and the clusters show arbitrary shapes and spread in such spaces. Although since the late 1990’s several algorithms have been proposed for finding clusters of arbitrary structures, these algorithms cannot yield desirable... 

    Non-linear metric learning using pairwise similarity and dissimilarity constraints and the geometrical structure of data

    , Article Pattern Recognition ; Volume 43, Issue 8 , August , 2010 , Pages 2982-2992 ; 00313203 (ISSN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology
    2010
    Abstract
    The problem of clustering with side information has received much recent attention and metric learning has been considered as a powerful approach to this problem. Until now, various metric learning methods have been proposed for semi-supervised clustering. Although some of the existing methods can use both positive (must-link) and negative (cannot-link) constraints, they are usually limited to learning a linear transformation (i.e., finding a global Mahalanobis metric). In this paper, we propose a framework for learning linear and non-linear transformations efficiently. We use both positive and negative constraints and also the intrinsic topological structure of data. We formulate our metric... 

    Kernel-based metric learning for semi-supervised clustering

    , Article Neurocomputing ; Volume 73, Issue 7-9 , 2010 , Pages 1352-1361 ; 09252312 (ISSN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology
    2010
    Abstract
    Distance metric plays an important role in many machine learning algorithms. Recently, there has been growing interest in distance metric learning for semi-supervised setting. In the last few years, many methods have been proposed for metric learning when pairwise similarity (must-link) and/or dissimilarity (cannot-link) constraints are available along with unlabeled data. Most of these methods learn a global Mahalanobis metric (or equivalently, a linear transformation). Although some recently introduced methods have devised nonlinear extensions of linear metric learning methods, they usually allow only limited forms of distance metrics and also can use only similarity constraints. In this... 

    MRI Semi-Supervised Segmentation

    , M.Sc. Thesis Sharif University of Technology Izadi, Azadeh (Author) ; Bagheri Shouraki, Saeed (Supervisor)
    Abstract
    Image segmentation is a technique which divides an image into significant parts. The accuracy of this technique plays an important role when it applies on medical images. Among various image segmentation methods, clustering methods have been extensively investigated and used. Since it is an unsupervised method, the existence of a small amount of side-information which is extracted from a specific application (in this case, medical image) could improve its accuracy. Using this side-information in clustering methods introduces a new generation of clustering approaches called semi-supervised clustering. This information usually has a format of pair-wise constraints and can be prepared easily... 

    Human Genome Sequence Analysis Using Statistical and Machine Learning Methods

    , M.Sc. Thesis Sharif University of Technology Alaei, Shervin (Author) ; Manzuri Shalmani, Mohammad Taghi (Supervisor)
    Abstract
    During recent decades, dramatic advances in Genetics and Molecular Biology, has provided scientists with enormous amounts of molecular genomic information of different living organisms, from DNA sequences to complex 3d structures of proteins. This information is raw data which their analysis can provide better understanding of genome mechanisms, discriminating healthy and tumor cells, predicting disease type, making drugs based on genome information, and many more applications. Here, one important issue is the inevitable use of computer science and statistics to analyze these data; such that according to the vast amount of data, would provide intelligent methods, which yield most accurate... 

    Probabilistic non-linear distance metric learning for constrained clustering

    , Article MultiClust 2013 - 4th Workshop on Multiple Clusterings, Multi-View Data, and Multi-Source Knowledge-Driven Clustering, in Conj. with the 19th ACM SIGKDD Int. Conf. on KDD 2013 ; 2013 ; 9781450323345 (ISBN) Babagholami Mohamadabadi, B ; Zarghami, A ; Pourhaghighi, H. A ; Manzuri Shalmani, M. T ; Sharif University of Technology
    2013
    Abstract
    Distance metric learning is a powerful approach to deal with the clustering problem with side information. For semi-supervised clustering, usually a set of pairwise similarity and dissimilarity constraints is provided as supervisory information. Although some of the existing methods can use both equivalence (similarity) and inequivalence (dissimilarity) constraints, they are usually limited to learning a global Mahalanobis metric (i.e., finding a linear transformation). Moreover, they find metrics only according to the data points appearing in constraints, and cannot utilize information of other data points. In this paper, we propose a probabilistic metric learning algorithm which uses... 

    A novel semi-supervised clustering algorithm for finding clusters of arbitrary shapes

    , Article 13th International Computer Society of Iran Computer Conference on Advances in Computer Science and Engineering, CSICC 2008, Kish Island, 9 March 2008 through 11 March 2008 ; Volume 6 CCIS , 2008 , Pages 876-879 ; 18650929 (ISSN); 3540899847 (ISBN); 9783540899846 (ISBN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology
    2008
    Abstract
    Recently, several algorithms have been introduced for enhancing clustering quality by using supervision in the form of constraints. These algorithms typically utilize the pair wise constraints to either modify the clustering objective function or to learn the clustering distance measure. Very few of these algorithms show the ability of discovering clusters of different shapes along with satisfying the provided constraints. In this paper, a novel semi-supervised clustering algorithm is introduced that uses the side information and finds clusters of arbitrary shapes. This algorithm uses a two-stage clustering approach satisfying the pair wise constraints. In the first stage, the data points... 

    Low-rank kernel learning for semi-supervised clustering

    , Article Proceedings of the 9th IEEE International Conference on Cognitive Informatics, ICCI 2010, 7 July 2010 through 9 July 2010, Beijing ; 2010 , Pages 567-572 ; 9781424480401 (ISBN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology
    2010
    Abstract
    In the last decade, there has been a growing interest in distance function learning for semi-supervised clustering settings. In addition to the earlier methods that learn Mahalanobis metrics (or equivalently, linear transformations), some nonlinear metric learning methods have also been recently introduced. However, these methods either allow limited choice of distance metrics yielding limited flexibility or learn nonparametric kernel matrices and scale very poorly (prohibiting applicability to medium and large data sets). In this paper, we propose a novel method that learns low-rank kernel matrices from pairwise constraints and unlabeled data. We formulate the proposed method as a trace... 

    Semi-supervised metric learning using pairwise constraints

    , Article 21st International Joint Conference on Artificial Intelligence, IJCAI-09, Pasadena, CA, 11 July 2009 through 17 July 2009 ; 2009 , Pages 1217-1222 ; 10450823 (ISSN) ; 9781577354260 (ISBN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology
    Abstract
    Distance metric has an important role in many machine learning algorithms. Recently, metric learning for semi-supervised algorithms has received much attention. For semi-supervised clustering, usually a set of pairwise similarity and dissimilarity constraints is provided as supervisory information. Until now, various metric learning methods utilizing pairwise constraints have been proposed. The existing methods that can consider both positive (must-link) and negative (cannot-link) constraints find linear transformations or equivalently global Mahalanobis metrics. Additionally, they find metrics only according to the data points appearing in constraints (without considering other data...