Active distance-based clustering using k-medoids

, Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 19 April 2016 through 22 April 2016 ; Volume 9651 , 2016 , Pages 253-264 ; 03029743 (ISSN) ; 9783319317526 (ISBN) Aghaee, A ; Ghadiri, M ; Soleymani Baghshah, M ; Sharif University of Technology

Springer Verlag 2016

Abstract

k-medoids algorithm is a partitional, centroid-based clustering algorithm which uses pairwise distances of data points and tries to directly decompose the dataset with n points into a set of k disjoint clusters. However, k-medoids itself requires all distances between data points that are not so easy to get in many applications. In this paper, we introduce a new method which requires only a small proportion of the whole set of distances and makes an effort to estimate an upperbound for unknown distances using the inquired ones. This algorithm makes use of the triangle inequality to calculate an upper-bound estimation of the unknown distances. Our method is built upon a recursive approach to...

An attribute learning method for zero-shot recognition

, Article 2017 25th Iranian Conference on Electrical Engineering, ICEE 2017, 2 May 2017 through 4 May 2017 ; 2017 , Pages 2235-2240 ; 9781509059638 (ISBN) Yazdanian, R ; Shojaee, S. M ; Soleymani Baghshah, M ; Sharif University of Technology

2017

Abstract

Recently, the problem of integrating side information about classes has emerged in the learning settings like zero-shot learning. Although using multiple sources of information about the input space has been investigated in the last decade and many multi-view and multi-modal learning methods have already been introduced, the attribute learning for classes (output space) is a new problem that has been attended in the last few years. In this paper, we propose an attribute learning method that can use different sources of descriptions for classes to find new attributes that are more proper to be used as class signatures. Experimental results show that the learned attributes by the proposed...

DGSAN: Discrete generative self-adversarial network

, Article Neurocomputing ; Volume 448 , 2021 , Pages 364-379 ; 09252312 (ISSN) Montahaei, E ; Alihosseini, D ; Soleymani Baghshah, M ; Sharif University of Technology

Elsevier B.V 2021

Abstract

Although GAN-based methods have received many achievements in the last few years, they have not been entirely successful in generating discrete data. The most crucial challenge of these methods is the difficulty of passing the gradient from the discriminator to the generator when the generator outputs are discrete. Despite the fact that several attempts have been made to alleviate this problem, none of the existing GAN-based methods have improved the performance of text generation compared with the maximum likelihood approach in terms of both the quality and the diversity. In this paper, we proposed a new framework for generating discrete data by an adversarial approach in which there is no...

Transformer-based deep neural network language models for Alzheimer’s disease risk assessment from targeted speech

, Article BMC Medical Informatics and Decision Making ; Volume 21, Issue 1 , 2021 ; 14726947 (ISSN) Roshanzamir, A ; Aghajan, H ; Soleymani Baghshah, M ; Sharif University of Technology

BioMed Central Ltd 2021

Abstract

Background: We developed transformer-based deep learning models based on natural language processing for early risk assessment of Alzheimer’s disease from the picture description test. Methods: The lack of large datasets poses the most important limitation for using complex models that do not require feature engineering. Transformer-based pre-trained deep language models have recently made a large leap in NLP research and application. These models are pre-trained on available large datasets to understand natural language texts appropriately, and are shown to subsequently perform well on classification tasks with small training sets. The overall classification model is a simple classifier on...

Multi-modal deep distance metric learning

, Article Intelligent Data Analysis ; Volume 21, Issue 6 , 2017 , Pages 1351-1369 ; 1088467X (ISSN) Roostaiyan, S. M ; Imani, E ; Soleymani Baghshah, M ; Sharif University of Technology

IOS Press 2017

Abstract

In many real-world applications, data contain heterogeneous input modalities (e.g., web pages include images, text, etc.). Moreover, data such as images are usually described using different views (i.e. different sets of features). Learning a distance metric or similarity measure that originates from all input modalities or views is essential for many tasks such as content-based retrieval ones. In these cases, similar and dissimilar pairs of data can be used to find a better representation of data in which similarity and dissimilarity constraints are better satisfied. In this paper, we incorporate supervision in the form of pairwise similarity and/or dissimilarity constraints into...

Sample complexity of classification with compressed input

, Article Neurocomputing ; Volume 415 , 2020 , Pages 286-294 Hafez Kolahi, H ; Kasaei, S ; Soleymani Baghshah, M ; Sharif University of Technology

Elsevier B.V 2020

Abstract

One of the most studied problems in machine learning is finding reasonable constraints that guarantee the generalization of a learning algorithm. These constraints are usually expressed as some simplicity assumptions on the target. For instance, in the Vapnik–Chervonenkis (VC) theory the space of possible hypotheses is considered to have a limited VC dimension One way to formulate the simplicity assumption is via information theoretic concepts. In this paper, the constraint on the entropy H(X) of the input variable X is studied as a simplicity assumption. It is proven that the sample complexity to achieve an ∊-δ Probably Approximately Correct (PAC) hypothesis is bounded by [Formula...

An FPCA-based color morphological filter for noise removal

, Article Scientia Iranica ; Volume 16, Issue 1 D , 2009 , Pages 8-18 ; 10263098 (ISSN) Soleymani Baghshah, M ; Kasaei, S ; Sharif University of Technology

2009

Abstract

Morphological filtering is a useful technique for the processing and analysis of binary and gray scale images. The extension of morphological techniques to color images is not a straightforward task because this extension stems from the multivariate ordering problem. Since multivariate ordering is ambiguous, existing approaches have used known vector ordering schemes for the color ordering purpose. In the. last decade, many different color morphological operators have been introduced in the literature. Some of them have focused on noise suppression purposes. However, none has shown good performance, especially on edgy regions. In this paper, new color morphological operators, based on a...

Low-rank kernel learning for semi-supervised clustering

, Article Proceedings of the 9th IEEE International Conference on Cognitive Informatics, ICCI 2010, 7 July 2010 through 9 July 2010, Beijing ; 2010 , Pages 567-572 ; 9781424480401 (ISBN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology

2010

Abstract

In the last decade, there has been a growing interest in distance function learning for semi-supervised clustering settings. In addition to the earlier methods that learn Mahalanobis metrics (or equivalently, linear transformations), some nonlinear metric learning methods have also been recently introduced. However, these methods either allow limited choice of distance metrics yielding limited flexibility or learn nonparametric kernel matrices and scale very poorly (prohibiting applicability to medium and large data sets). In this paper, we propose a novel method that learns low-rank kernel matrices from pairwise constraints and unlabeled data. We formulate the proposed method as a trace...

Non-linear metric learning using pairwise similarity and dissimilarity constraints and the geometrical structure of data

, Article Pattern Recognition ; Volume 43, Issue 8 , August , 2010 , Pages 2982-2992 ; 00313203 (ISSN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology

2010

Abstract

The problem of clustering with side information has received much recent attention and metric learning has been considered as a powerful approach to this problem. Until now, various metric learning methods have been proposed for semi-supervised clustering. Although some of the existing methods can use both positive (must-link) and negative (cannot-link) constraints, they are usually limited to learning a linear transformation (i.e., finding a global Mahalanobis metric). In this paper, we propose a framework for learning linear and non-linear transformations efficiently. We use both positive and negative constraints and also the intrinsic topological structure of data. We formulate our metric...

Efficient kernel learning from constraints and unlabeled data

, Article Proceedings - International Conference on Pattern Recognition, 23 August 2010 through 26 August 2010, Istanbul ; 2010 , Pages 3364-3367 ; 10514651 (ISSN) ; 9780769541099 (ISBN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology

2010

Abstract

Recently, distance metric learning has been received an increasing attention and found as a powerful approach for semi-supervised learning tasks. In the last few years, several methods have been proposed for metric learning when must-link and/or cannot-link constraints as supervisory information are available. Although many of these methods learn global Mahalanobis metrics, some recently introduced methods have tried to learn more flexible distance metrics using a kernel-based approach. In this paper, we consider the problem of kernel learning from both pairwise constraints and unlabeled data. We propose a method that adapts a flexible distance metric via learning a nonparametric kernel...

Kernel-based metric learning for semi-supervised clustering

, Article Neurocomputing ; Volume 73, Issue 7-9 , 2010 , Pages 1352-1361 ; 09252312 (ISSN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology

2010

Abstract

Distance metric plays an important role in many machine learning algorithms. Recently, there has been growing interest in distance metric learning for semi-supervised setting. In the last few years, many methods have been proposed for metric learning when pairwise similarity (must-link) and/or dissimilarity (cannot-link) constraints are available along with unlabeled data. Most of these methods learn a global Mahalanobis metric (or equivalently, a linear transformation). Although some recently introduced methods have devised nonlinear extensions of linear metric learning methods, they usually allow only limited forms of distance metrics and also can use only similarity constraints. In this...

Semi-supervised metric learning using pairwise constraints

, Article 21st International Joint Conference on Artificial Intelligence, IJCAI-09, Pasadena, CA, 11 July 2009 through 17 July 2009 ; 2009 , Pages 1217-1222 ; 10450823 (ISSN) ; 9781577354260 (ISBN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology

2009

Abstract

Distance metric has an important role in many machine learning algorithms. Recently, metric learning for semi-supervised algorithms has received much attention. For semi-supervised clustering, usually a set of pairwise similarity and dissimilarity constraints is provided as supervisory information. Until now, various metric learning methods utilizing pairwise constraints have been proposed. The existing methods that can consider both positive (must-link) and negative (cannot-link) constraints find linear transformations or equivalently global Mahalanobis metrics. Additionally, they find metrics only according to the data points appearing in constraints (without considering other data...

Metric learning for semi-supervised clustering using pairwise constraints and the geometrical structure of data

, Article Intelligent Data Analysis ; Volume 13, Issue 6 , 2009 , Pages 887-899 ; 1088467X (ISSN) Baghshah Soleymani, B ; Bagheri Shouraki, S ; Sharif University of Technology

2009

Abstract

Metric learning is a powerful approach for semi-supervised clustering. In this paper, a metric learning method considering both pairwise constraints and the geometrical structure of data is introduced for semi-supervised clustering. At first, a smooth metric is found (based on an optimization problem) using positive constraints as supervisory information. Then, an extension of this method employing both positive and negative constraints is introduced. As opposed to the existing methods, the extended method has the capability of considering both positive and negative constraints while considering the topological structure of data. The proposed metric learning method can improve performance of...

Finding arbitrary shaped clusters and color image segmentation

, Article 1st International Congress on Image and Signal Processing, CISP 2008, Sanya, Hainan, 27 May 2008 through 30 May 2008 ; Volume 1 , 2008 , Pages 593-597 ; 9780769531199 (ISBN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology

2008

Abstract

One of the most famous approaches for the segmentation of color images is finding clusters in the color space. Shapes of these clusters are often complex and the time complexity of the existing algorithms for finding clusters of different shapes is usually high. In this paper, a novel clustering algorithm is proposed and used for the image segmentation purpose. This algorithm distinguishes clusters of different shapes using a two-stage clustering approach in a reasonable time. In the first stage, the mean-shift clustering algorithm is used and the data points are grouped into some sub-clusters. In the second stage, connections between sub-clusters are established according to a dissimilarity...

A novel semi-supervised clustering algorithm for finding clusters of arbitrary shapes

, Article 13th International Computer Society of Iran Computer Conference on Advances in Computer Science and Engineering, CSICC 2008, Kish Island, 9 March 2008 through 11 March 2008 ; Volume 6 CCIS , 2008 , Pages 876-879 ; 18650929 (ISSN); 3540899847 (ISBN); 9783540899846 (ISBN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology

2008

Abstract

Recently, several algorithms have been introduced for enhancing clustering quality by using supervision in the form of constraints. These algorithms typically utilize the pair wise constraints to either modify the clustering objective function or to learn the clustering distance measure. Very few of these algorithms show the ability of discovering clusters of different shapes along with satisfying the provided constraints. In this paper, a novel semi-supervised clustering algorithm is introduced that uses the side information and finds clusters of arbitrary shapes. This algorithm uses a two-stage clustering approach satisfying the pair wise constraints. In the first stage, the data points...

A fuzzy clustering algorithm for finding arbitrary shaped clusters

, Article 6th IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2008, Doha, 31 March 2008 through 4 April 2008 ; 2008 , Pages 559-566 ; 9781424419685 (ISBN) Soleymani Baghshah, M ; Bagheri Shouraki, S ; Sharif University of Technology

2008

Abstract

Until now, many algorithms have been introduced for finding arbitrary shaped clusters, but none of these algorithms is able to identify all sorts of cluster shapes and structures that are encountered in practice. Furthermore, the time complexity of the existing algorithms is usually high and applying them on large dataseis is time-consuming. In this paper, a novel fast clustering algorithm is proposed. This algorithm distinguishes clusters of different shapes using a twostage clustering approach. In the first stage, the data points are grouped into a relatively large number of fuzzy ellipsoidal sub-clusters. Then, connections between sub-clusters are established according to the Bhatiacharya...

Adaptation for Evolving Domains

, M.Sc. Thesis Sharif University of Technology Bitarafan, Adeleh (Author) ; Soleymani Baghshah, Mahdieh (Supervisor)

Abstract

Until now many domain adaptation methods have been proposed. A major limitation of almost all of these methods is their assumption that all test data belong to a single stationary target distribution and a large amount of unlabeled data is available for modeling this target distribution. In fact, in many real world applications, such as classifying scene image with gradually changing lighting and spam email identification, data arrives sequentially and the data distribution is continuously evolving. In this thesis, we tackle the problem of adaptation to a continuously evolving target domain that has been recently introduced and propose the Evolving Domain Adaptation (EDA) method to classify...

Cancer Prediction Using cfDNA Methylation Patterns With Deep Learning Approach

, M.Sc. Thesis Sharif University of Technology Mahdavi, Fatemeh (Author) ; Soleymani Baghshah, Mahdieh (Supervisor)

Abstract

Liquid biopsy includes information about the progress of the tumor, the effectiveness of the treatment and the possibility of tumor metastasis. This type of biopsy obtains this information by doing diagnosis and enumerating genetic variations in cells and cell-free DNA (cfDNA). Only a small fraction of cfDNA which might be free circulation tumor DNA (ctDNA) fragments, has mutations and is usually identified by epigenetic variations. On the other hand, the use of liquid biopsy has decreased, and tumors in the final stages are often untreatable due to the low accuracy in prediction of cancer. In this research, the aim is to predict cancer using cfDNA methylation patterns. We obtain these...

Answering Questions about Image Contents by Deep Networks

, M.Sc. Thesis Sharif University of Technology Chavoshian, Mohammad (Author) ; Soleymani Baghshah, Mahdieh (Supervisor)

Abstract

Due to the recent advances in the learning of multimodal data, humans tend to use computer systems in order to solve more complex problems. One of them is Visual Question Answering (VQA), where the goal is finding the answer of a question asked about the visual contents of a given image. This is an interdisciplinary problem between the areas of Computer Vision, Natural Language Processing and Reasoning. Because of recent achievements of Deep Neural Networks in these areas, recent works used them to address the VQA task. In this thesis, three different methods have been proposed which adding each of them to existing solutions to the VQA problem can improve their results. First method tries to...

Adversarial Robustness of Deep Neural Networks in Text Domain

, M.Sc. Thesis Sharif University of Technology Behjati, Melika (Author) ; Soleymani Baghshah, Mahdieh (Supervisor)

Abstract

In recent years, neural networks have been widely used in most machine learning domains. However, it has been shown that these networks are vulnerable to adversarial examples. adversarial examples are small and imperceptible perturbations applied to the input which lead to producing wrong output and thus, fooling the network. This will become an important issue in security related applications of deep neural networks, such as self-driving cars and medical diagnostics. Since, in the wort-case scenario, even human lives could be threatened. Although, many works have focused on crafting adversarial examples for image data, only a few studies have been done on textual data due to the existing...