Loading...
Search for: multi-modal
0.006 seconds
Total 29 records

    A new real-coded Bayesian optimization algorithm based on a team of learning automata for continuous optimization

    , Article Genetic Programming and Evolvable Machines ; Vol. 15, Issue. 2 , 2014 , pp. 169-193 ; ISSN: 13892576 Moradabadi, B ; Beigy, H ; Sharif University of Technology
    Abstract
    Estimation of distribution algorithms have evolved as a technique for estimating population distribution in evolutionary algorithms. They estimate the distribution of the candidate solutions and then sample the next generation from the estimated distribution. Bayesian optimization algorithm is an estimation of distribution algorithm, which uses a Bayesian network to estimate the distribution of candidate solutions and then generates the next generation by sampling from the constructed network. The experimental results show that the Bayesian optimization algorithms are capable of identifying correct linkage between the variables of optimization problems. Since the problem of finding the... 

    Multi-Modal Distance Metric Learning

    , M.Sc. Thesis Sharif University of Technology Roostaiyan, Mahdi (Author) ; Soleymani, Mahdieh (Supervisor)
    Abstract
    In many real-world applications, data contain multiple input channels (e.g., web pages include text, images and etc). In these cases, supervisory information may also be available in the form of distance constraints such as similar and dissimilar pairs from user feedbacks. Distance metric learning in these environments can be used for different goals such as retrieval and recommendation. In this research, we used from dual-wing harmoniums to combining text and image modals to a unified latent space when similar-dissimilar pairs are available. Euclidean distance of data represented in this latent space used as a distance metric. In this thesis, we extend the dual-wing harmoniums for... 

    Deep Learning for Multimodal Data

    , M.Sc. Thesis Sharif University of Technology Rastegar, Sarah (Author) ; Soleymani, Mahdieh (Supervisor)
    Abstract
    Recent advances in data recording has lead to different modalities like text, image, audio and video. Images are annotated and audio accompanies video. Because of distinct modality statistical properties, shallow methods have been unsuccessful in finding a shared representation which maintains the most information about different modalities. Recently, deep networks have been used for extracting high-level representations for multimodal data. In previous methods, for each modality, one modality-specific network was learned. Thus, high-level representations for different modalities were extracted. Since these high-level representations have less difference than raw modalities, a shared... 

    Multi-modal distance metric learning: A bayesian non-parametric approach

    , Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6 September 2014 through 12 September 2014 ; Volume 8927 , September , 2015 , Pages 63-77 ; 03029743 (ISSN) ; 9783319161983 (ISBN) Babagholami Mohamadabadi, B ; Roostaiyan, S. M ; Zarghami, A ; Baghshah, M. S ; Rother, C ; Agapito, L ; Bronstein, M. M ; Sharif University of Technology
    Springer Verlag  2015
    Abstract
    In many real-world applications (e.g. social media application), data usually consists of diverse input modalities that originates from various heterogeneous sources. Learning a similarity measure for such data is of great importance for vast number of applications such as classification, clustering, retrieval, etc. Defining an appropriate distance metric between data points with multiple modalities is a key challenge that has a great impact on the performance of many multimedia applications. Existing approaches for multi-modal distance metric learning only offer point estimation of the distance matrix and/or latent features, and can therefore be unreliable when the number of training... 

    Efficient multi-modal fusion on supergraph for scalable image annotation

    , Article Pattern Recognition ; Volume 48, Issue 7 , July , 2015 , Pages 2241-2253 ; 00313203 (ISSN) Amiri, S. H ; Jamzad, M ; Sharif University of Technology
    Elsevier Ltd  2015
    Abstract
    Different types of visual features provide multi-modal representation for images in the annotation task. Conventional graph-based image annotation methods integrate various features into a single descriptor and consider one node for each descriptor on the learning graph. However, this graph does not capture the information of individual features, making it unsuitable for propagating the labels of annotated images. In this paper, we address this issue by proposing an approach for fusing the visual features such that a specific subgraph is constructed for each visual modality and then subgraphs are connected to form a supergraph. As the size of supergraph grows linearly with the number of... 

    Two multimodal approaches for single microphone source separation

    , Article European Signal Processing Conference, 28 August 2016 through 2 September 2016 ; Volume 2016-November , 2016 , Pages 110-114 ; 22195491 (ISSN ; 9780992862657 (ISBN) Sedighin, F ; Babaie Zadeh, M ; Rivet, B ; Jutten, C ; Sharif University of Technology
    European Signal Processing Conference, EUSIPCO  2016
    Abstract
    In this paper, the problem of single microphone source separation via Nonnegative Matrix Factorization (NMF) by exploiting video information is addressed. Respective audio and video modalities coming from a single human speech usually have similar time changes. It means that changes in one of them usually corresponds to changes in the other one. So it is expected that activation coefficient matrices of their NMF decomposition are similar. Based on this similarity, in this paper the activation coefficient matrix of the video modality is used as an initialization for audio source separation via NMF. In addition, the mentioned similarity is used for post-processing and for clustering the rows... 

    MDL-CW: A multimodal deep learning framework with cross weights

    , Article 2016 IEEE Conference on Computer Vision and Pattern Recognition, 26 June 2016 through 1 July 2016 ; Volume 2016-January , 2016 , Pages 2601-2609 ; 10636919 (ISSN) ; 9781467388511 (ISBN) Rastegar, S ; Soleymani Baghshah, M ; Rabiee, H. R ; Shojaee, S. M ; Sharif University of Technology
    IEEE Computer Society 
    Abstract
    Deep learning has received much attention as of the most powerful approaches for multimodal representation learning in recent years. An ideal model for multimodal data can reason about missing modalities using the available ones, and usually provides more information when multiple modalities are being considered. All the previous deep models contain separate modality-specific networks and find a shared representation on top of those networks. Therefore, they only consider high level interactions between modalities to find a joint representation for them. In this paper, we propose a multimodal deep learning framework (MDLCW) that exploits the cross weights between representation of... 

    Multi Modal Traffic Assignment with Mode Choice Models

    , M.Sc. Thesis Sharif University of Technology Azizian, Hossein (Author) ; Zakai Ashtiani, Hedayat (Supervisor)
    Abstract
    Traffic congestion is one of the most important issues facing modern societies that causes many impacts such as environmental pollutions and waste of physical and spiritual energies. One of the traffic congestion mitigation strategies is network management, which encompasses a diverse range of tools. Efficient management of urban transportation network requires the transportation network information. Traffic equilibrium in the transportation network is one of the fundamental information that is required for most of network management tools. In this study, a multi-modal traffic assignment model with complementarity structure is proposed that can determine the multi-modal traffic equilibrium... 

    Answering Questions about Image Contents by Deep Networks

    , M.Sc. Thesis Sharif University of Technology Chavoshian, Mohammad (Author) ; Soleymani Baghshah, Mahdieh (Supervisor)
    Abstract
    Due to the recent advances in the learning of multimodal data, humans tend to use computer systems in order to solve more complex problems. One of them is Visual Question Answering (VQA), where the goal is finding the answer of a question asked about the visual contents of a given image. This is an interdisciplinary problem between the areas of Computer Vision, Natural Language Processing and Reasoning. Because of recent achievements of Deep Neural Networks in these areas, recent works used them to address the VQA task. In this thesis, three different methods have been proposed which adding each of them to existing solutions to the VQA problem can improve their results. First method tries to... 

    An attribute learning method for zero-shot recognition

    , Article 2017 25th Iranian Conference on Electrical Engineering, ICEE 2017, 2 May 2017 through 4 May 2017 ; 2017 , Pages 2235-2240 ; 9781509059638 (ISBN) Yazdanian, R ; Shojaee, S. M ; Soleymani Baghshah, M ; Sharif University of Technology
    Abstract
    Recently, the problem of integrating side information about classes has emerged in the learning settings like zero-shot learning. Although using multiple sources of information about the input space has been investigated in the last decade and many multi-view and multi-modal learning methods have already been introduced, the attribute learning for classes (output space) is a new problem that has been attended in the last few years. In this paper, we propose an attribute learning method that can use different sources of descriptions for classes to find new attributes that are more proper to be used as class signatures. Experimental results show that the learned attributes by the proposed... 

    Multimodal soft nonnegative matrix go-factorization for convolutive source separation

    , Article IEEE Transactions on Signal Processing ; Volume 65, Issue 12 , 2017 , Pages 3179-3190 ; 1053587X (ISSN) Sedighin, F ; Babaie Zadeh, M ; Rivet, B ; Jutten, C ; Sharif University of Technology
    Abstract
    In this paper, the problem of convolutive source separation via multimodal soft Nonnegative Matrix Co-Factorization (NMCF) is addressed. Different aspects of a phenomenon may be recorded by sensors of different types (e.g., audio and video of human speech), and each of these recorded signals is called a modality. Since the underlying phenomenon of the modalities is the same, they have some similarities. Especially, they usually have similar time changes. It means that changes in one of them usually correspond to changes in the other one. So their active or inactive periods are usually similar. Assuming this similarity, it is expected that the activation coefficient matrices of their... 

    Multi-modal deep distance metric learning

    , Article Intelligent Data Analysis ; Volume 21, Issue 6 , 2017 , Pages 1351-1369 ; 1088467X (ISSN) Roostaiyan, S. M ; Imani, E ; Soleymani Baghshah, M ; Sharif University of Technology
    IOS Press  2017
    Abstract
    In many real-world applications, data contain heterogeneous input modalities (e.g., web pages include images, text, etc.). Moreover, data such as images are usually described using different views (i.e. different sets of features). Learning a distance metric or similarity measure that originates from all input modalities or views is essential for many tasks such as content-based retrieval ones. In these cases, similar and dissimilar pairs of data can be used to find a better representation of data in which similarity and dissimilarity constraints are better satisfied. In this paper, we incorporate supervision in the form of pairwise similarity and/or dissimilarity constraints into... 

    A new algorithm for multimodal soft coupling

    , Article 13th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2017, 21 February 2017 through 23 February 2017 ; Volume 10169 LNCS , 2017 , Pages 162-171 ; 03029743 (ISSN); 9783319535463 (ISBN) Sedighin, F ; Babaie Zadeh, M ; Rivet, B ; Jutten, C ; Sharif University of Technology
    Springer Verlag  2017
    Abstract
    In this paper, the problem of multimodal soft coupling under the Bayesian framework when variance of probabilistic model is unknown is investigated. Similarity of shared factors resulted from Nonnegative Matrix Factorization (NMF) of multimodal data sets is controlled in a soft manner by using a probabilistic model. In previous works, it is supposed that the probabilistic model and its parameters are known. However, this assumption does not always hold. In this paper it is supposed that the probabilistic model is already known but its variance is unknown. So the proposed algorithm estimates the variance of the probabilistic model along with the other parameters during the factorization... 

    Leveraging multi-modal fusion for graph-based image annotation

    , Article Journal of Visual Communication and Image Representation ; Volume 55 , 2018 , Pages 816-828 ; 10473203 (ISSN) Amiri, S. H ; Jamzad, M ; Sharif University of Technology
    Academic Press Inc  2018
    Abstract
    Considering each of the visual features as one modality in image annotation task, efficient fusion of different modalities is essential in graph-based learning. Traditional graph-based methods consider one node for each image and combine its visual features into a single descriptor before constructing the graph. In this paper, we propose an approach that constructs a subgraph for each modality in such a way that edges of subgraph are determined using a search-based approach that handles class-imbalance challenge in the annotation datasets. Multiple subgraphs are then connected to each other to have a supergraph. This follows by introducing a learning framework to infer the tags of... 

    A Solution for Network Design Problem by B.O.T (Build-Operate-Transfer) and Considering Uncertain Parameters of the Problem

    , M.Sc. Thesis Sharif University of Technology Qiam, Shirin (Author) ; Poorzahedy, Hossain (Supervisor)
    Abstract
    Governments are concurrently faced with two problems in Network Development Problem (NDP): limited resources to invest, and the economic justification of the candidate projects. Public- Private Participation (PPP) is a solution for the first problem, in which Build- Operate- Transfer (BOT) is a scheme to implement this partnership.This study formulates a NDP, in which two sources of funds back projects’ construction: that of the private investor and the public budget for this purpose. The problem is seen as bi-level optimization, with the upper level dealing with the government decisions on the level of participation (from 0 to 100 percent of the project costs), and the lower level being a... 

    Multimodal Blind Source Separation

    , Ph.D. Dissertation Sharif University of Technology Sedighin, Farnaz (Author) ; Babaie-Zadeh, Massoud (Supervisor)
    Abstract
    Blind Source Separation (BSS) is a challenging task in signal processing which aims to separate sources from their mixtures when no information is available about the sources or the mixing system. Different approaches have already been proposed for source separation.However, during the last decade, new approaches based on multimodal nature of phenomena have been proposed for source separation. Different aspects of a multimodal phenomenon can be measured by means of different instruments where each of the measured signals is called a modality of that phenomenon. Although the modalities are different signals with different features, due to the same physical origin, they usually have some... 

    Evaluation of modal incremental dynamic analysis, using input energy intensity and modified bilinear curve

    , Article Structural Design of Tall and Special Buildings ; Volume 18, Issue 5 , 2009 , Pages 573-586 ; 15417794 (ISSN) Zarfam, P ; Mofid, M ; Sharif University of Technology
    2009
    Abstract
    In this paper, a technique for the study of nonlinear performance of structures in different levels of earthquakes is developed. In this method, the Incremental Dynamic Analysis (IDA) curves are not achieved from nonlinear dynamic analysis of multi-degree-of-freedom (MDF) structure. However, the procedure of constructing these curves is based on modelling of the entire structure with several single-degrees-of-freedom (SDF) structures and evaluating them through the modal pushover analysis method. An innovative idea for approximating pushover curves that is based on error distribution is introduced in this investigation. Furthermore, the total input energy applied towards the SDF oscillator,... 

    Self-attention equipped graph convolutions for disease prediction

    , Article 16th IEEE International Symposium on Biomedical Imaging, ISBI 2019, 8 April 2019 through 11 April 2019 ; Volume 2019-April , 2019 , Pages 1896-1899 ; 19457928 (ISSN) ; 9781538636411 (ISBN) Kazi, A ; Krishna, S. A ; Shekarforoush, S ; Kortuem, K ; Albarqouni, S ; Navab, N ; Sharif University of Technology
    IEEE Computer Society  2019
    Abstract
    Multi-modal data comprising imaging (MRI, fMRI, PET, etc.) and non-imaging (clinical test, demographics, etc.) data can be collected together and used for disease prediction. Such diverse data gives complementary information about the patient's condition to make an informed diagnosis. A model capable of leveraging the individuality of each multi-modal data is required for better disease prediction. We propose a graph convolution based deep model which takes into account the distinctiveness of each element of the multi-modal data. We incorporate a novel self-attention layer, which weights every element of the demographic data by exploring its relation to the underlying disease. We demonstrate... 

    Brain tumor segmentation based on 3D neighborhood features using rule-based learning

    , Article 11th International Conference on Machine Vision, ICMV 2018, 1 November 2018 through 3 November 2018 ; Volume 11041 , 2019 ; 0277786X (ISSN); 9781510627482 (ISBN) Barzegar, Z ; Jamzad, M ; Sharif University of Technology
    SPIE  2019
    Abstract
    In order to plan precise treatment or accurate tumor removal surgery, brain tumor segmentation is critical for detecting all parts of tumor and its surrounding tissues. To visualize brain anatomy and detect its abnormalities, we use multi-modal Magnetic Resonance Imaging (MRI) as input. This paper introduces an efficient and automated algorithm based on the 3D bit-plane neighborhood concept for Brain Tumor segmentation using a rule-based learning algorithm. In the proposed approach, in addition to using intensity values in each slice, we consider sets of three consecutive slices to extract information from 3D neighborhood. We construct a Rule base using sequential covering algorithm. Through... 

    Brain tumor segmentation based on 3D neighborhood features using rule-based learning

    , Article 11th International Conference on Machine Vision, ICMV 2018, 1 November 2018 through 3 November 2018 ; Volume 11041 , 2019 ; 0277786X (ISSN) ; 9781510627482 (ISBN) Barzegar, Z ; Jamzad, M ; Sharif University of Technology
    SPIE  2019
    Abstract
    In order to plan precise treatment or accurate tumor removal surgery, brain tumor segmentation is critical for detecting all parts of tumor and its surrounding tissues. To visualize brain anatomy and detect its abnormalities, we use multi-modal Magnetic Resonance Imaging (MRI) as input. This paper introduces an efficient and automated algorithm based on the 3D bit-plane neighborhood concept for Brain Tumor segmentation using a rule-based learning algorithm. In the proposed approach, in addition to using intensity values in each slice, we consider sets of three consecutive slices to extract information from 3D neighborhood. We construct a Rule base using sequential covering algorithm. Through...