Loading...
Search for:
gaussian-mixture-model
0.006 seconds
Total 26 records
Image interpolation using Gaussian Mixture Models with spatially constrained patch clustering
, Article ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 19 April 2014 through 24 April 2014 ; Volume 2015-August , April , 2015 , Pages 1613-1617 ; 15206149 (ISSN) ; 9781467369978 (ISBN) ; Rabbani, H ; Babaie Zadeh, M ; Jutten, C ; Sharif University of Technology
Institute of Electrical and Electronics Engineers Inc
2015
Abstract
In this paper we address the problem of image interpolation using Gaussian Mixture Models (GMM) as a prior. Previous methods of image restoration with GMM have not considered spatial (geometric) distance between patches in clustering, failing to fully exploit the coherency of nearby patches. The GMM framework in our method for image interpolation is based on the assumption that the accumulation of similar patches in a neighborhood are derived from a multivariate Gaussian probability distribution with a specific covariance and mean. An Expectation Maximization-like (EM-like) algorithm is used in order to determine patches in a cluster and restore them. The results show that our image...
Variational bayesian approximation. A rigorous approach
, Article Proceedings of the Romanian Academy Series A - Mathematics Physics Technical Sciences Information Science ; Volume 23, Issue 2 , 2022 , Pages 107-112 ; 14549069 (ISSN) ; Sharif University of Technology
Publishing House of the Romanian Academy
2022
Abstract
We apply the theory of optimal transport to study mathematical properties of mean field variational Bayesian approximation. It turns out that if K +C > 0 where C is the convexity coefficient of −log p and K is a lower bound for the Ricci curvature of the underlying parameter space, then the corresponding system of equations of variational Bayesian approximation admits a unique solution. The uniqueness property in presence of symmetry leads to preservation of mode. As an explicit application we correct Bayesian Gaussian Mixture model in such a way that it turns into a convex model while its (unique) maximum likelihood solution coincides asymptotically with the true solution. Using convexity...
Statistical feature embedding for heart sound classification
, Article Journal of Electrical Engineering ; Volume 70, Issue 4 , 2019 , Pages 259-272 ; 13353632 (ISSN) ; Babaali, B ; Shehnepoor, S ; Sharif University of Technology
De Gruyter Open Ltd
2019
Abstract
Cardiovascular Disease (CVD) is considered as one of the principal causes of death in the world. Over recent years, this field of study has attracted researchers' attention to investigate heart sounds' patterns for disease diagnostics. In this study, an approach is proposed for normal/abnormal heart sound classification on the Physionet challenge 2016 dataset. For the first time, a fixed length feature vector; called i-vector; is extracted from each heart sound using Mel Frequency Cepstral Coefficient (MFCC) features. Afterwards, Principal Component Analysis (PCA) transform and Variational Autoencoder (VAE) are applied on the i-vector to achieve dimension reduction. Eventually, the reduced...
Statistical Video Indexing
, M.Sc. Thesis Sharif University of Technology ; Rabiee, Hamid Reza (Supervisor)
Abstract
Nowadays, video search and retrieval is interesting for computer users and it has chief usages for multimedia systems. Video generation rate has increased and Internet as a communication framework is case of its transferring on the world. Because of these, importance of video files is more than past. Searching for finding content will be faster if video files would have indexed with a comprehensive system. The biggest step in this way is power of index generation that would be same or similar to human mind, for improvement of the clustering’s result or classification’s result. For generating suitable indexes, it is necessary to extracting effective features from videos and synthesizing these...
Unsupervised Command Detection in EEG-based Brain-computer Interface
, M.Sc. Thesis Sharif University of Technology ; Beigy, Hamid (Supervisor)
Abstract
A Brain–Computer Interface is a system that provides a direct pathway for communication between a brain and a computer device by processing signals from sensors measuring brain activity (here Electroencephalography signals). Brain signals are known to be stochastic, non-stationary, non-linear and highly noisy, Therfore Brain–Computer Interface Systems rely on signal preprocessing, feature extraction and use of machine learning methods in order to detect mental state of Brain–Computer Interface user. Current approaches addressing the problem are mainly based on supervised learning methods. In this Thesis, first some of freely obtainable datasets with motor or motor-imagery paradigms are...
Text-Independent Speaker Identification in Large Population Applications
, M.Sc. Thesis Sharif University of Technology ; Sameti, Hossein (Supervisor)
Abstract
The human speech conveys much information such as semantic contents, emotion and even speaker identity. Our goal in this thesis is the task of text-independent speaker identification (SI) in large population applications. Identification (test) time has become one of the most important issues in recent real time systems. Identification time depends on the cost of likelihood computation between test features and registered speaker models. For real time application of SI, system must identify an unknown speaker quickly. Hence the conventional SI methods cannot be used. The main goal in this thesis is to propose several methods that reduced identification time without any loss of identification...
Optimization of cellular lifi network deployment for gaussian mixture user distributions
, Article 9th Iran Workshop on Communication and Information Theory, IWCIT 2021, 19 May 2021 through 20 May 2021 ; 2021 ; 9781665400565 (ISBN) ; Beyranvand, H ; Zolala, E ; Salehi, J.A ; Sharif University of Technology
Institute of Electrical and Electronics Engineers Inc
2021
Abstract
The long-term performance of LiFi networks significantly depends on the location of access points. The optimized placement can be determined based on the distribution of users in the room. In this paper, we investigate the placement optimization for average throughput maximization in the presence of asymmetric distributions. In particular, we represent users' distribution in the indoor environment by the Gaussian mixture model, which is powerful and computationally convenient. Then we obtain the optimized deployment for different scenarios using gradient ascent algorithm. The results show that optimization of deployment significantly improves the average throughput of the network. As the...
Image restoration using gaussian mixture models with spatially constrained patch clustering
, Article IEEE Transactions on Image Processing ; Volume 24, Issue 11 , June , 2015 , Pages 3624-3636 ; 10577149 (ISSN) ; Rabbani, H ; Babaie Zadeh, M ; Sharif University of Technology
Institute of Electrical and Electronics Engineers Inc
2015
Abstract
In this paper, we address the problem of recovering degraded images using multivariate Gaussian mixture model (GMM) as a prior. The GMM framework in our method for image restoration is based on the assumption that the accumulation of similar patches in a neighborhood are derived from a multivariate Gaussian probability distribution with a specific covariance and mean. Previous methods of image restoration with GMM have not considered spatial (geometric) distance between patches in clustering. Our conducted experiments show that in the case of constraining Gaussian estimates into a finite-sized windows, the patch clusters are more likely to be derived from the estimated multivariate Gaussian...
Speaker Verification using Limited Enrollment Data
, M.Sc. Thesis Sharif University of Technology ; Sameti, Hossein (Supervisor)
Abstract
In this thesis, we investigate speaker verification as a biometric technology to verify a person based on his/her claim. Text-dependent speaker verification systems are preferred in commercial and security applications and these systems have better performance in limited data condition based on a prior knowledge about speakers that are assumed to be cooperative. Limited amount of enrollment data is a major concern in this thesis. Speaker dependent model construction and channel variability issues on telephone-based text-dependent speaker verification applications are surveyed. Due to the lack of an appropriate database for the task, we collected a database which is referred to as text-prompt...
Large Vocabulary Isolated Word Recognition Using Neural Networks
, M.Sc. Thesis Sharif University of Technology ; Sameti, Hossein (Supervisor)
Abstract
Speech Recognition is an important topic in speech processing. In this thesis, we intend to do Isolated Word Recognition (IWR) a large vocabulary dataset. Previous works on large vocabulary IWR have used Hidden Markov Models, Gaussian Mixture Model and hybrid methods for this purpose, But our approach is based on Deep Neural Network (DNN). DNNs have shown excellent performance recently in different applications of voice and image processing. A key factor in speech recognition is the availability at appropriate datasets. There has been no acceptable speech corpus in Persian language for isolated word recognition before this work. In addition, Persian IWR systems reported so far are quite...
Speech Activity Detection Using Deep Networks
, M.Sc. Thesis Sharif University of Technology ; Sameti, Hossein (Supervisor)
Abstract
In this paper, we introduce a new dataset for SAD and evaluate certain common methods such as GMM, ANN, and RNN on it. We have collected our dataset in a semi-supervised approach, using subtitled movies, with a labeling accuracy of 95%. This semi-automatic method can help us collect huge amounts of labeled audio data with very high diversity in language, speaker, and channel. We model the problem of SAD as a classification task to two classes of speech and non-speech. When using GMM for this problem, we use two separate mixtures to model speech and non-speech. In the case of neural networks, we use a softmax layer at the end of the network, with two neurons which represent speech and...
Modelling Cell`s State in Different Cell Types
, M.Sc. Thesis Sharif University of Technology ; Hossein Khalaj, Babak (Supervisor) ; Motahari, Abolfazl (Co-Supervisor)
Abstract
Existence of heterogeneity in vital tissues of complex multicellular organisms like mammals, and fatal tissues like cancer on one hand, and limited access to biological properties of their components on the other hand, turn the study of these tissue traits to one of the most interesting fields in bioinformatics. One of the hottest subjects in this field is the recognition of functional components of these tissues by using bulk data extracted from the whole tissue.Almost every method that aims to achieve such a purpose, particularly using gene expression data, assumes that all of the cell types which constitute the studied tissue have a deterministic expression profile.In this thesis we...
A two layer texture modeling based on curvelet transform and spiculated lesion filters for recognizing architectural distortion in mammograms
, Article Middle East Conference on Biomedical Engineering, MECBME ; 17 - 20 February , 2014 , pp. 21-24 ; Nadjar, H. S ; Fatemizadeh, E ; Mohammadi, E ; Sharif University of Technology
Abstract
This paper presents a two layer texture modeling method to recognize architectural distortion in mammograms. We propose a method that models a Gaussian mixture on the Curvelet coefficients and the outputs of Spiculated Lesion Filters. The Curvelet transform and the Spiculated Lesion Filters have been applied to extract textural features of mammograms in literature. However the key difference between this study and the previous ones is that in our approach, a Gaussian mixture models the textural features extracted by the Curvelet transform and the Spiculated Lesion Filters. The results of the current study are shown in the form of accuracy and the area under the receiver operating...
Speaker phone mode classification using Gaussian mixture models
, Article SPA 2011 - Signal Processing: Algorithms, Architectures, Arrangements, and Applications - Conference Proceedings, 29 September 2011 through 30 September 2011 ; September , 2011 , Pages 112-117 ; 9781457714863 (ISBN) ; Sobhan Manesh, F ; Sameti, H ; BabaAli, B ; Sharif University of Technology
2011
Abstract
This study focuses on the mode classification of phones speaker modes using GMM 1. In this regard, speech data in both enabled and disabled speaker modes of cell phones and telephones were collected, processed and classified into two different categories. The different mixture numbers (1 to 4) of GMM and wave files sizes of 10, 20, 40 and 80 kb were tested in order to obtain an optimal condition for classification. The GMM method attained 87.99% correct classification rate on test data. This classification is important for speech enabled IVR 2 systems [1], dialog systems and many systems in speech processing in the sense that it could help to load an optimum model for increasing system...
Improvements in audio classification based on sinusoidal modeling
, Article 2008 IEEE International Conference on Multimedia and Expo, ICME 2008, Hannover, 23 June 2008 through 26 June 2008 ; 2008 , Pages 1485-1488 ; 9781424425716 (ISBN) ; Ghaemmaghami, S ; Razzazi, F ; Sharif University of Technology
2008
Abstract
In this paper, a set of features is presented and evaluated based on sinusoidal modeling of audio signals. Amplitude, frequency, and phase parameters of the sinusoidal model are used and compared as input features into an audio classifier system. The performance of sinusoidal model features is evaluated for classification of audio into speech and music classes using both the Gaussian and the GMM (Gaussian Mixture Model) classifiers. Experimental results show superiority of the amplitude parameters of the sinusoidal model, which could be used for the first time for such an audio classification, as compared to the popular cepstral features. By using a set of 40 sinusoidal features, we achieved...
Biometric identification through hand geometry
, Article EUROCON 2005 - The International Conference on Computer as a Tool, Belgrade, 21 November 2005 through 24 November 2005 ; Volume II , 2005 , Pages 1011-1014 ; 142440049X (ISBN); 9781424400492 (ISBN) ; Fatemizadeh, E ; Sharif University of Technology
IEEE Computer Society
2005
Abstract
A new approach for person identification based on hand geometry is presented. After preprocessing hand features are extracted from a photograph taken while user has placed his/her hand (either left or right) on the platform of a document scanner with no limits or fixation. Different pattern recognition techniques like Gaussian mixture modeling (GMM), Radial basis function neural networks (RBF), Multi layer perceptron (MLP), k-Nearest Neighbor (k-NN), Bayes method and mahalanobis/Hamming distance have been used in classification section. Experimental results show a rate of success above 90%. © 2005 IEEE
Online Monitoring of Multi-source PD Signals in a Single-phase Transformer Model with IEC 60270 and RF Methods
, Ph.D. Dissertation Sharif University of Technology ; Vakilian, Mehdi (Supervisor)
Abstract
Transformers are the key component in power system transmission and distribution networks. Condition based maintenance will increase their expected life and online monitoring is essential to ensure operation reliability. In this work a new approach to transformer online monitoring is provided based on partial discharge (PD) measurement.Multi-source PD signal separated using time-frequency S transform (ST) that is applied to the PD signal waveforms. The resultant ST matrix is then converted to gray scale image from which high level features are extracted using Bag of Words (BoW). Gaussian mixture model (GMM) clustering is used to discover clusters in the feature space. For recognition of...
Discrimination and Identification of Multiple Partial Discharge Sources in a Transformer Insulation
, M.Sc. Thesis Sharif University of Technology ; Vakilian, Mahdi (Supervisor)
Abstract
Partial discharges that occur in a transformer insulation, generate current pulses. If these pulses be recorded, they can be used for transformer insulation condition assessment. Through processing of these recorded partial discharge signals, the PRPD patterns are generated and used to identify the source type of partial discharge defect. If multiple partial discharge defects exist in a transformer insulation, the related PRPD pattern, doesn’t look like any PRPD patterns of single defects. In this case, we need in the first step to discriminate the partial discharge signals stemmed from all the existing multiple partial discharge sources. To simulate the occurrence of multiple partial...
Teaching to Point at different Objects as an Interactive Gesture to Robot by Learning from Demonstration
, M.Sc. Thesis Sharif University of Technology ; Meghdari, Ali (Supervisor) ; Taheri, Alireza (Supervisor)
Abstract
The usage of robots as our friends has been proliferated these days. Knowing that they are going to be used in ordinary houses, we should develop methods and algorithms in order to provide a situation for end-users to program their own robots for their desired tasks. Learning from Demonstrations (LfD) can play a crucial role in this field. In this study, we had taught a non-verbal communication method (pointing) to a robot utilizing LfD. The learning method used was TP-GMM1. The rationale to use this method was that it models all the degrees of freedom together, and we thought it might be an essential parameter to make a movement more natural and understandable which could be two vital...
Unsupervised estimation of conceptual classes for semantic image annotation
, Article 2011 19th Iranian Conference on Electrical Engineering, ICEE 2011, 17 May 2011 through 19 May 2011 ; May , 2011 ; 9789644634284 (ISBN) ; Esmaili, H ; Shirazi, A. A. B ; Sharif University of Technology
2011
Abstract
A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple and 2) computationally efficient. In this article, a content-based image retrieval and annotation architecture is proposed. Its attitude is decreasing the semantic gap by partitioning the image to its semantic regions and using...