Sharif Digital Repository / Sharif University of Technology / Search result

Speaker Verification using Limited Enrollment Data

, M.Sc. Thesis Sharif University of Technology Kalantari, Elaheh (Author) ; Sameti, Hossein (Supervisor)

Abstract

In this thesis, we investigate speaker verification as a biometric technology to verify a person based on his/her claim. Text-dependent speaker verification systems are preferred in commercial and security applications and these systems have better performance in limited data condition based on a prior knowledge about speakers that are assumed to be cooperative. Limited amount of enrollment data is a major concern in this thesis. Speaker dependent model construction and channel variability issues on telephone-based text-dependent speaker verification applications are surveyed. Due to the lack of an appropriate database for the task, we collected a database which is referred to as text-prompt...

محتواي کتاب

Online Monitoring of Multi-source PD Signals in a Single-phase Transformer Model with IEC 60270 and RF Methods

, Ph.D. Dissertation Sharif University of Technology Firuzi, Keyvan (Author) ; Vakilian, Mehdi (Supervisor)

Abstract

Transformers are the key component in power system transmission and distribution networks. Condition based maintenance will increase their expected life and online monitoring is essential to ensure operation reliability. In this work a new approach to transformer online monitoring is provided based on partial discharge (PD) measurement.Multi-source PD signal separated using time-frequency S transform (ST) that is applied to the PD signal waveforms. The resultant ST matrix is then converted to gray scale image from which high level features are extracted using Bag of Words (BoW). Gaussian mixture model (GMM) clustering is used to discover clusters in the feature space. For recognition of...

محتواي کتاب

Modelling Cell`s State in Different Cell Types

, M.Sc. Thesis Sharif University of Technology Saberi, Amir Hossein (Author) ; Hossein Khalaj, Babak (Supervisor) ; Motahari, Abolfazl (Co-Supervisor)

Abstract

Existence of heterogeneity in vital tissues of complex multicellular organisms like mammals, and fatal tissues like cancer on one hand, and limited access to biological properties of their components on the other hand, turn the study of these tissue traits to one of the most interesting fields in bioinformatics. One of the hottest subjects in this field is the recognition of functional components of these tissues by using bulk data extracted from the whole tissue.Almost every method that aims to achieve such a purpose, particularly using gene expression data, assumes that all of the cell types which constitute the studied tissue have a deterministic expression profile.In this thesis we...

محتواي کتاب

Speech Activity Detection Using Deep Networks

, M.Sc. Thesis Sharif University of Technology Shahsavari, Sajad (Author) ; Sameti, Hossein (Supervisor)

Abstract

In this paper, we introduce a new dataset for SAD and evaluate certain common methods such as GMM, ANN, and RNN on it. We have collected our dataset in a semi-supervised approach, using subtitled movies, with a labeling accuracy of 95%. This semi-automatic method can help us collect huge amounts of labeled audio data with very high diversity in language, speaker, and channel. We model the problem of SAD as a classification task to two classes of speech and non-speech. When using GMM for this problem, we use two separate mixtures to model speech and non-speech. In the case of neural networks, we use a softmax layer at the end of the network, with two neurons which represent speech and...

محتواي کتاب

Text-Independent Speaker Identification in Large Population Applications

, M.Sc. Thesis Sharif University of Technology Zeinali, Hossein (Author) ; Sameti, Hossein (Supervisor)

Abstract

The human speech conveys much information such as semantic contents, emotion and even speaker identity. Our goal in this thesis is the task of text-independent speaker identification (SI) in large population applications. Identification (test) time has become one of the most important issues in recent real time systems. Identification time depends on the cost of likelihood computation between test features and registered speaker models. For real time application of SI, system must identify an unknown speaker quickly. Hence the conventional SI methods cannot be used. The main goal in this thesis is to propose several methods that reduced identification time without any loss of identification...

محتواي پايان نامه

Statistical Video Indexing

, M.Sc. Thesis Sharif University of Technology Roozgard, Amin Mohammad (Author) ; Rabiee, Hamid Reza (Supervisor)

Abstract

Nowadays, video search and retrieval is interesting for computer users and it has chief usages for multimedia systems. Video generation rate has increased and Internet as a communication framework is case of its transferring on the world. Because of these, importance of video files is more than past. Searching for finding content will be faster if video files would have indexed with a comprehensive system. The biggest step in this way is power of index generation that would be same or similar to human mind, for improvement of the clustering’s result or classification’s result. For generating suitable indexes, it is necessary to extracting effective features from videos and synthesizing these...

محتواي پايان نامه

Teaching to Point at different Objects as an Interactive Gesture to Robot by Learning from Demonstration

, M.Sc. Thesis Sharif University of Technology Razmjoofard, Amir Reza (Author) ; Meghdari, Ali (Supervisor) ; Taheri, Alireza (Supervisor)

Abstract

The usage of robots as our friends has been proliferated these days. Knowing that they are going to be used in ordinary houses, we should develop methods and algorithms in order to provide a situation for end-users to program their own robots for their desired tasks. Learning from Demonstrations (LfD) can play a crucial role in this field. In this study, we had taught a non-verbal communication method (pointing) to a robot utilizing LfD. The learning method used was TP-GMM1. The rationale to use this method was that it models all the degrees of freedom together, and we thought it might be an essential parameter to make a movement more natural and understandable which could be two vital...

محتواي کتاب

Large Vocabulary Isolated Word Recognition Using Neural Networks

, M.Sc. Thesis Sharif University of Technology Hajitabar, Alireza (Author) ; Sameti, Hossein (Supervisor)

Abstract

Speech Recognition is an important topic in speech processing. In this thesis, we intend to do Isolated Word Recognition (IWR) a large vocabulary dataset. Previous works on large vocabulary IWR have used Hidden Markov Models, Gaussian Mixture Model and hybrid methods for this purpose, But our approach is based on Deep Neural Network (DNN). DNNs have shown excellent performance recently in different applications of voice and image processing. A key factor in speech recognition is the availability at appropriate datasets. There has been no acceptable speech corpus in Persian language for isolated word recognition before this work. In addition, Persian IWR systems reported so far are quite...

محتواي کتاب

Discrimination and Identification of Multiple Partial Discharge Sources in a Transformer Insulation

, M.Sc. Thesis Sharif University of Technology Javandel Ajirloo, Vahid (Author) ; Vakilian, Mahdi (Supervisor)

Abstract

Partial discharges that occur in a transformer insulation, generate current pulses. If these pulses be recorded, they can be used for transformer insulation condition assessment. Through processing of these recorded partial discharge signals, the PRPD patterns are generated and used to identify the source type of partial discharge defect. If multiple partial discharge defects exist in a transformer insulation, the related PRPD pattern, doesn’t look like any PRPD patterns of single defects. In this case, we need in the first step to discriminate the partial discharge signals stemmed from all the existing multiple partial discharge sources. To simulate the occurrence of multiple partial...

محتواي کتاب

Unsupervised Command Detection in EEG-based Brain-computer Interface

, M.Sc. Thesis Sharif University of Technology Behmand, Arash (Author) ; Beigy, Hamid (Supervisor)

Abstract

A Brain–Computer Interface is a system that provides a direct pathway for communication between a brain and a computer device by processing signals from sensors measuring brain activity (here Electroencephalography signals). Brain signals are known to be stochastic, non-stationary, non-linear and highly noisy, Therfore Brain–Computer Interface Systems rely on signal preprocessing, feature extraction and use of machine learning methods in order to detect mental state of Brain–Computer Interface user. Current approaches addressing the problem are mainly based on supervised learning methods. In this Thesis, first some of freely obtainable datasets with motor or motor-imagery paradigms are...

محتواي کتاب

HMM-based phrase-independent i-vector extractor for text-dependent speaker verification

, Article IEEE/ACM Transactions on Audio Speech and Language Processing ; Volume 25, Issue 7 , 2017 , Pages 1421-1435 ; 23299290 (ISSN) Zeinali, H ; Sameti, H ; Burget, L ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2017

Abstract

The low-dimensional i-vector representation of speech segments is used in the state-of-the-art text-independent speaker verification systems. However, i-vectors were deemed unsuitable for the text-dependent task, where simpler and older speaker recognition approaches were found more effective. In this work, we propose a straightforward hidden Markov model (HMM) based extension of the i-vector approach, which allows i-vectors to be successfully applied to text-dependent speaker verification. In our approach, the Universal Background Model (UBM) for training phrase-independent i-vector extractor is based on a set of monophone HMMs instead of the standard Gaussian Mixture Model (GMM). To...

Unsupervised estimation of conceptual classes for semantic image annotation

, Article 2011 19th Iranian Conference on Electrical Engineering, ICEE 2011, 17 May 2011 through 19 May 2011 ; May , 2011 ; 9789644634284 (ISBN) Teimoori, F ; Esmaili, H ; Shirazi, A. A. B ; Sharif University of Technology

2011

Abstract

A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple and 2) computationally efficient. In this article, a content-based image retrieval and annotation architecture is proposed. Its attitude is decreasing the semantic gap by partitioning the image to its semantic regions and using...

Improvement to speech-music discrimination using sinusoidal model based features

, Article Multimedia Tools and Applications ; Volume 50, Issue 2 , November , 2010 , Pages 415-435 ; 13807501 (ISSN) Shirazi, J ; Ghaemmaghami, S ; Sharif University of Technology

2010

Abstract

This paper addresses a model-based audio content analysis for classification of speech-music mixed audio signals into speech and music. A set of new features is presented and evaluated based on sinusoidal modeling of audio signals. The new feature set, including variance of the birth frequencies and duration of the longest frequency track in sinusoidal model, as a measure of the harmony and signal continuity, is introduced and discussed in detail. These features are used and compared to typical features as inputs to an audio classifier. Performance of these sinusoidal model features is evaluated through classification of audio into speech and music using both the GMM (Gaussian Mixture Model)...

Improvements in audio classification based on sinusoidal modeling

, Article 2008 IEEE International Conference on Multimedia and Expo, ICME 2008, Hannover, 23 June 2008 through 26 June 2008 ; 2008 , Pages 1485-1488 ; 9781424425716 (ISBN) Shirazi, J ; Ghaemmaghami, S ; Razzazi, F ; Sharif University of Technology

2008

Abstract

In this paper, a set of features is presented and evaluated based on sinusoidal modeling of audio signals. Amplitude, frequency, and phase parameters of the sinusoidal model are used and compared as input features into an audio classifier system. The performance of sinusoidal model features is evaluated for classification of audio into speech and music classes using both the Gaussian and the GMM (Gaussian Mixture Model) classifiers. Experimental results show superiority of the amplitude parameters of the sinusoidal model, which could be used for the first time for such an audio classification, as compared to the popular cepstral features. By using a set of 40 sinusoidal features, we achieved...

Portfolio Value-at-Risk and expected-shortfall using an efficient simulation approach based on Gaussian Mixture Model

, Article Mathematics and Computers in Simulation ; Volume 190 , 2021 , Pages 1056-1079 ; 03784754 (ISSN) Seyfi, S. M. S ; Sharifi, A ; Arian, H ; Sharif University of Technology

Elsevier B.V 2021

Abstract

Monte Carlo Approaches for calculating Value-at-Risk (VaR) are powerful tools widely used by financial risk managers across the globe. However, they are time consuming and sometimes inaccurate. In this paper, a fast and accurate Monte Carlo algorithm for calculating VaR and ES based on Gaussian Mixture Models is introduced. Gaussian Mixture Models are able to cluster input data with respect to market's conditions and therefore no correlation matrices are needed for risk computation. Sampling from each cluster with respect to their weights and then calculating the volatility-adjusted stock returns leads to possible scenarios for prices of assets. Our results on a sample of US stocks show that...

Utility of a nonlinear joint dynamical framework to model a pair of coupled cardiovascular signals

, Article IEEE Journal of Biomedical and Health Informatics ; Volume 17, Issue 4 , 2013 , Pages 881-890 ; 21682194 (ISSN) Sayadi, O ; Shamsollahi, M. B ; Sharif University of Technology

2013

Abstract

We have recently proposed a correlated model to provide a Gaussian mixture representation of the cardiovascular signals, with promising results in identifying rhythm disturbances. The approach provides a transformation of the data into a set of integrable Gaussians distributed over time. Looking into the model from a new joint modeling perspective, it is capable of assembling a filtered estimation, and can be used to derive temporal information of the waveforms. In this paper, we present a step-by-step derivation of the joint model putting correlation assumptions together to conclude a minimal joint description for a pair of ECG-ABP signals. We then probe novel applications of this model,...

Life-threatening arrhythmia verification in ICU patients using the joint cardiovascular dynamical model and a bayesian filter

, Article IEEE Transactions on Biomedical Engineering ; Volume 58, Issue 10 PART 1 , 2011 , Pages 2748-2757 ; 00189294 (ISSN) Sayadi, O ; Shamsollahi, M. B ; Sharif University of Technology

Abstract

In this paper, a novel nonlinear joint dynamical model is presented, which is based on a set of coupled ordinary differential equations of motion and a Gaussian mixture model representation of pulsatile cardiovascular (CV) signals. In the proposed framework, the joint interdependences of CV signals are incorporated by assuming a unique angular frequency that controls the limit cycle of the heart rate. Moreover, the time consequence of CV signals is controlled by the same phase parameter that results in the space dimensionality reduction. These joint equations together with linear assignments to observation are further used in the Kalman filter structure for estimation and tracking. Moreover,...

Image restoration using gaussian mixture models with spatially constrained patch clustering

, Article IEEE Transactions on Image Processing ; Volume 24, Issue 11 , June , 2015 , Pages 3624-3636 ; 10577149 (ISSN) Niknejad, M ; Rabbani, H ; Babaie Zadeh, M ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2015

Abstract

In this paper, we address the problem of recovering degraded images using multivariate Gaussian mixture model (GMM) as a prior. The GMM framework in our method for image restoration is based on the assumption that the accumulation of similar patches in a neighborhood are derived from a multivariate Gaussian probability distribution with a specific covariance and mean. Previous methods of image restoration with GMM have not considered spatial (geometric) distance between patches in clustering. Our conducted experiments show that in the case of constraining Gaussian estimates into a finite-sized windows, the patch clusters are more likely to be derived from the estimated multivariate Gaussian...

Image interpolation using Gaussian Mixture Models with spatially constrained patch clustering

, Article ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 19 April 2014 through 24 April 2014 ; Volume 2015-August , April , 2015 , Pages 1613-1617 ; 15206149 (ISSN) ; 9781467369978 (ISBN) Niknejad, M ; Rabbani, H ; Babaie Zadeh, M ; Jutten, C ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2015

Abstract

In this paper we address the problem of image interpolation using Gaussian Mixture Models (GMM) as a prior. Previous methods of image restoration with GMM have not considered spatial (geometric) distance between patches in clustering, failing to fully exploit the coherency of nearby patches. The GMM framework in our method for image interpolation is based on the assumption that the accumulation of similar patches in a neighborhood are derived from a multivariate Gaussian probability distribution with a specific covariance and mean. An Expectation Maximization-like (EM-like) algorithm is used in order to determine patches in a cluster and restore them. The results show that our image...

A two layer texture modeling based on curvelet transform and spiculated lesion filters for recognizing architectural distortion in mammograms

, Article Middle East Conference on Biomedical Engineering, MECBME ; 17 - 20 February , 2014 , pp. 21-24 Khoubani, S ; Nadjar, H. S ; Fatemizadeh, E ; Mohammadi, E ; Sharif University of Technology

Abstract

This paper presents a two layer texture modeling method to recognize architectural distortion in mammograms. We propose a method that models a Gaussian mixture on the Curvelet coefficients and the outputs of Spiculated Lesion Filters. The Curvelet transform and the Spiculated Lesion Filters have been applied to extract textural features of mammograms in literature. However the key difference between this study and the previous ones is that in our approach, a Gaussian mixture models the textural features extracted by the Curvelet transform and the Spiculated Lesion Filters. The results of the current study are shown in the form of accuracy and the area under the receiver operating...