Sharif Digital Repository / Sharif University of Technology / Search result

Speech Activity Detection Using Deep Networks

, M.Sc. Thesis Sharif University of Technology Shahsavari, Sajad (Author) ; Sameti, Hossein (Supervisor)

Abstract

In this paper, we introduce a new dataset for SAD and evaluate certain common methods such as GMM, ANN, and RNN on it. We have collected our dataset in a semi-supervised approach, using subtitled movies, with a labeling accuracy of 95%. This semi-automatic method can help us collect huge amounts of labeled audio data with very high diversity in language, speaker, and channel. We model the problem of SAD as a classification task to two classes of speech and non-speech. When using GMM for this problem, we use two separate mixtures to model speech and non-speech. In the case of neural networks, we use a softmax layer at the end of the network, with two neurons which represent speech and...

محتواي کتاب

Speaker Verification using Limited Enrollment Data

, M.Sc. Thesis Sharif University of Technology Kalantari, Elaheh (Author) ; Sameti, Hossein (Supervisor)

Abstract

In this thesis, we investigate speaker verification as a biometric technology to verify a person based on his/her claim. Text-dependent speaker verification systems are preferred in commercial and security applications and these systems have better performance in limited data condition based on a prior knowledge about speakers that are assumed to be cooperative. Limited amount of enrollment data is a major concern in this thesis. Speaker dependent model construction and channel variability issues on telephone-based text-dependent speaker verification applications are surveyed. Due to the lack of an appropriate database for the task, we collected a database which is referred to as text-prompt...

محتواي کتاب

Text-Independent Speaker Identification in Large Population Applications

, M.Sc. Thesis Sharif University of Technology Zeinali, Hossein (Author) ; Sameti, Hossein (Supervisor)

Abstract

The human speech conveys much information such as semantic contents, emotion and even speaker identity. Our goal in this thesis is the task of text-independent speaker identification (SI) in large population applications. Identification (test) time has become one of the most important issues in recent real time systems. Identification time depends on the cost of likelihood computation between test features and registered speaker models. For real time application of SI, system must identify an unknown speaker quickly. Hence the conventional SI methods cannot be used. The main goal in this thesis is to propose several methods that reduced identification time without any loss of identification...

محتواي پايان نامه

Statistical Video Indexing

, M.Sc. Thesis Sharif University of Technology Roozgard, Amin Mohammad (Author) ; Rabiee, Hamid Reza (Supervisor)

Abstract

Nowadays, video search and retrieval is interesting for computer users and it has chief usages for multimedia systems. Video generation rate has increased and Internet as a communication framework is case of its transferring on the world. Because of these, importance of video files is more than past. Searching for finding content will be faster if video files would have indexed with a comprehensive system. The biggest step in this way is power of index generation that would be same or similar to human mind, for improvement of the clustering’s result or classification’s result. For generating suitable indexes, it is necessary to extracting effective features from videos and synthesizing these...

محتواي پايان نامه

A two layer texture modeling based on curvelet transform and spiculated lesion filters for recognizing architectural distortion in mammograms

, Article Middle East Conference on Biomedical Engineering, MECBME ; 17 - 20 February , 2014 , pp. 21-24 Khoubani, S ; Nadjar, H. S ; Fatemizadeh, E ; Mohammadi, E ; Sharif University of Technology

Abstract

This paper presents a two layer texture modeling method to recognize architectural distortion in mammograms. We propose a method that models a Gaussian mixture on the Curvelet coefficients and the outputs of Spiculated Lesion Filters. The Curvelet transform and the Spiculated Lesion Filters have been applied to extract textural features of mammograms in literature. However the key difference between this study and the previous ones is that in our approach, a Gaussian mixture models the textural features extracted by the Curvelet transform and the Spiculated Lesion Filters. The results of the current study are shown in the form of accuracy and the area under the receiver operating...

Life-threatening arrhythmia verification in ICU patients using the joint cardiovascular dynamical model and a bayesian filter

, Article IEEE Transactions on Biomedical Engineering ; Volume 58, Issue 10 PART 1 , 2011 , Pages 2748-2757 ; 00189294 (ISSN) Sayadi, O ; Shamsollahi, M. B ; Sharif University of Technology

Abstract

In this paper, a novel nonlinear joint dynamical model is presented, which is based on a set of coupled ordinary differential equations of motion and a Gaussian mixture model representation of pulsatile cardiovascular (CV) signals. In the proposed framework, the joint interdependences of CV signals are incorporated by assuming a unique angular frequency that controls the limit cycle of the heart rate. Moreover, the time consequence of CV signals is controlled by the same phase parameter that results in the space dimensionality reduction. These joint equations together with linear assignments to observation are further used in the Kalman filter structure for estimation and tracking. Moreover,...