Speaker Verification using Limited Enrollment Data

Kalantari, Elaheh; Sameti, Hossein

Please enable javascript in your browser.

Speaker Verification using Limited Enrollment Data

Kalantari, Elaheh | 2015

770 Viewed

Type of Document: M.Sc. Thesis
Language: English
Document No: 46911 (55)
University: Sharif University of Technology, International Campus, Kish Island
Department: Science and Engineering
Advisor(s): Sameti, Hossein
Abstract:
In this thesis, we investigate speaker verification as a biometric technology to verify a person based on his/her claim. Text-dependent speaker verification systems are preferred in commercial and security applications and these systems have better performance in limited data condition based on a prior knowledge about speakers that are assumed to be cooperative. Limited amount of enrollment data is a major concern in this thesis. Speaker dependent model construction and channel variability issues on telephone-based text-dependent speaker verification applications are surveyed. Due to the lack of an appropriate database for the task, we collected a database which is referred to as text-prompt utterance. Our dataset consists of Persian month names and is divided into three separate sets. In this thesis a new scheme is proposed to use mean supervector in text-prompted speaker verification. Eigenvoice modeling is applied to construct speaker dependent model from UBM based on enrollment data for each speaker. HMM model is considered for modeling temporal order of speech features. In this scheme, for each month name a separate model is constructed and final score based on passphrase is computed by the combination of scores of each word (model). Results from the telephony dataset of Persian month names show that the proposed method significantly reduces the EER by 17.84% compared to the-state-of-the-art State-GMM-MAP method. In Optimized-GMM-MAP method, it is shown that based on training set and testing set we can use 12 SVM models per speaker instead of 220. Therefore, the scheme has reduced EER and computational burden. In addition, the use of HMM instead of GMM as words’ model has improved the performance of the system. In the best case, EER is reduced by 32.3% in comparison with the State-GMM-MAP method. Furthermore, i-vectors have proved to be the most effective features for text-independent speaker verification in recent researches. Thus, a new scheme is proposed to utilize this technique in text prompted speaker verification in a simple while effective manner. Experiments show that the proposed scheme relatively reduces the EER by 23.47% compared to the State-GMM-MAP method. Additionally, it is shown that using HMM instead of GMM for universal background model leads to 15.33% reduction in EER
Keywords:
Supervector ; Hidden Markov Model ; Gaussian Mixture Model ; Text-Dependent Verification ; Ownership Authentication ; Text-Prompted Verification ; Telephony Dataset

Digital Object List

محتواي کتاب
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code