Sharif Digital Repository / Sharif University of Technology / Search result

KNNDIST: A non-parametric distance measure for speaker segmentation

, Article 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 ; Volume 3 , 2012 , Pages 2279-2282 ; 9781622767595 (ISBN) Mohammadi, S. H ; Sameti, H ; Langarani, M. S. E ; Tavanaei, A ; Sharif University of Technology

2012

Abstract

A novel distance measure for distance-based speaker segmentation is proposed. This distance measure is nonparametric, in contrast to common distance measures used in speaker segmentation systems, which often assume a Gaussian distribution when measuring the distance between two audio segments. This distance measure is essentially a k-nearest-neighbor distance measure. Non-vowel segment removal in preprocessing stage is also proposed. Speaker segmentation performance is tested on artificially created conversations from the TIMIT database and two AMI conversations. For short window lengths, Missed Detection Rated is decreased significantly. For moderate window lengths, a decrease in both...

Detecting Speakers in a Telephone Conversation

, M.Sc. Thesis Sharif University of Technology Soltani Farani, Ali (Author) ; Sameti, Hossein (Supervisor)

Abstract

The human speech signal conveys many levels of information ranging from phonetic content to speaker identity and even emotional status. This thesis deals with the task of open-set speaker identification (SI) from an unconstrained telephone conversation between two speakers. The goal is to find at most two speakers among a known set of target speakers that best match the voice samples of the input speech; the input voice samples are not constrained to the target speaker set. The uni-speaker problem is investigated first. The classic GMM-UBM system for text-independent SI and its adapted form are explored. The use of score-space information is advocated as a complementary source to the...

محتواي پايان نامه

Speaker Diarization in Adverse Conditions

, M.Sc. Thesis Sharif University of Technology Mohammadi, Hamid Reza (Author) ; Sameti, Hossein (Supervisor)

Abstract

The goal of a speaker diarization system is to detect the number of speakers of a conversation and also assign each segment of the conversation to one of the speakers. In these types of systems it is assumed that the identity of the speakers is completely unknown. Usually speaker diarization systems operate in an offline mode. The system assumes that it does have the whole conversation at hand and then it starts processing the conversation. This method is effective for applications like spoken document retrieval, but it is not applicable to speech/speaker recognition systems which require online operating. In this dissertation, an online speaker diarization system is implemented. This...

محتواي پايان نامه