Speaker Diarization in Adverse Conditions

Mohammadi, Hamid Reza; Sameti, Hossein

Please enable javascript in your browser.

Speaker Diarization in Adverse Conditions

Mohammadi, Hamid Reza | 2011

608 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 42081 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Sameti, Hossein
Abstract:
The goal of a speaker diarization system is to detect the number of speakers of a conversation and also assign each segment of the conversation to one of the speakers. In these types of systems it is assumed that the identity of the speakers is completely unknown. Usually speaker diarization systems operate in an offline mode. The system assumes that it does have the whole conversation at hand and then it starts processing the conversation. This method is effective for applications like spoken document retrieval, but it is not applicable to speech/speaker recognition systems which require online operating. In this dissertation, an online speaker diarization system is implemented. This implementation uses the output of a continuously running offline diarization system to produce the online output. In this dissertation, four ideas are proposed to improve the performance of such a system. These ideas are removing unvoiced frames, removing non-vowel frames, using wavelet packet features and proposing a new distance measure. In the performed experiments, the effectiveness of the ideas where shown. By incorporating these new ideas into the system, the speaker segmentation MDR and FAR are decreased by 40% and 30%, respectively. Also, the Diarization Error Rate is decreased by about 10% in short system runs.
Keywords:
Speaker Clustering ; Wavelet Packet Transform ; Distance Measurement ; Distance-based Speaker Segmentation ; Online Speaker Diarization

Digital Object List

محتواي پايان نامه
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code