Loading...

Non-speaker information reduction from Cosine Similarity Scoring in i-vector based speaker verification

Zeinali, H ; Sharif University of Technology | 2015

511 Viewed
  1. Type of Document: Article
  2. DOI: 10.1016/j.compeleceng.2015.09.003
  3. Publisher: Elsevier Ltd , 2015
  4. Abstract:
  5. Cosine similarity and Probabilistic Linear Discriminant Analysis (PLDA) in i-vector space are two state-of-the-art scoring methods in speaker verification field. While PLDA usually gives better accuracy, Cosine Similarity Scoring (CSS) remains a widely used method due to simplicity and acceptable performance. In this domain, several channel compensation and score normalization methods have been proposed to improve the performance. We investigate non-speaker information in cosine similarity metric and propose a new approach to remove it from the decision making process. I-vectors hold a large amount of non-speaker information such as channel effects, language, and phonetic content. This type of information increases the verification error rate and hence it should be removed from the scoring method. To this end we propose a method that estimates non-speaker information between two i-vectors using the development set and subtracts it from cosine similarity. The results indicate that the proposed method performed better than other implemented methods based on the cosine similarity. Furthermore, in certain cases the performance of this method was better than the PLDA method and when combined with PLDA performance was improved in most cases
  6. Keywords:
  7. Decision making ; Discriminant analysis ; Speech recognition ; Vectors ; Acceptable performance ; Cosine similarity ; Cosine similarity metric ; Decision making process ; I vectors ; Non-speaker information ; Probabilistic linear discriminant analysis ; Speaker verification ; Vector spaces
  8. Source: Computers and Electrical Engineering ; Volume 48 , November , 2015 , Pages 226–238 ; 00457906 (ISSN)
  9. URL: http://www.sciencedirect.com/science/article/pii/S0045790615003092