Loading...

Improvement to speech-music discrimination using sinusoidal model based features

Shirazi, J ; Sharif University of Technology | 2010

879 Viewed
  1. Type of Document: Article
  2. DOI: 10.1007/s11042-009-0416-3
  3. Publisher: 2010
  4. Abstract:
  5. This paper addresses a model-based audio content analysis for classification of speech-music mixed audio signals into speech and music. A set of new features is presented and evaluated based on sinusoidal modeling of audio signals. The new feature set, including variance of the birth frequencies and duration of the longest frequency track in sinusoidal model, as a measure of the harmony and signal continuity, is introduced and discussed in detail. These features are used and compared to typical features as inputs to an audio classifier. Performance of these sinusoidal model features is evaluated through classification of audio into speech and music using both the GMM (Gaussian Mixture Model) and the SVM (Support Vector Machine) classifiers. Experimental results show that the proposed features are quite successful in speech/music discrimination. By using only a set of two sinusoidal model features, extracted from 1-s segments of the signal, we achieved 96.84% accuracy in the audio classification. Experimental comparisons also confirm superiority of the sinusoidal model features to the popular time domain and frequency domain features in audio classification
  6. Keywords:
  7. Audio classification ; Audio content analysis ; Audio signal ; Classification of speech ; Experimental comparison ; Feature sets ; Frequency domains ; Gaussian Mixture Model ; Model-based ; Signal continuity ; Sinusoidal model ; Sinusoidal modeling ; Speech/music discrimination ; SVM(support vector machine) ; Time domain ; Classification (of information) ; Classifiers ; Signal analysis ; Speech recognition ; Time domain analysis ; Audio acoustics
  8. Source: Multimedia Tools and Applications ; Volume 50, Issue 2 , November , 2010 , Pages 415-435 ; 13807501 (ISSN)
  9. URL: http://link.springer.com/article/10.1007%2Fs11042-009-0416-3