Loading...
Search for: telephony-speech-recognition
0.006 seconds

    Speaker phone mode classification using Gaussian mixture models

    , Article SPA 2011 - Signal Processing: Algorithms, Architectures, Arrangements, and Applications - Conference Proceedings, 29 September 2011 through 30 September 2011 ; September , 2011 , Pages 112-117 ; 9781457714863 (ISBN) Eghbal Zadeh, H ; Sobhan Manesh, F ; Sameti, H ; BabaAli, B ; Sharif University of Technology
    2011
    Abstract
    This study focuses on the mode classification of phones speaker modes using GMM 1. In this regard, speech data in both enabled and disabled speaker modes of cell phones and telephones were collected, processed and classified into two different categories. The different mixture numbers (1 to 4) of GMM and wave files sizes of 10, 20, 40 and 80 kb were tested in order to obtain an optimal condition for classification. The GMM method attained 87.99% correct classification rate on test data. This classification is important for speech enabled IVR 2 systems [1], dialog systems and many systems in speech processing in the sense that it could help to load an optimum model for increasing system... 

    Introducing a framework to create telephony speech databases from direct ones

    , Article 14th International Conference on Systems Signals and Image Processing, IWSSIP 2007 and 6th EURASIP Conference Focused on Speech and Image Processing, Multimedia Communications and Services, EC-SIPMCS 2007, Maribor, 27 June 2007 through 30 June 2007 ; November , 2007 , Pages 327-330 ; 9789612480295 (ISBN) Momtazi, S ; Sameti, H ; Vaisipour, S ; Tefagh, M ; Sharif University of Technology
    2007
    Abstract
    A Comprehensive speech database is one of the important tools for developing speech recognition systems; these tools are necessary for telephony recognition, too. Although adequate databases for direct speech recognizers exist, there is not an appropriate database for telephony speech recognizers. Most methods suggested for solving this problem are based on building new databases which tends to consume much time and many resources; or they used a filter which simulates circuit switch behavior to transform direct databases to telephony ones, in this case resulted databases have many differences with real telephony databases. In this paper we introduce a framework for creating telephony speech...