Sharif Digital Repository / Sharif University of Technology / Search result

Niusha, the first persian speech-enabled IVR platform

, Article 2010 5th International Symposium on Telecommunications, IST 2010, 4 December 2010 through 6 December 2010, Tehran ; 2010 , Pages 591-595 ; 9781424481835 (ISBN) Bokaei, M. H ; Sameti, H ; Eghbal-Zadeh, H ; BabaAli, B ; Hosseinzadeh, K. H ; Bahrani, M ; Veisi, H ; Sanian, A ; Sharif University of Technology

2010

Abstract

This paper introduces Niusha, the first Persian speech-enabled IVR platform. This platform uses Persian recognizer and Persian text-to-speech synthesizer engines in order to interact with users. The platform is designed in a way that it can simply be customized in various domains and its components are adjustable with new words

Introducing a framework to create telephony speech databases from direct ones

, Article 14th International Conference on Systems Signals and Image Processing, IWSSIP 2007 and 6th EURASIP Conference Focused on Speech and Image Processing, Multimedia Communications and Services, EC-SIPMCS 2007, Maribor, 27 June 2007 through 30 June 2007 ; November , 2007 , Pages 327-330 ; 9789612480295 (ISBN) Momtazi, S ; Sameti, H ; Vaisipour, S ; Tefagh, M ; Sharif University of Technology

2007

Abstract

A Comprehensive speech database is one of the important tools for developing speech recognition systems; these tools are necessary for telephony recognition, too. Although adequate databases for direct speech recognizers exist, there is not an appropriate database for telephony speech recognizers. Most methods suggested for solving this problem are based on building new databases which tends to consume much time and many resources; or they used a filter which simulates circuit switch behavior to transform direct databases to telephony ones, in this case resulted databases have many differences with real telephony databases. In this paper we introduce a framework for creating telephony speech...

Speaker phone mode classification using Gaussian mixture models

, Article SPA 2011 - Signal Processing: Algorithms, Architectures, Arrangements, and Applications - Conference Proceedings, 29 September 2011 through 30 September 2011 ; September , 2011 , Pages 112-117 ; 9781457714863 (ISBN) Eghbal Zadeh, H ; Sobhan Manesh, F ; Sameti, H ; BabaAli, B ; Sharif University of Technology

2011

Abstract

This study focuses on the mode classification of phones speaker modes using GMM 1. In this regard, speech data in both enabled and disabled speaker modes of cell phones and telephones were collected, processed and classified into two different categories. The different mixture numbers (1 to 4) of GMM and wave files sizes of 10, 20, 40 and 80 kb were tested in order to obtain an optimal condition for classification. The GMM method attained 87.99% correct classification rate on test data. This classification is important for speech enabled IVR 2 systems [1], dialog systems and many systems in speech processing in the sense that it could help to load an optimum model for increasing system...

Kalman filter method for packet loss replacement in presence of background noise

, Article International Multi-Conference on Systems, Signals and Devices, SSD 2012 - Summary Proceedings, 20 March 2012 through 23 March 2012 ; March , 2012 ; 9781467315906 (ISBN) Miralavi, S. R ; Ghorshi, S ; Tahaei, A ; Rahimi, A ; Sharif University of Technology

2012

Abstract

A major problem in real-time packet-based communication systems, is misrouted or delayed packets which results in degraded perceived voice quality. If packets are not available on time, the packets are considered as lost. The easiest solution in a network terminal receiver is to replace silence for the duration of lost speech segments. In a high quality communication system, to avoid degradation in speech quality due to packet loss, a suitable method or algorithm is needed to replace the missing segments of speech. In this paper, we introduce an adaptive filter for replacement of lost speech segment. In this method Kalman filter as a state-space based method will be used to predict the clean...