Loading...
Search for: speech-recognition
0.012 seconds
Total 131 records

    A new solution for password key transferring in steganography methods by CAPTCHA through MMS technology

    , Article 2007 International Conference on Information and Emerging Technologies, ICIET, Karachi, 6 July 2007 through 7 July 2007 ; 2007 , Pages 136-141 ; 1424412463 (ISBN); 9781424412464 (ISBN) Shirali Shahreza, M ; Shirali Shahreza, M. H ; Sharif University of Technology
    2007
    Abstract
    The Multimedia Messaging System (MMS) allows a user of a mobile phone to send messages containing multimedia objects, such as images, audio or video clips. On the other hand establishing hidden communication is an important subject of discussion that has gained increasing importance nowadays with the development of the Internet. One of the methods introduced for establishing hidden communication is steganography. Therefore steganography in MMS is an interesting idea. One of the problems in steganography methods is the security of transferring password key used for steganography between sender and receiver of secure data. In this paper a new method is proposed for solving this problem using... 

    Localized CAPTCHA for illiterate people

    , Article 2007 International Conference on Intelligent and Advanced Systems, ICIAS 2007, Kuala Lumpur, 25 November 2007 through 28 November 2007 ; 2007 , Pages 675-679 ; 1424413559 (ISBN); 9781424413553 (ISBN) Shirali Shahreza, M. H ; Shirali Shahreza, M ; Sharif University of Technology
    2007
    Abstract
    Nowadays, many daily human activities such as education, commerce, talks, etc. are carried out through the Internet. In cases such as the registering in websites, some hackers write programs to make automatic false enrolments which waste the resources of the website while this may even stop the entire website from working. Therefore, it is necessary to tell apart human users from computer programs which is known as CAPTCHA (Completely Automated Public Turing test to tell Computers and Human Apart). CAPTCHA methods are mainly based on the weak points of OCR (Optical Character Recognition) systems while using them are undesirable to human users. So the Non-OCR-Based CAPTCHA methods are... 

    Deep learning in analytical chemistry

    , Article TrAC - Trends in Analytical Chemistry ; Volume 145 , 2021 ; 01659936 (ISSN) Debus, B ; Parastar, H ; Harrington, P ; Kirsanov, D ; Sharif University of Technology
    Elsevier B.V  2021
    Abstract
    In recent years, extensive research in the field of Deep Learning (DL) has led to the development of a wide array of machine learning algorithms dedicated to solving complex tasks such as image classification or speech recognition. Due to their unprecedented ability to explore large volumes of data and extract meaningful hidden structures, DL models have naturally drawn attention from various fields in science. Analytical chemistry, in particular, has successfully benefited from the application of DL tools for extracting qualitative and quantitative information from high-dimensional and complex chemical measurements. This report provides introductory reading for understanding DL machinery... 

    Light-sernet: a lightweight fully convolutional neural network for speech emotion recognition

    , Article 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022, 23 May 2022 through 27 May 2022 ; Volume 2022-May , 2022 , Pages 6912-6916 ; 15206149 (ISSN); 9781665405409 (ISBN) Aftab, A ; Morsali, A ; Ghaemmaghami, S ; Champagne, B ; Chinese and Oriental Languages Information Processing Society (COLPIS); Singapore Exhibition and Convention Bureau; The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen); The Institute of Electrical and Electronics Engineers Signal Processing Society ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2022
    Abstract
    Detecting emotions directly from a speech signal plays an important role in effective human-computer interactions. Existing speech emotion recognition models require massive computational and storage resources, making them hard to implement concurrently with other machine-interactive tasks in embedded systems. In this paper, we propose an efficient and lightweight fully convolutional neural network for speech emotion recognition in systems with limited hardware resources. In the proposed FCNN model, various feature maps are extracted via three parallel paths with different filter sizes. This helps deep convolution blocks to extract high-level features, while ensuring sufficient separability.... 

    Towards MPEG4 compatible face representation via hierarchical clustering-based facial feature extraction

    , Article ISCI 2011 - 2011 IEEE Symposium on Computers and Informatics ; 2011 , Pages 436-441 ; 9781612846903 (ISBN) Ghahari, A ; Mosleh, M ; Sharif University of Technology
    Abstract
    Multi-view imaging and display systems has taken a divide and conquer approach to 3D sensing and visualization. We aim to make more reliable and robust automatic feature extraction and natural 3D feature construction from 2D features detected on a pair of frontal and profile view face images. We propose several heuristic algorithms to minimize possible errors introduced by prevalent imperfect orthogonal condition and non-coherent luminance trying to address the problems incurred with illumination discrepancies on common surface points in accommodation of multi-views. In our approach, we first extract the 2D features that are visible to both cameras in both views. Then, we estimate the... 

    Speech signal modeling using multivariate distributions

    , Article Eurasip Journal on Audio, Speech, and Music Processing ; Volume 2015, Issue 1 , 2015 , Pages 1-14 ; 16874714 (ISSN) Aroudi, A ; Veisi, H ; Sameti, H ; Mafakheri, Z ; Sharif University of Technology
    Springer International Publishing  2015
    Abstract
    Using a proper distribution function for speech signal or for its representations is of crucial importance in statistical-based speech processing algorithms. Although the most commonly used probability density function (pdf) for speech signals is Gaussian, recent studies have shown the superiority of super-Gaussian pdfs. A large research effort has focused on the investigation of a univariate case of speech signal distribution; however, in this paper, we study the multivariate distributions of speech signal and its representations using the conventional distribution functions, e.g., multivariate Gaussian and multivariate Laplace, and the copula-based multivariate distributions as candidates.... 

    Hybrid clustering-based 3D face modeling upon non-perfect orthogonality of frontal and profile views

    , Article 2010 International Conference on Computer Information Systems and Industrial Management Applications, CISIM 2010, 8 October 2010 through 10 October 2010, Krackow ; 2010 , Pages 578-584 ; 9781424478170 (ISBN) Ghahari, A ; Mosleh, M ; Sharif University of Technology
    2010
    Abstract
    Multi view imaging has attracted increasing attention recently and has become one of the potential avenues in future video systems. We aim to make more reliable and robust automatic feature extraction and natural 3D feature construction from 2D features detected on a pair of frontal and profile view face images. We propose several heuristic algorithms to minimize possible errors introduced by prevalent non-perfect orthogonal condition and non-coherent luminance. In our approach, we first extract the 2D features that are visible to both cameras in both views. Then, we estimate the coordinates of the features in the hidden profile view based on the visible features extracted in the two... 

    Automatic MPEG4 compatible face representation using clustering-based modeling schemes

    , Article 2010 International Conference on Computer Information Systems and Industrial Management Applications, CISIM 2010, 8 October 2010 through 10 October 2010, Krackow ; 2010 , Pages 96-102 ; 9781424478170 (ISBN) Ghahari, A ; Mosleh, M ; Sharif University of Technology
    2010
    Abstract
    Multi view imaging has attracted increasing attention recently and has become one of the potential avenues in future video systems. We aim to make more reliable and robust automatic feature extraction and natural 3D feature construction from 2D features detected on a pair of frontal and profile view face images. We propose several heuristic algorithms to minimize possible errors introduced by prevalent non-perfect orthogonal condition and non-coherent luminance. In our approach, we first extract the 2D features that are visible to both cameras in both views. Then, we estimate the coordinates of the features in the hidden profile view based on the visible features extracted in the two... 

    Capacity bounds and detection schemes for data over voice

    , Article IEEE Transactions on Vehicular Technology ; Volume 65, Issue 11 , 2016 , Pages 8964-8977 ; 00189545 (ISSN) Kazemi, R ; Boloursaz, M ; Etemadi, S. M ; Behnia, F ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc 
    Abstract
    Cellular networks provide widespread and reliable voice communications among subscribers through mobile voice channels. These channels benefit from superior priority and higher availability compared with conventional cellular data communication services, such as General Packet Radio Service, Enhanced Data Rates for GSM Evolution, and High-Speed Downlink Packet Access. These properties are of major interest to applications that require transmitting small volumes of data urgently and reliably, such as an emergency call in vehicular applications. This encourages excessive research to make digital communication through voice channels feasible, leading to the emergence of Data over Voice (DoV)... 

    Estimation of current-induced scour depth around pile groups using neural network and adaptive neuro-fuzzy inference system

    , Article Applied Soft Computing Journal ; Volume 9, Issue 2 , 2009 , Pages 746-755 ; 15684946 (ISSN) Zounemat Kermani, M ; Beheshti, A. A ; Ataie Ashtiani, B ; Sabbagh Yazdi, S. R ; Sharif University of Technology
    2009
    Abstract
    The process of local scour around bridge piers is fundamentally complex due to the three-dimensional flow patterns interacting with bed materials. For geotechnical and economical reasons, multiple pile bridge piers have become more and more popular in bridge design. Although many studies have been carried out to develop relationships for the maximum scour depth at pile groups under clear-water scour condition, existing methods do not always produce reasonable results for scour predictions. It is partly due to the complexity of the phenomenon involved and partly because of limitations of the traditional analytical tool of statistical regression. This paper addresses the latter part and... 

    Significant pathological voice discrimination by computing posterior distribution of balanced accuracy

    , Article Biomedical Signal Processing and Control ; Volume 73 , 2022 ; 17468094 (ISSN) Pakravan, M ; Jahed, M ; Sharif University of Technology
    Elsevier Ltd  2022
    Abstract
    The ability to speak lucidly plays a key role in social relations. Consequently, the role of the larynx is quite important, and timely diagnosis of laryngeal diseases has proved to be crucial. In this study, a simple computational model for inverse of speech production model is employed to extract the glottal waveform using speech signal. This waveform has useful information about vocal folds performance in terms of providing evidence for distinguishing pathological disorders. Furthermore, obtaining the significance of classification results is important, because it leads to reliable inferences. This study utilizes the sustained vowel sound /a/ and a well-referenced database, namely MEEI. In...