Sharif Digital Repository / Sharif University of Technology / Search result

A Persian Dialog System with Sequence to Sequence Learning

, M.Sc. Thesis Sharif University of Technology Ghafourian, Mohammad (Author) ; Sameti, Hossein (Supervisor)

Abstract

Conversation modeling is one of the most important goals in the field of understanding natural language and machine intelligence. Recently, with the enormous growth of the Internet and social networks, the amount of available data on the Web has increased significantly.This makes it possible to use data-driven approaches to solve the modeling problem of conversation.One of the most recent data-driven methods is the sequence to sequence modeling. In this document, after providing the necessary prerequisites, we examined the various models that have used the sequence to sequence approach for conversation modeling. We further examined the ways of improving the efficiency of this modeling...

محتواي کتاب

Deep Learning for Speech Recognition

, M.Sc. Thesis Sharif University of Technology Azadi Yazdi, Saman (Author) ; Sameti, Hossein (Supervisor)

Abstract

Speech recognition is one of the first goals of speech processing. Our goal in this thesis is to use deep learning for speech recognition. In recent years little improvement of speech recognition accuracies are reported. Deep learning is a new learning algorithm that results in improvement in many machine learning tasks. Following improvements reported in speech recognition in English language by deep learning, in this thesis we tried to improve accuracy over common and new recognition methods for Persian language.
First the overall structure of a typical speech recognition system is introduced. For this purpose, the modules of a speech recognition system are introduced. Deep multilayer...

محتواي کتاب

Learning Dialogue Management in Spoken Dialogue Systems

, M.Sc. Thesis Sharif University of Technology Habibi, Maryam (Author) ; Sameti, Hossein (Supervisor)

Abstract

Applying spoken dialogue systems (SDS's) is growing in the real life more rapidly because of the advances in the design and management of these systems. The traditional touch tone computer telephony systems are being substituted by the SDS's. In a typical SDS, the user speaks naturally to the system through a phone line and the system provides the required information or performs the required action. Banking and ticket reservation are typical examples of the prevalent SDS's. A spoken dialogue system has four units: automatic speech recognition (ASR), natural language understanding (NLU), dialogue management (DM), and spoken language generation (SLG). In this work, the first spoken dialogue...

محتواي پايان نامه

Uncertainty Reduction in Speaker Verification with Short Duration Utterances

, Ph.D. Dissertation Sharif University of Technology Maghsoodi, Nooshin (Author) ; Sameti, Hossein (Supervisor)

Abstract

The voice biometric is used in today’s telephone based speaker verification because of its unique feature for remote access. However, there are significant challenges in implementing such systems. One of these challenges is the need for sufficient data in the enrollment phase. In fact, the speaker verification system needs a dataset that covers phonetic variations of the language to be able to discriminate between different speakers. In real applications it’s not easy to ask the speakers to say long utterances. Therefore, an ideal speaker verification system should be able to find imposters without any constraint on the input lexicon whether the utterances are long or short.The results of...

محتواي کتاب

Stock Market Prediction Using Deep Learning based on Social Networks Data

, M.Sc. Thesis Sharif University of Technology Shafiei Masoleh, Mohammad (Author) ; Sameti, Hossein (Supervisor)

Abstract

Stock market prediction has always been a challenging task. Due to its stochastic nature, naive models cannot help solve the problem. In the past, Statistical models were used, however nowa- days with the rise of deep learning and more complex models, aggregating data, in order to pre- dict the stock price, has become feasible. Moreover, the emergence of social networks enables researchers to design models for stock prediction.Researchers used recurrent networks and word vector representations to solve this problem. However, recently newer models such as generative models based on VAEs and attention have gained interest. Newer models also don’t rely on a single data source and use multiple...

محتواي کتاب

Prediction of Stock Market Based on Corporate Financial Reports Using Deep Learning

, M.Sc. Thesis Sharif University of Technology Shafiei Masoleh, Hamed (Author) ; Sameti, Hossein (Supervisor)

Abstract

Creating tools for automating trade or creating advisory tools have great importance for stock markets. Regarding stock markets, information varies in type e.g. financial disclosures, news, price history, audit reports, etc. Aforementioned information and data's variance, volume, and the high number of factors affecting the stock price, make the stock market hard to predict all together. Therefore, predictions are usually limited to a subset of data. The goal of this research is to take advantage of the newest language processing techniques in order to analyze financial disclosure documents and predict their effect on their related stock price. Financial disclosures usually have a longer...

محتواي کتاب

Conversational Question Answering in Partial Context

, M.Sc. Thesis Sharif University of Technology Satvaty, Ali (Author) ; Sameti, Hossein (Supervisor)

Abstract

Conversational Question Answering (CQA) has gained significant attention in recent years due to its potential to facilitate natural language interactions between humans and machines. The ability to effectively incorporate relevant history turns, which are previous utterances in a conversation, plays a crucial role in improving the overall performance of CQA systems. In this master's thesis, we explore the importance of conversational question answering and propose a novel approach for selecting relevant history turns to enhance the accuracy and relevance of the system's responses. Initially, we provide an overview of the recent models developed for addressing the CQA challenge. We analyze...

محتواي کتاب

Keyword Spotting in Continuous Speech Based on Hidden Markov Model

, M.Sc. Thesis Sharif University of Technology Tavanaei, Amirhossein (Author) ; Sameti, Hossein (Supervisor)

Abstract

In this thesis we describe Keyword Spotting in continuous speech based on hidden Markov modeling. The aim of keyword spotting is to detect the specified keywords and get rid of other speech streams by a network of keyword models and a garbage model. Phoneme recognition is the basis of this work and we obtain appropriate feature vector and model for phonemes. Two main parts of keyword spotting are the keyword models and the filler model connected together by a network grammar. The Viterbi algorithm can recognize keywords and non-keywords using the network grammar. Each keyword model is created by concatenation of phoneme HMMs. In experiments keyword models with one skip in states of HMMs...

محتواي پايان نامه

Normalization of Non-standard Texts for Persian language Using Neural
Networks

, M.Sc. Thesis Sharif University of Technology Seyyedi, Javad (Author) ; Sameti, Hossein (Supervisor)

Abstract

The purpose of this research is to normalize non-standard persian texts. We proposed a method to transfigure the texts with any non-standard structure into a formal and standard form. One of the major complications of the text normalization is the large variety of non-standard structures, and the fact that these diversities could not be classified in one constructional pattern. Furthermore, the concept of text normalization, in different situations, has multiple different definitions, and any of this settings needs a distinct normalization method. Supervised learning methods are not suitable for normalization due to variety of both standard and non-standard texts as well as the absence of...

محتواي کتاب

Robust Speech Recognition Based on Data Compensation and MDT Methods

, M.Sc. Thesis Sharif University of Technology BabaAli, Bagher (Author) ; Sameti, Hossein (Supervisor)

Abstract

Automatic speech recognition performance degrades significantly when speech is affected by environmental noise. Nowadays, the major challenge is to achieve good robustness in adverse noisy conditions so that automatic speech recognizers can be used in real situations. Spectral subtraction (SS) is a well-known and effective approach; it was originally designed for improving the quality of speech signal judged by human listeners. SS techniques usually improve the quality and intelligibility of speech signal while speech recognition systems need compensation techniques to reduce mismatch between noisy speech features and clean trained acoustic model. Nevertheless, correlation can be expected...

محتواي پايان نامه

Improving Robustness of Speaker Verification Systems Against Non-Identity Information

, Ph.D. Dissertation Sharif University of Technology Zeinali, Hossein (Author) ; Sameti, Hossein (Supervisor)

Abstract

Speaker verification as a kind of biometric methods aims to verify the identity of a person from characteristics of their voice. This method faces many challenges such as voice imitation (spoofing), use of recorded voice, high sensitivity to convolutive distortions resulted by channel, and a large performance degradation for short-duration utterances. The aim of this thesis is to propose different methods for reducing the effects of non-identity information,especially the channel, and also solving the problem of new methods for text-dependent speaker verification with very short utterances. i-vector has been the best speaker modeling method in recent years but it doesn’t result in good...

محتواي کتاب

Discriminative Articulatory Models for Spoken Term Detection in Low-Resource Conditions

, M.Sc. Thesis Sharif University of Technology Gomar, Zahra (Author) ; Sameti, Hossein (Supervisor)

Abstract

This thesis is focused on the spoken term detection system based on speech recognition in low resources conditions. A spoken term detection system is composed of two parts: speech recognition and search. In search of words, the method of proxy words is used as a basic approache to overcome the problem of OOV words. The main challenge in this thesis in the context of low resources, is poor training acoustic and language models and the small lexicon in speech recognition. Small lexicon increases the number of OOV words. In this thesis, two innovation has been proposed to improve the basic system. The first is training a bottleneck neural network for extraction the articulatory features of...

محتواي کتاب

Markov Logic Networks for Persian Spoken Language Understanding

, M.Sc. Thesis Sharif University of Technology Hemmatan Attarbashi, Ensieh (Author) ; Bahrani, Mohammad (Supervisor) ; Khosravizadeh, Parvaneh (Co-Advisor) ; Sameti, Hossein (Co-Advisor)

Abstract

Spoken Language Understanding (SLU) is aimed at extracting meaning from natural spoken language. Meaning extraction ranges from "extracting specific phrases" to "extracting users' intentions from their speech" and goes as far as "extracting the entities and details of their intentions". Extracting the exact intended meaning of the user is a sophisticated process. In this research, considering the lack of standard data in Persian, an SLU system for this language has been implemented using Markov Logic Networks (MLNs), in order to reduce the need for extra datasets. MLNs combine the explanatory power and orderliness of First-Order Logic with the uncertainty of probabilities. Therefore, these...

محتواي کتاب

Language Modeling for Persian using Recurrent Neural Networks

, M.Sc. Thesis Sharif University of Technology Pourbagheri, Mohammad (Author) ; Sameti, Hossein (Supervisor)

Abstract

During recent years, neural networks have been used for language modeling in tasks related to natural language processing. In these models, various structures of neural networks have been used, and recurrent networks (RNN) have achieved good results in these tasks. Since RNNs are not limited to a fixed number of words for predicting next word, they have achieved better results than feedforward networks. However, these networks have problems to learn long sequences, and long short-term memory (LSTM) networks have been presented for solving this problem. In this research, language models are extracted for Persian language using RNN and LSTM, and are compared with n-gram-based models. For...

محتواي کتاب

Language Modeling Using Recurrent Neural Networks

, M.Sc. Thesis Sharif University of Technology Rahimi, Adel (Author) ; Sameti, Hossein (Supervisor)

Abstract

This thesis examines the differences and the similarities between the two famous RNN blocks the Long Short Term Memory and the Gated Recurrent Unit. It measure different aspects such as computational complexity, Word Error Rate, and subjective human evaluation in the task of text generation.In the computational complexity experiment results show that the LSTM takes more time to compute, in comparison to the GRU. Moving on into the next experiment the GRU slightly outperforms the LSTM in terms of WER but the perplexity for the language models tested was the same. This shows that slight differences in the perplexity does not drastically change the WER. Having said, the results suggest that the...

محتواي کتاب

Persian Statistical Natural Language Understanding Based on Partially Annotated Corpus

, M.Sc. Thesis Sharif University of Technology Jabbari, Fattaneh (Author) ; Sameti, Hossein (Supervisor)

Abstract

Spoken language understanding unit is one of the most important parts of a spoken dialogue system. The input of this system is the output of speech recognition unit. The main function of this unit is to extract the semantic information from the input utterances. There are two main types of approaches to do this task: rule-based approaches, and data-driven approaches. Today data-driven approaches are of more interest because they are more flexible and robust compared to the rule-based approaches. The main drawback of these methods is that they need a large amount of fully annotated or in some cases Treebank data. Preparing such data is time consuming and expensive. The goal of this thesis is...

محتواي پايان نامه

Designing a General Persian Text to Speech System

, M.Sc. Thesis Sharif University of Technology Jamshidian, Hamed (Author) ; Sameti, Hossein (Supervisor)

Abstract

In recent years with advances in artificial intelligence, numerous methods have been proposed for tasks that sometimes are difficult for human or requires a long time to overcome. Text-to-speech systems are among the methods that lead to easier human real life in different applications. The goal of this research is to propose a method for designing a Persian text-to-speech system while this system can be used in a wide domain of Persian texts and its output sound looks natural. In recent years, significant advances have been made in designing these systems for common languages like English. Most of these advances are because of proposed deep learning methods that are suitable for these...

محتواي کتاب

Design of a Knowledge-Grounded Open Domain Dialogue System

, M.Sc. Thesis Sharif University of Technology Samiei Paghale, Mohammad Mahdi (Author) ; Sameti, Hossein (Supervisor)

Abstract

Despite significant advances in dialog systems, data-driven dialog systems are often unable to have content-driven conversations and present real-world knowledge in the context which is due to the lack of knowledge-based conversations in the research datasets and the lack of external knowledge in their architecture. As a result, they are far from the real world and opendomain use-cases. The goal of this research is to introduce a dialogue system based on external knowledge and facts using Deep Learning that the external knowledge can be updated and, the model will adapt itself and take them into account to have a rich conversation. It must be noted that external knowledge is assumed as a...

محتواي کتاب

Design and Performance Improvement of a Spoken Term Detection System

, M.Sc. Thesis Sharif University of Technology Ghadirinia, Marzieh (Author) ; Sameti, Hossein (Supervisor)

Abstract

Recently, widely application of video and radio data makes the exploiting an efficient speech information retrival systems highly crucial. In the present work, Our focus is on spoken term detection which is one of the most important approaches for information retrival. The present system is including two main steps: first, speech processing by means of automatic speech recognition. In recognition Step, we apply large vocabulary. In all recent approaches, the main concern is to retrieve words which are out of vocabulary (OOV). The state of the art to tackle the problem is to exploit the proxy kewords which are in vocabulary words and could be recognized instead of OOV words. Such proxies have...

محتواي کتاب

Detecting Speakers in a Telephone Conversation

, M.Sc. Thesis Sharif University of Technology Soltani Farani, Ali (Author) ; Sameti, Hossein (Supervisor)

Abstract

The human speech signal conveys many levels of information ranging from phonetic content to speaker identity and even emotional status. This thesis deals with the task of open-set speaker identification (SI) from an unconstrained telephone conversation between two speakers. The goal is to find at most two speakers among a known set of target speakers that best match the voice samples of the input speech; the input voice samples are not constrained to the target speaker set. The uni-speaker problem is investigated first. The classic GMM-UBM system for text-independent SI and its adapted form are explored. The use of score-space information is advocated as a complementary source to the...

محتواي پايان نامه