Sharif Digital Repository / Sharif University of Technology / Search result

Speaker Adaptation in HMM-Based Persian Speech Synthesis

, M.Sc. Thesis Sharif University of Technology Bahmaninezhad, Fahimeh (Author) ; Sameti, Hossein (Supervisor)

Abstract

Text-to-speech synthesis, one of the key technologies in speech processing, is a technique for generating speech signal from arbitrarily given text with target speaker’s voice characteristics and various speaking styles and emotional expressions. Statistical parametric speech synthesishasrecently been shown to be very effective in generating acceptable synthesized speech. Therefore, in this study,the main focus is on one of the instances of these techniquescalled hidden Markov model-based speech synthesis. In text-to-speech systems, it is desirable to synthesize high quality speech using a small amount of speech data; this goal would be achieved by employing speaker adaptation framework and...

محتواي پايان نامه

Improving Persian Word Embeddings Using Neural Networks

, M.Sc. Thesis Sharif University of Technology Aliramezani, Mohammad (Author) ; Sameti, Hossein (Supervisor) ; Bokaei, Mohammad Hadi (Co-Supervisor)

Abstract

In recent years, word embeddings as the word representation have captured the attention of natural language processing (NLP) researches. One of the great advantages of word embeddings is their capability in representing the relationships of the words. Therefore, using word embeddings in NLP applications results in better performance.Despite widespread attention towards word embedding in late years, Persian word embeddings have not achieved sensible progress. One of the Persian word embeddings difficulties is related to that, Persian is a low-resource language in comparison with worldwide languages. Therefore, Persian word embedding quality is lower than English. Consequently, the accuracy of...

محتواي کتاب

Cross-Lingual Speaker Adaptation for Statistical Parametric Speech Synthesis

, M.Sc. Thesis Sharif University of Technology Saleh, Fatemeh Sadat (Author) ; Sameti, Hossein (Supervisor)

Abstract

Speech synthesis and its applications have been very attractive recently. The main purpose of this technique is to produce a speech signal with natural characteristics of human speech like prosody and emotion. Among all existing methods for speech synthesis, statistical parametric speech synthesis methods are more promising because ofhigher flexibility in comparison to other methods. One of the applications of speech synthesis is speech to speech translation. In these systems, the generated voice in target language should have the same characteristics as the input voice in source language. The main purpose of this research is to review and evaluate the cross lingual speaker adaptation...

محتواي کتاب

Speaker Adaptation in Eigen Voice Space for Statistical Parametric Speech Syntheis

, M.Sc. Thesis Sharif University of Technology Shams, Boshra (Author) ; Sameti, Hossein (Supervisor)

Abstract

Recently various speaker adaptation methods in HMM-based speech synthesis are proposed. The importance of adaptation techniques is that we can design a system in which speech is generated with high quality and target speaker characteristics through limited adaptation data sets.
In this research, we focus on adaptation based on clustering and develop a new and novel method using eigenvoices in order to adapt a new speaker. We employ this approach for the first time in HSMM-based speech synthesis systems and its goal is to reduce the parameters and adaptation data of the system. In our proposed method, first some speaker dependent models are trained. For each model we combine the...

محتواي کتاب

HMM-based persian speech synthesis using limited adaptation data

, Article International Conference on Signal Processing Proceedings, ICSP ; Volume 1 , 2012 , Pages 585-589 ; 9781467321945 (ISBN) Bahmaninezhad, F ; Sameti, H ; Khorram, S ; Sharif University of Technology

2012

Abstract

Speech synthesis systems provided for the Persian language so far need various large-scale speech corpora to synthesize several target speakers' voice. Accordingly, synthesizing speech with a small amount of data seems to be essential in Persian. Taking advantage of a speaker adaptation in the speech synthesis systems makes it possible to generate speech with remarkable quality when the data of the speaker are limited. Here we conducted this method for the first time in Persian. This paper describes speaker adaptation based on Hidden Markov Models (HMMs) in Persian speech synthesis system for FARsi Speech DATabase (FARSDAT). In this regard, we prepared the whole FARSDAT, then for...

Pre-trained Model utilization Using Cross-lingual Methods

, M.Sc. Thesis Sharif University of Technology Hosseini, Mohammad (Author) ; Sameti, Hossein (Supervisor) ; Motahari, Abolfazl (Supervisor)

Abstract

Following dramatic changes after using deep learning method as a solution for Natural Language Processing tasks, Transformer architecture get popular. Based on that, then BERT Language model presented and get state-of-the-art as a solution for a lot of language processing tasks. It was a turning point in Natural Language Processing field. Also, in cross-lingual methods research line motivated by developing a common space for representation of language units, e.g. words, sentences, in more that one language, get some remarkable improvements. However, for languages distant from English such as Persian or Arabic the methods' performance was not clear. In this work, we performed some innovative...

محتواي کتاب