Loading...
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 52770 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Soleymani, Mahdieh
- Abstract:
- By the improvement of machine learning methods specially the Deep Learning in the last decade, there were expanding usage of these methods in Language Modeling task. As the essence of a language model is more basic, recently huge networks are trained with language model objective but fine-tuned on target tasks such as Question Answering, Sentiment Analysis and etc. which is a promising sign of its importance and usage in even other NLP tasks. However, this task still has severe problems. The Teacher Forcing based methods, suffer from the so-called exposure bias problem which is due to the train/test procedure discrepancy. Some solutions such as using Reinforcement Learning which has high variance or other approximate ones have been introduced. On the side of models with latent space, the ignorance of the latent space by the decoder is reported.The more practical task, Conditional Text Generation including determination of the tense of the output sentence or more complex conditions such as context or topic, has great importance. The usage of these models is not restricted to the text generation and they can be taken in account in creating drug molecules with specific characteristics, musics with specific genres or generating graphs with specific features. In conditional text generation, there will be additional problem which is the mismatch of generated sentence with the desired condition.In this project, two latent based perspectives in conditional text generation with discrete conditions is discussed. In the first one the latent space is completely determined by the condition value. The proposed method is a model with latent space while controlled the latent space ignorance problem and also partitioned the prior distribution on latent space with respect to the condition values. By incorporating the Normalizing Flow Networks, which have been recently in attention, the distribution of latent space of each condition is learnt. On the other perspective, the latent space can be independent from the condition. As a result it only contains the content of sentences but not the condition and so the latent space won’t be separated with respect to condition values.At the end, the baseline methods and the proposed methods are evaluated against various metrics including quality, diversity and the match percentage of generated text with the desired condition. The first proposed method outperforms other methods in the match percentage of generated text with the desired condition while also keeping the quality and diversity of samples close to the baselines. The second method have results similar to the latent based baseline while a new approach in training is incorporated and also have superior quality and diversity in some datasets
- Keywords:
- Neural Networks ; Deep Learning ; Generative Models ; Conditional Text Generation ; Generative Models with Latent Space
- محتواي کتاب
- view
- 1 مقدمه
- 2 پژوهشهای پیشین
- 3 راهکار پیشنهادی
- 4 پیادهسازی، آزمایشها و ارزیابی
- 5 جمعبندی و کارهای آتی
- آ نمودارهای آموزش شبکه
- مراجع
- واژهنامه فارسی به انگلیسی
- واژهنامه انگلیسی به فارسی