Loading...

Identification of the Set of Single Nucleotide Variants in Genome Responsible for the Differentiation of Expression of Genes

Khatami, Mahshid | 2021

503 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 54761 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Rabiee, Hamid Reza; Beigi, Hamid
  7. Abstract:
  8. Single nucleotide polymorphs, There are changes caused by a mutation in a nucleotide in the Dena sequence. Mononucleotide polymorphisms are the most common type of genetic variation. Some of these changes have little or no effect on cells, while others cause significant changes in the expression of cell genes that can lead to disease or resistance to certain diseases. Because of the importance of these changes and their effect on cell function, the relationships between these changes are also important. Over the past decade, thousands of single disease-related mononucleotide polymorphisms have been identified in genome-related studies. Studies in this field have shown that the expression of genes is not determined randomly in humans and it is possible to predict their expression with the help of single nucleotide polymorphisms. In this study, the effect of single nucleotide polymorphisms on gene expression was calculated separately. To do this, genomics were categorized into possible modes and then a statistical analysis was performed to determine the significance of this classification. Then, significant single nucleotide polymorphisms were used as input features to different machine learning models to predict the expression of genes. Then, from the interpretation of the taught models, the importance of different mononucleotide polymorphisms was calculated. In the next step, by retraining the models with the help of a certain number of the most important single nucleotide polymorphisms as input features of the model, we observed that if the proposed method for assigning coefficients of importance to the features is selected, the results for predicting gene expression are very different. Teaching models with all the features will not work. As a result, by selecting a small set of features and teaching the model with their help, the accuracy of estimating gene expression will not be greatly reduced. Finally, the proposed method showed better performance compared to previous methods
  9. Keywords:
  10. Single Nucleotide Polymorphism (SNP) ; Statistical Analysis ; Machine Learning ; Gene Expression

 Digital Object List

 Bookmark

...see more