Loading...

Isoform Function Prediction Using Deep Neural Network

Ghazanfari, Sara | 2021

617 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 54237 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Motahari, Abolfazl; Soleymani, Mahdieh
  7. Abstract:
  8. Isoforms are mRNAs that are produced from a same gene site in the phenomenon called Alternative Splicing. Studies have shown that more than 95% of multiexon genes in humans have undergone Alternative Splicing. Although there are few changes in mRNA sequence, They may have a systematic effect on cell function and regulation. It is widely reported that isoforms of a gene have distinct or even contrasting functions. Most studies have shown that alternative splicing plays a significant role in human health and disease. Despite the wide range of gene function studies, there is little information about isoforms’ functionalities. Recently, some computational methods based on Multiple Instance Learning have been proposed to predict isoform function using gene function and gene expression profile. However their performance is not desirable due to the lack of labeled training data. In addition, probabilitic models such as Conditional Random Field (CRF) have been used to model the relation between isoforms. In this project, we intend to use all the data and useful information such as isoform sequences, graph between isoforms based on their expression profiles and gene ontology graphs and propose a comprehensive model based on deep neural networks. The UniProt Gene Ontology (GO) database, used as a standard reference for gene function, is used to recruit the function of each gene. The NCBI RefSeq database is used for extracting gene and isoform sequences, and the NCBI SRA database is used for expression profile data. To measure the prediction accuracy, metrics such as Receiver Operating Characteristic Area Under the Curve(ROC AUC) and Precision Recall Under the Curve(PR AUC) are used
  9. Keywords:
  10. Alternative Splicing Learning ; Deep Neural Networks ; Gene Expression Data ; Isoform Function Prediction ; Conditional Random Fields (CRF)

 Digital Object List

 Bookmark

  • مقدمه
    • تعاریف و پیش‌نیازها
    • معرفی مسئله
    • اهمیت مسئله
    • ساختار پایان‌نامه
  • کارهای پیشین
    • روش یادگیری چند نمونه‌ای
      • روش iMILP li2014high
      • روش WLRM luo2017functional
      • روش DIFFUSE chen2019diffuse
      • روش DisoFun li2016proteogenomic
    • انطباق دامنه
      • روش DeepIsoFun shaw2019deepisofun
    • مرور کلی بر مسئله پیش‌بینی عملکرد پروتئین shehu2016survey
    • مقایسه مسائل پیش‌بینی عملکرد ژن، ایزوفرم و پروتئین
    • جمع‌بندی
  • راهکار پیشنهادی
    • دادگان مسئله
      • علت انتخاب دادگان مطرح شده
      • جمع آوری دادگان منتخب
    • پیش پردازش دادگان
      • رشته ایزوفرم
      • دامنه‌های حفاظت شده
      • پروفایل بیان ایزوفرم
    • مدل پیشنهادی
    • آموزش مدل پیشنهادی
      • چالش‌های آموزش مدل
      • یادگیری چند وظیفه‌ای
    • جمع‌بندی
  • نتایج جدید
    • معیارهای ارزیابی
      • معیار AUC
      • معیار AUPRC
    • ارزیابی نتایج راهکار پیشنهادی
    • مقایسه نتایج با روش‌های پیشین
    • تحلیل اثرات مولفه‌های مدل
    • سازگاری با ویژگی‌های توالی Uniprot
    • ارزیابی مدل بروی ایزوفرم‌های شناخته شده
    • جمع‌بندی
  • نتیجه‌گیری
...see more