Loading...

Analyzing Cancer Cell Identity and Appropriative Subnetworks using Machine Learning

Saberi, Ali | 2018

1399 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 50826 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Rabiee, Hamid Reza; Sharifi Zarchi, Ali
  7. Abstract:
  8. From a long time ago cancer has been threatening human’s health, and researchers have been grappling with the phenomenon for numerous years. In the annals of this struggle, the number of cancer victims has outnumbered the survivals in a way that,until recently, suffering from cancer was perceived to be equivalent to death. Permanent defeat against cancer stems from the incomplete recognition of the phenomenon. In recent years, with the advent of technologies to extract information from the heart of cells and at the genome and transcriptome levels, man has been able to acquire a deeper understanding of cancer, its behavior and operation. Now that cancer is regarded to be a genetic disease, for a more accurate identification of it and also to provide more effective treatments, one has to return to the source of the incidence of cancer which is genome and analyze this facet of cancer phenomenon as much as possible.Previously, the analysis of the key events at genome level, would end up to the individual and separate study of genes. Although this approach was responsive for many diseases, cancer - due to its high complexity - requires a more thorough review, such that researchers in years gone by, have reached the conclusion that they must investigate the group functionality of genes in order to demystify the behavioral patterns of cancer and detect its weakness points. Recognition and determination of the identity of cancer cells according to the analysis of the group behavior of genes in cancer can be a basis for much cancer research and leads to a better planning for competing against cancer and ultimately overcoming it and improving millions of people’s health.To achieve this goal, this study has deduced the relations between genes using gene expression data in healthy and cancer cells and by doing necessary processing, and additionally has tried to gather, construct and conclude a comprehensive and complete gene regulatory network. After getting to gene-interaction network in cell, specialized sub-networks of four cancer types (breast, prostate, lung and kidney) were extracted from the massive gene network in an entirely novel way and based on Markov random processes, in such a way that it has been shown that the behavioral pattern of random process in active sub-networks in cancer has a significant difference comparing to other sub-networks, and by skimming through this pattern one may identify the identity of the cell. These sub-networks have complete accordance and adaptation with what has been discovered about cancer up to now. The most important evidence for this claim is that the curated cellular pathways relating to cancer follow the aforementioned pattern. The identification and extraction of these patterns demonstrates an outstanding performance in the recognition of the identity of cancer cell, in a way that with the aid of these subnetworks and its characteristics, cancer cells have been categorized with a precision of 100%
  9. Keywords:
  10. Cancer Cells ; Machine Learning ; Gene Expression Data ; Cancer Cell Identity ; Cancer Cell Subnetworks

 Digital Object List

 Bookmark

  • فهرست شکل‌ها
  • فهرست جدول‌ها
  • مقدمه
    • اهمیت انجام پژوهش
    • اهداف پژوهش
    • امکانات و محدودیت‌های پژوهش
    • تعریف نظری متغیر‌های پژوهش
  • مروری بر ادبیات و پیشینه‌ی علمی
    • مقدمه
      • سلول
      • ژ‌ن
      • بیان ژ‌ن
      • ژنوم
      • داده‌های بیان ژ‌ن
      • روابط میان ژ‌ن‌ها (شبکه‌ی ژنی)
      • زیر شبکه‌های ژنی
      • آغاز تکوین جنین
      • سلول بنیادی
      • تمایز وارون سلولی
      • دگر تمایز سلولی
      • مهندسی دگر تمایز سلولی
      • چارچوب CellNet
      • داده
      • لایه‌های داده
      • داده‌های پژوهش
      • انواع داده‌های بیان ژن
      • داده‌های حالت پایدار
      • استخراج داده‌های بیان ژن
      • توالی یابی رنا
    • مجموعه دادگان
    • نرمال‌سازی داده‌ها
    • روش‌های نرمال‌سازی
      • طریقه‌های ارزیابی یک روش نرمال‌سازی
    • استنتاج شبکه‌های ژنی
      • استفاده از دانش زیست‌شناسی‌ به عنوان دانش پیشین
      • مدل‌سازی شبکه‌های ژنی
      • تجزیه مسئله‌ی استنتاج شبکه ژنی سلول به چندین زیرمسئله‌ی انتخاب ویژگی‌
    • حل مسئله‌ی انتخاب ویژگی
      • روش‌های فیلتر
      • روش‌های پوشه
      • روش‌های جا‌سازی‌شده
      • مقایسه‌ی روش‌های انتخاب ویژگی
    • ارزیابی شبکه‌های ژنی استنتاج شده
    • استخراج زیر شبکه‌های ژنی
      • روش‌های استخراج زیر شبکه‌ها
    • دسته‌بندی داده‌ها
  • روش پیشنهادی
    • نقشه راه پژوهش
    • بررسی نقاط ضعف CellNet
      • تحلیل عملکرد ضعیف CellNet
    • روش پیشنهادی برای تعیین هویت سلول سرطانی
      • فاز ۱. جمع‌آوری داده‌ها
      • فاز ۲. پیش‌پردازش داده‌ها
      • فاز ۳. استنتاج شبکه تنظیمی ژنی
      • استخراج زیر شبکه‌های اختصاصی ژنی
  • ارزیابی
    • فاز ۱. جمع‌آوری داده
    • فاز ۲. نرمال‌سازی داده‌ها
    • فاز ۳. استنتاج شبکه تنظیمی ژنی
    • فاز ۴. استخراج زیر شبکه‌های اختصاصی
    • فاز ۵. دسته‌بندی انواع سرطان
    • جمع‌بندی
  • جمع‌بندی و کارهای آتی
  • مراجع
  • واژه‌نامه انگلیسی به فارسی
  • واژه‌نامه فارسی به انگلیسی
...see more