Loading...
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 49426 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Motahari, Abolfazl; Beigy, Hamid
- Abstract:
- Nowadays, many progresses in biology and medicine such as diagnosis of diseases and drug discoveries depend heavily on analyzing biological datasets collected from advanced machines. DNA Microarrays are amongst such machines applicable in measuring the expressions levels of thousand of genes and genotyping of a set of single nucleotide polymorphic sites to name a few. Compared to the more advanced Next Generation Sequencing (NGS) technology, the microarray platform produces lower quality of datasets. However, there has been tones of efforts to produce, process, and curate datasets from microarrays based on well designed protocols for sample preparation, hybridization, image processing, and statical learning algorithms. Hence, it is motivating to enhance and normalize microarray datasets based on new and more accurate observations from NGS technologies to make it more reliable in research as well as practice. In this research, we propose a new approach to enhance microarray data with the use of equivalent RNA-Seq gene expression data as a reference, which is a more accurate method for gene expression assessment. We first make the independence assumption between genes.Then we estimate the probability distribution of RNA-Seq data with normal distribution as a function of microarray gene expression. Parameters of normal distribution which are functions of microarray data can be learned using linear regression method. To estimate value of RNA-Seq method from microarray gene expression, we used the new learned model. The proposed method is successful in predicting absolute values and fold change detection of gene expression. But it is not successful in the case of differentially expressed genes detection
- Keywords:
- Microarray ; Normalization ; RNA Sequencing ; Enhancing