Loading...
Search for: gene-expression-data
0.007 seconds
Total 27 records

    Using Transductive Learning Classification in Bioinformatics

    , M.Sc. Thesis Sharif University of Technology Tajari, Hossein (Author) ; Beigy, Hamid (Supervisor)
    Abstract
    Classification is one of the most important problems in machine learning area. Reliable and successful classification is essential for diagnosing patients for further treatment. In many applications such as bioinformatics unlabeled data is abundant and available. However labeling data is much more difficult and expensive to obtain. This dissertation presents a novel transductive approach for the development of robust microarray data classification. The transduction problem is to estimate the value of classification function at the given points in the working set. This contrasts with the standard inductive learning problem of estimating the classification method at all possible values and... 

    Analysis of Gene Expression Data in Bioinformatics Data Sets Using Machine Learning Approaches

    , M.Sc. Thesis Sharif University of Technology Bagherian, Misagh (Author) ; Beigy, Hamid (Supervisor)
    Abstract
    As a robust and accurate classification of tumors is necessary for successful treatment of cancer, classification of DNA microarray data has been widely used in successful diagnosis of cancers and some other biological diseases. But the main challenge in classification of microarray data is the extreme asymmetry between the dimensionality of features (usually thousands or even tens of thousands of genes) and that of tissues (few hundreds of samples). Because of such curse of dimensionality, a class prediction model could be very successful in classifying one type of dataset but may fail to perform well in some other ones. Overfitting is another problem that prevents conventional learning... 

    A Semi-Supervised Algorithms for Clustering Microarray Data

    , M.Sc. Thesis Sharif University of Technology Eslamzadeh, Habibollah (Author) ; Mahdavi Amiri, Nezamoddin (Supervisor) ; Madadkar Sobhani, Armin (Supervisor)
    Abstract
    Microarray which is also known as Biochip is a flat substrate of glass with the size of 1 ×1 cm on which a numerous number of biosensors are placed in an array format. Microarray DNAs are used to measure expression level of thousands of genes. Repeating these experiments in different conditions can result in patterns of expression. After preparation, the florescent sample is hybridized with the sensors of microarray surface and fluoresce intensities of the spots are measured by a special camera called CCD. The obtained pictures are examined by a computer and the spot lights converted into numerical data by image processing algorithms. Putting these numbers into matrices of size m×n is... 

    Bayesian Filtering Approach to Improve Gene Regulatory Networks Inference Using Gene Expression Time Series

    , M.Sc. Thesis Sharif University of Technology Fouladi, Ramouna (Author) ; Fatemizadeh, Emadoddin (Supervisor) ; Arab, Shahriar (Co-Advisor)
    Abstract
    Gene regulatory modeling in different species is one of the main aims of Bioinformatics. Regarding the limitations of the data available and the perspectives which should be taken into account for modeling such networks, proposed methods up to now have not yet been successful in yielding a comprehensive model. In one of the recent researches, the Gene regulation process is considered as a nonlinear dynamic stochastic process and described by state space equations. Afterwards, in order for the unknown parameters to be estimated, Extended Kalman Filtering is used. In this thesis, first of all, Gene complexes are taken into consideration instead of genes and afterwards, Extended Kalman... 

    Distributed Processing of Next Generation Sequencing Data Set

    , M.Sc. Thesis Sharif University of Technology Hadadian Nejad Yousefi, Mostafa (Author) ; Goudarzi, Maziar (Supervisor) ; Motahari, Abolfazl (Supervisor)
    Abstract
    DNA analysis plays a significant role in fields such as pharmacy, agriculture, genealogy, and forensics. Next generation sequencing datasets cover a gene several times due to a large number of readings. Therefore, the initial data volume is several times the amount of memory required to store the DNA strand. First, the DNA sequence of a sample should be made using the primary data, and then the difference should be found by comparing the sample DNA sequence with the reference DNA sequence. By finding these differences, one can extract the characteristics of the tested species. The extracted properties are precious for genetics researchers. For example, they can produce drugs that are... 

    Identifying Core Genes in Estimation of Missing Gene Expressions

    , M.Sc. Thesis Sharif University of Technology Darvish Shafighi, Shadi (Author) ; Motahari, Abolfazl (Supervisor)
    Abstract
    Characterizing cellular states in response to various disease conditions is an important issue which is addressed by different methods such as Large-scale gene expression profiling. One of the most important challenges in front of bioinformaticians is the loss of data because expression profiling is still very expensive. It is understood that profiling a group of selected genes could be enough for understanding all of the gene expression profile.In this research, we propose a fast method for estimation of the missing values inlow-rank matrices. We consider the highly correlated expression profiles as a low-rank matrix. Then, we used this new method in a proposed algorithm which will select... 

    Analysis of Genes Regulating Beta Cells Cell Cycle

    , M.Sc. Thesis Sharif University of Technology Saraei, Tannaz (Author) ; Motahari, Abolfazl (Supervisor)
    Abstract
    Diabetes mellitus is a group of disorders where the level of blood sugar remains high for a long period of time. This increase may be due to either reduced insulin secretion from the pancreatic gland, or insulin resistance, or both. Another key reason is the destruction of beta cells due to functional defect in the body’s immune system. Current treatments include controlling diet, insulin injection and pancreatic transplantation, all of which are temporary. For this reason, finding genetic factors participating in the progression of the disease and adapting treatments to these factors are under intensive studies.In this thesis, available information resources including genomic, biological... 

    Detection and Estimation of Key Parameters in Traffic Models Using Data Mining Tools

    , M.Sc. Thesis Sharif University of Technology Moadab, Amir Hossein (Author) ; Khedmati, Majid (Supervisor)
    Abstract
    Nowadays, investigating the factors affecting traffic models from different aspects such as metropolitan planning according to the present conditions can help high-level decision-makers and also, at the micro-level, help the travelers to make appropriate decisions for scheduling affairs, route selection, and vehicle type selection. Given the importance of this topic, a framework will be presented in this study that will evaluate the impact of some identified factors such as travel distance, climate, and urban events, and then all these factors will be presented in mathematical formulas. In the end, based on the model, the travel time will be predicted. In this framework, gene expression... 

    Identification of Driver Genes in Glioblastoma Based on Single-Cell Gene Expression Data Utilizing the Concept of Pseudotime and Phylogenetic Analysis

    , M.Sc. Thesis Sharif University of Technology Mirza Abolhassani, Fatemeh (Author) ; Foroughmand Aarabi, Mohammad Hadi (Supervisor) ; Kavousi, Kaveh (Co-Supervisor) ; Zare Mirakabad, Fatemeh (Co-Supervisor)
    Abstract
    Genetic heterogeneity within a tumor, which occurs during cancer evolution, is one of the reasons for treatment failure and increased chances of drug resistance. Cancer cells initially derive from a mutated progenitor cell, resulting in shared mutated genes. Throughout the course of tumor formation and progression, the occurrence of new mutations is possible, leading to the generation of cancer cells with various mutated genes. An appropriate approach is to identify the sequence of mutations that have occurred in the tumor, which can be inferred from single-cell sequencing data. Singlecell data provides valuable information about branching events in the evolution of a cancerous tumor. In... 

    Using Statistical Pattern Recognition on Gene Expression Data for Prediction of Cancer

    , M.Sc. Thesis Sharif University of Technology Hajiloo, Mohsen (Author) ; Rabiee, Hamid Reza (Supervisor)
    Abstract
    The classification of different tumor types is of great importance in cancer diagnosis and drug discovery. However, most previous cancer classification studies are clinical based and have limited diagnostic ability. Cancer classification using gene expression data is known to contain the keys for addressing the fundamental problems relating to cancer diagnosis. The recent advent of DNA microarray technique has made simultaneous monitoring of thousands of gene expressions possible. With this abundance of gene expression data, researchers have started to explore the possibilities of cancer classification using gene expression data and quite a number of Pattern Recognition approaches have been... 

    Gene Selection and Reduction in DNA Microarrays to Improve Classification Accuracy of Cancerous Samples

    , M.Sc. Thesis Sharif University of Technology Baharvand Irannia, Zohreh (Author) ; Rabiee, Hamid Reza (Supervisor)
    Abstract
    DNA Microarray is the state-of-the-art technology in analyzing gene expression data. It has made it possible to measure expression levels of thousand of genes simultaneously. Microarray classification has been widely used in effective diagnosis of cancers and some other biological diseases. But the most challenging issue is the intense asymmetry between the dimensionality of genes and tissue samples which can wreck the classification performance. This dissertation will focus on gene selection and reduction solutions and presents a novel classification scheme which uses both gene selection and dimension reduction in its different stages. We have improved one of the recently proposed topology... 

    Human Genome Sequence Analysis Using Statistical and Machine Learning Methods

    , M.Sc. Thesis Sharif University of Technology Alaei, Shervin (Author) ; Manzuri Shalmani, Mohammad Taghi (Supervisor)
    Abstract
    During recent decades, dramatic advances in Genetics and Molecular Biology, has provided scientists with enormous amounts of molecular genomic information of different living organisms, from DNA sequences to complex 3d structures of proteins. This information is raw data which their analysis can provide better understanding of genome mechanisms, discriminating healthy and tumor cells, predicting disease type, making drugs based on genome information, and many more applications. Here, one important issue is the inevitable use of computer science and statistics to analyze these data; such that according to the vast amount of data, would provide intelligent methods, which yield most accurate... 

    Semi-supervised Breast Cancer Subtype Clustering Using Microarray Datasets

    , M.Sc. Thesis Sharif University of Technology Vasei, Hamed (Author) ; Motahhari, Abolfazl (Supervisor)
    Abstract
    Gene expression microarrays can be used for precision medicine and targeted therapies. The data generated by microarrays are high-dimensional causing statistical inference of any parameter a daunting task. In this thesis, it is shown that regardless of high-dimensional datasets produced by microarrays, the inference can be robust in the sense that random selection of features results in the same conclusion as far as the number of selected features are chosen appropriately. Stratifying patients with breast cancer based on their gene expression levels shows that patient subtypes are almost independent of the feature selection strategy. Moreover, using less noisy datasets coming from RNAseq... 

    Analyzing Cancer Cell Identity and Appropriative Subnetworks using Machine Learning

    , M.Sc. Thesis Sharif University of Technology Saberi, Ali (Author) ; Rabiee, Hamid Reza (Supervisor) ; Sharifi Zarchi, Ali (Supervisor)
    Abstract
    From a long time ago cancer has been threatening human’s health, and researchers have been grappling with the phenomenon for numerous years. In the annals of this struggle, the number of cancer victims has outnumbered the survivals in a way that,until recently, suffering from cancer was perceived to be equivalent to death. Permanent defeat against cancer stems from the incomplete recognition of the phenomenon. In recent years, with the advent of technologies to extract information from the heart of cells and at the genome and transcriptome levels, man has been able to acquire a deeper understanding of cancer, its behavior and operation. Now that cancer is regarded to be a genetic disease,... 

    Modelling Cell`s State in Different Cell Types

    , M.Sc. Thesis Sharif University of Technology Saberi, Amir Hossein (Author) ; Hossein Khalaj, Babak (Supervisor) ; Motahari, Abolfazl (Co-Supervisor)
    Abstract
    Existence of heterogeneity in vital tissues of complex multicellular organisms like mammals, and fatal tissues like cancer on one hand, and limited access to biological properties of their components on the other hand, turn the study of these tissue traits to one of the most interesting fields in bioinformatics. One of the hottest subjects in this field is the recognition of functional components of these tissues by using bulk data extracted from the whole tissue.Almost every method that aims to achieve such a purpose, particularly using gene expression data, assumes that all of the cell types which constitute the studied tissue have a deterministic expression profile.In this thesis we... 

    Isoform Function Prediction Using Deep Neural Network

    , M.Sc. Thesis Sharif University of Technology Ghazanfari, Sara (Author) ; Motahari, Abolfazl (Supervisor) ; Soleymani, Mahdieh (Supervisor)
    Abstract
    Isoforms are mRNAs that are produced from a same gene site in the phenomenon called Alternative Splicing. Studies have shown that more than 95% of multiexon genes in humans have undergone Alternative Splicing. Although there are few changes in mRNA sequence, They may have a systematic effect on cell function and regulation. It is widely reported that isoforms of a gene have distinct or even contrasting functions. Most studies have shown that alternative splicing plays a significant role in human health and disease. Despite the wide range of gene function studies, there is little information about isoforms’ functionalities. Recently, some computational methods based on Multiple Instance... 

    Identifying Cancer-related Genes Via Network Feature Learning and Multi-Omics Data Integration

    , M.Sc. Thesis Sharif University of Technology Safari, Monireh (Author) ; Rabiee, Hamid Reza (Supervisor)
    Abstract
    The highly developed biological data collection methods enable scientists to capture protein-protein interaction (PPI) in the human body, which could be analyzed as biological networks such as protein-protein interaction networks. These networks reveal essential information about the biological process in human cells and can be used to identify genes associated with cancers. Effectively identifying disease-related genes would contribute to improving the treatment and diagnosis of various diseases. Current methods for identifying disease-related genes mainly focus on the hypothesis of guilt-by-association and do not consider the global information in the PPI network. Besides, most methods pay... 

    Motif Finding Application Using Edit Distance Approuch

    , M.Sc. Thesis Sharif University of Technology Mohammadi, Farzin (Author) ; Koohi, Somayyeh (Supervisor)
    Abstract
    Motif finding problem in biology is a search for repeated patterns to reveal information about gene expression, one of the most complex subsystems in genomics. ChIP-seq technology abled researchers to investigate location of protein-DNA interactions but analyzing downstream results of such experiments to find actual regulatory signals in genome is challenging. For many years, applications of motif finding had models based on limiting assumption as an exchange for lower computational complexity. Results: AKAGI program is build upon upgraded methods and new general models to investigate statistical and experimental evidences for accurately finding significant patterns among biological... 

    Drug Synergy Prediction on Diverse Cancer Cell-Lines Using Deep Learning

    , M.Sc. Thesis Sharif University of Technology Labbaf, Farzaneh (Author) ; Hossein Khalaj, Babak (Supervisor)
    Abstract
    Despite significant progress in cancer treatment, drug resistance remains a major challenge. Synergistic drug combinations offer a promising approach to overcome drug resistance and reduce side effects. Still, despite high-throughput testing technologies, existing drug combination databases suffer from biases and a lack of diversity in tested cancer cell lines, which challenges the prediction of drug response on novel cell targets. To address this critical need, we designed a two-level deep learning method that uses large-scale gene expression datasets to estimate the score and synergy of drug compounds on a wide variety of cancer cell lines. Our model includes an auto-encoder that train on... 

    Exploration of Existing Patterns in Copy Number Variations of Genetic Diseases and Disorders

    , Ph.D. Dissertation Sharif University of Technology Rahaie, Zahra (Author) ; Rabiee, Hamid Reza (Supervisor)
    Abstract
    One of the main sources of genetic variations are structural variations, including the widespread Copy Number Variations (CNVs). CNVs include two types, copy of genetic material (duplication) and loss of part of genetic sequence (deletion) and typically range from one kilobase pairs (Kbp) to several megabase pairs (Mbp) in size. Most of the copy number variations are occured in in healthy people; however, these variants can also contribute to numerous diseases through several genetic mechanisms (e.g. change gene dosage through insertions, duplications or deletions). The CNV study can provide greater insight into the etiology of disease phenotypes. Nowadays, with the huge amount of investment...