Sharif Digital Repository / Sharif University of Technology / Search result

Statistical association mapping of population-structured genetic data

, Article IEEE/ACM Transactions on Computational Biology and Bioinformatics ; 2017 ; 15455963 (ISSN) Najafi, A ; Janghorbani, S ; Motahari, S. A ; Fatemizadeh, E ; Sharif University of Technology

Abstract

Association mapping of genetic diseases has attracted extensive research interest during the recent years. However, most of the methodologies introduced so far suffer from spurious inference of the associated sites due to population inhomogeneities. In this paper, we introduce a statistical framework to compensate for this shortcoming by equipping the current methodologies with a state-of-the-art clustering algorithm being widely used in population genetics applications. The proposed framework jointly infers the disease-associated factors and the hidden population structures. In this regard, a Markov Chain-Monte Carlo (MCMC) procedure has been employed to assess the posterior probability...

Genome-Wide Association Study via Machine Learning Techniques

, M.Sc. Thesis Sharif University of Technology Najafi, Amir (Author) ; Fatemizadeh, Emad (Supervisor) ; Motahari, Abolfazl (Co-Advisor)

Abstract

Development of DNA sequencing technologies in the recent years magniﬁes the need for computational tools in genomic data processing, and thus has attracted inten- sive research interest to this area. Among them, Genome-Wide Association Study (GWAS) refers to discovering of causal relationships among genetic sequences of living organisms and the macroscopic phenotypes present in their physiological structure. Chosen phenotypes for genomic association studies are mostly vulnerability or im- munity to common genetic diseases. Conventional methods in GWAS consists of statistical hypothesis testing algorithms in case/control approaches; Most of which are based upon single-locus analysis and...

محتواي کتاب

Genome-Wide Association Studies: Information Theoretic Limits of Reliable Learning

, Article 2018 IEEE International Symposium on Information Theory, ISIT 2018, 17 June 2018 through 22 June 2018 ; Volume 2018-June , 2018 , Pages 2231-2235 ; 21578095 (ISSN); 9781538647806 (ISBN) Tahmasebi, B ; Maddah Ali, M. A ; Motahari, A. S ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2018

Abstract

In the problems of Genome-Wide Association Study (GWAS), the objective is to associate subsequences of individual's genomes to the observable characteristics called phenotypes. The genome containing the biological information of an individual can be represented by a sequence of length G. Many observable characteristics of the individuals can be related to a subsequence of a given length L, called causal subsequence. The environmental affects make the relation between the causal subsequence and the observable characteristics a stochastic function. Our objective in this paper is to detect the causal subsequence of a specific phenotype using a dataset of N individuals and their observed...

Information theory of mixed population genome-wide association studies

, Article 2018 IEEE Information Theory Workshop, ITW 2018, 25 November 2018 through 29 November 2018 ; 2019 ; 9781538635995 (ISBN) Tahmasebi, B ; Maddah Ali, M. A ; Motahari, S. A ; Sun Yat-Sen University ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2019

Abstract

Genome-Wide Association Study (GWAS) addresses the problem of associating subsequences of individuals' genomes to the observable characteristics called phenotypes. In a genome of length G, it is observed that each characteristic is only related to a specific subsequence of it with length L, called the causal subsequence. The objective is to recover the causal subsequence, using a dataset of N individuals' genomes and their observed characteristics. Recently, the problem has been investigated from an information theoretic point of view in [1]. It has been shown that there is a threshold effect for reliable learning of the causal subsequence at Gh ( N L/G ) by characterizing the capacity of...

The Capacity of associated subsequence retrieval

, Article IEEE Transactions on Information Theory ; Volume 67, Issue 2 , 2021 , Pages 790-804 ; 00189448 (ISSN) Tahmasebi, B ; Maddah Ali, M. A ; Motahari, S. A ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2021

Abstract

The objective of a genome-wide association study (GWAS) is to associate subsequences of individuals' genomes to the observable characteristics called phenotypes (e.g., high blood pressure). Motivated by the GWAS problem, in this paper we introduce the information-theoretic problem of associated subsequence retrieval, where a dataset of N (possibly high-dimensional) sequences of length G, and their corresponding observable (binary) characteristics is given. The sequences are chosen independently and uniformly at random from XG , where X is a finite alphabet. The observable (binary) characteristic is only related to a specific unknown subsequence of length L of the sequences, called associated...

Statistical association mapping of population-structured genetic data

, Article IEEE/ACM Transactions on Computational Biology and Bioinformatics ; Volume 16, Issue 2 , 2019 , Pages 636-649 ; 15455963 (ISSN) Najafi, A ; Janghorbani, S ; Motahari, A ; Fatemizadeh, E ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2019

Abstract

Association mapping of genetic diseases has attracted extensive research interest during the recent years. However, most of the methodologies introduced so far suffer from spurious inference of the associated sites due to population inhomogeneities. In this paper, we introduce a statistical framework to compensate for this shortcoming by equipping the current methodologies with a state-of-the-art clustering algorithm being widely used in population genetics applications. The proposed framework jointly infers the disease-associated factors and the hidden population structures. In this regard, a Markov Chain-Monte Carlo (MCMC) procedure has been employed to assess the posterior probability...

MaxHiC: A robust background correction model to identify biologically relevant chromatin interactions in Hi-C and capture Hi-C experiments

, Article PLoS Computational Biology ; Volume 18, Issue 6 , 2022 ; 1553734X (ISSN) Alinejad Rokny, H ; Modegh, R. G ; Rabiee, H. R ; Sarbandi, E. R ; Rezaie, N ; Tam, K. T ; Forrest, A. R. R ; Sharif University of Technology

Public Library of Science 2022

Abstract

Hi-C is a genome-wide chromosome conformation capture technology that detects interactions between pairs of genomic regions and exploits higher order chromatin structures. Conceptually Hi-C data counts interaction frequencies between every position in the genome and every other position. Biologically functional interactions are expected to occur more frequently than transient background and artefactual interactions. To identify biologically relevant interactions, several background models that take biases such as distance, GC content and mappability into account have been proposed. Here we introduce MaxHiC, a background correction tool that deals with these complex biases and robustly...

Fundamental Limits of Population Stratification From an Information Theoretic View

, M.Sc. Thesis Sharif University of Technology Tahmasebi, Behrooz (Author) ; Maddah-Ali, Mohammad Ali (Supervisor) ; Motahari, Abolfazl (Co-Supervisor)

Abstract

This thesis consists of two parts. For the first, we study the identifiability of finite mixtures of finite product measures. This class of mixture models has a large number of applications in real-world data modeling. An important example is the population genetic application of them in modeling of mixed population datasets. The identifiability means that the mapping between the class parameters and the mixture distributions is one to one. In this manuscript, we define some separability metrics inspired by methods used in clustering mixture models and study the fundamental trade off between identifiability and the number of separable variables of the mixture model. For the second part of...

محتواي کتاب

Genome-wide DNA methylation profiling in ectopic and eutopic of endometrial tissues

, Article Journal of Assisted Reproduction and Genetics ; Volume 36, Issue 8 , 2019 , Pages 1743-1752 ; 10580468 (ISSN) Barjaste, N ; Shahhoseini, M ; Afsharian, P ; Sharifi Zarchi, A ; Masoudi Nejad, A ; Sharif University of Technology

Springer New York LLC 2019

Abstract

Purpose: Endometriosis is a gynecological disease that causes the uterine lining to appear in other organs outside the uterus. As DNA methylation has an important role in this disorder, its profiling can reveal new information to improve the diagnosis and treatment of endometriosis patients. Methods: We conducted a genome-wide methylation profiling of ectopic and eutopic endometrial tissues from women with and without endometriosis using Infinium Human Methylation 450K BeadChip arrays. DNA methylation samples were collected from nine ectopic and nine eutopic endometrial tissues of endometriosis and six endometrial tissues of healthy controls. Results: Correlation heatmaps and the principal...

Dna-Rna hybrid (R-loop): From a unified picture of the mammalian telomere to the genome-wide profile

, Article Cells ; Volume 10, Issue 6 , 2021 ; 20734409 (ISSN) Rassoulzadegan, M ; Sharifi Zarchi, A ; Kianmehr, L ; Sharif University of Technology

MDPI 2021

Abstract

Local three-stranded DNA/RNA hybrid regions of genomes (R-loops) have been detected either by binding of a monoclonal antibody (DRIP assay) or by enzymatic recognition by RNaseH. Such a structure has been postulated for mouse and human telomeres, clearly suggested by the identification of the complementary RNA Telomeric repeat-containing RNA “TERRA”. However, the tremendous disparity in the information obtained with antibody-based technology drove us to investigate a new strategy. Based on the observation that DNA/RNA hybrids in a triplex complex genome co-purify with the double-stranded chromosomal DNA fraction, we developed a direct preparative approach from total protein-free cellular...