Loading...
Search for: genomic-sequence
0.005 seconds

    Haploblock Detection Based on Reads and Population Structure

    , M.Sc. Thesis Sharif University of Technology Akbari, Elahe (Author) ; Motahari, Abolfazl (Supervisor)
    Abstract
    Human is diploid specie that inherits a set of chromosomes from their mother and a set from their father. The process of separating the nucleotide content of a set of extracted maternal and paternal chromosomes for an individual or a population is called phasing the genome of the individual or the population. The placement of any two variants relative to each other in diploid species is possible in two forms: cis (placement of both variants on one chromosome), and trans (placement of variants on different chromosomes). Each of these conditions leads to different phenotypes. Thus, understanding how variants are placed relative to each other is a crucial problem in human biology which is... 

    Inferring Relation between World and Iranian Populations from Microarray Data

    , M.Sc. Thesis Sharif University of Technology Saberi, Sasan (Author) ; Hossein Khalaj, Babak (Supervisor) ; Motahhari, Abolfazl (Supervisor)
    Abstract
    One of the branches of genetic studies is population genetics. Each population has its own characteristics due to its evolutionary history, cultural characteristics and geography, which distinguish it from other populations. Scientific and technological advances in recent decades have led to the production of new generation sequencing machines and the creation of large genetic data. These data contain important genetic information and answers to many questions about the origin of humans, the history of populations and their evolutionary process. More and better understanding of the human genome and the distance between populations can help to better understand biological mechanisms and deal... 

    Study of Energy and Compression-Ratio Tradeoff in Portable Sequencers

    , M.Sc. Thesis Sharif University of Technology Sojoodi, Hossein (Author) ; Goudarzi, Maziar (Supervisor)
    Abstract
    Recently, portable genome sequencing devices have been introduced to the market, which have also made it possible to provide these services in remote locations or outside the laboratory. The amount of raw data from the readings of a sequencer for the entire genome of a human or plant can be in the hundreds of gigabytes, making it difficult and expensive to maintain and transfer to the center for such sequencing. Fortunately, these readings have a lot of redundancy, and many new algorithms have been proposed to compress them based on the intrinsic properties of this data. Sequencing devices were mainly used in the laboratory environment, which naturally had virtually unlimited access to urban... 

    Performance Improvement of Compression Algorithms for Gene Sequencing Reads by Cache Miss Improvement

    , M.Sc. Thesis Sharif University of Technology Shadab, Mohammad (Author) ; Goudarzi, Maziar (Supervisor)
    Abstract
    Nowadays, one of the challenges in the field of bioinformatics is the excess processed data volume such that this data volume resulted from a complete genome sequence of a species can be up to hundreds gigabytes. Every time that we talk about increasing data volume, data storage, transforming, and the process will become of interest. Moreover, considering the presence of portable sequencer devices in the market and the limitations of process outside of the lab environments, this problem becomes of more critical importance. Fortunately, due to the nature of the genome data and their redundancy, specific algorithms to compress them have been introduced to the market. In this thesis, we chose... 

    Comparative Analysis of Haplotype Assembly Algorithms to Identify and Propose Optimal Methods

    , M.Sc. Thesis Sharif University of Technology Bagher, Melina (Author) ; Jahed, Mehran (Supervisor) ; Hossein Khalaj, Babak (Supervisor)
    Abstract
    Humans, as a diploid species, have two nucleotide sequences of homologous chromosomes in their genomes, where one set is inherited from the mother, and the other comes from the father. The Single Individual Haplotype assembly problem (SIH) refers to the reconstruction of these two distinct nucleotide sequences of a chromosome from the sequencing reads, and it is currently considered one of the most important issues in the field of computational genomics, which plays an essential role in solving various genetic and medical problems.Nowadays direct experimental methods are not welcomed due to their high cost, and labor intensity, and are limited to certain regions of the genome, therefore,... 

    High-speed all-optical DNA local sequence alignment based on a three-dimensional artificial neural network

    , Article Journal of the Optical Society of America A: Optics and Image Science, and Vision ; Volume 34, Issue 7 , 2017 , Pages 1173-1186 ; 10847529 (ISSN) Maleki, E ; Babashah, H ; Koohi, S ; Kavehvash, Z ; Sharif University of Technology
    OSA - The Optical Society  2017
    Abstract
    This paper presents an optical processing approach for exploring a large number of genome sequences. Specifically, we propose an optical correlator for global alignment and an extended moiré matching technique for local analysis of spatially coded DNA, whose output is fed to a novel three-dimensional artificial neural network for local DNA alignment. All-optical implementation of the proposed 3D artificial neural network is developed and its accuracy is verified in Zemax. Thanks to its parallel processing capability, the proposed structure performs local alignment of 4 million sequences of 150 base pairs in a few seconds, which is much faster than its electrical counterparts, such as the... 

    Optical pattern generator for efficient bio-data encoding in a photonic sequence comparison architecture

    , Article PLoS ONE ; Volume 16, Issue 1 January 2021 , 2021 ; 19326203 (ISSN) Akbari Rokn Abadi, S ; Dijujin, N. H ; Koohi, S ; Sharif University of Technology
    Public Library of Science  2021
    Abstract
    In this study, optical technology is considered as SA issues’ solution with the potential ability to increase the speed, overcome memory-limitation, reduce power consumption, and increase output accuracy. So we examine the effect of bio-data encoding and the creation of input images on the pattern-recognition error-rate at the output of optical Vander-lugt correlator. Moreover, we present a genetic algorithm-based coding approach, named as GAC, to minimize output noises of cross-correlating data. As a case study, we adopt the proposed coding approach within a correlation-based optical architecture for counting k-mers in a DNA string. As verified by the simulations on Salmonella whole-genome,... 

    Optical pattern generator for efficient bio-data encoding in a photonic sequence comparison architecture

    , Article PLoS ONE ; Volume 16, Issue 1 , 2021 ; 19326203 (ISSN) Akbari Rokn Abadi, S ; Dijujin, N. H ; Koohi, S ; Sharif University of Technology
    Public Library of Science  2021
    Abstract
    In this study, optical technology is considered as SA issues’ solution with the potential ability to increase the speed, overcome memory-limitation, reduce power consumption, and increase output accuracy. So we examine the effect of bio-data encoding and the creation of input images on the pattern-recognition error-rate at the output of optical Vander-lugt correlator. Moreover, we present a genetic algorithm-based coding approach, named as GAC, to minimize output noises of cross-correlating data. As a case study, we adopt the proposed coding approach within a correlation-based optical architecture for counting k-mers in a DNA string. As verified by the simulations on Salmonella whole-genome,... 

    A Novel Method to Improve Binning of Metagenomic Sequence Fragments by Using Gene Ontology Graphs

    , M.Sc. Thesis Sharif University of Technology Abolhassani, Ilia (Author) ; Beigy, Hamid (Supervisor) ; Marashi, Amir (Supervisor)
    Abstract
    Bacteria are the first organisms, and ath the same time, the most diverse forms of life that appeared on the earth and without their functions living on the earth would not be possible. Therefore, recognition of bacteria is greatly important and can lead to the recognition of new genes, production of new antibiotics and anti-cancer compounds, etc. The problem of identifying the bacterial genome from DNA sequences obtained directly from environmental samples is known as Metagenomics. One of the main stages of Metagenomics is the binning of genetic components and because bio-marker based binning methods are only able to bin 1% of bacterial species, bio-marker free methods have been considered... 

    All-optical DNA variant discovery utilizing extended DV-curve-based wavelength modulation

    , Article Journal of the Optical Society of America A: Optics and Image Science, and Vision ; Volume 35, Issue 11 , 2018 , Pages 1929-1940 ; 10847529 (ISSN) Maleki, E ; Babashah, H ; Koohi, S ; Kavehvash, Z ; Sharif University of Technology
    OSA - The Optical Society  2018
    Abstract
    This paper presents a novel optical processing approach for exploring genome sequences built upon an optical correlator for global alignment and the extended dual-vector-curve-curve (DV-curve) method for local alignment. To overcome the problem of the traditional DV-curve method for presenting an accurate and simplified output, we propose the hybrid amplitude wavelength polarization optical DV-curve (HAWPOD) method, built upon the DV-curve method, to analyze genome sequences in three steps: DNA coding, alignment, and post-Analysis. For this purpose, a tunable graphene-based color filter is designed for wavelength modulation of optical signals. Moreover, all-optical implementation of the... 

    A novel pattern matching algorithm for genomic patterns related to protein motifs

    , Article Journal of Bioinformatics and Computational Biology ; Volume 18, Issue 1 , 2020 Foroughmand Araabi, M. H ; Goliaei, S ; Goliaei, B ; Sharif University of Technology
    World Scientific Publishing Co. Pte Ltd  2020
    Abstract
    Patterns on proteins and genomic sequences are vastly analyzed, extracted and collected in databases. Although protein patterns originate from genomic coding regions, very few works have directly or indirectly dealt with coding region patterns induced from protein patterns. Results: In this paper, we have defined a new genomic pattern structure suitable for representing induced patterns from proteins. The provided pattern structure, which is called "Consecutive Positions Scoring Matrix (CPSSM)", is a replacement for protein patterns and profiles in the genomic context. CPSSMs can be identified, discovered, and searched in genomes. Then, we have presented a novel pattern matching algorithm... 

    Whole-genome analysis of de novo somatic point mutations reveals novel mutational biomarkers in pancreatic cancer

    , Article Cancers ; Volume 13, Issue 17 , 2021 ; 20726694 (ISSN) Ghareyazi, A ; Mohseni, A ; Dashti, H ; Beheshti, A ; Dehzangi, A ; Rabiee, H. R ; Alinejad Rokny, H ; Sharif University of Technology
    MDPI  2021
    Abstract
    It is now known that at least 10% of samples with pancreatic cancers (PC) contain a causative mutation in the known susceptibility genes, suggesting the importance of identifying cancer-associated genes that carry the causative mutations in high-risk individuals for early detection of PC. In this study, we develop a statistical pipeline using a new concept, called gene-motif, that utilizes both mutated genes and mutational processes to identify 4211 3-nucleotide PC-associated gene-motifs within 203 significantly mutated genes in PC. Using these gene-motifs as distinguishable features for pancreatic cancer subtyping results in identifying five PC subtypes with distinguishable phenotypes and... 

    Developing a Deep Neural Network for Bio-sequence Classification Capable of Optical Computing

    , M.Sc. Thesis Sharif University of Technology Mohammadi, Amir Hossein (Author) ; Koohi, Somayyeh (Supervisor)
    Abstract
    The classification of biological sequences is an open issue for a variety of data sets, such as viral and metagenomics sequences. Therefore, many studies utilize neural network tools, as the well-known methods in this field, and focus on designing customized network structures. However, a few works focus on more effective factors, such as input encoding method or implementation technology, to address accuracy and efficiency issues in this area. Therefore, in this work, we propose an image-based encoding method, called as WalkIm, whose adoption, even in a simple neural network, provides competitive accuracy and superior efficiency, compared to the existing classification methods (e.g. VGDC,... 

    Building an Iranian Reference Panel by Imputing Low-coverage Genomic Data

    , M.Sc. Thesis Sharif University of Technology Poursoleymani, Rooholla (Author) ; Foroughmand Araabi, Mohammad Hadi (Supervisor)
    Abstract
    One of the most available genomics data in Iran is non-invasive parental testing (NIPT) data obtained from the blood of pregnant mothers after the tenth week of pregnancy using the new generation sequencing technology. Sequencer output is a combination of maternal and fetal read data, most of which (about 90%) is from maternal DNA. These data have very low coverage of the genome, but their advantage is that they read the entire human genome. Low coverage data has led to the loss of large parts of the genome, but having a large number of samples helps to compensate for this problem. The purpose of this project is to use this data with the help of imputation methods to build a reference for... 

    Genome annotation and comparative genomic analysis of Bacillus subtilis MJ01, a new bio-degradation strain isolated from oil-contaminated soil

    , Article Functional and Integrative Genomics ; Volume 18, Issue 5 , 2018 , Pages 533-543 ; 1438793X (ISSN) Rahimi, T ; Niazi, A ; Deihimi, T ; Taghavi, S. M ; Ayatollahi, S ; Ebrahimie, E ; Sharif University of Technology
    Springer Verlag  2018
    Abstract
    One of the main challenges in elimination of oil contamination from polluted environments is improvement of biodegradation by highly efficient microorganisms. Bacillus subtilis MJ01 has been evaluated as a new resource for producing biosurfactant compounds. This bacterium, which produces surfactin, is able to enhance bio-accessibility to oil hydrocarbons in contaminated soils. The genome of B. subtilis MJ01 was sequenced and assembled by PacBio RS sequencing technology. One big contig with a length of 4,108,293 bp without any gap was assembled. Genome annotation and prediction of gene showed that MJ01 genome is very similar to B. subtilis spizizenii TU-B-10 (95% similarity). The comparison... 

    Point-of-use rapid detection of sars-cov-2: Nanotechnology-enabled solutions for the covid-19 pandemic

    , Article International Journal of Molecular Sciences ; Volume 21, Issue 14 , 2020 , Pages 1-23 Rabiee, N ; Bagherzadeh, M ; Ghasemi, A ; Zare, H ; Ahmadi, S ; Fatahi, Y ; Dinarvand, R ; Rabiee, M ; Ramakrishna, S ; Shokouhimehr, M ; Varma, R. S ; Sharif University of Technology
    MDPI AG  2020
    Abstract
    Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused the COVID-19 pandemic that has been spreading around the world since December 2019. More than 10 million affected cases and more than half a million deaths have been reported so far, while no vaccine is yet available as a treatment. Considering the global healthcare urgency, several techniques, including whole genome sequencing and computed tomography imaging have been employed for diagnosing infected people. Considerable efforts are also directed at detecting and preventing different modes of community transmission. Among them is the rapid detection of virus presence on different surfaces with which people may come in... 

    Whole genome sequencing of SARS-CoV2 strains circulating in Iran during five waves of pandemic

    , Article PLoS ONE ; Volume 17, Issue 5 May , 2022 ; 19326203 (ISSN) Yavarian, J ; Nejati, A ; Salimi, V ; Jandaghi, N.Z.S ; Sadeghi, K ; Abedi, A ; Zarchi, A. S ; Gouya, M. M ; Mokhtari Azad, T ; Sharif University of Technology
    Public Library of Science  2022
    Abstract
    Purpose Whole genome sequencing of SARS-CoV2 is important to find useful information about the viral lineages, variants of interests and variants of concern. As there are not enough data about the circulating SARS-CoV2 variants in Iran, we sequenced 54 SARS-CoV2 genomes during the 5 waves of pandemic in Iran. Methods After viral RNA extraction from clinical samples collected during the COVID-19 pandemic, next generation sequencing was performed using the Nextseq platform. The sequencing data were analyzed and compared with reference sequences. Results During the 1st wave, V and L clades were detected. The second wave was recognized by G, GH and GR clades. Circulating clades during the 3rd... 

    Homozygous mutations in C14orf39/SIX6OS1 cause non-obstructive azoospermia and premature ovarian insufficiency in humans

    , Article American Journal of Human Genetics ; Volume 108, Issue 2 , 2021 , Pages 324-336 ; 00029297 (ISSN) Fan, S ; Jiao, Y ; Khan, R ; Jiang, X ; Javed, A. R ; Ali, A ; Zhang, H ; Zhou, J ; Naeem, M ; Murtaza, G ; Li, Y ; Yang, G ; Zaman, Q ; Zubair, M ; Guan, H ; Zhang, X ; Ma, H ; Jiang, H ; Ali, H ; Dil, S ; Shah, W ; Ahmad, N ; Zhang, Y ; Shi, Q ; Sharif University of Technology
    Cell Press  2021
    Abstract
    Human infertility is a multifactorial disease that affects 8%–12% of reproductive-aged couples worldwide. However, the genetic causes of human infertility are still poorly understood. Synaptonemal complex (SC) is a conserved tripartite structure that holds homologous chromosomes together and plays an indispensable role in the meiotic progression. Here, we identified three homozygous mutations in the SC coding gene C14orf39/SIX6OS1 in infertile individuals from different ethnic populations by whole-exome sequencing (WES). These mutations include a frameshift mutation (c.204_205del [p.His68Glnfs∗2]) from a consanguineous Pakistani family with two males suffering from non-obstructive...