Loading...
Search for: computational-biology
0.006 seconds

    afpCOOL: a tool for antifreeze protein prediction

    , Article Heliyon ; Volume 4, Issue 7 , 2018 ; 24058440 (ISSN) Eslami, M ; Shirali Hossein Zade, R ; Takalloo, Z ; Mahdevar, G ; Emamjomeh, A ; Sajedi, R. H ; Zahiri, J ; Sharif University of Technology
    Elsevier Ltd  2018
    Abstract
    Various cold-adapted organisms produce antifreeze proteins (AFPs), which prevent the freezing of cell fluids by inhibiting the growth of ice crystals. AFPs are currently being recognized in various organisms, living in extremely low temperatures. AFPs have several important applications in increasing freeze tolerance of plants, maintaining the tissue in frozen conditions and producing cold-hardy plants by applying transgenic technology. Substantial differences in the sequence and structure of the AFPs, pose a challenge for researchers to identify these proteins. In this paper, we proposed a novel method to identify AFPs, using supportive vector machine (SVM) by incorporating 4 types of... 

    FAME: fast and memory efficient multiple sequences alignment tool through compatible chain of roots

    , Article Bioinformatics ; Volume 36, Issue 12 , 15 June , 2020 , Pages 3662-3668 Etminan, N ; Parvinnia, E ; Sharifi Zarchi, A ; Sharif University of Technology
    Oxford University Press  2020
    Abstract
    Motivation: Multiple sequence alignment (MSA) is important and challenging problem of computational biology. Most of the existing methods can only provide a short length multiple alignments in an acceptable time. Nevertheless, when the researchers confront the genome size in the multiple alignments, the process has required a huge processing space/time. Accordingly, using the method that can align genome size rapidly and precisely has a great effect, especially on the analysis of the very long alignments. Herein, we have proposed an efficient method, called FAME, which vertically divides sequences from the places that they have common areas; then they are arranged in consecutive order. Then... 

    A Simulation for Better Understanding of Integrin Clustering and Activation

    , M.Sc. Thesis Sharif University of Technology Shams, Shahab (Author) ; Motahhari, Abolfazl (Supervisor)
    Abstract
    Today, computer simulations are employed prevalently by researchers to understand biological processes. The tendency to use computational methods has been increased recently due to the high costs and errors of experimental methods. Integrins are membrane proteins that mechanically attach cells to the extra cellular matrix (ECM) and derive some behaviors such as cell migration. Moreover, integrins have biochemical functions. They transduce environment signals and trigger chemical pathways. As a result, integrins regulate cell shape, motility, etc. The investigation of integrin behavior in a real environment is very difficult due to the presence of many other proteins that interfere with the... 

    Erratum to A linear genetic programming approach for the prediction of solar global radiation

    , Article Neural Computing and Applications ; Volume 23, Issue 3-4 , 2013 , Pages 1205- ; 09410643 (ISSN) Shavandi, H ; Saeidi Ramiyani, S ; Sharif University of Technology
    2013

    PFP-WGAN: Protein function prediction by discovering gene ontology term correlations with generative adversarial networks

    , Article PLoS ONE ; Volume 16, Issue 2 , 2021 ; 19326203 (ISSN) Seyyedsalehi, S. F ; Soleymani, M ; Rabiee, H. R ; Kaazempur Mofrad, M. R ; Sharif University of Technology
    Public Library of Science  2021
    Abstract
    Understanding the functionality of proteins has emerged as a critical problem in recent years due to significant roles of these macro-molecules in biological mechanisms. However, in-laboratory techniques for protein function prediction are not as efficient as methods developed and processed for protein sequencing. While more than 70 million protein sequences are available today, only the functionality of around one percent of them are known. These facts have encouraged researchers to develop computational methods to infer protein functionalities from their sequences. Gene Ontology is the most well-known database for protein functions which has a hierarchical structure, where deeper terms are... 

    Game Theory in Computational Biology

    , M.Sc. Thesis Sharif University of Technology Safarnejad Boroujeni, Mahdi (Author) ; Ghodsi, Mohammad (Supervisor)
    Abstract
    In recent years, game theory passed from areas of economics and socials to other fields such as computational biology and bioinformatics. Several evolutionary games and cooperative games have been defined to predict the behavior of nonrational agents in interaction situations arising from computational biology.One of the these main applications is using Shapley value. Shapley value allocates a fair value to each player, in games that each coalition has a profit.But in many cases the computation of Shapley value is #P-complete. Thus,the goal is to optimally find Shapley value or to approximate it in each game.Another option is Banzhaf index.One of the essential games on genes is Pathway game.... 

    1 + ϵ approximation of tree edit distance in quadratic time

    , Article 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, 23 June 2019 through 26 June 2019 ; 2019 , Pages 709-720 ; 07378017 (ISSN); 9781450367059 (ISBN) Boroujeni, M ; Ghodsi, M ; Hajiaghayi, M ; Seddighin, S ; Sharif University of Technology
    Association for Computing Machinery  2019
    Abstract
    Edit distance is one of the most fundamental problems in computer science. Tree edit distance is a natural generalization of edit distance to ordered rooted trees. Such a generalization extends the applications of edit distance to areas such as computational biology, structured data analysis (e.g., XML), image analysis, and compiler optimization. Perhaps the most notable application of tree edit distance is in the analysis of RNA molecules in computational biology where the secondary structure of RNA is typically represented as a rooted tree. The best-known solution for tree edit distance runs in cubic time. Recently, Bringmann et al. show that an O(n2.99) algorithm for weighted tree edit... 

    Learning of Statistical Mixture Models in High Dimensions

    , Ph.D. Dissertation Sharif University of Technology Najafi, Amir (Author) ; Motahari, Abolfazl (Supervisor) ; Rabiee, Hamid Reza (Supervisor)
    Abstract

    Using statistical tools in machine learning and artificial intelligence to infer knowledge from high-dimensional data, namely data science, has attracted a siginificant research interest over the past two decades. The number of real databases around the world continues to grow with an increasing pace, which are used to store huge amounts of high-dimensional data points of various types. However, applying machine learning tools to high-dimensional data has also raised potential concerns, specially with respect to the fundamental capability of such tools to be useful in practical situations. In fact, the large dimension of a data could eventually damage the outcome of any statistical... 

    Capturing single-cell heterogeneity via data fusion improves image-based profiling

    , Article Nature Communications ; Volume 10, Issue 1 , 2019 ; 20411723 (ISSN) Rohban, M. H ; Abbasi, H. S ; Singh, S ; Carpenter, A. E ; Sharif University of Technology
    Nature Publishing Group  2019
    Abstract
    Single-cell resolution technologies warrant computational methods that capture cell heterogeneity while allowing efficient comparisons of populations. Here, we summarize cell populations by adding features’ dispersion and covariances to population averages, in the context of image-based profiling. We find that data fusion is critical for these metrics to improve results over the prior alternatives, providing at least ~20% better performance in predicting a compound’s mechanism of action (MoA) and a gene’s pathway. © 2019, The Author(s)  

    AntAngioCOOL: computational detection of anti-angiogenic peptides

    , Article Journal of Translational Medicine ; Volume 17, Issue 1 , 2019 ; 14795876 (ISSN) Zahiri, J ; Khorsand, B ; Yousefi, A. A ; Kargar, M. J ; Shirali Hossein Zade, R ; Mahdevar, G ; Sharif University of Technology
    BioMed Central Ltd  2019
    Abstract
    Background: Angiogenesis inhibition research is a cutting edge area in angiogenesis-dependent disease therapy, especially in cancer therapy. Recently, studies on anti-angiogenic peptides have provided promising results in the field of cancer treatment. Methods: A non-redundant dataset of 135 anti-angiogenic peptides (positive instances) and 135 non anti-angiogenic peptides (negative instances) was used in this study. Also, 20% of each class were selected to construct an independent test dataset (see Additional files 1, 2). We proposed an effective machine learning based R package (AntAngioCOOL) to predict anti-angiogenic peptides. We have examined more than 200 different classifiers to build... 

    Statistical association mapping of population-structured genetic data

    , Article IEEE/ACM Transactions on Computational Biology and Bioinformatics ; Volume 16, Issue 2 , 2019 , Pages 636-649 ; 15455963 (ISSN) Najafi, A ; Janghorbani, S ; Motahari, A ; Fatemizadeh, E ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2019
    Abstract
    Association mapping of genetic diseases has attracted extensive research interest during the recent years. However, most of the methodologies introduced so far suffer from spurious inference of the associated sites due to population inhomogeneities. In this paper, we introduce a statistical framework to compensate for this shortcoming by equipping the current methodologies with a state-of-the-art clustering algorithm being widely used in population genetics applications. The proposed framework jointly infers the disease-associated factors and the hidden population structures. In this regard, a Markov Chain-Monte Carlo (MCMC) procedure has been employed to assess the posterior probability... 

    IMOS: improved meta-aligner and minimap2 on spark

    , Article BMC Bioinformatics ; Volume 20, Issue 1 , 2019 ; 14712105 (ISSN) Hadadian Nejad Yousefi, M ; Goudarzi, M ; Motahari, A ; Sharif University of Technology
    BioMed Central Ltd  2019
    Abstract
    Background: Long reads provide valuable information regarding the sequence composition of genomes. Long reads are usually very noisy which renders their alignments on the reference genome a daunting task. It may take days to process datasets enough to sequence a human genome on a single node. Hence, it is of primary importance to have an aligner which can operate on distributed clusters of computers with high performance in accuracy and speed. Results: In this paper, we presented IMOS, an aligner for mapping noisy long reads to the reference genome. It can be used on a single node as well as on distributed nodes. In its single-node mode, IMOS is an Improved version of Meta-aligner (IM)... 

    OptCAM: An ultra-fast all-optical architecture for DNA variant discovery

    , Article Journal of Biophotonics ; Volume 13, Issue 1 , August , 2020 Maleki, E ; Koohi, S ; Kavehvash, Z ; Mashaghi, A ; Sharif University of Technology
    Wiley-VCH Verlag  2020
    Abstract
    Nowadays, the accelerated expansion of genetic data challenges speed of current DNA sequence alignment algorithms due to their electrical implementations. Essential needs of an efficient and accurate method for DNA variant discovery demand new approaches for parallel processing in real time. Fortunately, photonics, as an emerging technology in data computing, proposes optical correlation as a fast similarity measurement algorithm; while complexity of existing local alignment algorithms severely limits their applicability. Hence, in this paper, employing optical correlation for global alignment, we present an optical processing approach for local DNA sequence alignment to benefit both... 

    The performances of the chi-square test and complexity measures for signal recognition in biological sequences

    , Article Journal of Theoretical Biology ; Volume 251, Issue 2 , 2008 , Pages 380-387 ; 00225193 (ISSN) Pirhaji, L ; Kargar, M ; Sheari, A ; Poormohammadi, H ; Sadeghi, M ; Pezeshk, H ; Eslahchi, C ; Sharif University of Technology
    2008
    Abstract
    With large amounts of experimental data, modern molecular biology needs appropriate methods to deal with biological sequences. In this work, we apply a statistical method (Pearson's chi-square test) to recognize the signals appear in the whole genome of the Escherichia coli. To show the effectiveness of the method, we compare the Pearson's chi-square test with linguistic complexity on the complete genome of E. coli. The results suggest that Pearson's chi-square test is an efficient method for distinguishing genes (coding regions) form pseudogenes (noncoding regions). On the other hand, the performance of the linguistic complexity is much lower than the chi-square test method. We also use the... 

    Fuzzy support vector machine: An efficient rule-based classification technique for microarrays

    , Article BMC Bioinformatics ; Volume 14, Issue SUPPL13 , 2013 ; 14712105 (ISSN) Hajiloo, M ; Rabiee, H. R ; Anooshahpour, M ; Sharif University of Technology
    2013
    Abstract
    Background: The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification.Results: Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection... 

    Inferring causal molecular networks: Empirical assessment through a community-based effort

    , Article Nature Methods ; Volume 13, Issue 4 , 2016 , Pages 310-322 ; 15487091 (ISSN) Hill, S. M ; Heiser, L.M ; Cokelaer, T ; Linger, M ; Nesser, N. K ; Carlin, D. E ; Zhang, Y ; Sokolov, A ; Paull, E. O ; Wong, C. K ; Graim, K ; Bivol, A ; Wang, H ; Zhu, F ; Afsari, B ; Danilova, L. V ; Favorov, A. V ; Lee, W. S ; Taylor, D ; Hu, C. W ; Long, B. L ; Noren, D. P ; Bisberg, A. J ; Mills, G. B ; Gray, J. W ; Kellen, M ; Norman, T ; Friend, S ; Qutub, A. A ; Fertig, E. J ; Guan, Y ; Song, M ; Stuart, J. M ; Spellman, P. T ; Koeppl, H ; Stolovitzky, G ; Saez Rodriguez, J ; Mukherjee, S ; Afsari, B ; Al-Ouran, R ; Anton, B ; Arodz, T ; Askari Sichani, O ; Bagheri, N ; Berlow, N ; Bisberg, A. J ; Bivol, A ; Bohler, A ; Bonet, J ; Bonneau, R ; Budak, G ; Bunescu, R ; Caglar, M ; Cai, B ; Cai, C ; Carlin, D. E ; Carlon, A ; Chen, L ; Ciaccio, M. F ; Cokelaer, T ; Cooper, G ; Coort, S ; Creighton, C. J ; Daneshmand, S. M. H ; De La Fuente, A ; Di Camillo, B ; Danilova, L. V ; Dutta-Moscato, J ; Emmett, K ; Evelo, C ; Fassia, M. K. H ; Favorov, A. V ; Fertig, E. J ; Finkle, J. D ; Finotello, F ; Friend, S ; Gao, X ; Gao, J ; Garcia Garcia, J ; Ghosh, S ; Giaretta, A ; Graim, K ; Gray, J. W ; Großeholz, R ; Guan, Y ; Guinney, J ; Hafemeister, C ; Hahn, O ; Haider, S ; Hase, T ; Heiser, L. M ; Hill, S. M ; Hodgson, J ; Hoff, B ; Hsu, C. H ; Hu, C. W ; Hu, Y ; Huang, X ; Jalili, M ; Jiang, X ; Kacprowski, T ; Kaderali, L ; Kang, M ; Kannan, V ; Kellen, M ; Kikuchi, K ; Kim, D. C ; Kitano, H ; Knapp, B ; Komatsoulis, G ; Koeppl, H ; Krämer, A ; Kursa, M. B ; Kutmon, M ; Lee, W. S ; Li, Y ; Liang, X ; Liu, Z ; Liu, Y ; Long, B. L ; Lu, S ; Lu, X ; Manfrini, M ; Matos, M. R. A ; Meerzaman, D ; Mills, G. B ; Min, W ; Mukherjee, S ; Müller, C. L ; Neapolitan, R. E ; Nesser, N. K ; Noren, D. P ; Norman, T ; Oliva, B ; Opiyo, S. O ; Pal, R ; Palinkas, A ; Paull, E. O ; Planas Iglesias, J ; Poglayen, D ; Qutub, A. A ; Saez Rodriguez, J ; Sambo, F ; Sanavia, T ; Sharifi-Zarchi, A ; Slawek, J ; Sokolov, A ; Song, M ; Spellman, P. T ; Streck, A ; Stolovitzky, G ; Strunz, S ; Stuart, J. M ; Taylor, D ; Tegnér, J ; Thobe, K ; Toffolo, G. M ; Trifoglio, E ; Unger, M ; Wan, Q ; Wang, H ; Welch, L ; Wong, C. K ; Wu, J. J ; Xue, A. Y ; Yamanaka, R ; Yan, C ; Zairis, S ; Zengerling, M ; Zenil, H ; Zhang, S ; Zhang, Y ; Zhu, F ; Zi, Z ; Sharif University of Technology
    Nature Publishing Group  2016
    Abstract
    It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was... 

    A tale of two symmetrical tails: Structural and functional characteristics of palindromes in proteins

    , Article BMC Bioinformatics ; Volume 9 , 2008 ; 14712105 (ISSN) Sheari, A ; Kargar, M ; Katanforoush, A ; Arab, S ; Sadeghi, M ; Pezeshk, H ; Eslahchi, C ; Marashi, S. A ; Sharif University of Technology
    2008
    Abstract
    Background: It has been previously shown that palindromic sequences are frequently observed in proteins. However, our knowledge about their evolutionary origin and their possible importance is incomplete. Results: In this work, we tried to revisit this relatively neglected phenomenon. Several questions are addressed in this work. (1) It is known that there is a large chance of finding a palindrome in low complexity sequences (i.e. sequences with extreme amino acid usage bias). What is the role of sequence complexity in the evolution of palindromic sequences in proteins? (2) Do palindromes coincide with conserved protein sequences? If yes, what are the functions of these conserved segments?... 

    Genome annotation and comparative genomic analysis of Bacillus subtilis MJ01, a new bio-degradation strain isolated from oil-contaminated soil

    , Article Functional and Integrative Genomics ; Volume 18, Issue 5 , 2018 , Pages 533-543 ; 1438793X (ISSN) Rahimi, T ; Niazi, A ; Deihimi, T ; Taghavi, S. M ; Ayatollahi, S ; Ebrahimie, E ; Sharif University of Technology
    Springer Verlag  2018
    Abstract
    One of the main challenges in elimination of oil contamination from polluted environments is improvement of biodegradation by highly efficient microorganisms. Bacillus subtilis MJ01 has been evaluated as a new resource for producing biosurfactant compounds. This bacterium, which produces surfactin, is able to enhance bio-accessibility to oil hydrocarbons in contaminated soils. The genome of B. subtilis MJ01 was sequenced and assembled by PacBio RS sequencing technology. One big contig with a length of 4,108,293 bp without any gap was assembled. Genome annotation and prediction of gene showed that MJ01 genome is very similar to B. subtilis spizizenii TU-B-10 (95% similarity). The comparison...