Loading...
Search for:
bioinformatics
0.006 seconds
Total 68 records
Graph traversal edit distance and extensions
, Article Journal of Computational Biology ; Volume 27, Issue 3 , 2020 , Pages 317-329 ; Shrestha, A ; Sharifi Zarchi, A ; Gallagher, S. R ; Sahinalp, S. C ; Chitsaz, H ; Sharif University of Technology
Mary Ann Liebert Inc
2020
Abstract
Many problems in applied machine learning deal with graphs (also called networks), including social networks, security, web data mining, protein function prediction, and genome informatics. The kernel paradigm beautifully decouples the learning algorithm from the underlying geometric space, which renders graph kernels important for the aforementioned applications. In this article, we give a new graph kernel, which we call graph traversal edit distance (GTED). We introduce the GTED problem and give the first polynomial time algorithm for it. Informally, the GTED is the minimum edit distance between two strings formed by the edge labels of respective Eulerian traversals of the two graphs....
PyGTED: Python application for computing graph traversal edit distance
, Article Journal of Computational Biology ; Volume 27, Issue 3 , 2020 , Pages 436-439 ; Shrestha, A ; Sharifi Zarchi, A ; Gallagher, S. R ; Sahinalp, S. C ; Chitsaz, H ; Sharif University of Technology
Mary Ann Liebert Inc
2020
Abstract
Graph Traversal Edit Distance (GTED) is a measure of distance (or dissimilarity) between two graphs introduced. This measure is based on the minimum edit distance between two strings formed by the edge labels of respective Eulerian traversals of the two graphs. GTED was motivated by and provides the first mathematical formalism for sequence coassembly and de novo variation detection in bioinformatics. Many problems in applied machine learning deal with graphs (also called networks), including social networks, security, web data mining, protein function prediction, and genome informatics. The kernel paradigm beautifully decouples the learning algorithm from the underlying geometric space,...
Observations on using probabilistic c-means for solving a typical bioinformatics problem
, Article EMS 2008, European Modelling Symposium, 2nd UKSim European Symposium on Computer Modelling and Simulation, Liverpool, 8 September 2008 through 10 September 2008 ; 2008 , Pages 236-239 ; 9780769533254 (ISBN) ; Ghazinezhad, A ; Rasooli Valaghozi, A ; Nadi, A ; Asgarian, E ; Salmani, V ; Najafi Ardabili, A ; Moeinzadeh, M. H ; Sharif University of Technology
2008
Abstract
Recently, there has been great interest in Bio informatics among researches from various disciplines such as computer science, mathematics, statistics and artificial intelligence. Bioinformatics mainly deals with solving biological problems at molecular levels. One of the classic problems of bioinformatics which has gain a lot attention lately is Haplotyping, the goal of which is categorizing SNP-fragments into two clusters and deducing a haplotype for each. Since the problem is proved to be NP-hard, several computational and heuristic methods have addressed the problem seeking feasible answers. In this work it is shown that using PCM to solve Haplotyping problem in DALY dataset yields...
An in-vitro measurement of temperature changes in phacoemulsification system during different modes
, Article 2nd International Conference on Bioinformatics and Biomedical Engineering, iCBBE 2008, Shanghai, 16 May 2008 through 18 May 2008 ; 2008 , Pages 1569-1574 ; 9781424417483 (ISBN) ; Fattahi, H ; Amjadi, A ; Sharif University of Technology
IEEE Computer Society
2008
Abstract
Ultrasound waves have been used in these days for most surgeries. They are applied to remove body tissues through surgery, for example in removing eye lens in cataract surgery. In a nutshell, a 3mm incision near cornea has been created and then after, a folded lens will be implanted and the patient can be released soon after the operation. The process is done by a vibrating metal tip in order to emulsify the lenses and the small particles will aspirate the debris through the hollow center of the tip. The problem of this procedure in some cases is thermal damage. This research addresses the aforementioned problem through an important parameter, different operating modes of the system. The...
A hybrid fuzzy based algorithm for 3D human airway segmentation
, Article 2nd International Conference on Bioinformatics and Biomedical Engineering, iCBBE 2008, Shanghai, 16 May 2008 through 18 May 2008 ; 2008 , Pages 2295-2298 ; 9781424417483 (ISBN) ; Ahmadian, A ; Sahba, N ; Tavakoli, V ; Alirezaie, J ; Fatemizadeh, E ; Rezaie, N ; Sharif University of Technology
IEEE Computer Society
2008
Abstract
Segmentation of the human airway tree from volumetric computed tomography images is an important stage for many clinical applications such as virtual bronchoscopy. The main challenges of previously developed methods are to deal with two problems namely, leaking into the surrounding lung parenchyma during segmentation and the need to manually adjust the parameters. To overcome these problems, a multi-seeded fuzzy based region growing approach in conjuction with the spatial information of voxels is proposed. Comparison with a commonly used region growing segmentation algorithm shows that the proposed method retrieves more accurate results by achieving the specificity and sensitivity of 98.81%...
Three heuristic clustering methods for haplotype reconstruction problem with genotype information
, Article Innovations'07: 4th International Conference on Innovations in Information Technology, IIT, Dubai, 18 November 2007 through 20 November 2007 ; 2007 , Pages 402-406 ; 9781424418411 (ISBN) ; Asgarian, E ; Najafi Ardabili, A ; Sharifian R, S ; Sheikhaei, M. S ; Mohammadzadeh, J ; Sharif University of Technology
IEEE Computer Society
2007
Abstract
Most positions of the human genome are typically invariant (99%) and only some positions (1%) are commonly variant which are associated with complex genetic diseases. Haplotype reconstruction is to divide aligned SNP fragments, which is the most frequent form of difference to address genetic diseases, into two classes, and thus inferring a pair of haplotypes from them. Minimum error correction (MEC) is an important model for this problem but only effective when the error rate of the fragments is low. MEC/GI as an extension to MEC employs the related genotype information besides the SNP fragments and so results in a more accurate inference. The haplotyping problem, due to its NP-hardness, may...
Learning and Associating Phenotypic Behavior of Organisms using Biological data
, M.Sc. Thesis Sharif University of Technology ; Beigy, Hamid (Supervisor) ; Motahari, Abolfazl (Supervisor)
Abstract
Datasets extracted from gene expression microarrays contain information about the phenotypic behavior of organisms. Turning this information into knowledge, i.e. finding associative genes with a given phenotype, is a daunting task. This is due to the high dimensionality of the data as the number of features on a gene expression microarray is usually very large. Moreover, a phenotype may change the expression pattern of a set of genes rather than changing each gene’s expression independently. To tackle the second problem, integrating other sources of information such as Protein-Protein Interaction (PPI) networks is required. In this thesis, the PPI network extracted from the String database...
Gene Selection and Reduction in DNA Microarrays to Improve Classification Accuracy of Cancerous Samples
, M.Sc. Thesis Sharif University of Technology ; Rabiee, Hamid Reza (Supervisor)
Abstract
DNA Microarray is the state-of-the-art technology in analyzing gene expression data. It has made it possible to measure expression levels of thousand of genes simultaneously. Microarray classification has been widely used in effective diagnosis of cancers and some other biological diseases. But the most challenging issue is the intense asymmetry between the dimensionality of genes and tissue samples which can wreck the classification performance. This dissertation will focus on gene selection and reduction solutions and presents a novel classification scheme which uses both gene selection and dimension reduction in its different stages. We have improved one of the recently proposed topology...
Human Genome Sequence Analysis Using Statistical and Machine Learning Methods
,
M.Sc. Thesis
Sharif University of Technology
;
Manzuri Shalmani, Mohammad Taghi
(Supervisor)
Abstract
During recent decades, dramatic advances in Genetics and Molecular Biology, has provided scientists with enormous amounts of molecular genomic information of different living organisms, from DNA sequences to complex 3d structures of proteins. This information is raw data which their analysis can provide better understanding of genome mechanisms, discriminating healthy and tumor cells, predicting disease type, making drugs based on genome information, and many more applications. Here, one important issue is the inevitable use of computer science and statistics to analyze these data; such that according to the vast amount of data, would provide intelligent methods, which yield most accurate...
Fast Alignment-free Protein Comparison Approach based on FPGA Implementation
, M.Sc. Thesis Sharif University of Technology ; Koohi, Somayyeh (Supervisor)
Abstract
Protein, as the functional unit of the cell, plays a vital role in its biological function. With the advent of advanced sequencing techniques in recent years and the consequent exponential growth of the number of protein sequences extracted from diverse biological samples, their analysis, comparison, and classification have faced a considerable challenge. Existing methods for comparing proteins divide into two categories: methods based on alignment and alignment-free. Although alignment-based methods are among the most accurate methods, they face inherent limitations such as poor analysis of protein groups with low sequence similarity, time complexity, computational complexity, and memory...
Exploring Pivot Genes and Clinical Prognosis Using Combined Bioinformatics Approaches in the Colon Cancer
,
M.Sc. Thesis
Sharif University of Technology
;
Foroughmand Araabi, Mohammad Hadi
(Supervisor)
Abstract
Colorectal cancer (CRC) is one of the most common cause of cancer death worldwide. Identification of pivot genes in colorectal cancer can play an important role as biomarkers in predicting and early diagnosis and reducing the number of deaths caused by this disease. In this study, the aim of which is to discover pivot genes in colorectal cancer, six microarray datasets selected from the GEO database including 277 tumor tissue samples and 325 normal colon tissue samples. After data processing, differentially expressed genes and CRC-related genes were screened and 285 shared genes between them were identified for subsequent analysis. Based on 285 shared genes, the protein-protein interaction...
Fundamental Bounds for Clustering of Bernoulli Mixture Models
, M.Sc. Thesis Sharif University of Technology ; Motahari, Abolfazl (Supervisor)
Abstract
A random vector with binary components that are independent of each other is referred to as a Bernoulli random vector. A Bernoulli Mixture Model (BMM) is a combination of a finite number of Bernoulli models, where each sample is generated randomly according to one of these models. The important challenge is to estimate the parameters of a Bernoulli Mixture Model or to cluster samples based on their source models. This problem has applications in bioinformatics, image recognition, text classification, social networks, and more. For example, in bioinformatics, it pertains to clustering ethnic groups based on genetic data. Many studies have introduced algorithms for solving this problem without...
Exploration of Existing Patterns in Copy Number Variations of Genetic Diseases and Disorders
, Ph.D. Dissertation Sharif University of Technology ; Rabiee, Hamid Reza (Supervisor)
Abstract
One of the main sources of genetic variations are structural variations, including the widespread Copy Number Variations (CNVs). CNVs include two types, copy of genetic material (duplication) and loss of part of genetic sequence (deletion) and typically range from one kilobase pairs (Kbp) to several megabase pairs (Mbp) in size. Most of the copy number variations are occured in in healthy people; however, these variants can also contribute to numerous diseases through several genetic mechanisms (e.g. change gene dosage through insertions, duplications or deletions). The CNV study can provide greater insight into the etiology of disease phenotypes. Nowadays, with the huge amount of investment...
Inference of gene regulatory networks by extended Kalman filtering using gene expression time seriesdata
, Article BIOINFORMATICS 2012 - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms ; 2012 , Pages 150-155 ; 9789898425904 (ISBN) ; Fatemizadeh, E ; Arab, S. S ; Sharif University of Technology
2012
Abstract
In this paper, the Extended Kalman filtering (EKF) approach has been used to infer gene regulatory networks using time-series gene expression data. Gene expression values are considered stochastic processes and the gene regulatory network, a dynamical nonlinear stochastic model. Using these values and a modified Kalman filtering approach, the model's parameters and consequently the interactions amongst genes are predicted. In this paper, each gene-gene interaction is modeled using a linear term, a nonlinear one, and a constant term. The linear and nonlinear term coefficients are included in the state vector together with the gene expressions' true values. Through the extended Kalman...
Multiple cell tracking algorithm assessment using simulation of spermatozoa movement
, Article 2015 IEEE 15th International Conference on Bioinformatics and Bioengineering, BIBE 2015, 2 November 2015 through 4 November 2015 ; 2015 ; 9781467379830 (ISBN) ; Vahdat, B. V ; Sharif University of Technology
Institute of Electrical and Electronics Engineers Inc
2015
Abstract
In this research, a web-based simulator is developed, which can be used for generating image sequences of moving spermatozoa cells. It can be used for assessment of multiple object tracking algorithms, especially Computer Aided Sperm Analysis (CASA) systems. The developed software has many useful parameters such as blurring images or adding noise and it also gives full control of sperm counts and types. To illustrate performance of the developed simulator, three parameters (spermatozoa population, standard deviation of Gaussian blur filter and noise intensity) have been swept and the results of three different multiple object tracking algorithms were compared as an application of this...
An intelligent multi-agent system architecture for enhancing self-management of type 2 diabetic patients
, Article IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015, 12 August 2015 through 15 August 2015 ; August , 2015 , Page(s): 1 - 8 ; 9781479969265 (ISBN) ; Zarandi, M. H. F ; Solgi, S. S ; Turksen, I. B ; Sharif University of Technology
Institute of Electrical and Electronics Engineers Inc
2015
Abstract
This paper presents a multi agent system which can help type 2 diabetic patients. Because of diabetes nature, medical care and lifestyle can both prevent the complications of this illness, so self-managing is very crucial for these patients. Therefore, each effective factor of controlling diabetes and increasing patients' knowledge including: suitable diet and monitoring blood glucose level, is assigned to be an agent which works independently whilst coordinates and cooperates with other agents. The proposed architecture of this multi agent system organizes all the various agents of this system in a way that not only each of them can accomplish their responsibility, but also they can have...
Investigation of heavy metals containing acidic waste waters from coal mine
, Article 2010 4th International Conference on Bioinformatics and Biomedical Engineering, iCBBE 2010, 18 June 2010 through 20 June 2010 ; June , 2010 ; 9781424447138 (ISBN) ; Sharif University of Technology
2010
Abstract
Waste water from ore plant layout central Alborz's coal has been investigated and analyzed. Studies showed that because of acidic character, the waste water contains heavy metals including Cu, Cr, Ba, Zn, V, Sr, Pb, Ni. Considering the importance of this topic for outgoing acidic waste waters which is hazardous for the environment, i.e. heavy metals, the same should not be transferred into the environment. Since their waste water is transferred into nearby agricultural ground, it was developed a method, so that waste water is primarily directed into a slug pond for sedimentation. In the second stage in another pond, the outflow of waste water was treated with chemicals. Also the pH was...
Stiffer double-stranded DNA in two-dimensional confinement due to bending anisotropy
, Article Physical Review E - Statistical, Nonlinear, and Soft Matter Physics ; Volume 94, Issue 6 , 2016 ; 15393755 (ISSN) ; Eslami Mossallam, B ; Ranjbar, H. F ; Ejtehadi, M. R ; Sharif University of Technology
American Physical Society
2016
Abstract
Using analytical approach and Monte Carlo (MC) simulations, we study the elastic behavior of the intrinsically twisted elastic ribbons with bending anisotropy, such as double-stranded DNA (dsDNA), in two-dimensional (2D) confinement. We show that, due to the bending anisotropy, the persistence length of dsDNA in 2D conformations is always greater than three-dimensional (3D) conformations. This result is in consistence with the measured values for DNA persistence length in 2D and 3D in equal biological conditions. We also show that in two dimensions, an anisotropic, intrinsically twisted polymer exhibits an implicit twist-bend coupling, which leads to the transient curvature increasing with a...
Modeling and control of cell cycle
, Article 3rd International Conference on Bioinformatics and Biomedical Engineering, iCBBE 2009 ; 2009 ; 9781424429028 (ISBN) ; Seifipour, N ; Sharif University of Technology
Abstract
Applying engineering approaches to non-engineering systems such as biological systems has brought a new research filed to the surface. The sole purpose of this research is mathematical modeling the cell cycle control system in the fissionyeast cell using Ordinary Differential Equations (ODE).This paper, in fact, introduces the capabilities of engineering knowledge and engineering of control, in particular, in this field. ©2009 IEEE
A combination of PSO and K-means methods to solve haplotype reconstruction problem
, Article 2009 International Conference on Innovations in Information Technology, IIT '09, 15 December 2009 through 17 December 2009 ; 2009 , Pages 190-194 ; 9781424456987 (ISBN) ; Baharian, A ; Asgarian, E ; Rasooli, A ; Sharif University of Technology
Abstract
Disease association study is of great importance among various fields of study in bioinformatics. Computational methods happen to be advantageous specifically when experimental approaches fail to obtain accurate results. Haplotypes are believed to be the most responsible biological data for genetic diseases. In this paper, the problem of reconstructing haplotypes from error-containing SNP fragments is discussed For this purpose, two new methods have been proposed by a combination of k-means clustering and particle swarm optimization algorithm. The methods and their implementation results on real biological and simulation datasets are represented which shows that they outperform the methods...