Loading...

A clustering-based algorithm for de novo motif discovery in DNA sequences

Ebrahim Abadi, M. H ; Sharif University of Technology

553 Viewed
  1. Type of Document: Article
  2. DOI: 10.1109/ICBME.2017.8430242
  3. Abstract:
  4. Motif discovery is a challenging problem in molecular biology and has been attracting researcher's attention for years. Different kind of data and computational methods have been used to unravel this problem, but there is still room for improvement. In this study, our goal was to develop a method with the ability to identify all the TFBS signals, including known and unknown, inside the input set of sequences. We developed a clustering method specialized as part of our algorithm which outperforms other existing clustering methods such as DNACLUST and CD-HIT-EST in clustering short sequences. A scoring system was needed to determine how much a cluster is close to being a real motif. Multiple features are calculated based on the contents of each cluster to determine the score of the cluster. These features contain a set of divergence measures, positional, and occurrence information. These scores are combined in a way that a trade-off between them determines the clusters situation. There is an option to compare the final results with the motif databases such as Jolma2013, and UniProbe using Tomtom motif comparison tool. Algorithm Evaluation has been performed on three datasets from ABS database. © 2017 IEEE
  5. Keywords:
  6. De novo motif prediction ; Transcription factor binding site (TFBS) ; Binding sites ; Bioinformatics ; Biomedical engineering ; Biophysics ; Cluster analysis ; DNA sequences ; Economic and social effects ; Molecular biology ; Algorithm evaluation ; Clustering ; Clustering methods ; Clustering-based algorithms ; Divergence measures ; Multiple features ; Scoring systems ; Transcription factor binding sites ; Clustering algorithms
  7. Source: 2017 24th Iranian Conference on Biomedical Engineering and 2017 2nd International Iranian Conference on Biomedical Engineering, ICBME 2017, 30 November 2017 through 1 December 2017 ; 2018 ; 9781538636091 (ISBN)
  8. URL: https://ieeexplore.ieee.org/document/8430242