Loading...

Fast Alignment-free Protein Comparison Approach based on FPGA Implementation

Abdosalehi, Azam Sadat | 2022

146 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 55053 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Koohi, Somayyeh
  7. Abstract:
  8. Protein, as the functional unit of the cell, plays a vital role in its biological function. With the advent of advanced sequencing techniques in recent years and the consequent exponential growth of the number of protein sequences extracted from diverse biological samples, their analysis, comparison, and classification have faced a considerable challenge. Existing methods for comparing proteins divide into two categories: methods based on alignment and alignment-free. Although alignment-based methods are among the most accurate methods, they face inherent limitations such as poor analysis of protein groups with low sequence similarity, time complexity, computational complexity, and memory consumption. Thus, alignment-free methods have been proposed based on various properties such as similarity of physiochemical properties, number and distribution of k-mers or even a combination of them, graphical representation, and information theory to improve the limitations of alignment methods. These methods still face a trade-off between accuracy and time complexity and so memory consumption, although they improved computational complexity and memory consumption compared to alignment-based methods, they still face accuracy problems. Accordingly, the main goal of the project will be to design and evaluate a protein-level classification model using a combination of data from different domains (number of patterns, distribution, and their chemical properties) along with improvements in accuracy, speed, and memory consumption. We then optimize the proposed model for simulation on the FPGA board, which can accelerate genomic applications and protein sequence analysis. Finally, we compare the proposed model with similar tools in terms of accuracy, speed, and memory consumption.
  9. Keywords:
  10. Bioinformatics ; Proteins ; Field Programmable Gate Array (FPGA) ; Phylogenetics ; Protein Sequence ; Alignment-Free Techniques ; Protein Classification

 Digital Object List

 Bookmark

No TOC