Loading...

Speeding up of Genetic Structural Variation Detection

Akbari Nejad Mousavi, Shaya | 2020

1071 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 52781 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Goudarzi, Maziar
  7. Abstract:
  8. Large differences in chromosome structures, compared to the reference genome, are one of the essential reasons for genetic variations. These differences that are called structural variations are associated with numerous diseases, including schizophrenia, cancer development, and autism. Therefore, calling these variations is of utmost importance in the next stages of analysis. However, due to computationally intensive tasks of discovering these variations, calling structural variations is lagging behind data produced by sequencers. Hence, discovering these variations with proper accuracy and in a reasonable time is of paramount importance. In this research, we implement a fast, yet accurate, structural variation calling pipeline for long-reads that takes raw reads as the input and detects structural variants of size larger than 50 bp. The accuracy and sensitivity of our method are especially evident in low-coverage scenarios, where similar tools need high-coverage for high sensitivity. To reduce running time, we utilize a neural network to classify useful processes from unuseful ones and only run the useful ones. The useful processes are those that help us find structural variations. We further reordered the processes to optimize cache utilization and, in turn, increase speed. To increase accuracy, we combined the results of Sniffles and SVIM, two of the state-of-the-art tools. In conclusion, our method is up to 2 times faster than the most sensitive method, which is a naive combination of different tools. Accuracy-wise, our implemented method is up to 17\% more sensitive than the base method
  9. Keywords:
  10. Aligner ; Long Reads ; PacBio Sequencers ; Structural Variation Calling Speed-up ; Structured Variability

 Digital Object List

 Bookmark

...see more