Loading...

PCR Amplification Prediction using Machine Learning

Latifian, Niloofar | 2023

72 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 56656 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Hossein Khalaj, Babak
  7. Abstract:
  8. Polymerase Chain Reaction (PCR) is a laboratory method for amplifying a part of DNA. This method is used in determining the sequence of genes, detecting pathogenic agents in epidemics, creating genetic changes in bacteria, diseases, plants and even animals. Many factors affect the quality of the reaction. Each of these factors can be effective in amplifying the target in DNA. If we can predict the result of PCR using the factors involved in the reaction, it will save a lot of money and time. The aim of this research is to predict the result of PCR amplification using machine learning methods. For this purpose, two methods are proposed: feature-based method and string-based method. In the feature-based method, we must first select the most important features. For this purpose, first we have used some algorithms to sort features by their importance. Then these features were extracted for each dataset and a neural network model was trained on each of these datasets. Also we made 2 other datasets that are combination of these datasets. Then we also trained the neural network model on the combined datasets. The accuracy of the model was 94%, 85%, and 83% for the existing datasets and 84% and 77% for the combined datasets. In the string-based method, with having the primer and template strings, we first found the place where the primer binds to the template. Then we presented a new idea to encode this binding. The binding of each nucleotide pair in this region can be either match or mismatch. For each match or mismatch, one English letter was assigned to it. Then the letters assigned to the nucleotide pairs were put together to form a new string. This procedure can give a specific string to any combination of primer pairs with templates in the dataset. These strings are then fed into an RNN network to process the data. The final accuracy of this model reaches 96% using the presented method for coding bindings
  9. Keywords:
  10. Polymerase Chain Reaction (PCR) ; Feature Selection ; Artificial Neural Network ; Amplification Prediction ; Recurrent Neural Networks ; Machine Learning ; Primer

 Digital Object List

 Bookmark

...see more