Loading...
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 56255 (02)
- University: Sharif University of Technology
- Department: Mathematical Sciences
- Advisor(s): Foroughmand Aarabi, Mohammad Hadi
- Abstract:
- The field of Genome-Wide Asocciation Studies (GWAS) plays a vital role in understanding the genetic basis of complex traits and diseases. In this thesis, the focus is on investigating the effectiveness of two approaches combining Differential Evolution (DE) with Random Forest (RF) and support vector machine (SVM) for feature selection in the context of GWAS. Arabidopsois Thaliana dataset is used as experimental dataset for comparative analysis. The main goal is to achieve more efficient feature selection while maintaining competitive accuracy compared to RF and SVM without using DE. This research includes conducting experiments using DE with RF and DE with SVM followed by a comprehensive evaluation of the results. Key performance measures, such as feature selection and accuracy, are used to evaluate the effectiveness of the approaches. Reducing the number of features from the initial number of 278 to the final number of 141 is a significant result that shows the efficiency of the feature selection process. The obtained results show that the integration of DE with RF and SVM leads to improved feature selection and classification accuracy. Despite the additional computational complexity by DE, the value of the area under the curve (AUC) shows a small increase on average compared to RF and SVM alone. The implications of the research findings highlight the importance of DE in trait selection for the context of GWAS. Efficient feature selection is essential in identifying the most relevant genetic markers associated with traits or diseases. Reducing the trait dimensions obtained through DE contributes to a more focused and interpretable analysis and facilitates a better understanding of the genetic factors influencing the phenotype
- Keywords:
- Genome Analysis ; Differential Evolution Algorithm ; Random Forest Algorithm ; Support Vector Machine (SVM) ; Genome-Wide Asocciation Studies (GWAS) ; Disease Diagnosis
-
محتواي کتاب
- view