Loading...
Search for:
genome-analysis
0.009 seconds
Privacy in DNA Sequencing
, M.Sc. Thesis Sharif University of Technology ; Maddah-ali, Mohammad Ali (Supervisor) ; Motahari, Abolfazl (Co-Supervisor)
Abstract
DNA sequence is the lifetime private information of each individual: it can reveal personal traits, health status, and medical risk of that individual, it can be abused by entities such as insurance companies, it can be used for identity theft, etc. Unfortunately, due to cost, regulations, or some restrictions, we may not be able to complete DNA sequencing in-house and have to outsource it to some unreliable companies in some foreign countries.This would compromise the DNA privacy from the beginning. This would raise the question that how we can guarantee the DNA privacy in the process of sequencing.Here we propose a solution for private DNA sequencing by exploiting the fact that the process...
Distributed Processing of Next Generation Sequencing Data Set
, M.Sc. Thesis Sharif University of Technology ; Goudarzi, Maziar (Supervisor) ; Motahari, Abolfazl (Supervisor)
Abstract
DNA analysis plays a significant role in fields such as pharmacy, agriculture, genealogy, and forensics. Next generation sequencing datasets cover a gene several times due to a large number of readings. Therefore, the initial data volume is several times the amount of memory required to store the DNA strand. First, the DNA sequence of a sample should be made using the primary data, and then the difference should be found by comparing the sample DNA sequence with the reference DNA sequence. By finding these differences, one can extract the characteristics of the tested species. The extracted properties are precious for genetics researchers. For example, they can produce drugs that are...
Human Genome Sequence Analysis Using Statistical and Machine Learning Methods
,
M.Sc. Thesis
Sharif University of Technology
;
Manzuri Shalmani, Mohammad Taghi
(Supervisor)
Abstract
During recent decades, dramatic advances in Genetics and Molecular Biology, has provided scientists with enormous amounts of molecular genomic information of different living organisms, from DNA sequences to complex 3d structures of proteins. This information is raw data which their analysis can provide better understanding of genome mechanisms, discriminating healthy and tumor cells, predicting disease type, making drugs based on genome information, and many more applications. Here, one important issue is the inevitable use of computer science and statistics to analyze these data; such that according to the vast amount of data, would provide intelligent methods, which yield most accurate...
Algorithms of Genome-Wide Association Studies
, M.Sc. Thesis Sharif University of Technology ; Foroughmand Aarabi, Mohammad Hadi (Supervisor)
Abstract
The field of Genome-Wide Asocciation Studies (GWAS) plays a vital role in understanding the genetic basis of complex traits and diseases. In this thesis, the focus is on investigating the effectiveness of two approaches combining Differential Evolution (DE) with Random Forest (RF) and support vector machine (SVM) for feature selection in the context of GWAS. Arabidopsois Thaliana dataset is used as experimental dataset for comparative analysis. The main goal is to achieve more efficient feature selection while maintaining competitive accuracy compared to RF and SVM without using DE. This research includes conducting experiments using DE with RF and DE with SVM followed by a comprehensive...
Genome annotation and comparative genomic analysis of Bacillus subtilis MJ01, a new bio-degradation strain isolated from oil-contaminated soil
, Article Functional and Integrative Genomics ; Volume 18, Issue 5 , 2018 , Pages 533-543 ; 1438793X (ISSN) ; Niazi, A ; Deihimi, T ; Taghavi, S. M ; Ayatollahi, S ; Ebrahimie, E ; Sharif University of Technology
Springer Verlag
2018
Abstract
One of the main challenges in elimination of oil contamination from polluted environments is improvement of biodegradation by highly efficient microorganisms. Bacillus subtilis MJ01 has been evaluated as a new resource for producing biosurfactant compounds. This bacterium, which produces surfactin, is able to enhance bio-accessibility to oil hydrocarbons in contaminated soils. The genome of B. subtilis MJ01 was sequenced and assembled by PacBio RS sequencing technology. One big contig with a length of 4,108,293 bp without any gap was assembled. Genome annotation and prediction of gene showed that MJ01 genome is very similar to B. subtilis spizizenii TU-B-10 (95% similarity). The comparison...
A hierarchical machine learning model based on Glioblastoma patients' clinical, biomedical, and image data to analyze their treatment plans
, Article Computers in Biology and Medicine ; Volume 150 , 2022 ; 00104825 (ISSN) ; Rahimi Rise, Z ; Akhavan Niaki, S. T ; Sharif University of Technology
Elsevier Ltd
2022
Abstract
Aim of study: Glioblastoma Multiforme (GBM) is an aggressive brain cancer in adults that kills most patients in the first year due to ineffective treatment. Different clinical, biomedical, and image data features are needed to analyze GBM, increasing complexities. Besides, they lead to weak performances for machine learning models due to ignoring physicians' knowledge. Therefore, this paper proposes a hierarchical model based on Fuzzy C-mean (FCM) clustering, Wrapper feature selection, and twelve classifiers to analyze treatment plans. Methodology/Approach: The proposed method finds the effectiveness of previous and current treatment plans, hierarchically determining the best decision for...