Loading...

Using Statistical Pattern Recognition on Gene Expression Data for Prediction of Cancer

Hajiloo, Mohsen | 2009

776 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 39717 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Rabiee, Hamid Reza
  7. Abstract:
  8. The classification of different tumor types is of great importance in cancer diagnosis and drug discovery. However, most previous cancer classification studies are clinical based and have limited diagnostic ability. Cancer classification using gene expression data is known to contain the keys for addressing the fundamental problems relating to cancer diagnosis. The recent advent of DNA microarray technique has made simultaneous monitoring of thousands of gene expressions possible. With this abundance of gene expression data, researchers have started to explore the possibilities of cancer classification using gene expression data and quite a number of Pattern Recognition approaches have been proposed in recent years with promising results. But there are still a lot of issues which need to be understood. In the literature emphasis is on building accurate and robust cancer classification models which are biologically relevant also. Fuzzy SVM is a rule-based classification model with high generalization ability which was designed about six years ago. Since there is not any reported research on its application in cancer classification domain, we have started to study its capabilities in the above mentioned domain. Experimental results show that the proposed method has better or same performance in comparison with previous methods. Furthermore, it provides valuable rule sets for further analysis. As the next step in this research, we have built an ensemble model in which each Fuzzy SVM classifier is fed with datasets provided by a feature sub-sampling technique. Experimental results show that this model is more robust and less sensitive to irrelevant genes.

  9. Keywords:
  10. Cancer ; Microarray Data ; Classification ; Gene Expression Data ; Ensemble Learning ; Fuzzy Support Vector Machine ; Biological Relevance

 Digital Object List