Loading...
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 39783 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Beigy, Hamid
- Abstract:
- Decision making often has different effects and results with unequal importance. Most of classifiers try to minimize the rate of misclassified instances. These classifiers assume equal costs for different misclassification types. However, this assumption is not true in many real world problems and different misclassification types have different costs. These differences can be applied by introducing the cost in the process of learning. In this manner, total cost of misclassification will be the evaluation metric of classification. In order to apply this metric to the problems, new learning algorithms are needed. Cost-sensitive learning is the related area of machine learning which deals with these concepts. These learning algorithms have been deployed in many data mining applications such as intrusion detection, bioinformatics, and spam filtering. One of the successful ensemble learners is called stacked generalization. This thesis tries to make cost sensitive stacked generalization. For this purpose, one solution is to change the structure of stacked generalization by supposing the cost in the process of learning, and make a direct cost-sensitive algorithm. Empirical results show that this cost-sensitive algorithm has higher performance than cost-insensitive version. The other solution is to apply thresholds of different types on the output of different levels of the stacked generalization algorithm. Experiments illuminate that cost-sensitive stacked generalization based on threshold has much more performance than other thresholding cost-sensitive algorithms. Besides, the performance of proposed thresholding cost-sensitive algorithm is higher than direct cost-sensitive stacked generalization. Finally a new algorithm is presented to improve the performance of cost-sensitive stacked generalization. This algorithm is based on cascade correlation neural network and is called cost-sensitive cascade generalization. Empirical results express that cost-sensitive cascade generalization has better results than direct cost-sensitive stacked generalization.
- Keywords:
- Classification ; Machine Learning ; Stacked Greneralization ; Ensemble Learning ; Cost-Sensitive Learning ; Cascade Generalization ; Class Imbalance Problem
- محتواي پايان نامه
- view