Loading...

Concept Drift Detection in Spam Filtering

Nosrati, Leili | 2011

548 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: English
  3. Document No: 41735 (52)
  4. University: Sharif University of Technology, International Campus, Kish Island
  5. Department: Science and Engineering
  6. Advisor(s): Beigy, Hamid
  7. Abstract:
  8. As part of the definition of concept drift as an online learning task, concepts change or drift as time goes by. Consequently, these changes have to be monitored and their implication for learning should be recognized. An example of concept drift detection is needed for spam filtering problem. An effective spam filter must be able to handle various changes, including changes in the user’s criteria for filtering spam, changes in message topics, and changes caused by the people sending spam messages. In this thesis, spam detection system has been considered in which emails are given sequentially and learns them one by one. As we mentioned, the purpose of this thesis is detecting spam emails. In such a system emails concept may be changed, therefore, spam filtering system should be able to learn and identify new concept. To satisfy this purpose, a concept drift detection algorithm which is based on ensemble learning has been proposed. The proposed method, which named as DWM-CDD, monitors the predictive accuracy of a single online classifier and detects significant decreases in the predictive accuracy, which are caused by concept drift. When concept drift is detected, the online classifier is reinitialized to prepare for the learning of the next concept. Experimental results show that the proposed algorithm is able to detect concept drift more quickly and accurately than the related algorithms and performs well in learning and detecting concept drift in both synthetic and actual datasets
  9. Keywords:
  10. Spam ; Concept Drift ; Online Learning ; Machine Learning ; Spam Filtering

 Digital Object List

 Bookmark

No TOC