Loading...

Concept Drift Detection in Data Streams Using Ensemble Classifiers

Dehghan, Mahdie | 2012

591 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 42627 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Beigy, Hamid
  7. Abstract:
  8. Concept drift is a challenging problem in the context of data stream processing. As a result of increasing applications of data streams, including network intrusion detection, weather forecasting, and detection of unconventional behavior in financial transactions; numerous studies have been conducted in the field of concept drift detection. In order to solve the problem of concept drift detection, an ideal method should be able to quickly and correctly identify a variety of changes, adapt quickly to new concepts, in the presence of limitations of memory and processing power. In this thesis, a new explicit concept drift detection method based on ensemble classifiers has been proposed for data streams. This method processes samples one by one and monitor the error of the ensemble classification method to detect concept drift. By detection of a drift, a batch classifier will be made on the new concept and will be added to the existing batch classifiers. Also, in this thesis an ensemble learning method for data streams classification is proposed, which uses the proposed concept drift detection method. In this classification method, a new approach to weight and combine the results of classifiers is presented. The proposed method has been evaluated on synthetic and real datasets and has been compared to other methods which exist in this context. It is shown that the method is capable of detecting and adjusting to concept drifts of different speeds and severities. And the advantages of this method in compare to other existing methods in this field is early concept drift detection, increasing the number of correct change detection, reducing the number of incorrect change detection and reduce the loss of accuracy after the concept drift. Specifically, the proposed method could achieve noticeable results in detection of high speed sever concept drifts. The proposed classification method using the proposed detection method, in addition to the above advantages, it’s accuracy is higher than other methods
  9. Keywords:
  10. Data Stream ; Classification ; Ensemble Learning ; Concept Drift ; Online Learning

 Digital Object List

 Bookmark

  • مقدمه
    • هدف از پايان‌نامه
    • سا‌‌ختار پايان‌نامه
  • تغيير مفهوم در جويبار داده
    • جويبار داده
    • تغيير مفهوم
      • دلايل تغيير مفهوم
    • انواع تغيير مفهوم
    • تشخيص تغيير مفهوم
    • چالش‌هاي موجود در تشخيص تغيير مفهوم
    • کاربردهاي تشخيص تغيير مفهوم
    • جمع‌بندي
  • روش‌هاي موجود تشخيص تغيير مفهوم
    • طبقه‌بندي روش‌هاي کنترل تغيير مفهوم
    • روش‌هاي تشخيص تغيير مفهوم
      • روش‌هاي تشخيص تغيير مفهوم روي تک دسته‌بندها
      • روش‌هاي تشخيص تغيير مفهوم با استفاده از دسته‌بندهاي جمعي
    • روش‌هاي يادگيري جمعي
    • جمع‌بندي
  • روش پيشنهادي براي تشخيص تغيير مفهوم
    • روش پيشنهادي
    • مرحله‌ي آزمون و محاسبه‌ي خروجي دسته‌بند جمعي
    • محاسبه‌ي وزن دسته‌بندهاي پايه
    • تشخيص تغيير مفهوم
    • حذف دسته‌بند پايه
    • مرحله‌ي آموزش دسته‌بند
    • جمع‌بندي
  • پياده‌سازي و ارزيابي
    • مجموعه داده‌هاي مورد بررسي
      • مجموعه داده‌هاي مصنوعي
      • مجموعه داده‌هاي واقعي
    • معيارهاي ارزيابي
    • آزمايش‌ها و نتايج
      • نتايج بر روي مجموعه داده‌هاي مصنوعي
      • نتايج بر روي مجموعه داده‌هاي واقعي
    • بررسي عملکرد روش پيشنهادي در حضور نويز
    • تحليل پارامترهاي مختلف روش پيشنهادي
      • تاثير پارامتر اندازه‌ي پنجره‌ي داده (W)
      • تأثير پارامتر تعداد خطاها (E)
      • تأثير پارامتر فاصله‌ي بين خطاها (D)
      • تأثير پارامتر درجه اطمينان ()
    • جمع‌بندي
  • جمع‌بندي و کارهاي آتي
    • نتيجه‌گيري
    • کارهاي آتي
  • مراجع
...see more