Loading...

Concept Drift Handling in Data Stream using Domain Adaptation Approach

Karimian, Mahmood | 2024

0 Viewed
  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 57578 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Beigy, Hamid
  7. Abstract:
  8. The escalating volume of data generated across diverse platforms underscores the necessity for robust methodologies in data stream classification. Predicting data streams becomes particularly challenging amidst evolving concepts, processing time constraints, and memory limitations. Concept drift, characterized by shifts in data distribution over time, significantly impacts prediction accuracy. This dissertation delves into data stream prediction and implicit concept drift management through a domain adaptation approach. To address these challenges, we examine two distinct scenarios. Firstly, we investigate data stream prediction problems wherein multiple sources contribute to the stream, employing two different domain adaptation approaches. Here, our aim is to demonstrate the efficacy of domain adaptation in concept drift management, both theoretically and practically. Subsequently, we explore real-world applications where storing historical data is infeasible. Leveraging deep learning techniques, such as Vision Transformers, we tackle concept drift in test-time adaptation scenarios. In the first scenario, we analyze data stream prediction under two assumptions: first, where true labels are immediately available post-prediction, and second, in scenarios where true labels arrive with a delay, posing an unsupervised learning challenge. In the second scenario, we confront the reality of no access to true labels or historical data, proposing a domain adaptation approach for continual learning scenarios. We conduct a series of experiments for each scenario to evaluate the performance of our proposed methods. For the first scenario, we provide theoretical insights into the behavior of CDDA (our first proposed method), deriving its generalization bounds for data stream prediction. Furthermore, extensive experiments conducted on synthetic and real-world data streams affirm the efficacy of this approach. In the second scenario, comprehensive experiments using continual image classification benchmarks in non-stationary environments demonstrate the significant performance enhancement of CPT4 (our second proposed method) over the original Vision Transformer model, across various adaptation strategies
  9. Keywords:
  10. Data Stream Classification ; Domain Adaptation ; Continual Learning ; Concept Drift ; Test-Time Training

 Digital Object List

 Bookmark

No TOC