Loading...

Management of Classifiers Pool in Data Stream Classification Using Probabilistic Graphical Models

Talebi, Hesamoddin | 2015

685 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 47108 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Beigy, Hamid
  7. Abstract:
  8. Concept drift is a common situation in data streams where distribution which data is generated from, changes over time due to various reasons like environmental changes. This phenomenon challenges classification process strongly. Recent studies on keeping a pool of classifiers each modeling one of the concepts, have achieved promising results. Storing used classifiers in a pool enables us to exploit prior knowledge of concepts in the future occurrence of them. Most of the methods presented so far, introduce a similarity measure between current and past concepts and select the closest stored concept as current one. These methods don’t consider possible relations and dependenies between observed concepts. In this respect, this thesis presents an improved method to model these dependencies. To achive this, we propose using high-order markov chains. High number of parameters in an n-order markov chain discourage its usage in practice, so we use one of its approximations that doesn’t suffer from this problem, but still can handle most of the dependencies. In addition, to utilize our knowledge of concepts relations and similarity measures provided by other methods, together, we present a discriminative classification model, inspired by Conditional Random Field. This model can cope with data streams depending on how strongly concepts relate to each other. Experimental results on real and synthetic data streams demonstrate that proposed method can model dependencies successfully and apply this knowledge in classification task. Also in case of non-dependent concepts or even non-recurring drifts, our method delivers acceptable results that indicate it can generalize well with all kind of concept drift
  9. Keywords:
  10. Data Stream ; Conditional Random Fields (CRF) ; Recurring Concept Drift ; Classifiers Pool ; Probabilistic Graphical Models

 Digital Object List

 Bookmark

No TOC