Loading...
Approximation Algorithms for Clustering Points in the Data Stream Model
Hatami Varzaneh, Behnam | 2015
1017
Viewed
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 47499 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Zarrabi Zadeh, Hamid
- Abstract:
- The k-center problem—covering a set of points using k congruent balls with minimum radius—is a well-known clustering model in computer science with a wide range of applications. The k-center is a well known NP-Hard problem. In this thesis, we focus on the k-center problem with outliers in high dimensional data streams. Due to increase in data size, we focus on the data stream model of the problem. Moreover, in real-world applications, where input points are noisy, it is very important to consider outliers. In this thesis, we study 1-center and 2-center with outliers in high dimensional data streams in Euclidean space. We provide a 1:7-approximation streaming algorithm for 1-center with z outliers (for constant z), which improves previous 1:73-approximation algorithm. We also provide a (1:8+ϵ)-approximation streaming algorithm for 2-center problem with outliers, improving upon the previous (4+ϵ)-approximation algorithm available for the problem. The space complexity and update time of both algorithms are poly(z; d; 1 ϵ ), independent of the size of the stream
- Keywords:
- Clustering ; Streaming Algorithm ; Data Stream Clustering ; Approximate Algorithm ; Data Stream ; K-Center Problem