Loading...
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 56511 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Zarrabizadeh, Hamid
- Abstract:
- Clustering is a fundamental problem for data analysis, and it has a lot of variants. In this thesis we focused on the k-center problem, which is one of the most popular and well-studied variants of clustering. In this problem we are given a metric set of points called X, and a parameter k ⩽ |X|. Our goal is to find a set of k centers in X, minimizing the maximum distance of any point of X from its closest center. This thesis has worked on a version of the problem that is harder to solve. we have an extra parameter called z, which represents the maximum number of points that there is no need to be clustered, and we refer to them as outliers. The growth of data that needs to be processed makes us invent strategies that can be efficient on large sizes of data. Massively parallel computation(MPC) is one of them, which is the strategy of this thesis. the best known approximation factor of the problem in MPC model is 13 and we improved it to 11 + ϵ, which is a considerable improvement
- Keywords:
- Clustering ; Outliers ; Massively Parallel Computation ; Outlier Removal ; K-Center Problem
-
محتواي کتاب
- view
