Loading...
Active Constraint Clustering by Instance-level Constraint Ranking Using Estimated Cluster Boundaries
Abbasi, Mohammad Javad | 2017
489
Viewed
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 49384 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Beigy, Hamid
- Abstract:
- Taking into account the fast and ever-increasing pace of data growth, clustering algorithms emerge as the key tools for data analysis in new researches. Clustering remain as a method for decomposing data into clusters, in such a way that similar data coalesce in the same group. Different algorithms conduct clustering according to a series of initial hypotheses, without being informed about the clusters’ form and aims. Hence, in case with no conformity between initial hypothesis and the clustering aim, one cannot expect adequate response from the clustering algorithm. Exploitation of side information in clustering can play an impactful role in introduction of real models into clustering algorithms. Constrained clustering methods benefit from side information, at instance-level in form of must-link and cannot-link constraints, in order to generate or modify the initial hypotheses of a clustering algorithm. Keeping in mind that not every set of constraints would improve the performance of clustering algorithms, the active selection of constraints evolves, involving subsets of constraints with the utmost information to clustering algorithms. In this dissertation, active constrained clustering by instance-level constraint ranking using estimated boundaries is investigated, the aim of which is to present an approach to selection of beneficent constraints from boundary points regarding instant condition of clustering algorithm, and improvement of data clustering by virtue of useful constraint set in an iterative manner. In order to evaluate the proposed method, the ultimate performance was examined over various sets of real-world data and compared with the relevant active constraint selection methods. The desired performance of the proposed method over different datasets was analyzed
- Keywords:
- Ranking ; Constrained Clustering ; Side Information ; Algorithm ; Clustering ; Must-Link Constraint ; Cannot-Link Constraint ; Active Selection