Loading...
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 47456 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Zarrabi-Zadeh, Hamid
- Abstract:
- The problem of finding the near neighbours is as follows: given a set of npoints, build a data structure that for any query point, can quickly find all points in distancer from the query point. The problem finds applications in various areas of computer science such as data mining, pattern recognition, databases, and search engines. An important factor here is to determine the number of points to be reported. If this number is too small, the answers may be too homogeneous (similar to the query point), and therefore, convey no useful information.On the ther hand, if the number of reported points is too high, again the informativeness decreases because of the large output size. Therefore, in recent years, a considerable amount of work has been done on diversity-aware search in which the goal is to find answers that are both related (close to the query point) and diverse (distant from each other). Among the most well-studied diversity measures is the remote-edge, whose objective is to maximize the minimum distance in the subset selected. This problem is known to be NP-hard, and therefore, the focus has been on finding an approximate solution. The best algorithm for solving this problem achieves an approximation factor of 6 using the composable coresets framework. In this thesis, we present composable coresets with near-optimal approximation factors for several notions of diversity, including remote-clique, remote-cycle, and remote-tree.We also prove a general lower bound on the approximation factor of composable coresets for a large class of diversity maximization problems
- Keywords:
- Near Neighbor ; Approximate Algorithm ; Computational Geometry ; Diversity Theory ; Core Sets