Loading...

Randomized algorithms for comparison-based search

Tschopp, D ; Sharif University of Technology

558 Viewed
  1. Type of Document: Article
  2. Abstract:
  3. This paper addresses the problem of finding the nearest neighbor (or one of the R-nearest neighbors) of a query object q in a database of n objects, when we can only use a comparison oracle. The comparison oracle, given two reference objects and a query object, returns the reference object most similar to the query object. The main problem we study is how to search the database for the nearest neighbor (NN) of a query, while minimizing the questions. The difficulty of this problem depends on properties of the underlying database. We show the importance of a characterization: combinatorial disorder D which defines approximate triangle inequalities on ranks. We present a lower bound of Ω(Dlog n/D + D 2) average number of questions in the search phase for any randomized algorithm, which demonstrates the fundamental role of D for worst case behavior. We develop a randomized scheme for NN retrieval in O(D 3 log 2 n + Dlog 2 n log log n D3 ) questions. The learning requires asking O(nD 3 log 2 n + Dlog 2 n log log n D3 ) questions and O(n log 2 n/ log(2D)) bits to store
  4. Keywords:
  5. Average numbers ; Lower bounds ; Nearest neighbors ; Query object ; Randomized Algorithms ; Reference objects ; Triangle inequality ; Algorithms ; Database systems
  6. Source: Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011, 12 December 2011 through 14 December 2011 ; December , 2011 ; 9781618395993 (ISBN)
  7. URL: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.315
  8. URL: http://papers.nips.cc/paper/4381-randomized-algorithms-for-comparison-based-search.pdf