Loading...

Improved MPC algorithms for edit distance and ulam distance

Boroujeni, M ; Sharif University of Technology | 2019

381 Viewed
  1. Type of Document: Article
  2. DOI: 10.1145/3323165.3323205
  3. Publisher: Association for Computing Machinery , 2019
  4. Abstract:
  5. Edit distance is one of the most fundamental problems in combinatorial optimization. Ulam distance is a special case of edit distance where no character is allowed to appear more than once in a string. Recent developments have been very fruitful for obtaining fast and parallel algorithms for both edit distance and Ulam distance. In this work, we present an almost optimal MPC algorithm for Ulam distance and improve MPC algorithms for edit distance. Our algorithm for Ulam distance is optimal in the sense that (1) the approximation factor of our algorithm is 1 + ϵ, (2) the round complexity of our algorithm is constant, (3) the total memory of our algorithm is almost linear (OH(n)), and (4)] the overall running time of our algorithm is almost linear which is the best known for Ulam distance. Similar to edit distance and longest common subsequence (LCS) which are considered as dual problems, Ulam distance and longest increasing subsequence (LIS) are also seen as dual problems. LIS is equivalent to a special case of LCS where each string can contain each character at most once. In that sense, our result for Ulam distance complements the work of Im et al., wherein a similar result is presented for LIS. We also improve the work of Hajiaghayi et al. for edit distance in terms of total memory. The best previously known MPC algorithm for edit distance requires OH(n2x ) machines when the memory of each machine is bounded by OH(n1−x ). In this work, we improve the number of machines to OH(n1.75x ) while keeping the memory limit intact. Moreover, the round complexity of our algorithm is constant and the total running time of our algorithm is truly subquadratic. However, our improvement comes at the expense of a constant factor in the approximation guarantee of the algorithm. This improvement is inspired by the recent techniques of Boroujeni et al. and Chakraborty et al. for obtaining truly subquadratic time algorithms for edit distance
  6. Keywords:
  7. Approximation algorithms ; Edit distance ; MapReduce ; Parallel algorithms ; Ulam distance ; Combinatorial optimization ; Computational complexity ; Memory architecture ; Approximation factor ; Constant factors ; Longest common subsequences ; Longest increasing subsequences ; Map reduce ; Round complexity
  8. Source: 31st ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2019, 22 June 2019 through 24 June 2019 ; 2019 , Pages 31-40 ; 9781450361842 (ISBN)
  9. URL: https://dl.acm.org/doi/10.1145/3323165.3323205