Loading...

Improved MPC algorithms for Edit distance and Ulam distance

Boroujeni, M ; Sharif University of Technology | 2021

210 Viewed
  1. Type of Document: Article
  2. DOI: 10.1109/TPDS.2021.3076534
  3. Publisher: IEEE Computer Society , 2021
  4. Abstract:
  5. Edit distance is one of the most fundamental problems in combinatorial optimization to measure the similarity between strings. Ulam distance is a special case of edit distance where no character is allowed to appear more than once in a string. Recent developments have been very fruitful for obtaining fast and parallel algorithms for both edit distance and Ulam distance. In this work, we present an almost optimal MPC (massively parallel computation) algorithm for Ulam distance and improve MPC algorithms for edit distance. Our algorithm for Ulam distance is almost optimal in the sense that (1) the approximation factor of our algorithm is $1+epsilon$1+ϵ, (2) the round complexity of our algorithm is constant, (3) the total memory of our algorithm is almost linear ($widetilde{O}_epsilon (n)$Õϵ(n)), and (4) the overall running time of our algorithm is almost linear which is the best known for Ulam distance. We also improve the work of Hajiaghayi et al. for edit distance in terms of total memory. The best previously known MPC algorithm for edit distance requires $widetilde{O}(n^{2x})$Õ(n2x) machines when the memory of each machine is bounded by $widetilde{O}(n^{1-x})$Õ(n1-x). In this work, we improve the number of machines to $widetilde{O}(n^{(9/5)x})$Õ(n(9/5)x) while keeping the memory limit intact. Moreover, the round complexity of our algorithm is constant and the total running time of our algorithm is truly subquadratic. However, our improvement comes at the expense of a constant factor in the approximation guarantee of the algorithm. This improvement is inspired by the recent techniques of Boroujeni et al. and Chakraborty et al. for obtaining truly subquadratic time algorithms for edit distance. © 1990-2012 IEEE
  6. Keywords:
  7. Approximation algorithms ; Combinatorial optimization ; Approximation factor ; Constant factors ; Edit distance ; Massively parallels ; Round complexity ; Running time ; Computational complexity
  8. Source: IEEE Transactions on Parallel and Distributed Systems ; Volume 32, Issue 11 , 2021 , Pages 2764-2776 ; 10459219 (ISSN)
  9. URL: https://ieeexplore.ieee.org/document/9419758