Loading...

Distributed Computing in Heterogeneous Environments using Coded Redundancy

Sadeghi Arkami, Homayoun | 2025

0 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 58049 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Zarrabizadeh, Hamid
  7. Abstract:
  8. In today's world, where data volumes are rapidly increasing, the implementation of algorithms in distributed or parallel manners has become a common strategy for improving processing speed by leveraging multiple computing nodes. However, due to the heterogeneity in the computational power of these nodes, the overall performance of such algorithms is often dictated by the slowest machines. To overcome this limitation, novel approaches based on task repetition, computational load balancing, intelligent data distribution and polynomial-based coding have been proposed. These methods enable the final result to be obtained by collecting responses from only a subset of the machines, rather than requiring outputs from all of them. In other words, instead of distributing data directly in the traditional way, the operations themselves can be encoded and executed in a distributed manner, thereby enhancing both performance and fault tolerance in the presence of straggler nodes or node failures. This technology ensures that even when some nodes experience delays, partial results from other nodes can be quickly and accurately obtained. Such approaches are widely employed in machine learning tasks and computational algorithms, including matrix multiplication. In this research, following a comprehensive review of existing coded distributed computing methods particularly distributed matrix multiplication and an identification of the minimum number of healthy nodes required in this heterogeneous environments, we investigate polynomial coding as a means of mitigating the impact of straggler nodes through the use of redundancy. Building on insights from evaluations of similar coding schemes, a distributed algorithm based on coded redundancy is then designed and introduced to solve the high-dimensional diameter problem. Compared to previous algorithms with a cost of O(n^2 d), the proposed algorithm, under certain conditions, can surpass this complexity bound and achieve a more efficient running time
  9. Keywords:
  10. Polynomial Codes ; Points Diameter ; Matrix Multiplication ; Heterogeneous Distributed Environments ; Straggler Nodes ; Distributed Computing

 Digital Object List

 Bookmark

...see more