Loading...
Distributed Fault-tolerant Computation for Massive Data
Mahvari Habibabadi, Mohammad Mahdi | 2019
539
Viewed
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 52135 (05)
- University: Sharif University of Technology
- Department: Electrical Engineering
- Advisor(s): Maddah-Ali, Mohammad Ali
- Abstract:
- In this thesis we consider the problem of distributed computation by many processors.We mainly concentrate on matrix multiplication problem in this thesis because of its importance. A distributed system consists of N worker processors and one master processor. The master processor should distribute the computation between workers and after computation in each of them, collect the results. In this thesis, we are going to mitigate the effect of straggler processors by using coding methods. Straggler processors can cause delays in the computation time.In this thesis, we firstly introduce a method to multiply any number of matrices in each other. The proposed method occurred in one shot without any communication between worker processors. First of all, the master processor encodes each matrix and sends the result to each worker processor. The workers compute the multiplication and send the result back to the master. At the end, the master processor should be able to decode and find the final result from the subset of workers. The main purpose of this thesis is to construct methods to minimize the size of this subset. Finally, we will show how to extend this problem for computing a multivariate polynomial of matrices. In other words, we will show how to multiply and add arbitrary number of matrices in a distributed system with straggler processors.In another problem, we tried to combine the two areas, Fast Matrix Multiplication and Coding, to attain better results. The Fast Matrix Multiplication is a fairly old research area with a lot of achievements. By using this area we improve the previous results in distributed matrix multiplication. More specifically, we consider K problems each of them multiplication of two matrices. In this problem the master processor encodes the matrices and distribute them between workers. Each worker processor sends back the result to the master after finishing the computation. The objective is to decode the final result from a subset of workers
- Keywords:
- Distributed Computing ; Fault Tolerance ; Massive Data
- محتواي کتاب
- view
- New Doc 2019-08-06 13.28.39
- Mahvari_Thesis