Loading...

Design and Implementation of an Offline Scheduling and Resource Allocating Algorithm for Distributed Big Data Stream Processing Systems

Divband, Arman | 2017

540 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 50747 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Goudarzi, Maziar
  7. Abstract:
  8. One of the most important categories of big data processing is stream processing. In stream processing, processing of data is performed simultaneously with the production of data. one of the most well-known frameworks used for stream processing is Apache Storm. By default, Storm uses a round-robin scheduler to allocate tasks to physical machines. This scheduler randomly performs scheduling and assignment of tasks to physical machines without considering the processing power of physical machines and processing tasks, which makes it impossible to properly utilize the processing resources. In this paper, a scheduling algorithm and resource allocation have been proposed based on the processing time of each of the tasks on different physical machines, and based on this information, tasks are assigned to some kind of physical machine that all machines use to maximize their CPU utilization. To evaluate the proposed algorithm, it is implemented in Storm 0.95 and the results on Storm Micro-Benchmark show that our scheduler provides 7% to 44% throughput enhancement compare to Storm's default scheduler and it can find the solution within 4% of optimum mode which obtains the best task scheduling scenario using an exhaustive search on the problem design space
  9. Keywords:
  10. Scheduling ; Apache Storm ; Big Data ; Stream Data Processing ; Resources Allocation

 Digital Object List

 Bookmark

No TOC