Loading...
MapReduce service provisioning for frequent big data jobs on clouds considering data transfers
Nabavinejad, S. M ; Sharif University of Technology | 2018
486
Viewed
- Type of Document: Article
- DOI: 10.1016/j.compeleceng.2018.08.005
- Publisher: Elsevier Ltd , 2018
- Abstract:
- Many companies regularly run Big Data analysis, and need to optimize their resource usage considering cost, deadline, and environmental impact simultaneously. The cloud allows choosing from various virtual machines (VM) where the number and type of VMs affect the outcome such as the time for data placement and data shuffle phases, a task's energy consumption and execution time, and the makespan of jobs. We provide provisioning and scheduling algorithms to minimize environmental impact, considering the above factors, for frequently executed MapReduce jobs. To mathematically model the problem and obtain the optimal solution, we present an Integer Linear Programming (ILP) model and then continue with two heuristic algorithms. We compare proposed algorithms against a number of rivals using extensive simulations based on publicly available real-world data. The results demonstrate that our algorithms can achieve near-optimal solutions, e.g., sometime even within 0.39% of the optimal solution obtained by ILP regarding energy consumption. © 2018 Elsevier Ltd
- Keywords:
- Big data ; MapReduce ; Cloud computing ; Data transfer ; Energy efficiency ; Energy utilization ; Environmental impact ; Green computing ; Heuristic algorithms ; Integer programming ; Job shop scheduling ; Optimal systems ; Scheduling algorithms ; Data placement ; Extensive simulations ; Hadoop ; Integer linear programming models ; Map-reduce ; Near-optimal solutions ; Optimal solutions ; Service provisioning
- Source: Computers and Electrical Engineering ; Volume 71 , 2018 , Pages 594-610 ; 00457906 (ISSN)
- URL: https://www.sciencedirect.com/science/article/pii/S0045790617317482