Loading...

A fault tolerant scheduling algorithm for dag applications in cluster environments

Tabbaa, N ; Sharif University of Technology | 2011

745 Viewed
  1. Type of Document: Article
  2. DOI: 10.1007/978-3-642-22389-1_18
  3. Publisher: 2011
  4. Abstract:
  5. Fault tolerance is an essential requirement in systems running applications which need a technique to continue execution where some system components are subject to failure. In this paper, a fault tolerant task scheduling algorithm is proposed for mapping task graphs to heterogeneous processing nodes in cluster computing systems. The starting point of the algorithm is a DAG representing an application with information about the tasks. This information consists of the execution time of the tasks on the target system processors, communication times between the tasks having data dependencies, and the number of the processor failures (ε) which should be tolerated by the scheduling algorithm. The algorithm is based on the active replication scheme, and it schedules ε+1 replicas of each task to achieve the required fault tolerance. Simulation results show the efficiency of the proposed algorithm in spite of its lower complexity
  6. Keywords:
  7. Active replication ; Cluster computing system ; Cluster environments ; DAG Tasks ; Data dependencies ; Execution time ; Fault tolerant scheduling ; Fault-tolerant ; Heterogeneous processing ; Lower complexity ; Processor failures ; Running applications ; Simulation result ; System components ; Target systems ; Task graph ; Task-scheduling algorithms ; Cluster computing ; Data processing ; Fault tolerance ; Fault tolerant computer systems ; Multitasking ; Scheduling algorithms ; Clustering algorithms
  8. Source: Communications in Computer and Information Science, 7 July 2011 through 9 July 2011 ; Volume 188 CCIS, Issue PART 1 , July , 2011 , Pages 189-199 ; 18650929 (ISSN) ; 9783642223884 (ISBN)
  9. URL: http://link.springer.com/chapter/10.1007%2F978-3-642-22389-1_18