Loading...

A Data Replication Algorithm to Improve Performance of Cloud Data Centers

Mehri, Saeedeh | 2014

590 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 45667 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Movaghar, Ali
  7. Abstract:
  8. The daily growth of cloud computing usage for data based applications and Internet services has caused many challenges in the sense of storage cost, data access performance, QoS provision such as availability, scalability, pay-as-you-go model conformation and etc. Data replication is one of the most important mechanisms for data management in distributed systems. It creates some replicas of data and distributes them to the network. The number of replicas, the time for creating a new replica, the way of their distribution among the nodes and replica replacement strategy in the case of storage unavailability are some important challenges in data replication context. Replication is extensively used in cloud commerce systems such as Amazon, Google file system and Hadoop distributed file system. In these systems, the 3-replicas data replication strategy is used by default which imposes a huge constant storage cost.
    In this thesis, we propose a dynamic data replication algorithm which intelligently decides about replication regarding users’ access pattern and reduces the storage and replication cost by switching from static payment model to pay-as-you-go model which is the dominant model in cloud. The proposed algorithm recognizes the popular data in right time regarding users’ access pattern and locates its replica in a proper data center. Also the prediction of users’ access behavior in the future makes the algorithm auto scaling and follows the pay-as-you-go model, so the extra storage payment is avoided. In order to implementate the proposed algorithm, a new capability is added to cloud simulator called Cloudsim to provide a simulation toolkit of data cloud and replica management. The proposed algorithm shows an improvement about 18% in storage cost, 14% in replication cost and 0.7% in request finish time in simulated scenarios and in exchange shows a 0.02% reduction in availability.
  9. Keywords:
  10. Cloud Computing ; Availability ; Scalability ; Data Center ; Data Replication ; Data Popularity

 Digital Object List

 Bookmark

No TOC