Loading...

Evaluating Data Prefetching Methods and Proposing an Energy-aware First Level Cache for Cloud Workloads

Naderan Tahan, Mahmood | 2015

455 Viewed
  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 48246 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Sarbazi Azad, Hamid
  7. Abstract:
  8. Data generation rate is far more than the technology scaling rate in a way that there will be a 40x gap between the data generation rate and the technology scaling rate in 2020. On one hand, unlike traditional HPC clusters, processors in data centers are not fully utilized and on the other hand, unlike traditional embedded processors, they are not idle most of the time. Therefore, energy consumption of such processors is an important issue; otherwise dealing with a huge volume of data will be problematic in the near future. In this dissertation, we will show that while first level data cache encounters high miss rate, traditional approaches such as data prefetching, which were efficient for traditional workloads, are ineffective for modern and cloud workloads. That means, a large fraction of prefetched blocks are evicted from the cache without any access. The reason for such weak predictions is the irregularity of cache misses in the space dimension and their low repetitions in groups in the time dimension. Knowing the fact that energy consumption is an important issue, the question is, how it is possible to use the prefetcher's storage budget in other ways in order to reduce the cache misses while saving energy consumption? Domino, a cache architecture presented in this dissertation, attaches the prefetcher's storage budget to the first level data cache by a mean to keep more data blocks in the cache and filtering additional accesses to the upper levels in the memory hierarchy. In this architecture, two important issues were considered: keeping the cache's critical path unchanged and controlling the energy consumption while accessing a large number of cache ways. For this reason, the additional ways are separated from the baseline cache and they are not searched in parallel. Additionally, a dynamic voltage scaling approach (DVS) is employed to further reduce leakage power when the added ways are inactive. Simulation results show that Domino is able to reduce the read/write misses and snoop traffic by about 30%. The result is that the access energy is reduced by about 22% (maximum 40%). Also, by using DVS, it is possible to reduce the average leakage power by 83% compared to a big L1 cache without DVS. From the performance point of view, Domino is able to improve the performance by a small value since the L1 cache is backed up by a large L2 cache
  9. Keywords:
  10. Cache Memory ; Multicore Processors ; Prefetching ; Energy Consumption ; Cloud Work Loads

 Digital Object List

 Bookmark

No TOC