A new Temporal Locality Method for Multi-Core Processor data Cache

Banihashemi, Borzoo; Jahangir, AmirHossein

Please enable javascript in your browser.

A new Temporal Locality Method for Multi-Core Processor data Cache

Banihashemi, Borzoo | 2014

639 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 45929 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Jahangir, AmirHossein
Abstract:
By increasing speed gap between microprocessors and off-chip Last Level Cache, Optimization in Last Level Cache makes improvement in system performance. With development of new generation of multi-core processors and sharing LLC between these cores, the so called issue of Memory Wall has caused an incremental effect of LLC on system performance. There are three approaches to use this memory more efficiently:
1. Increasing cache capacity
2. Making cache hierarchical and adding different layers to hierarchy
3. Improvement of replacement algorithms in cache memory
The first approach has not been used in regard with limitation of technology and growth of access time due to increasing cache capacity, the second approach has been discussed briefly in following chapters but has not been focused on because of data coherency problem and search time issue in each layer. Eventually the third approach has been discussed in detail.
Throughout this thesis, we predict temporal locality for a block of cache memory based on reused distance of that block in execution time. Then the acceptable precision of this method is going to be shown. The behavior of different programs from cache miss viewpoint is going to be assessed as well and based on the results we will classify those programs and after that we will estimate the probability of next access time to a cache block according to the prioritization and behavior of the data and the program, hence, within this thesis we will propose a method for replacement of LLC in multi-processors systems which will be extremely close to optimal cache replacement algorithm and can choose the best option after bringing both the previous and the predicted behaviors into account. In this method we will prioritize the data which show predictable behavior and the programs that show acceptable behavior. Furthermore entrance and resident of useless data in cache memory is going to be prevented. Our performance evaluations based on a subset ofSPEC2006 applications using CMP$im simulator show that our proposed method achieves an IPCimprovement of 17% at best and 4% in average over traditionalLRU for single processors, and achieves an IPCimprovement of 16% at best and 4% in average over DIP, Finally,we also show that our approach method achieves an Weighted speedup of 13% at best and -2% in worse case over traditionalLRU for multi processors
Keywords:
Collocation Method ; Last Level Cache (LLC) ; Multicore Processors ; Prediction ; Performance ; Memory Wall ; Reuse Distance

Digital Object List

محتواي کتاب
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code