Improving CPU-GPU System Performance Through Dynamic Management of LLC and NoC

Rostamnejad Khatir, Maede; Sarbazi Azad, Hamid

Please enable javascript in your browser.

Improving CPU-GPU System Performance Through Dynamic Management of LLC and NoC

Rostamnejad Khatir, Maede | 2020

363 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 53244 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Sarbazi Azad, Hamid
Abstract:
CPU-GPU Heterogeneous System Architectures (HSA) play an important role in today's computing systems. Because of fast-growing in technology and the necessity of high-performance computing, HSAs are widely used platforms. Integrating the multi-core Central Processing Unit (CPU) with many-core Graphics Processing Unit (GPU) on the same die combines the feature of both processors and providing better performance. The capacity of HSAs to provide high throughput of computing led to the widespread use of these systems. Besides the high performance of HSAs, we also face challenges. These challenges are caused by the use of two processors with different behaviors and requirements on the same die. Optimal management of interaction between processors and their utilization of on-chip shared resources is an important point in achieving high performance. In this thesis, first, we investigated the challenges of on-chip resource management in HSAs. After that, we analyzed prior attempts in this direction and illustrated their weaknesses. Then, we have introduced a cache management method for these systems, a dynamic cache partitioning. Shared last level cache (LLC) is one of the most important shared resources between CPU and GPU in HSAs. Therefore, in this thesis, we used a dynamic cache partitioning mechanism to allocate memory resources to processors to improve the performance of Chai benchmark applications. Since both the CPU and the GPU have multiple access to the same shared cache, and the GPU has many accesses due to its high level of parallelism, it may interfere with CPU accesses and reduce CPU performance. And this may reduce the performance of the whole system. By applying management constraints to control accesses and find the optimal allocation for shared resources, we could reduce potential interference and obtain better performance in the execution of collaborative benchmarks. The results show that on average we achieved 17 percent improvement in HSA performance
Keywords:
Last Level Cache (LLC) ; Network-on-Chip (NOC) ; Heterogeneous Architecture ; Collaborative Benchmarks ; High Performance Computing

Digital Object List

محتواي کتاب
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code