A Reconfigurable and Adaptive Shared-memory Architecture for GPUs

Abbasitabar, Hamed; Sarbazi Azad, Hamid

Please enable javascript in your browser.

A Reconfigurable and Adaptive Shared-memory Architecture for GPUs

Abbasitabar, Hamed | 2013

735 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 44566 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Sarbazi Azad, Hamid
Abstract:
The importance of shared memory (scratchpad memory) in GPGPU programming, the memory size limits of GPGPUs and the influence of shared memory on overall performance of the GPGPU has led to its performance optimization. Moreover, the trend of new GPGPUs design shows that the ratio of shared memory to processing elements is going smaller. As a result, the limited capacity of shared memory becomes a bottleneck for a GPU to host a high number of thread blocks, limiting the otherwise available thread-level parallelism (TLP). In this thesis we introduced a reconfigurable and adaptive shared memory architecture for GPGPUs based on resource sharing which can be exploited for throughput improvement in various GPGPU architectures.Due to GPGPU programs structure, in which grids contain homogeneous blocks of threads, stream multiprocessors (SMs) have the same memory requirements; therefore sharing the resources, including memories, is not feasible. Multi-programming on GPGPU would be a good solution to cause variety in SMs resource requirements. We proposed an architecture for running multiple programs on one GPU which results in utilization enhancement and resource sharing capability on SMs. Then a method has been proposed for shared memory sharing by dynamically permitting SMs to take each other’s shared memory space in order to achieve more thread blocks running simultaneously. Our experiments, done by GPGPU-Sim and a developed simulator, with NVIDIA Fermi configuration, show on average 74%, 121% and 137% throughput improvement, for the multi-programming method with 2, 4 and 8 programs, respectively. Also the proposed shared memory sharing method improves throughput by maximum 60%, 78% and 39% for 2, 4 and 8 programs, respectively
Keywords:
General Purpose Graphic Processing Units (GPGPU) ; Shared Memory ; Throughput Improvement ; Multithread Systems ; Reconfigurable Architecture

Digital Object List

محتواي کتاب
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code