Loading...

A Reconfigurable and Adaptive Shared-memory Architecture for GPUs

Abbasitabar, Hamed | 2013

735 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 44566 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Sarbazi Azad, Hamid
  7. Abstract:
  8. The importance of shared memory (scratchpad memory) in GPGPU programming, the memory size limits of GPGPUs and the influence of shared memory on overall performance of the GPGPU has led to its performance optimization. Moreover, the trend of new GPGPUs design shows that the ratio of shared memory to processing elements is going smaller. As a result, the limited capacity of shared memory becomes a bottleneck for a GPU to host a high number of thread blocks, limiting the otherwise available thread-level parallelism (TLP). In this thesis we introduced a reconfigurable and adaptive shared memory architecture for GPGPUs based on resource sharing which can be exploited for throughput improvement in various GPGPU architectures.Due to GPGPU programs structure, in which grids contain homogeneous blocks of threads, stream multiprocessors (SMs) have the same memory requirements; therefore sharing the resources, including memories, is not feasible. Multi-programming on GPGPU would be a good solution to cause variety in SMs resource requirements. We proposed an architecture for running multiple programs on one GPU which results in utilization enhancement and resource sharing capability on SMs. Then a method has been proposed for shared memory sharing by dynamically permitting SMs to take each other’s shared memory space in order to achieve more thread blocks running simultaneously. Our experiments, done by GPGPU-Sim and a developed simulator, with NVIDIA Fermi configuration, show on average 74%, 121% and 137% throughput improvement, for the multi-programming method with 2, 4 and 8 programs, respectively. Also the proposed shared memory sharing method improves throughput by maximum 60%, 78% and 39% for 2, 4 and 8 programs, respectively
  9. Keywords:
  10. General Purpose Graphic Processing Units (GPGPU) ; Shared Memory ; Throughput Improvement ; Multithread Systems ; Reconfigurable Architecture

 Digital Object List

 Bookmark

No TOC