General-Purpose Graphics Processing Unit (GPGPU) applications exploit on-chip scratchpad memory available in the Graphics Processing Units (GPUs) to improve performance. The amount of thread level parallelism (TLP) present in the GPU is limited by the number of resident threads, which in turn depends on the availability of scratchpad memory in its streaming multiprocessor (SM). Since the scratchpad memory is allocated at thread block granularity, part of the memory may remain unutilized. In this article, we propose architectural and compiler optimizations to improve the scratchpad memory utilization. Our approach, called Scratchpad Sharing, addresses scratchpad under-utilization by launching additional thread blocks in each SM. These thread...
The massive amount of fine-grained parallelism exposed by a GPU program makes it difficult to exploi...
This paper presents a dynamic scratchpad memory (SPM) code allocation technique for embedded systems...
General purpose GPU (GPGPU) is an effective many-core architecture that can yield high throughput fo...
General-Purpose Graphics Processing Unit (GPGPU) applications exploit on-chip scratchpad memory avai...
Graphics Processing Units (GPUs) have become the accelerator of choice for data-parallel application...
During the last years Field Programmable Gate Arrays and Graphics Processing Units have become incre...
A key factor in GPU performance efficiency is the number of active threads that can run simultaneous...
Abstract—GPUs are increasingly used as compute accelera-tors. With a large number of cores executing...
There has been a tremendous growth in the use of Graphics Processing Units (GPU) for the acceleratio...
2018-02-23Graphics Processing Units (GPUs) are designed primarily to execute multimedia, and game re...
Graphics processing units (GPUs) have become prevalent in modern computing systems. While their high...
Graphics processing units (GPUs) have become ubiquitous for general purpose applications due to thei...
Portable embedded systems require diligence in manag-ing their energy consumption. Thus, power efcie...
Abstract—This paper presents a compiler strategy to optimize data accesses in regular array-intensiv...
Scratchpad memory has been introduced as a replacement for cache memory as it improves the performan...
The massive amount of fine-grained parallelism exposed by a GPU program makes it difficult to exploi...
This paper presents a dynamic scratchpad memory (SPM) code allocation technique for embedded systems...
General purpose GPU (GPGPU) is an effective many-core architecture that can yield high throughput fo...
General-Purpose Graphics Processing Unit (GPGPU) applications exploit on-chip scratchpad memory avai...
Graphics Processing Units (GPUs) have become the accelerator of choice for data-parallel application...
During the last years Field Programmable Gate Arrays and Graphics Processing Units have become incre...
A key factor in GPU performance efficiency is the number of active threads that can run simultaneous...
Abstract—GPUs are increasingly used as compute accelera-tors. With a large number of cores executing...
There has been a tremendous growth in the use of Graphics Processing Units (GPU) for the acceleratio...
2018-02-23Graphics Processing Units (GPUs) are designed primarily to execute multimedia, and game re...
Graphics processing units (GPUs) have become prevalent in modern computing systems. While their high...
Graphics processing units (GPUs) have become ubiquitous for general purpose applications due to thei...
Portable embedded systems require diligence in manag-ing their energy consumption. Thus, power efcie...
Abstract—This paper presents a compiler strategy to optimize data accesses in regular array-intensiv...
Scratchpad memory has been introduced as a replacement for cache memory as it improves the performan...
The massive amount of fine-grained parallelism exposed by a GPU program makes it difficult to exploi...
This paper presents a dynamic scratchpad memory (SPM) code allocation technique for embedded systems...
General purpose GPU (GPGPU) is an effective many-core architecture that can yield high throughput fo...