Scratchpad memories in GPU architectures are employed as software-controlled caches to increase the effective GPU memory bandwidth. Through the use of well-known optimization techniques, such as privatization and tiling, they are properly exploited. Typically, they are banked memories which are addressed with a mod (2N) bank indexing scheme. Although their bandwidth is fully exploited for linear memory accesses, their performance is burdened when non-unit strides appear in memory access patterns because they provoke bank conflicts. This paper explores the use of configurable bit-vector and bitwise XOR-based hash functions to evenly distribute memory addresses of the access patterns over the memory banks, reducing the number of bank conflict...
Exhaustive search is generally a last resort for solving a problem: each possible state of a system ...
Hash functions play an important role in various cryptographic applications. Modern cryptography rel...
In this paper, we analyze the special requirements of a dynamic memory allocator that is designed fo...
Scratchpad memories in GPU architectures are employed as software-controlled caches to increase the ...
Stringent power and performance constraints, coupled with detailed knowledge of the target applicati...
GPUs are increasingly used as compute accelerators. With a large number of cores executing an even l...
During the last years Field Programmable Gate Arrays and Graphics Processing Units have become incre...
General-Purpose Graphics Processing Unit (GPGPU) applications exploit on-chip scratchpad memory avai...
Parallel memory modules are widely used to increase memory bandwidth in parallel image processing an...
Graphics Processing Units (GPUs) have become the accelerator of choice for data-parallel application...
This paper exploits parallel computing power of graphics cards to accelerate state space search. We ...
Graphics processor units (GPUs) are designed to efficiently exploit thread level parallelism (TLP), ...
The continued evolution of GPUs have enabled the use of irregular algorithms which involve fine-grai...
Abstract—Hashing is one of the most fundamental operations that provides a means for a program to ob...
Graphics processing units (GPUs) have become ubiquitous for general purpose applications due to thei...
Exhaustive search is generally a last resort for solving a problem: each possible state of a system ...
Hash functions play an important role in various cryptographic applications. Modern cryptography rel...
In this paper, we analyze the special requirements of a dynamic memory allocator that is designed fo...
Scratchpad memories in GPU architectures are employed as software-controlled caches to increase the ...
Stringent power and performance constraints, coupled with detailed knowledge of the target applicati...
GPUs are increasingly used as compute accelerators. With a large number of cores executing an even l...
During the last years Field Programmable Gate Arrays and Graphics Processing Units have become incre...
General-Purpose Graphics Processing Unit (GPGPU) applications exploit on-chip scratchpad memory avai...
Parallel memory modules are widely used to increase memory bandwidth in parallel image processing an...
Graphics Processing Units (GPUs) have become the accelerator of choice for data-parallel application...
This paper exploits parallel computing power of graphics cards to accelerate state space search. We ...
Graphics processor units (GPUs) are designed to efficiently exploit thread level parallelism (TLP), ...
The continued evolution of GPUs have enabled the use of irregular algorithms which involve fine-grai...
Abstract—Hashing is one of the most fundamental operations that provides a means for a program to ob...
Graphics processing units (GPUs) have become ubiquitous for general purpose applications due to thei...
Exhaustive search is generally a last resort for solving a problem: each possible state of a system ...
Hash functions play an important role in various cryptographic applications. Modern cryptography rel...
In this paper, we analyze the special requirements of a dynamic memory allocator that is designed fo...