GPU Array Access Auto-Tuning

Weber, Nicolas

Publication date

January 2017

Abstract

GPUs have been used for years in compute intensive applications. Their massive parallel processing capabilities can speedup calculations significantly. However, to leverage this speedup it is necessary to rethink and develop new algorithms that allow parallel processing. These algorithms are only one piece to achieve high performance. Nearly as important as suitable algorithms is the actual implementation and the usage of special hardware features such as intra-warp communication, shared memory, caches, and memory access patterns. Optimizing these factors is usually a time consuming task that requires deep understanding of the algorithms and the underlying hardware. Unlike CPUs, the internal structure of GPUs has changed significantly and w...

Extracted data

We use cookies to provide a better user experience.

Data Protection

GPU Array Access Auto-Tuning

Abstract

Extracted data

GPU Array Access Auto-Tuning

Abstract

Extracted data

Related items

Related items