Hardware Transactional Memory for GPU Architectures

Wilson W. L. Fung
Inderpreet Singh
Andrew Brownsword
Tor M. Aamodt

Open link

Publication date

January 2011

DOI

10.1145/2155620.2155655

Citation count (estimate)

Abstract

Graphics processor units (GPUs) are designed to efficiently exploit thread level parallelism (TLP), multiplexing execution of 1000s of concurrent threads on a relatively smaller set of single-instruction, multiple-thread (SIMT) cores to hide various long latency opera-tions. While threads within a CUDA block/OpenCL workgroup can communicate efficiently through an intra-core scratchpad memory, threads in different blocks can only communicate via global mem-ory accesses. Programmers wishing to exploit such communication have to consider data-races that may occur when multiple threads modify the same memory location. Recent GPUs provide a form of inter-block communication through atomic operations for sin-gle 32-bit/64-bit words. Although fine...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Hardware Transactional Memory for GPU Architectures

Abstract

Extracted data

Hardware Transactional Memory for GPU Architectures

Abstract

Extracted data

Related items

Related items