Efficient Implementation of Reductions on GPU Architectures

Timcheck, Stephen W

Open PDF

Open link

Publication date

January 2017

Publisher

IdeaExchange@UAkron

Language

English

Abstract

With serial, or sequential, computational operations\u27 growth rate slowing over the past few years, parallel computing has become paramount to achieve speedup. In particular, GPUs (Graphics Processing Units) can be used to program parallel applications using a SIMD (Single Instruction Multiple Data) architecture. We studied SIMD applications constructed using the NVIDIA CUDA language and MERCATOR (Mapping EnumeRATOR for CUDA), a framework developed for streaming dataflow applications on the GPU. A type of operation commonly performed by streaming applications is reduction, a function that performs some associative operation on multiple data points such as summing a list of numbers (additive operator, +). By exploring numerous SIMD imp...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Efficient Implementation of Reductions on GPU Architectures

Abstract

Extracted data

Efficient Implementation of Reductions on GPU Architectures

Abstract

Extracted data

Related items

Related items