Accelerating Data Transfer for Throughput Processors.

Jamshidi, Davoud

Publication date

January 2016

Abstract

Graphics processing units (GPUs) have become prevalent in modern computing systems. While their highly parallel architectures are traditionally used as accelerators for rendering graphics, GPUs are also adept at handling data parallel workloads when provided large blocks of data for processing. Extracting performance from a GPU requires the programmer to provide enough work to keep the device fully utilized. Unlike CPUs, which are highly optimized to reduce memory access latency, GPUs are optimized for throughput and tend to have high access latency. The naive approach to obtaining performance is to provide a GPU with hundreds to thousands of threads so that some threads will be able to perform computation while others are waiting for da...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Accelerating Data Transfer for Throughput Processors.

Abstract

Extracted data

Accelerating Data Transfer for Throughput Processors.

Abstract

Extracted data

Related items

Related items