In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectures to accelerate pipelined wavefront applications - a ubiquitous class of parallel algorithms used for the solution of a number of scientific and engineering applications. Specifically, we employ a recently developed port of the LU solver (from the NAS Parallel Benchmark suite) to investigate the performance of these algorithms on high-performance computing solutions from NVIDIA (Tesla C1060 and C2050) as well as on traditional clusters (AMD/InfiniBand and IBM BlueGene/P). Benchmark results are presented for problem classes A to C and a recently developed performance model is used to provide projections for problem classes D and E, the latter...
The computational speed on microprocessors is increasing faster than the communication speed, especi...
Abstract—Emerging massively parallel architectures such as a general-purpose processor plus many-cor...
The research presented in this thesis investigates parallel implementations of the Fast Sweeping Met...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
Pipelined wavefront applications form a large portion of the high performance scientific computing w...
With the limits to frequency scaling in microprocessors due to power constraints, many-core and mult...
Using two full applications with different characteristics, this thesis explores the performance and...
Chandrasekaran, SunitaProcessor architectures have been rapidly evolving for decades. From the intro...
Over the past years, GPUs became ubiquitous in HPC installations around the world. Today, they provi...
Computing on graphics processors is maybe one of the most important developments in computational sc...
In this paper, we address the problem of efficient execution of a computation pattern, referred to h...
The evolution of GPUs (graphics processing units) has been enormous in the past few years. Their cal...
Graphic processors are becoming faster and faster. Computational power within graphic processing uni...
\u2014Emerging massively parallel architectures such as a general-purpose processor plus many-core p...
Graphical processing units (GPUs) have recently attracted attention for scientific applications such...
The computational speed on microprocessors is increasing faster than the communication speed, especi...
Abstract—Emerging massively parallel architectures such as a general-purpose processor plus many-cor...
The research presented in this thesis investigates parallel implementations of the Fast Sweeping Met...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
Pipelined wavefront applications form a large portion of the high performance scientific computing w...
With the limits to frequency scaling in microprocessors due to power constraints, many-core and mult...
Using two full applications with different characteristics, this thesis explores the performance and...
Chandrasekaran, SunitaProcessor architectures have been rapidly evolving for decades. From the intro...
Over the past years, GPUs became ubiquitous in HPC installations around the world. Today, they provi...
Computing on graphics processors is maybe one of the most important developments in computational sc...
In this paper, we address the problem of efficient execution of a computation pattern, referred to h...
The evolution of GPUs (graphics processing units) has been enormous in the past few years. Their cal...
Graphic processors are becoming faster and faster. Computational power within graphic processing uni...
\u2014Emerging massively parallel architectures such as a general-purpose processor plus many-core p...
Graphical processing units (GPUs) have recently attracted attention for scientific applications such...
The computational speed on microprocessors is increasing faster than the communication speed, especi...
Abstract—Emerging massively parallel architectures such as a general-purpose processor plus many-cor...
The research presented in this thesis investigates parallel implementations of the Fast Sweeping Met...