We describe our experience using NVIDIA's CUDA (Compute Unified Device Architecture) C programming environment to implement a two-dimensional second-order MUSCL-Hancock ideal magnetohydrodynamics (MHD) solver on a GTX 480 Graphics Processing Unit (GPU). Taking a simple approach in which the MHD variables are stored exclusively in the global memory of the GTX 480 and accessed in a cache-friendly manner (without further optimizing memory access by, for example, staging data in the GPU's faster shared memory), we achieved a maximum speed-up of approx. = 126 for a sq 1024 grid relative to the sequential C code running on a single Intel Nehalem (2.8 GHz) core. This speedup is consistent with simple estimates based on the known floating point per...
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA para...
Parabolic partial differential equations are often used to model systems involving heat transfer, ac...
Design optimization relies heavily on time-consuming simulations, especially when using gradient-fre...
This paper introduces the Sheffield Magnetohydrodynamics Algorithm Using GPUs (SMAUG+), an advanced ...
Graphics processor units (GPU) that are traditionally designed for graphics rendering have emerged a...
Graphics processor units (GPU) that are originally designed for graphics rendering have emerged as m...
We present the FARGO3D code, recently publicly released. It is a magnetohydrodynamics code developed...
Parallelization techniques have been exploited most successfully by the gaming/graphics industry wit...
This paper describes the GPU accelerated MBFLO2 multi-block turbulent flow solver completely in doub...
Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose...
We present HORIZON, a new graphics processing unit (GPU)-accelerated code to solve the equations of ...
A new high-performance general-purpose graphics processing unit (GPGPU) computational fluid dynamics...
Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose...
Recent advance of the technologies incorporated in graphics hardware has enabled general-purpose com...
Computational Science has emerged as a third pillar of science along with theory and experiment, whe...
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA para...
Parabolic partial differential equations are often used to model systems involving heat transfer, ac...
Design optimization relies heavily on time-consuming simulations, especially when using gradient-fre...
This paper introduces the Sheffield Magnetohydrodynamics Algorithm Using GPUs (SMAUG+), an advanced ...
Graphics processor units (GPU) that are traditionally designed for graphics rendering have emerged a...
Graphics processor units (GPU) that are originally designed for graphics rendering have emerged as m...
We present the FARGO3D code, recently publicly released. It is a magnetohydrodynamics code developed...
Parallelization techniques have been exploited most successfully by the gaming/graphics industry wit...
This paper describes the GPU accelerated MBFLO2 multi-block turbulent flow solver completely in doub...
Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose...
We present HORIZON, a new graphics processing unit (GPU)-accelerated code to solve the equations of ...
A new high-performance general-purpose graphics processing unit (GPGPU) computational fluid dynamics...
Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose...
Recent advance of the technologies incorporated in graphics hardware has enabled general-purpose com...
Computational Science has emerged as a third pillar of science along with theory and experiment, whe...
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA para...
Parabolic partial differential equations are often used to model systems involving heat transfer, ac...
Design optimization relies heavily on time-consuming simulations, especially when using gradient-fre...