We discuss the CUDA approach to the simulation of pure gauge Lattice SU(2). CUDA is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU with single pre-cision. Analysis with single and multiple GPU’s, using CUDA and OPENMP, are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. Using GPU texture memory and minimizing the data transfers between CPU and GPU, we achieve a speedup of 200 × using 2 NVIDIA 295 GTX cards relative to a serial CPU, which demonstrates that GPU’s can serve as an efficien...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
Over the last 20 years, the computing revolution has created many social benefits. The computing ene...
We describe the steps which lead to a speed efficiency of about 48% for a code for the simulation of...
In this work we explore the performance of CUDA in quenched lattice SU(2) simulations. CUDA, NVIDIA ...
Graphics Processing Units (GPUs) are being used in many areas of physics, since the performance vers...
The starting point of any lattice QCD computation is the generation of a Markov chain of gauge field...
We discuss how the steepest descent method with Fourier acceleration for Laudau gauge fixing in latt...
We adopt CUDA-capable Graphic Processing Units (GPUs) for Landau, Coulomb and maximally Abelian gaug...
The compute unified device architecture (CUDA) is a programming approach for performing scientific c...
Data-parallel accelerator devices such as Graphical Processing Units (GPUs) are providing dramatic p...
In this work, we consider the GPU implementation of the steepest descent method with Fourier acceler...
Abstract: Lattice spin models are useful for studying critical phenomena and allow the extraction of...
Here we present the cuLGT 1 code for gauge fixing in lattice gauge field theories with graphic proce...
Abstract—Computing platforms equipped with accelerators like GPUs have proven to provide great compu...
Restricted solid on solid surface growth models can be mapped onto binary lattice gases. We show tha...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
Over the last 20 years, the computing revolution has created many social benefits. The computing ene...
We describe the steps which lead to a speed efficiency of about 48% for a code for the simulation of...
In this work we explore the performance of CUDA in quenched lattice SU(2) simulations. CUDA, NVIDIA ...
Graphics Processing Units (GPUs) are being used in many areas of physics, since the performance vers...
The starting point of any lattice QCD computation is the generation of a Markov chain of gauge field...
We discuss how the steepest descent method with Fourier acceleration for Laudau gauge fixing in latt...
We adopt CUDA-capable Graphic Processing Units (GPUs) for Landau, Coulomb and maximally Abelian gaug...
The compute unified device architecture (CUDA) is a programming approach for performing scientific c...
Data-parallel accelerator devices such as Graphical Processing Units (GPUs) are providing dramatic p...
In this work, we consider the GPU implementation of the steepest descent method with Fourier acceler...
Abstract: Lattice spin models are useful for studying critical phenomena and allow the extraction of...
Here we present the cuLGT 1 code for gauge fixing in lattice gauge field theories with graphic proce...
Abstract—Computing platforms equipped with accelerators like GPUs have proven to provide great compu...
Restricted solid on solid surface growth models can be mapped onto binary lattice gases. We show tha...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
Over the last 20 years, the computing revolution has created many social benefits. The computing ene...
We describe the steps which lead to a speed efficiency of about 48% for a code for the simulation of...