In this work we explore the performance of CUDA in quenched lattice SU(2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software ar-chitecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi archi-tectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU(2) lattice gauge configura-tions, for the mean plaquette, for the Polyak...
Lattice QCD is widely considered the correct theory of the strong force and is able to make quantita...
We consider the implementation of a parallel Monte Carlo code for high-performance simulations on PC...
Accelerators are an increasingly common option to boost performance of codes that require extensive ...
We discuss the CUDA approach to the simulation of pure gauge Lattice SU(2). CUDA is a hardware and s...
The starting point of any lattice QCD computation is the generation of a Markov chain of gauge field...
Graphics Processing Units (GPUs) are being used in many areas of physics, since the performance vers...
We discuss how the steepest descent method with Fourier acceleration for Laudau gauge fixing in latt...
We adopt CUDA-capable Graphic Processing Units (GPUs) for Landau, Coulomb and maximally Abelian gaug...
In this work, we consider the GPU implementation of the steepest descent method with Fourier acceler...
The compute unified device architecture (CUDA) is a programming approach for performing scientific c...
Here we present the cuLGT 1 code for gauge fixing in lattice gauge field theories with graphic proce...
Data-parallel accelerator devices such as Graphical Processing Units (GPUs) are providing dramatic p...
We describe the steps which lead to a speed efficiency of about 48% for a code for the simulation of...
Abstract: Lattice spin models are useful for studying critical phenomena and allow the extraction of...
We report on our implementation of the RHMC algorithm for the simulation of lattice QCD with two sta...
Lattice QCD is widely considered the correct theory of the strong force and is able to make quantita...
We consider the implementation of a parallel Monte Carlo code for high-performance simulations on PC...
Accelerators are an increasingly common option to boost performance of codes that require extensive ...
We discuss the CUDA approach to the simulation of pure gauge Lattice SU(2). CUDA is a hardware and s...
The starting point of any lattice QCD computation is the generation of a Markov chain of gauge field...
Graphics Processing Units (GPUs) are being used in many areas of physics, since the performance vers...
We discuss how the steepest descent method with Fourier acceleration for Laudau gauge fixing in latt...
We adopt CUDA-capable Graphic Processing Units (GPUs) for Landau, Coulomb and maximally Abelian gaug...
In this work, we consider the GPU implementation of the steepest descent method with Fourier acceler...
The compute unified device architecture (CUDA) is a programming approach for performing scientific c...
Here we present the cuLGT 1 code for gauge fixing in lattice gauge field theories with graphic proce...
Data-parallel accelerator devices such as Graphical Processing Units (GPUs) are providing dramatic p...
We describe the steps which lead to a speed efficiency of about 48% for a code for the simulation of...
Abstract: Lattice spin models are useful for studying critical phenomena and allow the extraction of...
We report on our implementation of the RHMC algorithm for the simulation of lattice QCD with two sta...
Lattice QCD is widely considered the correct theory of the strong force and is able to make quantita...
We consider the implementation of a parallel Monte Carlo code for high-performance simulations on PC...
Accelerators are an increasingly common option to boost performance of codes that require extensive ...