Lattice Quantum Chromodynamics simulations typically spend most of the runtime in inversions of the Fermion Matrix. This part is therefore frequently optimized for various HPC architectures. Here we compare the performance of the Intel R © Xeon Phi TM to current Kepler-based NVIDIA R © Tesla TM GPUs running a conjugate gradient solver. By exposing more parallelism to the accelerator through inverting multiple vectors at the same time, we obtain a performance greater than 300 GFlop / s on both architectures. This more than doubles the performance of the inversions. We also give a short overview of the Knights Corner architecture, discuss some details of the implementation and the effort required to obtain the achieved performanc
The Conjugate Gradient method is a popular iterative method to solve a system of linear equations an...
Abstract—A new sparse high performance conjugate gradient benchmark (HPCG) has been recently release...
International audienceThis paper illustrates how GPU computing can be used to accelerate computation...
Lattice Quantum Chromodynamics simulations typically spend most of the runtime in inversions of the ...
Kaczmarek O, Schmidt C, Steinbrecher P, Wagner M. Conjugate gradient solvers on Intel Xeon Phi and N...
Mukherjee S, Kaczmarek O, Schmidt C, Steinbrecher P, Wagner M. HISQ inverter on Intel Xeon Phi and N...
Abstract. Lattice QuantumChromodynamics (LQCD) is currently the only known model independent, non pe...
DOI: will be assigned Lattice Quantum Chromodynamics simulations typically spend most of the runtime...
In this paper we describe a single-node, double precision Field Programmable Gate Array (FPGA) imple...
FPGA devices used in the HPC context promise an increased energy efficiency, enhancing the computing...
Results of porting parts of the Lattice Quantum Chromodynamics code to modern FPGA devices are prese...
The conjugate gradient (CG) algorithm is among the most essential and time consuming parts of lattic...
Scientific computing applications demand ever-increasing performance while traditional microprocesso...
Lattice QCD calculations require significant computational effort, with the dominant fraction of res...
We present the first GPU-based conjugate gradient (CG) solver for lattice QCD with domain-wall fermi...
The Conjugate Gradient method is a popular iterative method to solve a system of linear equations an...
Abstract—A new sparse high performance conjugate gradient benchmark (HPCG) has been recently release...
International audienceThis paper illustrates how GPU computing can be used to accelerate computation...
Lattice Quantum Chromodynamics simulations typically spend most of the runtime in inversions of the ...
Kaczmarek O, Schmidt C, Steinbrecher P, Wagner M. Conjugate gradient solvers on Intel Xeon Phi and N...
Mukherjee S, Kaczmarek O, Schmidt C, Steinbrecher P, Wagner M. HISQ inverter on Intel Xeon Phi and N...
Abstract. Lattice QuantumChromodynamics (LQCD) is currently the only known model independent, non pe...
DOI: will be assigned Lattice Quantum Chromodynamics simulations typically spend most of the runtime...
In this paper we describe a single-node, double precision Field Programmable Gate Array (FPGA) imple...
FPGA devices used in the HPC context promise an increased energy efficiency, enhancing the computing...
Results of porting parts of the Lattice Quantum Chromodynamics code to modern FPGA devices are prese...
The conjugate gradient (CG) algorithm is among the most essential and time consuming parts of lattic...
Scientific computing applications demand ever-increasing performance while traditional microprocesso...
Lattice QCD calculations require significant computational effort, with the dominant fraction of res...
We present the first GPU-based conjugate gradient (CG) solver for lattice QCD with domain-wall fermi...
The Conjugate Gradient method is a popular iterative method to solve a system of linear equations an...
Abstract—A new sparse high performance conjugate gradient benchmark (HPCG) has been recently release...
International audienceThis paper illustrates how GPU computing can be used to accelerate computation...