In this chapter we describe the architecture of a torus interconnect and its implementation on FPGAs, which so far has been used in two different HPC systems. The network design is optimized for applications which benefit from a tightly coupled network and allows to exchange relatively small messages between nearest neighbours at a high rate. Examples for such applications are lattice quantum chromodynamics (LQCD) simulations and fluid dynamics applications using the Lattice Boltzmann method (LBM). We describe the details of the implementation of our torus network architecture for two massively parallel machines, QCD Parallel Computing on Cell (QPACE) and AuroraScience, and present details on the FPGA resource usage. Furthermore, we discuss...
The Three Dimensional Fast Fourier Transform (3D-FFT) is commonly used to solve the partial differen...
Application-driven computers for Lattice Gauge Theory simulations have often been based on system-on...
In this paper we describe a single-node, double precision Field Programmable Gate Array (FPGA) imple...
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighb...
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighb...
Abstract. Modern Graphics Processing Units (GPUs) are now considered accelerators for general purpos...
QPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A single QPA...
Thesis (M.S.)--Boston UniversityApplications that require highly parallel computing along with low l...
The computation of a one-dimensional FFT on a c-dimensional torus multicomputer is analyzed. Differe...
The path towards realizing peta-scale computing is increasingly dependent on building supercomputer...
Many parallel algorithms use hypercubes as the communication topology among their processes. When su...
In this paper, the computation of a one-dimensional FFT on a c-dimensional torus multicomputer is an...
AbstractQPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A si...
The goal of the QPACE project is to build a novel cost-efficient massive parallel supercomputer opti...
In this paper, we discuss the implementation of a lattice Quantum Chromodynamics (QCD) application t...
The Three Dimensional Fast Fourier Transform (3D-FFT) is commonly used to solve the partial differen...
Application-driven computers for Lattice Gauge Theory simulations have often been based on system-on...
In this paper we describe a single-node, double precision Field Programmable Gate Array (FPGA) imple...
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighb...
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighb...
Abstract. Modern Graphics Processing Units (GPUs) are now considered accelerators for general purpos...
QPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A single QPA...
Thesis (M.S.)--Boston UniversityApplications that require highly parallel computing along with low l...
The computation of a one-dimensional FFT on a c-dimensional torus multicomputer is analyzed. Differe...
The path towards realizing peta-scale computing is increasingly dependent on building supercomputer...
Many parallel algorithms use hypercubes as the communication topology among their processes. When su...
In this paper, the computation of a one-dimensional FFT on a c-dimensional torus multicomputer is an...
AbstractQPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A si...
The goal of the QPACE project is to build a novel cost-efficient massive parallel supercomputer opti...
In this paper, we discuss the implementation of a lattice Quantum Chromodynamics (QCD) application t...
The Three Dimensional Fast Fourier Transform (3D-FFT) is commonly used to solve the partial differen...
Application-driven computers for Lattice Gauge Theory simulations have often been based on system-on...
In this paper we describe a single-node, double precision Field Programmable Gate Array (FPGA) imple...