Abstract. Modern Graphics Processing Units (GPUs) are now considered accelerators for general purpose computation. A tight interaction between the GPU and the interconnection network is the strategy to express the full potential on capability computing of a multi-GPU system on large HPC clusters; that is the reason why an efficient and scalable interconnect is a key technology to finally deliver GPUs for scientific HPC. In this paper we show the latest architectural and performance improvement of the APEnet+ network fabric, a FPGA-based PCIe board with 6 fully bidirectional off-board links with 34 Gbps of raw bandwidth per direction, and X8 Gen2 bandwidth towards the host PC. The board implements a Remote Direct Memory Access (RDMA) protoco...
Today’s heterogeneous computer systems combine CPUs, GPUs, and FPGAs with different architectures. G...
In recent years two main platforms emerged as powerful key players in the domain of parallel computi...
Efficient data movement in multi-node systems is a crucial issue at the crossroads of scientific com...
APEnet+ is an INFN (Italian Institute for Nuclear Physics) project aiming to develop a custom 3-Dime...
Many scientific computations need multi-node parallelism for matching up both space (memory) and tim...
In this chapter we describe the architecture of a torus interconnect and its implementation on FPGAs...
Despite dramatic improvements in GPU and interconnect architectures, inter-GPU communication remains...
Thesis (M.S.)--Boston UniversityApplications that require highly parallel computing along with low l...
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighb...
We present the current status of APENet, our custom 3-dimensional interconnect architecture for PC c...
Systems is dealing with the challenge of providing high-performance ECUs as an enabling technology a...
General-purpose computing on GPUs is emerging as a new paradigm in several fields of science, althou...
To achieve high throughput, core count in compute accelerators such as General-Purpose Graphics Proc...
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighb...
In this paper we detail the key features, architectural design, and implementation of rCUDA, an adv...
Today’s heterogeneous computer systems combine CPUs, GPUs, and FPGAs with different architectures. G...
In recent years two main platforms emerged as powerful key players in the domain of parallel computi...
Efficient data movement in multi-node systems is a crucial issue at the crossroads of scientific com...
APEnet+ is an INFN (Italian Institute for Nuclear Physics) project aiming to develop a custom 3-Dime...
Many scientific computations need multi-node parallelism for matching up both space (memory) and tim...
In this chapter we describe the architecture of a torus interconnect and its implementation on FPGAs...
Despite dramatic improvements in GPU and interconnect architectures, inter-GPU communication remains...
Thesis (M.S.)--Boston UniversityApplications that require highly parallel computing along with low l...
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighb...
We present the current status of APENet, our custom 3-dimensional interconnect architecture for PC c...
Systems is dealing with the challenge of providing high-performance ECUs as an enabling technology a...
General-purpose computing on GPUs is emerging as a new paradigm in several fields of science, althou...
To achieve high throughput, core count in compute accelerators such as General-Purpose Graphics Proc...
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighb...
In this paper we detail the key features, architectural design, and implementation of rCUDA, an adv...
Today’s heterogeneous computer systems combine CPUs, GPUs, and FPGAs with different architectures. G...
In recent years two main platforms emerged as powerful key players in the domain of parallel computi...
Efficient data movement in multi-node systems is a crucial issue at the crossroads of scientific com...