Sustained memory throughput is a key determinant of performance in HPC devices. Having an accurate estimate of this parameter is essential for manual or automated design space exploration for any HPC device. While there are benchmarks for measuring the sustained memory bandwidth for CPUs and GPUs, such a benchmark for FPGAs has been missing. We present MP-STREAM, an OpenCL-based synthetic micro-benchmark for measuring sustained memory bandwidth, optimized for FPGAs, but which can be used on multiple platforms. Our main contribution is the introduction of various generic as well as device-specific parameters that can be tuned to measure their effect on memory bandwidth. We present results of running our benchmark on a CPU, a GPU ...
In recent years the High Performance Computing (HPC) industry has benefited from the development of ...
The Hewlett-Packard X- and V-Class ccNUMA systems appear well suited to exploiting coarse and fine-g...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
In recent years, the world of high performance computing has been developing rapidly. The goal of t...
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-96983-1_10Des...
FPGA designs have an immense design space, and there can be an order of magnitude performance differ...
This paper investigates the development of a molecular dynamics code that is highly portable between...
Performance modeling, the science of understanding and predicting application performance, is import...
International audienceFinely tuning MPI applications (number of processes, granularity, collectiveop...
International audienceThe increasing computation capability of servers comes with a dramatic increas...
The next-generation of supercomputers will feature a diverse mix of accelerator devices. The increas...
Current HPC systems provide memory resources that are statically configured and tightly coupled with...
The landscape of High Performance Computing (HPC) system architectures keeps expanding with new tech...
This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level be...
As the capabilities of high performance computing (HPC) resources have grown over the last decades, ...
In recent years the High Performance Computing (HPC) industry has benefited from the development of ...
The Hewlett-Packard X- and V-Class ccNUMA systems appear well suited to exploiting coarse and fine-g...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
In recent years, the world of high performance computing has been developing rapidly. The goal of t...
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-96983-1_10Des...
FPGA designs have an immense design space, and there can be an order of magnitude performance differ...
This paper investigates the development of a molecular dynamics code that is highly portable between...
Performance modeling, the science of understanding and predicting application performance, is import...
International audienceFinely tuning MPI applications (number of processes, granularity, collectiveop...
International audienceThe increasing computation capability of servers comes with a dramatic increas...
The next-generation of supercomputers will feature a diverse mix of accelerator devices. The increas...
Current HPC systems provide memory resources that are statically configured and tightly coupled with...
The landscape of High Performance Computing (HPC) system architectures keeps expanding with new tech...
This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level be...
As the capabilities of high performance computing (HPC) resources have grown over the last decades, ...
In recent years the High Performance Computing (HPC) industry has benefited from the development of ...
The Hewlett-Packard X- and V-Class ccNUMA systems appear well suited to exploiting coarse and fine-g...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...