ABSTRACT. The parallel effective I/O bandwidth benchmark (b_eff_io) is aimed at producing a characteristic average number of the I/O bandwidth achievable with parallel MPI-I/O appli-cations exhibiting various access patterns and using various buffer lengths. It is designed so that 15 minutes should be sufficient for a first pass of all access patterns. First results of the b_eff_io benchmark are given for the IBM SP, Cray T3E, Hitachi SR 8000, and NEC SX-5 sys-tems, and a discussion follows about problematic issues of our current approach. We show how a redesign of our time-driven approach allows for rapid benchmarking of I/O bandwidth with various compute partition sizes. Next, we present how implementation specific file hints can be enabl...
The increasing number of cores per node has propelled the performance of leadershipscale systems fro...
Abstract. We explore several methods utilizing system-wide shared memory to improve the performance ...
Today\u27s most advanced scientific applications run on large clusters consisting of hundreds of th...
The effective I/O bandwidth benchmark (b{_}eff{_}io) covers two goals: (1) to achieve a characterist...
The effective I/O bandwidth benchmark (b{_}eff{_}io) covers two goals: (1) to achieve a characterist...
Beside the computational scalability of an HPC application, its I/O behaviour can significantly infl...
of the I/O subsystem plays a significant role in parallel applications that need to access large amo...
The broadening disparity between the performance of I/O devices and the performance of processors an...
Input/Output (I/O) operations can represent a significant proportion of the run-time of parallel sci...
[[abstract]]Presents the results of a study conducted to evaluate the performance of parallel I/O on...
Parallel computers are increasingly being used to run large-scale applications that also have huge I...
Various layers of the parallel I/O subsystem offer tunable parameters for improving I/O performance ...
Solving the bottleneck of I/O is key in the move towards exascale computing. Research communities mu...
Input/output (I/O) operations can represent a significant proportion of the run-time when large scie...
The purpose of this report is to investigate parallel I/O on HPCx, to compare its performance with s...
The increasing number of cores per node has propelled the performance of leadershipscale systems fro...
Abstract. We explore several methods utilizing system-wide shared memory to improve the performance ...
Today\u27s most advanced scientific applications run on large clusters consisting of hundreds of th...
The effective I/O bandwidth benchmark (b{_}eff{_}io) covers two goals: (1) to achieve a characterist...
The effective I/O bandwidth benchmark (b{_}eff{_}io) covers two goals: (1) to achieve a characterist...
Beside the computational scalability of an HPC application, its I/O behaviour can significantly infl...
of the I/O subsystem plays a significant role in parallel applications that need to access large amo...
The broadening disparity between the performance of I/O devices and the performance of processors an...
Input/Output (I/O) operations can represent a significant proportion of the run-time of parallel sci...
[[abstract]]Presents the results of a study conducted to evaluate the performance of parallel I/O on...
Parallel computers are increasingly being used to run large-scale applications that also have huge I...
Various layers of the parallel I/O subsystem offer tunable parameters for improving I/O performance ...
Solving the bottleneck of I/O is key in the move towards exascale computing. Research communities mu...
Input/output (I/O) operations can represent a significant proportion of the run-time when large scie...
The purpose of this report is to investigate parallel I/O on HPCx, to compare its performance with s...
The increasing number of cores per node has propelled the performance of leadershipscale systems fro...
Abstract. We explore several methods utilizing system-wide shared memory to improve the performance ...
Today\u27s most advanced scientific applications run on large clusters consisting of hundreds of th...