Various layers of the parallel I/O subsystem offer tunable parameters for improving I/O performance on large-scale computers. However, searching through a large parameter space is challenging. We are working towards an autotun-ing framework for determining the parallel I/O parameters that can achieve good I/O performance for different data write patterns. In this paper, we characterize parallel I/O and discuss the development of predictive models for use in effectively reducing the parameter space. Applying our technique on tuning an I/O kernel derived from a large-scale simulation code shows that the search time can be reduced from 12 hours to 2 hours, while achieving 54X I/O perfor-mance speedup
ABSTRACT. The parallel effective I/O bandwidth benchmark (b_eff_io) is aimed at producing a characte...
pre-printParallel I/O library performance can vary greatly in re- sponse to user-tunable parameter v...
I/O-intensive parallel programs have emerged as one of the leading consumers of cycles on parallel m...
Parallel I/O performance depends highly on the interac-tions among multiple layers of the parallel I...
The contemporary parallel I/O software stack is complex due to a large number of configurations for ...
Parallel Input output is an essential component of modern high-performance computing (HPC). Obtainin...
Abstract—As high performance computing (HPC) heads towards the exascale era, the computing power sur...
Getting good I/O performance from parallel programs is a critical problem for many application domai...
The broadening disparity between the performance of I/O devices and the performance of processors an...
The area of parallel and distributed computing has grown very fast in the past few decades with the ...
Parallel I/O is an essential component of modern High Performance Computing (HPC). Obtaining good I/...
The 2014 TOP500 supercomputer list includes over 40 deployed petascale systems, and the high perform...
The CPUs, memory, interconnection network, operating system, runtime system, I/O subsystem, and appl...
Beside the computational scalability of an HPC application, its I/O behaviour can significantly infl...
In high-performance computing (HPC) environments, an appropriate amount of hardware resources must b...
ABSTRACT. The parallel effective I/O bandwidth benchmark (b_eff_io) is aimed at producing a characte...
pre-printParallel I/O library performance can vary greatly in re- sponse to user-tunable parameter v...
I/O-intensive parallel programs have emerged as one of the leading consumers of cycles on parallel m...
Parallel I/O performance depends highly on the interac-tions among multiple layers of the parallel I...
The contemporary parallel I/O software stack is complex due to a large number of configurations for ...
Parallel Input output is an essential component of modern high-performance computing (HPC). Obtainin...
Abstract—As high performance computing (HPC) heads towards the exascale era, the computing power sur...
Getting good I/O performance from parallel programs is a critical problem for many application domai...
The broadening disparity between the performance of I/O devices and the performance of processors an...
The area of parallel and distributed computing has grown very fast in the past few decades with the ...
Parallel I/O is an essential component of modern High Performance Computing (HPC). Obtaining good I/...
The 2014 TOP500 supercomputer list includes over 40 deployed petascale systems, and the high perform...
The CPUs, memory, interconnection network, operating system, runtime system, I/O subsystem, and appl...
Beside the computational scalability of an HPC application, its I/O behaviour can significantly infl...
In high-performance computing (HPC) environments, an appropriate amount of hardware resources must b...
ABSTRACT. The parallel effective I/O bandwidth benchmark (b_eff_io) is aimed at producing a characte...
pre-printParallel I/O library performance can vary greatly in re- sponse to user-tunable parameter v...
I/O-intensive parallel programs have emerged as one of the leading consumers of cycles on parallel m...