Abstract—Performance of reading scientific data from a parallel file system depends on the organization of data on physical storage devices. Data is often immutable after producers of data, such as large-scale simulations, experiments, and observations, write the data to the parallel file system. As a result, read performance of data analysis tasks is often slow when the read pattern does not conform with the original organization of the data. For example, reading small non-contiguous chunks of data from a large array is many times slower than reading the same size of contiguous chunks of data. Towards improving the data read performance during analysis phase, we are developing the Scientific Data Services (SDS) framework for automatically ...
Scientific data analysis typically involves reading massive amounts of data that was generated by si...
The distributed file system, HDFS, is widely deployed as the bedrock for many parallel big data anal...
state.edu We present an I/O optimization method for parallel volume ren-dering based on visibility a...
As scientific simulations and experiments move toward extremely large scales and generate massive am...
Data producers typically optimize the layout of data files to minimize the write time. In most cases...
Abstract—Data producers typically optimize the layout of data files to minimize the write time. In m...
Large-scale scientific applications typically write their data to parallel file systems with organiz...
The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent...
I/O data access is a recognized performance bottleneck of high-end computing. Several commercial and...
As high-performance computing approaches exascale, the existing I/O system design is having trouble ...
In this study, the authors propose a simple performance model to promote a better integration betwee...
Parallel input/output in high performance computing is a field of increasing importance. In particul...
Disk drives are the bottleneck in the processing of large amounts of data used in almost all common ...
In this thesis, we propose a self-tuning approach for automatically selecting and refining the file ...
Scientific workflows contain an increasing number of interactingapplications, often with big dispari...
Scientific data analysis typically involves reading massive amounts of data that was generated by si...
The distributed file system, HDFS, is widely deployed as the bedrock for many parallel big data anal...
state.edu We present an I/O optimization method for parallel volume ren-dering based on visibility a...
As scientific simulations and experiments move toward extremely large scales and generate massive am...
Data producers typically optimize the layout of data files to minimize the write time. In most cases...
Abstract—Data producers typically optimize the layout of data files to minimize the write time. In m...
Large-scale scientific applications typically write their data to parallel file systems with organiz...
The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent...
I/O data access is a recognized performance bottleneck of high-end computing. Several commercial and...
As high-performance computing approaches exascale, the existing I/O system design is having trouble ...
In this study, the authors propose a simple performance model to promote a better integration betwee...
Parallel input/output in high performance computing is a field of increasing importance. In particul...
Disk drives are the bottleneck in the processing of large amounts of data used in almost all common ...
In this thesis, we propose a self-tuning approach for automatically selecting and refining the file ...
Scientific workflows contain an increasing number of interactingapplications, often with big dispari...
Scientific data analysis typically involves reading massive amounts of data that was generated by si...
The distributed file system, HDFS, is widely deployed as the bedrock for many parallel big data anal...
state.edu We present an I/O optimization method for parallel volume ren-dering based on visibility a...