Abstract—We seek to enable efficient large-scale parallel exe-cution of applications in which a shared filesystem abstraction is used to couple many tasks. Such parallel scripting (many-task computing, MTC) applications suffer poor performance and utilization on large parallel computers because of the volume of filesystem I/O and a lack of appropriate optimizations in the shared filesystem. Thus, we design and implement a scalable MTC data management system that uses aggregated compute node local storage for more efficient data movement strategies. We co-design the data management system with the data-aware scheduler to enable dataflow pattern identification and automatic optimization. The framework reduces the time to solution of parallel ...
Parallel scientific applications require high-performance I/O support from underlying file systems. ...
Computing systems are becoming increasingly data-intensive because of the explosion of data and the ...
Distributed systems are growing exponentially in the computing capacity. On the high-performance com...
Abstract—We seek to enable efficient large-scale parallel exe-cution of applications in which a shar...
Many scientific applications can be efficiently expressed with the parallel scripting (many-task com...
The success of modern applications depends on the insights they collect from their data repositories...
A fundamental problem of parallel computing is that applications often require large-size inst...
In order to run tasks in a parallel and load-balanced fashion, existing scientific parallel applicat...
In order to run tasks in a parallel and load-balanced fashion, existing scientific parallel applicat...
In many-task computing (MTC), applications such as scientific workflows or parameter sweeps communic...
In many-task computing (MTC), applications such as scientific workflows or parameter sweeps communic...
It is now widely recognized that increased levels of parallelism are a necessary condition for impro...
Effective high-level data management is becoming an important issue with more and more scientific a...
Abstract—Effective high-level data management is becoming an important issue with more and more scie...
Due to the explosive growth in the size of scientific data sets, data-intensive computing is an emer...
Parallel scientific applications require high-performance I/O support from underlying file systems. ...
Computing systems are becoming increasingly data-intensive because of the explosion of data and the ...
Distributed systems are growing exponentially in the computing capacity. On the high-performance com...
Abstract—We seek to enable efficient large-scale parallel exe-cution of applications in which a shar...
Many scientific applications can be efficiently expressed with the parallel scripting (many-task com...
The success of modern applications depends on the insights they collect from their data repositories...
A fundamental problem of parallel computing is that applications often require large-size inst...
In order to run tasks in a parallel and load-balanced fashion, existing scientific parallel applicat...
In order to run tasks in a parallel and load-balanced fashion, existing scientific parallel applicat...
In many-task computing (MTC), applications such as scientific workflows or parameter sweeps communic...
In many-task computing (MTC), applications such as scientific workflows or parameter sweeps communic...
It is now widely recognized that increased levels of parallelism are a necessary condition for impro...
Effective high-level data management is becoming an important issue with more and more scientific a...
Abstract—Effective high-level data management is becoming an important issue with more and more scie...
Due to the explosive growth in the size of scientific data sets, data-intensive computing is an emer...
Parallel scientific applications require high-performance I/O support from underlying file systems. ...
Computing systems are becoming increasingly data-intensive because of the explosion of data and the ...
Distributed systems are growing exponentially in the computing capacity. On the high-performance com...