This paper describes a new approach to managing array data layouts to optimize performance for scientific codes. Prior research has shown that changing data layouts (e.g., interleaving arrays) can improve performance. However, there have been two major reasons why such optimizations are not widely used: (1) the need to select different layouts for different computing platforms, and (2) the cost of re-writing codes to use to new layouts. We describe a source-to-source translation process that allows us to generate codes with different array interleavings, based on a data layout specification. We used this process to generate 19 different data layouts for an ASC benchmark code (IRSmk) and 32 different data layouts for the DARPA UHPC challenge...
Abstract—Performance of reading scientific data from a parallel file system depends on the organizat...
Supercomputers need not only to have fast functional units, but also to have rapid access to massive...
This paper introduces a dynamic layout optimization strategy to minimize the number of cycles spent ...
The system efficiency and throughput of most architectures are critically dependent on the ability o...
Besides the algorithm selection, the data layout choice is the key intellectual step in writing an e...
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet ef...
Input-output (I/O) optimization at the low-level design of data layout on disk drastically impacts t...
Part 2: Parallel and Multi-Core TechnologiesInternational audienceApplying appropriate data structur...
Input-output (I/O) optimization at the low-level design of data layout on disk drastically impacts t...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/1...
As scientific simulations and experiments move toward extremely large scales and generate massive am...
Recently, multi-cores chips have become omnipresent in computer systems ranging from high-end server...
High Performance Fortran (HPF) is rapidly gaining acceptance as a language for parallel programming....
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet ef...
Abstract. As the ever-increasing gap between the speed of processor and the speed of memory has beco...
Abstract—Performance of reading scientific data from a parallel file system depends on the organizat...
Supercomputers need not only to have fast functional units, but also to have rapid access to massive...
This paper introduces a dynamic layout optimization strategy to minimize the number of cycles spent ...
The system efficiency and throughput of most architectures are critically dependent on the ability o...
Besides the algorithm selection, the data layout choice is the key intellectual step in writing an e...
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet ef...
Input-output (I/O) optimization at the low-level design of data layout on disk drastically impacts t...
Part 2: Parallel and Multi-Core TechnologiesInternational audienceApplying appropriate data structur...
Input-output (I/O) optimization at the low-level design of data layout on disk drastically impacts t...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/1...
As scientific simulations and experiments move toward extremely large scales and generate massive am...
Recently, multi-cores chips have become omnipresent in computer systems ranging from high-end server...
High Performance Fortran (HPF) is rapidly gaining acceptance as a language for parallel programming....
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet ef...
Abstract. As the ever-increasing gap between the speed of processor and the speed of memory has beco...
Abstract—Performance of reading scientific data from a parallel file system depends on the organizat...
Supercomputers need not only to have fast functional units, but also to have rapid access to massive...
This paper introduces a dynamic layout optimization strategy to minimize the number of cycles spent ...