Existing techniques can enhance the locality of arrays indexed by affine functions of induction variables. This pa-per presents a technique to localize non-affine array ref-erences, such as the indirect memory references common in sparse-matrix computations. Our optimization combines elements of tiling, data-centric tiling, data remapping and inspector-executor parallelization. We describe our technique, bucket tiling, which includes the tasks of permutation generation, data remapping, and loop regeneration. We show that profitability cannot gener-ally be determined at compile-time, but requires an exten-sion to run-time. We demonstrate our technique on three codes: integer sort, conjugate gradient, and a kernel used in simulating a beating...
On modern computers, the performance of programs is often limited by memory latency rather than by p...
This paper presents compilation techniques to compress holes, which are caused by the non-unit align...
Applications that manipulate sparse data structures contain memory reference patterns that are un-kn...
scratch pad memory, affine reference This paper considers compiler management of fast, local memorie...
Abstract—Many scientific applications are organized in a data parallel way: as sequences of parallel...
Programming languages that provide multidimensional arrays and a flat linear model of memory must im...
AbstractÐExploiting locality of references has become extremely important in realizing the potential...
This work presents a novel strategy for the parallelization of applications containing sparse matrix...
Automatic scheduling in parallel/distributed systems for coarse grained irregular problems such as s...
[[abstract]]This paper presents an efficient compilation technique to generate the local memory acce...
[[abstract]]Address generation for compiling programs, written in HPF, to executable SPMD code is an...
[[abstract]]This paper presents compilation techniques used to compress holes, which are caused by t...
The bandwidth mismatch between processor and main memory is one major limiting problem. Although str...
Global locality optimization is a technique for improving the cache performance of a sequence of loo...
Many large-scale computational applications contain irregular data access patterns related to unstru...
On modern computers, the performance of programs is often limited by memory latency rather than by p...
This paper presents compilation techniques to compress holes, which are caused by the non-unit align...
Applications that manipulate sparse data structures contain memory reference patterns that are un-kn...
scratch pad memory, affine reference This paper considers compiler management of fast, local memorie...
Abstract—Many scientific applications are organized in a data parallel way: as sequences of parallel...
Programming languages that provide multidimensional arrays and a flat linear model of memory must im...
AbstractÐExploiting locality of references has become extremely important in realizing the potential...
This work presents a novel strategy for the parallelization of applications containing sparse matrix...
Automatic scheduling in parallel/distributed systems for coarse grained irregular problems such as s...
[[abstract]]This paper presents an efficient compilation technique to generate the local memory acce...
[[abstract]]Address generation for compiling programs, written in HPF, to executable SPMD code is an...
[[abstract]]This paper presents compilation techniques used to compress holes, which are caused by t...
The bandwidth mismatch between processor and main memory is one major limiting problem. Although str...
Global locality optimization is a technique for improving the cache performance of a sequence of loo...
Many large-scale computational applications contain irregular data access patterns related to unstru...
On modern computers, the performance of programs is often limited by memory latency rather than by p...
This paper presents compilation techniques to compress holes, which are caused by the non-unit align...
Applications that manipulate sparse data structures contain memory reference patterns that are un-kn...