Abstract—This paper presents a data layout optimization technique for sequential and parallel programs based on the theory of hyperplanes from linear algebra. Given a program, our framework automatically determines suitable memory layouts that can be expressed by hyperplanes for each array that is referenced. We discuss the cases where data transformations are preferable to loop transformations and show that under certain conditions a loop nest can be optimized for perfect spatial locality by using data transformations. We argue that data transformations can also optimize spatial locality for some arrays without distorting temporal/spatial locality exhibited by others. We divide the problem of optimizing data layout into two independent sub...
Despite decades of work in this area, the construction of effective loop nest optimizers and paralle...
Many optimizations (of programs with loops) used in parallelizing compilers and systolic array desig...
Two issues in linear algebra algorithms for multicomputers are addressed. First, how tounify paralle...
This paper presents a data layout optimization technique based on the theory of hyperplanes from lin...
The delivered performance on modern processors that employ deep memory hierarchies is closely relate...
AbstractÐThe delivered performance on modern processors that employ deep memory hierarchies is close...
The actual performance of programs on modern processors that em-ploy deep memory hierarchies is clos...
AbstractÐExploiting locality of references has become extremely important in realizing the potential...
Abstract. This paper aims to improve locality of references by suitably choosing array layouts. We u...
The actual performance of programs on modern processors that em-ploy deep memory hierarchies is clos...
Global locality optimization is a technique for improving the cache performance of a sequence of loo...
. This paper aims to improve locality of references by suitably choosing array layouts. We use a ne...
This paper presents a technique for finding good distributions of arrays and suitable loop restructu...
Global locality optimization is a technique for improving the cache performance of a sequence of loo...
Data and computation alignment is an important part of compiling sequential programs to architecture...
Despite decades of work in this area, the construction of effective loop nest optimizers and paralle...
Many optimizations (of programs with loops) used in parallelizing compilers and systolic array desig...
Two issues in linear algebra algorithms for multicomputers are addressed. First, how tounify paralle...
This paper presents a data layout optimization technique based on the theory of hyperplanes from lin...
The delivered performance on modern processors that employ deep memory hierarchies is closely relate...
AbstractÐThe delivered performance on modern processors that employ deep memory hierarchies is close...
The actual performance of programs on modern processors that em-ploy deep memory hierarchies is clos...
AbstractÐExploiting locality of references has become extremely important in realizing the potential...
Abstract. This paper aims to improve locality of references by suitably choosing array layouts. We u...
The actual performance of programs on modern processors that em-ploy deep memory hierarchies is clos...
Global locality optimization is a technique for improving the cache performance of a sequence of loo...
. This paper aims to improve locality of references by suitably choosing array layouts. We use a ne...
This paper presents a technique for finding good distributions of arrays and suitable loop restructu...
Global locality optimization is a technique for improving the cache performance of a sequence of loo...
Data and computation alignment is an important part of compiling sequential programs to architecture...
Despite decades of work in this area, the construction of effective loop nest optimizers and paralle...
Many optimizations (of programs with loops) used in parallelizing compilers and systolic array desig...
Two issues in linear algebra algorithms for multicomputers are addressed. First, how tounify paralle...