Abstract. Data copy is an important compiler optimization which dy-namically rearranges the layout of arrays by copying their elements into local buffers. Traditionally, array copy is considered expensive and has been applied only to the working sets of fully blocked computations. This paper presents an algorithm which automatically applies data copy to optimize the performance of general computations independent of block-ing. The algorithm automatically decides where to insert copy operations and which regions of arrays to copy. In addition, when specialized, it is equivalent to a general scalar replacement algorithm on arbitrary array computations. The algorithm is fully implemented and has been applied to optimize several scientific kern...
[[abstract]]In many scientific applications, array redistribution is usually required to enhance dat...
Abstract It has been observed that memory access performance can be improved by restructuring data d...
We detail an algorithm implemented in the R-Stream com-piler1 to perform controlled array expansion ...
The system efficiency and throughput of most architectures are critically dependent on the ability o...
The focus of this paper is on a data flow-transformation called advanced copy propagation. After an ...
The literature has witnessed much work aimed at improving the efficiency of mernory systems. The mot...
The literature has witnessed much work aimed at improving the efficiency of mernory systems. The mot...
In the past decade, processor speed has become significantly faster than memory speed. Small, fast c...
We consider the well-known problem of avoiding unnecessary costly copying that arises in languages w...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 1997. Simultaneously published...
In the past decade, processor speed has become significantly faster than memory speed. Small, fast c...
© 1994 ACM. In the past decade, processor speed has become significantly faster than memory speed. S...
Programming languages like Fortran or C define exactly the layout of array elements in memory. Progr...
In this paper, we discuss a program transformation technique called array reshaping. Array reshaping...
Numerical applications frequently contain nested loop structures that process large arrays of data. ...
[[abstract]]In many scientific applications, array redistribution is usually required to enhance dat...
Abstract It has been observed that memory access performance can be improved by restructuring data d...
We detail an algorithm implemented in the R-Stream com-piler1 to perform controlled array expansion ...
The system efficiency and throughput of most architectures are critically dependent on the ability o...
The focus of this paper is on a data flow-transformation called advanced copy propagation. After an ...
The literature has witnessed much work aimed at improving the efficiency of mernory systems. The mot...
The literature has witnessed much work aimed at improving the efficiency of mernory systems. The mot...
In the past decade, processor speed has become significantly faster than memory speed. Small, fast c...
We consider the well-known problem of avoiding unnecessary costly copying that arises in languages w...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 1997. Simultaneously published...
In the past decade, processor speed has become significantly faster than memory speed. Small, fast c...
© 1994 ACM. In the past decade, processor speed has become significantly faster than memory speed. S...
Programming languages like Fortran or C define exactly the layout of array elements in memory. Progr...
In this paper, we discuss a program transformation technique called array reshaping. Array reshaping...
Numerical applications frequently contain nested loop structures that process large arrays of data. ...
[[abstract]]In many scientific applications, array redistribution is usually required to enhance dat...
Abstract It has been observed that memory access performance can be improved by restructuring data d...
We detail an algorithm implemented in the R-Stream com-piler1 to perform controlled array expansion ...